Tuesday 29 December 2009

Backup Exec Licensing

Am I the only person who thinks the licensing scheme of Backup Exec is insane.

Why must there be so many options which become increasingly expensive for features that should just be included, and why exactly must we pay for enhancements which are often essentially fixes for features they never implemented properly the first time.

Grrrrrrrrr

Tuesday 10 November 2009

Reporting in Backup Exec ... no more pain and misery...

If, like me you manage a largeish Backup Exec installation with several media servers, hundreds of backups and lots of clients, you'll probably be pretty frustrated with the half assed nature of the Backup Exec logging and reporting capabilities.

For a long time, I've wanted a simple, but powerful way to do things like "show me backups that are consistently failing over 'x' period, or show me the most likely time of day for backup jobs to fail etc.

So, having looked everywhere and found no sane solution, I've just started writing one. Now I have a great little interface where I can review my backups, see what jobs are failing constantly, review the issue, fix it and then mark it as resolved so it can start being checked again.

I'm thinking of adding lots of features and eventually making it something I can sell for a reasonable (read: not outrageous) fee to others who feel the pain...

Any suggestions welcomed...

Wednesday 30 September 2009

Error E000FE30 every day on one server...

...for months. For months I've struggled with a problem on ONE server, that happens to be at a remote site on a different subnet, connected via a WAN VPN Link.

Every day, one or more jobs would fail with Backup Exec Errors, mainly E000FE30 - with the useful and generic messages about "communications failure has occured" and sometimes the "connection lost to the remote agent".

Needless to say, I've spent some time working on this, and tried all sorts. Reconfiguring the system to use a different WAN link to ensure the fault isn't with the WAN. Nothing. Checking to ensure the issue isn't with the server, reinstalling agents, trying all sorts.

I've updated network drivers, checked all sorts of patches etc - but nothing, Still this error - consistently failing jobs.

I even got a colleague to look at it for a fresh pair of eyes and he too tried all sorts. Given the error, we suspected "something" to do with comms, but never found any issue, and in hundreds of tests conducted could never replicate the issue - transferring large files to/fro the server worked fine etc.

Today I found the answer. The "Large TCP Offload" feature on the Network Card. While I've seen plenty of issues with this feature before, you normally see it with terrible throughput on the system in general and so on - but this machine is solid as a rock for everything else.

Still, the setting is off, and first complete, full backups in a few weeks... voila!

Top tip for anyone else facing this problem - don't just check the network drivers, but try turning off these features, even if you cannot see this issue at any other time on the machine.

Is this a Backup Exec issue? I'm not sure, but I'm happy to blame it since everything else works just fine.

Tuesday 18 August 2009

Debugging Mode for Remote Agent...

Right now I have a problem where a customers server just doesn't backup. Not for love nor money. The connection is over a WAN type connection, so not your average setup, but none the less, working fine previously.

First of all we suspected the comms - e.g. VPN, the Routers, the provider connection, and did the usual testing to be sure that isn't the cause. We saw a couple things that made us "think" the isuse was there but nothing too worrying.

Not a problem though, we've got multiple WAN links, and another way to get to the other end. A couple config changes, and the route now uses a different WAN link at both ends (e.g. BEWS Media Server and RAWS Enabled Server being backed up).

Problem still exists. So it's not comms then (given the other links switched to backup another server at the same site daily (40-60Gb/day without complaint).

So the issue is likely something with the server being backed up.

The point of the post though, is to let you know how to temporarily enable the debug backup logging on the Remote Agent (RAWS).

Stop the service, add "-debug" (no quotes) to the "startup parameters" in the services management in Admin Tools (or start > run > services.msc and hit enter...)

Start the service. You're good for logs until the next restart of services/server reboot.

Logs go in your backup exec install directory on the server being backed up, in the Logs subfolder named beremote... something...

(Oh yeah, and a warning, the debug logs are HUGEEEEEE)

Saturday 27 June 2009

Backup saves your ass, Microsoft kicks it hard

Today has been an interesting day where I learned that our backup platform is joy.

The basic scenario is pretty simple. Customer server dies and our on call guy says we need to restore.

Drop a clean Win2003 box onto our virtual platform, run a restore, bingo, first time success. Reboot the machine and...


windows product activation kicks in begging for reactivation. Many hours of hassle
later and no help from the so called 'parter critical support line', which seemed bothered
only about making statistics up about why I was calling, I get the issue resolved myself.

the moral of the story? backup exec works fine and Microsoft Product Activation smacks. Again.

Monday 8 June 2009

So it just died....

About a week ago our Backup Exec CASO box decided it had had enough of talking to Managed Media Servers. Randomly declaring known working servers to be "unavailable".

The usual checks started, and nothing. Patches checked, removed etc. Nothing.

The solution. Search for any msgq*.dat files on your servers. Delete them.

Voila. Everything works... I'm not happy...

Sunday 8 March 2009

Still here, still managing Backup Exec every day...

It's been some months since we've postted here, and the foolish (and those who never actually deal with Backup Exec) would probably believe that's because we've eithe forgotten about the blog, or we've got everything working.

You'd be wrong.

It's true to say that we've got a little more success, and now have 5 10d boxes and 1 12d box, all running, and, most of the time it tends to play well. Which is lovely, but when it does go wrong, it tends to lose it completely.

Here are a few problems we've currently got:

a) An old "Managed Media Server" that is long since departed just won't go away from all parts of the UI - most of it knows it has gone, but some parts still show it, as if it may somehow come back one day. It won't.

b) 2-3 Jobs are stuck in an external status where they're on "On Hold, Running" according to the status. That's not true. In fact, they've been stuck there for a year. Meaning I can't delete the now-redundant Policies, Templates or Selection Lists for those jobs. They're just stuck there forever.

c) Sometimes a job fails claiming the cause to be a Communications Failure. Communications Failure is Backup Exec speak for "most problems". There is naff all wrong with the communications, and normally we resolve it by deleting/recreating the job.

d) Synthetic jobs, well let's see. They suck. They only work in exacting circumstances, and the minute you step out of line of one of those or a job is missed, well that's your life made hell. They start failing, come up with lots of silly errors and you end up re-creating them, waiting for a full again. So you tend to not bother, and just do an old-style Full/Incremental set, since they normally work.

In the case of 12d, it does tend to be a little better, particularly with Exchange Backups. Except it STILL doesn't properly manage media, so the IMG foldes it creates don't always get deleted (although it reckons they will). Still no joy on having the B2D Files reused or deleted. Hell no, that'd make sense.

So yeah, I'm still here, managing Backup Exec, 7 days a week, doing what it should do for it, and going mental every time I come in to find it's just collapsed without warning. Quality it is not.

Sadly I've still yet to find a better solution at a price point that is sane.