Saturday, 24 December 2016
DPM: Replica is inconsistent, Error 3106, "system cannot find the path specified" - despite restarting both DPM + Protected Server systems
Each Volume or resource would be in "Replica is inconsistent" state. You'd play the usual game of running consistency checks, or consistency & synchronisation but the job goes to "OK" for a while - but the ability to make a new recovery point is missing (eg the "Create a recovery point after synchronizing") option is unavailable, and a short while later a protected volume would return to "replica is inconsistent" state.
After much head scratching and monitoring, we realised it was very simple... the installation path for DPM includes a temp folder, for example:
"C:\Program Files\Microsoft System Center 2012\DPM\DPM\Temp"
The folder "MTA" had been removed as part of a clear up of old temporary resources, and despite this folder not being actively used during a backup it seems not having it breaks DPM.
Simply recreate a folder called "MTA" and you'll then find everything is working just fine again - re-run those consistency checks and then make a recovery point with synchronisation and all will be well.
Hopefully this will help someone else with a similar issue!
Friday, 22 May 2015
The reality is that in our experience... it just works.
However, we recently had a problem where our primary DPM Server at a site failed. Having rebuilt it, and gotten things going again, we had issues getting Secondary Protection to work again, with DPM throwing error 33119
After wondering what was happening, we found the answer pretty simple...
On the PRIMARY DPM Server...
Make sure "DPM Writer" and the "DPM Access Manager" services are running - for some reason the Primary server wasn't running those services (although was otherwise working) - a quick service start and they've been reunited and now work.
Saturday, 1 June 2013
Data Protection Manager does, on the whole do a damn good job of making backups work reliably. There are some annoying quirks and on the whole when it does have a problem of some sort it tends to be communication with server issues, and the error numbers, messages and explanations are pretty hopeless in many cases, which does take the shine off an otherwise decent product.
But one area that we have recently been exploring in more depth is the "Manual Replica" feature. Traditionally we've not really had any need to get initial replicas via anything but the LAN/WAN links already in place, but more recently we had a few scenarios where it sounded like it could be a winner.
The process is pretty much undocumented as far as TechNet goes - there is a vague explanation but it doesn't cover several obvious and common scenarios whatsoever.
The basic principle is that you literally "copy" the source drives files to another media - like a removable hard disc, then copy them onto the DPM server.
Sounds simple right?
The first stage is - assuming you've got a couple of good tools that let you grab "in use" files or get via some VSS handiness, and can add an NTFS formatted drive etc. In practice it is the second part - getting it into DPM - that far more convoluted as you have to know the replica path, you have to mount it so you can access it to copy files and then you can copy the files.
The hassle lies in having to get the replica path from DPM's admin console (easy), copying that to clipboard and into notepad or similar (easy), extracting the volume ID (easy as long as you know which bit is the volume ID), and then finding the actual volume ID windows has via mountvol (time consuming to do when you have a lot of volumes), and not that easy. Finally you need to mount that volume so you can access it, browse to the "Full" folder and copy the files (only easy once you know how).
All of this would be relatively simple if the documentation was decent, but it isn't.
What makes it more bizarre is that the process of getting the IDs and mounting volumes is something that could be done within the UI if they wanted pretty easily, or at least via some DPM powershell.
Since this wasn't the case (and isn't as of the latest build), one of my colleagues and I knocked up a little script to make the last part a bit neater - essentially we feed a script the DPM volume string, and the drive letter we want to mount as, and the script queries the system volumes, finds the matching one and mounts it. Voila!.
Copy the files, then run a consistency check on the manual replica, which will then complete the job and let ongoing backups happen.
So we're done now right? Well not quite.
All of the above assumes you're creating a manual replica of a file storage volume (eg a typical "drive"). There doesn't appear to be any documentation on the TechNet site for any release of DPM that explains how you create manual replicas for resources like SQL Server, Exchange - it suggests you can, and you can make those resources go into "manual replica creation pending" but it does not make it at all clear how you actually achieve this.
So my simple request to Microsoft is PLEASE provide better documentation in this area, and perhaps throw in a couple scripts to simplify the process so we can just "get on with it" - as a general rule the Server 2012 and DPM 2012 releases do exactly that so it'd be good to see that last bit work too.
Thursday, 23 August 2012
Recently I moved our DPM servers over to DPM 2012 - and was amazed by how smoothly it went (more because I'm used to upgrading Backup Exec and watching the earth cave in), and all seemed to be working well.
However, I'm VERY paranoid when it comes to backups - and so my colleague has responsiblity for making sure backups happen - reviewing reports and so on. We'd been using the reporting in DPM 2010 quite happily, he received and reviewed the reports on a regular basis.
Since 2012 he'd not been getting them... it seems that the reporting gets broken on an upgrade, and the SMTP Server settings were not right anymore - bizarrely each server we had seemed to have different states - one had no SMTP Server details anymore, one had them but complained they weren't right, and the other had half of the settings. All very strange.
In theory this wasn't an issue - back into SMTP Settings, repopulate and reconfigure the reporting part.
Despite having all the details in the "SMTP Server" setting, and having those details set correctly (validated by the send test option and the receipt of a test e-mail etc) we couldn't setup any of the reports.
"DPM cannot setup an e-mail subscription for this report"
(and then advises you to go and setup your SMTP Server!)
It turns out that the issue is in fact that SQL Reporting Services doesn't actually have the details you entered - from what I can see, the system still looks in the DPM 2010 instance of SQL (because it doesn't remove it or the instance of old SQL it made) at upgrade, so it updates that instead. D'oh!
Whilst that's clearly a bug and should be fixed, the good news is that there is a quick fix.
Go into SQL Reporting Services Configuration, log into the DPM2012 instance, choose the "E-Mail Settings" and fill in the Sender Address and SMTP Server. Save that and you're golden.
Tuesday, 14 August 2012
But the good news is that unlike the blog post, the actual install of DPM 2012 went well :-)
I'll try and re-write our experience of setting it up properly later!
One of the "to clear up" items was to remove old no-longer-in-use Protected Computers from the Agents List in Management - for various reasons we'd ended up with some machines who still show up despite being long since removed and thrown away.
There's no way to do this from the main UI, but with a quick bit of DPM powershell you can nuke those old systems...
Run this script:
Remove-ProductionServer.ps1 -DPMServerName DPMSERVERNAMEHERE -PSName SERVERNAMEHERE
NOTE: You'll generally need to do "SERVERNAME.fullyqualified.com" if it is a domain joined server.
Once you've closed/reopened the UI you'll have a much tidier list :-)
Thursday, 9 August 2012
As those familiar with this blog will know, we used to use Symantec Backup Exec to look after backups for many many servers. We ran into all sorts of problems, and that's how we ended up creating this blog – known originally just as Backup Exec Hell. Today we use the Microsoft Data Protection Manager product since our business is *almost* entirely Windows based, and since we did that we spend a lot less time dealing with backup issues.
As a result we've not spent all that much time paying attention to Backup Exec's development since we moved off it, and our core experiences of it ended with version 10d. For reasons better known to someone else, I recently decided I was a little curious to see how the product had moved on. Popping onto the Symantec Forums, I soon saw a good number of posts complaining about the latest release – Backup Exec 2012.
Obviously I haven't used the product, but one thing that sticks out is the general noise about one change…
They've made major changes to the way it is designed to operate – switching the core method of backup from the old “Selection Lists” and “Resources” to a “Server Centric” view.
This seems to be causing lots of complaints from long term users. On first glance it does sound like Symantec have done a bad thing. But actually (and I hate to say it) I see why they've made the change. It's just that the execution is lousy… (so nothing new for our chums at Symantec).
To understand why they've changed the way you handle things to be server centric, you only have to look as far as Data Protection Manager (which is sort of server centric). Today's backup systems take advantage of new technologies (certainly compared to the original tech available when backup exec's current methods were written), plus the nature of server technology has changed.
Back in the day you had server which you could and did backup in a “files on the disk” manner. You couldn't backup files in use, and so on. This meant for some tools – like a database, you either had to take the database offline, or the database system had to have some backup utility, spitting files out to the disk, so you could back them up. Eventually we ended up with things like “System State” in Windows, and then alongside the crazy growth in disk storage needs, some tech got really smart – like Microsoft Exchange or SharePoint. The problem with the “way it used to be” is that it was a slow process, the inability to backup stuff in use/always open was becoming a pain, and having to have this crazy do one type of backup, do another was asking for trouble. Plus taking a system offline to do backups was horrible and as technology is more and more critical and 24/7 in nature, unfeasible.
We now have technologies like VSS (for snapshots) and this “virtual machine” thing has happened – thanks to VMWare, Hyper-V etc. With virtual machines in particular, the hypervisors now offer direct support for backing up systems without having Backup Software agents on every machine and all running unaware of each other etc).
As a result of this, it's more important than it used to be to get a “snapshot” of a servers state with EVERYTHING – files, exchange, sql, system state etc in a consistent manner. After all, if a server fails, it's no use having a copy of the windows files, but not the exchange information stores. And for straightforward recovery, the information stores are most useful when the rest of the server is there too. So actually backing up “whole servers” makes sense – and if you're using tools like Hyper-V, backing up all the guests and the host in one hit makes a lot of sense. What symantec thus appear to have done is move to this model, where you backup “a server” and not “the C drive” etc. That's royally hacked users off.
The main complaint seems to be that the upgrade process messes up the existing setup, and they end up with many more jobs and nothing makes sense. That's fair enough. It's interesting though that a couple of posts I read were from users NEW to Backup Exec 2012, and they didn't see what the fuss was because they never had a legacy setup to migrate.
From my perspective, I think Backup Exec users need to re-think how they do backups, and look at the new model as genuine progress and sensible long term. It does mean rethinking how you setup your backups sure – and if you have a lot of servers and an existing setup, I appreciate that upgrading to Backup Exec 2012 seems to be causing pain – and Symantec probably could have done a better job at making the changes clear (but from experience most people just dive in with an upgrade anyhow…)
Ignoring the other pains and issues, if you start thinking of things from a “I need to backup my server so I can recover it no matter what” – there's no reason a “server centric” approach can't and won't work. If you think still in terms of “files” and “drives” and so on, then you're not going to get on with the new version. But really you need to re-think. Backup Exec has used the method it has today for a long long time, and it's time to put it out to pasture. Who knows, maybe in the next version they can jettison more old thinking and give it half a chance of being a credible product again.
As someone who has moved to DPM, we had to get our heads around having “Protection Groups” and “Server Centric” concepts – coming from Backup Exec this was an interesting experience I admit, but would I want to go back to the crazy polices, templates and selection lists stuff…. no thanks! Today we have “Protection Groups” based on “Location or Customer – Server Product Family” – which is what suits us. So a company “Acme Ltd” might have a Protection Group “Acme Ltd – Web and SQL Servers” and another “Acme Ltd – Internal Network Servers” while we may have “London – Hosted E-Mail Service” and so on… In each group we setup each server to be backed up, with all its resources. We can still exclude a drive, folder etc if we want to… although we rarely do since just having a complete image makes sense – and thinking about it, the only reason we did that with Backup Exec was because it was slow at backing up, and the promised “Synthetic Backups” never worked, so we had to keep doing “full” backups of data that hardly ever changed and this was an issue.
With DPM, it backs up everything ONCE, then just keeps getting snapshots so we can recover. It's much better on network usage, it's far faster and because it doesn't take forever we just let it back up everything without question.
So in conclusion, we think you should go with these changes, understand there is some likely sound reasoning behind it, bite the bullet and reconfigure your setup to work the way they want you to work from now on. (Or just move to DPM if you've got Microsoft-only workloads…)
PS – I know there are plenty of other issues with Backup Exec 2012 – I just don't think that this change is the real issue!
Wednesday, 8 August 2012
Compared to Backup Exec (we stopped using it on BEWS 10d), DPM 2010 seems to "just get it" when it comes to backups. Once a backup is setup, it knows how to backup, it knows when to backup and it actually backs up.
If there's an issue it attempts to fix minor issues via consistency checks etc, and if there's really an issue, it is glaringly obvious for you to fix it. That's pretty good news when you're used to Backup Exec being hell on earth, randomly dropping jobs to "hold" status because of an issue and so on.
However yesterday a random issue cropped up which I hadn't seen before.
One of our Protected Servers Agent Status in the DPM console was "unavailable" - and the error logged was "10048 0x02740". The protected server in question is running Exchange Server 2007.
This appears to have happened because the IIS process on the protected server was suddenly using TCP Ports 5718 and 5719. This prevents the DPM Remote Agent from starting.
To fix this, you can simply:
(a) Stop IIS (iisreset /stop)
(b) Run the DPM Agent
(c) Start IIS (iisreset /start)
...or in our case, do nothing - by this morning it had cleared itself (we hadn't restarted IIS as we didn't want to drop the live users connected via OWA and OutlookAnywhere on this box).
Wednesday, 29 February 2012
We figured we should drop a quick post in to say hello - and so you know we haven't abandoned this blog! ... The truth is we've been busy, but have had some problems with Data Protection Manager too... and we'll post some updates soon.
Until then, good luck fighting Backup Exec if you're still stuck with it, and hope you avoid a few of the pitfalls we found in DPM which can be irritating if not fatal...
Monday, 13 June 2011
One area though that Backup Exec was MUCH better at is E-Mail Alerting. Firstly, it was more flexible - any SMTP server was OK, and that worked great for us. DPM however only seems to work if it's pointed at an Exchange based environment - which was a bit annoying since that's not really how I wanted it done. I guess that's the side effect of the "optimal for microsoft based workloads" strategy, but nonetheless...
The other bit though is the alerting capability. You can have alerts for 3 categories "Informational" "Warning" and "Critical", and a list of e-mail addresses to send to. You get one list of e-mail addresses and ALL of those addresses can receive the alerts you enable.
You can't set any thresholds, you can't customise the alerts and most annoyingly, it alerts you to both "Problems" and "Resolved". Given it tries to self resolve I was hoping "Critical" would only alert you to issues it has tried to resolve and failed at or cannot resolve because it needs our intervention.
All in all a bit poor and makes me want BEWS back, just for that bit anyhow...
Thursday, 12 May 2011
If you're finding that "System State" and "Bare Metal Recovery" items are frequently sitting in a "replica is inconsistent" state ...which happens a lot on Windows 2008 system... then the chances are you've not got "Windows Server Backup" as an installed feature on the server you're backing up (the protected server).
It's dead simple to sort, run Server Manager, click Add Feature, check "Windows Server Backup" and wait for it to install - job done - run a consistency check and they'll be sorted and work thereafter.
Of course, why the DPM installer doesn't just install this (or at least prompt) as part of the roll out since it is basically a dependency is anyones guess...
Friday, 6 May 2011
"Access Denied (0x80070005)"
Common causes are listed all over the place, suggesting Firewalls as the issues and DCOM Permissions. All entirely possible. One other thing to consider, especially if you've setup Forest Trusts etc, just make sure you've made sure the AD Network holding your DPM Server(s) is fully accessible - and that this traffic isn't restricted either! In our case, we had a Cluster with 2 servers, one in a Subnet (we'll call this Subnet A), another in a different subnet (Subnet B) and our DPM Servers (and the DPM AD Network) in another (Subnet C).
While Subnet A and B could talk without restriction, and A could quite happily talk to C, for historical reasons, B and C weren't completely open for communication. So my tip - make sure you've considered Active Directory Authentication and not just "DPM to Protected Server" issues!
Agents are "unavailable" and "VssError: Invalid value for registry"
This ia bit of an odd one and just "happened" on a previously perfectly happy server. We resolved this by simply removing the account used to push out the agents in the DCOM Config (run "dcomcnfg.exe"), find the "DPM RA" in the list and remove/readd the user. No idea what caused that mind!
Replica is inconsistent with System State and repeatedly so...
Especially if you're on a Windows 2003 SP-2 32-bit system? Yep, thought so. You've probably just not got enough space on the system drive (normally C:\). You should move the normally hidden "DPM_SYSTEM_STATE" folder to another drive, ideally with +10GB free, and then update the data source...
\Microsoft Data Protection Manager\DPM\datasources\PSdataSourceConfig.xml
so it points to wherever you put it... easily sorted.
Hopefully they'll help you for now, more tips later!
Thursday, 5 May 2011
It does seem to consume much high amounts of storage - but it isn't yet sufficiently clear if this is worthwhile yet (eg. if the space is pre-allocated so it can meet retention policies and then fills it, or it simply over-estimates likely requirements resulting in lots of unused capacity). We'll find out once we've run it a few weeks in a full production environment with realistic changes and replicas - and if needbe we'll tweak things a little.
Anyhow, I digress, so back to the purpose of this post... The next part of our rollout is to enable the "off site" capabilities - specifically making sure we have a second copy of each servers data at another site - you know for "total disasters".
This is called "DPM Chaining", "Secondary Protection" and various other things depending on the version of DPM, the documentation you read etc and what you are trying to achieve.
Basic steps are simple (after doing the normal DPM setup):
(a) On the second DPM server, push the protection agent to the first.
(b) On the first DPM server, push the protection agent to the second.
(c) On the second server, create protection groups, selecting the first dpm server as the data source, expanding "protected servers" and then treating it as if it was the first server.
(d) Complete the wizard, wait (a long time possibly) for replication to complete the first time.
We'll see how our trial run goes...
Thursday, 28 April 2011
One small example is where you find System State and Bare Metal Recovery Replicas keep becoming inconsistent on a Windows 2008 system that's being backed up with DPM 2010.
The fix is pretty simple. On the 2008 server you're backing up, go to "Server Manager", load features, choose "Add Feature" and ensure "Windows Server Backup" is an allowed feature (this won't need a reboot).
Given DPM seems to check loads of other pre-requisites you'd expect it would either alert you to this at install time, or just enable it as part of the install (even if there was an option which said "If Windows Server Backup features are not enabled on the source for protection, enable it automatically" or something.
A silly oversight and one that just takes a tiny bit of the sparkle of clueful implementation away I think.
Wednesday, 27 April 2011
Obviously we asked them to make sure the files were not critical or important (just in case, safety first naturally!) - and then just tell us what files they wanted back. The theory being we should be able to do this without knowing in advance whats being deleted (ensuring nobody here could take extra backups or look out for anything etc).
Guess what, it worked... first time, and it is very fast. By comparison to Backup Exec, which took a minimum of 3-4 minutes even for a single 100KB Word Document (because of the whole loading media nonsense...), it did the job quickly, very quickly.
Where Backup Exec is more flexible however is if you want to restore a random set of files from a single file in different folders - DPM doesn't appear to let you do this - so I'd have to select files in a single folder, run "Recover..." then repeat for each folder (well through the UI anyhow). However, given the restore takes literally a few seconds, I'm not sure we care too much - and in reality doing this is pretty rare - normally we want a whole folder or a group of files in a folder or similar, rather than completely random odd and sods files from across a server.
Tuesday, 26 April 2011
There are a few important things to think about though if you are looking to switch, since Microsoft DPM is really only about Windows, SQL, Exchange and Sharepoint. If that's what you're running, and you're on 2003 SP-2 or 2008 and above, you should be fine. If you need other platforms and apps which Backup Exec supports you're probably out of luck using this.
Microsoft DPM is a very different product. One of the key differences is that it is truely snapshot based. Backup Exec still does far too much by using file by file methods - this has terrible scaling consequences.
It is mostly about Disk backup, whereas Backup Exec has a wider range of support for traditional tape backup. DPM can do it (it calls this "Long Term" Storage, and uses Disk for "Short Term" (you define what short/long term is...)
So in a nutshell (kind of) here's the story so far:
1) Installation of DPM failed because the install folder was "C:\!Software\DPM2010" whereas the installer ignored the existance of ! and tried to load "C:\Software\DPM2010" and couldn't find its own files. So we just put up with that and put DPM2010 in the c:\ folder root so we could get started.
2) Installation takes a while as it also rolls out SQL 2008 (you can get it to use an existing Database but we opted not to - and this is the recommended approach).
3) Take time to read the pre-req's and understand how DPM works. For example, make sure you have a huge volume on each DPM server (the best scenario) you have left unformatted so it can claim this for itself.
With those basics covered, the initial installation was completely succesful and our first DPM server appeared.
Monday, 18 April 2011
So we figured we'd give Microsoft Data Protection Manager a go. Full of optimism, we began the install. It failed at the first hurdle.
You see the software was in a folder "C:\!Software\DPMServer2010"
Except the installer decided that is actually "C:\Software\DPMServer2010"
So although ! is a perfectly valid File System Character, the DPM Installer failed.
Folder renamed and it worked.
It isn't a good start... this is the sort of stupidity Backup Exec had!
Saturday, 23 October 2010
I still really don't like Backup Exec much.
Wednesday, 6 October 2010
CASO couldn't talk to all of the managed media servers (sort of making them unmanaged then...) Much muttering and fiddling later and nothing.
So I try and think how to describe the exact issue, and google for it (naturally you google last after messing around because that would be akin otherwise to actually reading the manual that comes with stuff you buy)...
First match... er, this blog.
I refer myself to my own sodding blog to fix an issue. The post in question, June 8th 2009...
Tuesday, 29 December 2009
Why must there be so many options which become increasingly expensive for features that should just be included, and why exactly must we pay for enhancements which are often essentially fixes for features they never implemented properly the first time.