No sane administrator wants to be at the receiving end of a failed Exchange Server, given the enormous pressures that the rush to restore services entails.  Yet the very real possibility of data loss coupled with general inexperience with server restoration – it’s not something that happens all the time, after all – makes it even more important be acquainted with the steps leading to the recovery of a failed Exchange Server.

1.      Identify the problem

The first step would of course be to identify the problem.  Identification of the issue is important as not only does it allow an estimated timeframe to be worked out and the allocation of sufficient resources, but knowing the problem can also influence whether a full server backup is required.

Below is a short list of some of the categories of problems that may crop up.

  • Mailbox content deleted by mistake
  • Data is lost, restoration from backup required
  • Data is fine, but server just doesn’t boot
  • Data is corrupted: Some or all mailboxes inaccessible
  • Exchange Server is fine; problem lies somewhere else
  • Mail is not flowing between sites
  • Mail is not flowing to or from the Internet.

The restoration on mailbox content is actually the easiest to recover from.  This is because data is actually retained in Exchange for 30 days (default), and can be retrieved using the “Recover Deleted Items” tool.  You can read more about the process for items that were deleted normally as well as for “hard deleted” items.

The other issues can be broadly divided into those that require a restoration from data backups, hardware and other issues not directly related to Exchange or Exchange-centric problems.  In a nutshell, administrators may want to establish if the failed Exchange Server is related to data, external factors or Exchange-related problems – or a combination of them.

2.      Perform a full backup of server

Before the initiation of any attempts at recovery, it is imperative that a full backup first be performed on the affected server.  An experienced administrator will know that the race to solve the problem might sometimes cause more harm as a direct results of mistakes made.  Moreover, the installation of patches or updates that causes inadvertent problem may also not be easy (or even impossible) to reverse.

The rational for a full server backup is simple: it serves as a restore point where harmful changes or misconfigurations can be overcome by rolling backwards to the original point.  The exception would be external problems such as those related to the network or misconfiguration of DNS, for example.

3.      Initiate recovery

Only after the creation of a backup should the recovery of a failed Exchange Server be initiated.  Below is a short list of common problems.

Boot failure: Assuming that the boot failure is not caused by a malware attack, initiate the Recovery Console by hitting the F8 key as the boot loader is being loaded.  Problems related to errant system services or problematic applications in the startup queue may be resolved here.

Disk failure: Where RAID is used, a recovery here may be as easy as replacing a failed disk in the array and waiting for it to rebuild.  However, the situation is akin as recovering from a complete server failure should there be no data redundancy, or should multiple disk failure occur.

Complete server failure: The first step to do here would be to file a support ticket with your server vendor.  Depending on support package, a replacement may be delivered in as little as four hours or on the next business day.  The decision at this stage would be to decide on rebuilding a new machine or restoring the server image from a backup.

Database corruption: The steps for partial or complete database corruption may vary, as with the tools that can be used.  On this, I recommend reading Understanding Backup, Restore and Disaster Recovery from TechNet for additional background on the technology and methodologies behind data recovering for Exchange 2010 here and using Recovery Storage Group to restore Exchange mailbox data from earlier versions of Exchange here.

Configuration problem: While easy to resolve in the hands of an experienced Exchange administrator, it can also result in larger problems being caused by consecutive or multiple configuration mistakes.  Making a proper backup before proceeding is highly recommended here.

4.      Preparing for easy recovery

Finally, it makes sense to prepare adequately for disaster before they actually hit.  One way to do so would be to configure your Exchange with redundancy or failover in mind.  An alternative for businesses that cannot afford failover hardware is to ensure the presence of sufficient storage space to facilitate the creation of server backups mentioned earlier.

Another way to be prepared would be to ensure that the Exchange Server environment is properly documented.  Some of the information that should be documented includes the server name, version of Windows, version of Exchange Server, database names, location and size of databases.  On this front, Microsoft has a utility called ExchDump that can assist administrators in creating a baseline snapshot of an Exchange environment.  The utility can be downloaded here.


Like this post?

If you like this post and would like to receive more Exchange Server tips, as well as the latest Exchange Server posts from across the web, plus a free ebook with 42 Exchange tools, subscribe to the IT Dojo – Exchange Sensei series!