www.snwonline.com

ARTICLE POSTED March 11th, 2002

In extreme cases, data recovery services can save the day and a lot of money
By Ron Austin

When companies suddenly lose data or experience a data emergency such as a server crash, they need very specialized knowledge and experience to resolve the situation.

Even the brightest and most experienced technicians working with the best data storage equipment lose data. Data storage systems are increasingly complex and susceptible to failure because of human error, adverse environmental conditions and occasional device failure. Storage networks are migrating to the workgroup and SME level, where the dollars available for enterprise-level redundancy and technical support are limited. When you combine this with less contingency infrastructure, human error is even more likely.

When companies experience sudden data loss or a data emergency such as a server crash, they require very specialized knowledge and experience to resolve the situation. Technical support teams are under incredible pressure to immediately restore mission-critical operations. Tape restore procedures can be lengthy and error prone, and even companies that regularly back up their systems have rarely or never conducted a practice restore.

As the sense of urgency builds and their own restore procedures fail, stumped technical teams are increasingly turning to data recovery services as a last means of digital defense.

Data recovery RX
Data recovery companies combine expertise in data storage methodologies with operating system- and application-level savvy to re-create usable data sets and rebuild file structures from recovered bits and bytes.

RAID servers crash or experience data loss for a number of reasons. It doesn't matter if they are deployed as NAS devices, as part of a SAN or under legacy file server architecture. Physical failure or degradation is a common culprit. For example, a RAID controller could malfunction or a hard drive could partly or completely fail. For some reason, the service alarms could be ignored. While physical disasters such as flood, fire or a 9/11-style attack are the most catastrophic and garner the most attention, more mundane issues are at the root of most data emergencies. These include:

  • Corruption on the file system level. Volumes become unreadable, are inaccessible or seem to have disappeared completely. This can be a result of physical problems, configuration errors, bad or poorly timed software routines, a reboot at the wrong time or a rogue process.
  • Corruption within the application data itself. The reasons for this are similar to those causing file system corruption.

As mentioned, physical failures often lead to logical failures, and all three types of problems can be found in a single case of data loss.

Business continuance planning and rigorous backup routines should be fundamental practices of every business or organization. Only a minority of data loss situations will become full-fledged data emergencies requiring the intervention of specialists. However, the huge numbers of mission-critical operations deployed on countless permutations of multi-vendor storage platforms has created a growing business segment for capable data recovery companies.

The three R's: Recognize, react and resolve
There is a recommended procedure to follow in the case of sudden data loss. It involves the three R's, which are as follows:

  • Recognize a data loss situation:
    Data loss is usually characterized in one of two ways. One is the sudden inability to access data from a computer system or backup that was previously functioning well, and the other is the accidental erasure or over-writing of data.
  • React appropriately to a data emergency:
    When facing data loss, stop and review the situation. Distress and even panic are typical reactions under the circumstances, so the process of reviewing and summarizing the situation has the dual purpose of preparing for a recovery and inducing calm. Resist the pressure from co-workers, your boss or even your own deadlines for an instant fix.

    Data recovery principles begin with the medical oath to "do no harm." The best data recovery services will always make a copy of your problem media and then use that copy for subsequent recovery attempts while the original media is preserved. While a quick fix may prove successful, if it fails, then your attempts may actually increase the damage to the problem media and greatly reduce the prospects of a successful data recovery. Never attempt to restore a backup into or onto a corrupted data set, as you may over-write lost data.
  • Resolve your data emergency:
    The best data recovery services always begin by stabilizing the situation. This includes thoroughly analyzing what happened, followed by identifying available resources, including backups and alternate systems for interim use. Making a complete mirrored version of the problem media is part of the stabilizing process, and this can be done quickly unless it is impeded by severe physical damage. This allows all subsequent recovery attempts to be made on the mirrored copy while the original media is preserved.

    If you have a good and fairly recent backup copy of your data, try restoring to an alternate server as a short-term work-around. Preserve your problem media in its current state and seek expert help to recover your lost data.

A recent case study
As part of a major marketing initiative, a well-known California-based technology company was about to launch a new product and an upgraded Web service for its large installed base. In the course of this project, a large RAID 5 server was being expanded from six drives to eight in anticipation of enlarged user volumes. The RAID software had never previously failed, and its documentation described a "transparent" upgrade process too transparent, as it turned out.

After physically installing the two drives and rebuilding the RAID 5 array for the new configuration, everything appeared normal. However, when a technician rebooted the system, he discovered that he had lost all access to the data on the RAID 5 server. Recognizing the seriousness of the situation, the company's IT team located some data emergency specialists by searching the Web.

Within three hours of speaking with a data recovery consultant, the company brought in data recovery technicians who worked around the clock to overcome the various unique obstacles inherent in each of these complex recoveries. Thirty-six hours later, the data was restored, and the server was back in place and ready to support the new product launch. This emergency service was not inexpensive, but it was cost-efficient, considering the costs of delaying the launch.

Ron Austin is vice president of business development and marketing for ActionFront Data Recovery Labs, a data recovery company with locations in Atlanta, Santa Clara, Buffalo, Toronto and Tokyo.

mailto:feedback@snwonline.com
mailto:questions@snwonline.com
mailto:webmaster@snwonline.com

Back to ActionFront Press Releases and Case Studies