Data Recovery Services by Seagate Recovery Services

Data Recovery from Seagate

Language
Data Recovery Services Manage your Case TechResources About ActionFront Data Recovery Submit a Case


 

ActionFront Research

On April 14, 2004, ActionFront Data Recovery Labs Inc. introduced the world to it's new SignalTrace™ technology and simultaneously released a white paper entitled "Recovering Unrecoverable Data - The Need for Drive-Independent Data Recovery" written by Charles H. Sobey, Chief Scientist of ChannelScience.

The white paper, commissioned by ActionFront, presents the most comprehensive overview of the data recovery industry's current methods and technologies ever published and calls for an independent data recovery trade association to be formed to certify data recovery companies and for companies to submit claims of recovery capabilities for independent review.

Mr. Sobey, an internationally respected authority on hard disk drive technology and data detection, provides a rare insight into data recovery practices and processes, explains the reasons some media are unrecoverable, and identifies the need for a new class of drive-independent data recovery techniques. He then proceeds to describe the new SignalTrace™ technology in some detail.

SignalTrace™ is the result of many years of research and development and a substantial investment of ActionFront resources. This new technology opens the possibility of recovering data from storage devices that currently cannot be recovered by anyone, anywhere, for any amount of time or money.

This first public demonstrations of ActionFront's new technology was provided on the exhibit floor of the 2004 NASA/IEEE Conference on Mass Storage Systems and Technologies (MSST04). A technical overview was presented at the work-in-progress session by Chuck Sobey, of ChannelScience, who assisted with the development of SignalTrace™.

Please find below the entire text extracted from the white paper.

Get the White Paper

Recovering Unrecoverable Data - The Need for Drive-Independant Data Recovery
527KB PDF. Published April 14, 2004.

Preview first three pages (of the 30 page white paper):


Follow up Whitepaper:

Drive-Independent Data Recovery - Current State of the Art
527KB PDF. PrePrint July 12, 2005.

Recovering Unrecoverable Data
The Need for Drive-Independent Data Recovery
A ChannelScience White Paper

Commissioned by
ActionFront
Data Recovery Labs, Inc.

Written by
Charles H. Sobey
April 14, 2004

7300 Cody Court (972) 814-3441 Voice
Plano TX 75024-3837 USA (972) 208-9095 FAX
connect@ChannelScience.com

RECOVERING UNRECOVERABLE DATA
The Need for Drive-Independent Data Recovery
1. Executive Summary 3
2. Introduction to Hard Disk Drive (HDD) Technology 4
2.1 Areal Density and Price Trends 4
2.2 What Happens to Data in a Hard Disk Drive? 5
2.2.1 Organizing the Data 5
2.2.2 Locating the Data 7
2.2.3 Detecting the Data 8
2.2.4 Decoding the Data 9
2.2.5 Drive Burn-in and Optimization: Hyper-Tuning 11
3. Data Recovery Market 12
3.1 Perception vs. Reality 13
3.2 A Call for Transparency 13
4. Data Recovery Technology 14
4.1 Traditional Hardware Replacement Methods 15
4.1.1 Replace the PCB 15
4.1.2 Replace the Firmware 16
4.1.3 Replace the Head Stack 18
4.1.4 Move the Disks to Another Drive 18
4.2 Magic Machines and Proprietary Processes 18
4.2.1 Spin-Stand Testers 19
4.2.2 Magnetic Force Microscopes (MFM) 20
4.2.3 The Spin-Stand MFM? 20
4.2.4 Exotic Recovery 21
5. The Frontiers of What's Possible: What Makes Data Unrecoverable? 21
5.1 When Firmware Replacement Fails 21
5.2 When Head Stack Replacement Fails 22
5.3 When Disk Remounting Fails 22
5.4 When the DATA Fails 23
6. Future Success Depends upon Developing Drive-Independent Data Recovery Capabilities 24
7. The FIRST Public Demonstration of Drive-Independent Data Recovery: ActionFront's SignalTrace™ Technology 25
8. Conclusions 28
9. References 29
About the Author 30

RECOVERING UNRECOVERABLE DATA
The Need for Drive-Independent Data Recovery

1. Executive Summary
When a hard disk drive containing valuable data no longer responds, the user's last hope is to send the drive to a data recovery company that specializes in drive hardware failures. There is a general perception that data recovery companies have "magic machines" for retrieving data in almost any situation. The reality is less glamorous. The most sophisticated, commercially successful recovery techniques involve careful part-replacement, in a cleanroom environment, of the heads, the spindle motor and base casting, the electronics board, and/or the drive's firmware and parameter tables. Part-replacement has historically been successful for data recovery about 40 to 60% of the time. Claimed data recovery success rates are much higher. While they may, in fact, approach 100% for some drive models, for other models and failure modes the success rate is near zero. Drive-independent data recovery methods are needed now to read these drives. Furthermore, as the data density of hard disk drives continues to increase the number of unrecoverable drives is expected to grow.

The reason for this lack of successful recovery can be traced to the methods drive manufacturers must employ to achieve both high data density and high production yields. Specifically, current drives are hyper-tuned in the factory to optimize the performance of each section of each hard disk drive. The data format, head, disk, electronics, and firmware parameters are all optimized together. This means that it is less likely that a head stack or electronics board or parameter tables from one drive even of the same model will work well when used as a replacement in a failed drive.

ActionFront Data Recovery Lab's SignalTrace™ technology is the only solution known to-date that demonstrates the capabilities needed for commercially viable recovery of user data that is otherwise unrecoverable using traditional part-replacement. SignalTrace™ technology replaces, instead, the exacting, optimized signal processing and positioning functions of the disk drive with custom hardware, software, and algorithms to precisely locate particular sectors of data and recover each bit individually independent of the drive's specific hardware. Furthermore, its underlying design has the flexibility to provide this data recovery capability into the future as increasing data densities continue to require more hyper-tuning of disk drives in the factory.

2. Introduction to Hard Disk Drive (HDD) Technology
In 1956, IBM introduced the world's first hard disk drive, the RAMAC (Random Access Method for Accounting and Control). It was approximately the size of two refrigerators placed side-by-side and stored about 5MB of data. It cost over $50,000 [1], or $10Million/GB! Currently, HDDs routinely provide over 100GB of storage in a 3 1/2 form-factor for less than $1/GB. This history of improvement out-strips the better-known development records in semiconductor density and telecommunications data rates. The constant research, high volumes, and low prices required by the disk drive industry have brought many great HDD-related companies into, and out of, existence. Even the HDD bell-weather, IBM, sold its drive business to Hitachi in 2002.

2.1 Areal Density and Price Trends
In hard disk drives, data is arranged in concentric circles, called tracks. To get more data on a track, the spacing between each bit in the down-track direction must decrease. The data density in this direction, also called the linear density, is measured in thousands of bits per inch (kbpi). Similarly, the track density across the disk is measured in thousands of tracks per inch (ktpi). The tpi metric not only reflects the width of the written track, but also the small guardbands that are needed between tracks to provide margin for head-to-track misalignment. These metrics are illustrated in the figure to the left.

Areal density is the metric used to quantify the impressive growth in HDD data storage capacity. It is the product of bpi and tpi, which reflects the amount of user data that can be stored reliably in one square unit of area on the disk surface. It is now measured in gigabits per square inch (Gb/in2). Areal density has increased by almost 8 orders-of-magnitude since the introduction of the first disk drive. This trend is shown in the areal density plot to the left.

For decades, the compound annual growth rate (CAGR) of areal density was about 30%. With the introduction of MR (magneto-resistance) head technology around 1990-1991, the rate increased to 60%. When GMR (giant magnet-resistance) heads were introduced in the late 90's, the CAGR temporarily increased to over 100%, during which time there was an increase in the number of companies that exited the industry or merged. The pace of areal density growth is now slowing and should settle somewhere between the historical rates of 30 and 60%.

Currently, an areal density of 100 Gb/in2 might be achieved by a combination of 800 kbpi and 125 ktpi, for example. This provides a bit aspect ratio (BAR) of about 6:1 (bpi/tpi). The bit-to-bit spacing in this example is 1.25 microinches (about 30 nanometers) . The track-to-track spacing (track pitch) is 8 microinches (about 200 nanometers). The 5 - 10% guardbands between tracks are a fraction of a microinch (less than 20 nanometers). It is astonishing that drives routinely achieve this level of mechanical precision at a price per megabyte that has been falling at the rates shown in the graph to the left. For the past few years, it has been cheaper to store data on HDDs than on paper or film. Currently the price of HDD storage is about $1/gigabyte.

When drives cost thousands of dollars, drive repair was a lower priced alternative to purchasing a new HDD. Today, the most economical option for dealing with a malfunctioning drive is to replace it with a new one. The new drive will likely be larger, cheaper, and faster. In fact, it is typically the data itself even for the home user that is much more valuable than the drive.

Increasingly, the home user's drive is filled with often-priceless photos and movies. The time it takes to recover a failed drive can also be more costly than the drive itself even when backups are available. (You do have backups, don't you?) However, backups typically represent a snapshot of the data some time ago (last night, last week, last month). Therefore all recent work and transactions are still lost. Unfortunately, many companies that run backups diligently do not practice restoring data from backups. Sometimes the backups themselves are corrupted. Even in redundant systems, such as drive arrays, data loss due to multiple-drive failures is not uncommon.

For these reasons, no matter what precautions have been taken, a drive may need the services of a data recovery company. For criminal investigations requiring data forensic analysis, there is no substitute for the drive in question. It must yield its information even if it has been intentionally destroyed.

2.2 What Happens to Data in a Hard Disk Drive?
When you push the Save button, and write your data to the HDD, you expect it to be returned correctly when you open the file in the future. The actual specification for this expectation of data integrity is the unrecoverable read error rate. This is typically in the range of 1 bit in error for 1013 to 1015 bits read. Every part and function of the drive is essential for achieving this level of data integrity, however for the purposes of data recovery the topics discussed next are the most relevant. These include the logical-to-physical block translation system, the servo positioning system, the drive layout optimization routines, the data detection algorithms, and the data decoding.

2.2.1 Organizing the Data Files, whether they represent text, a database, photo, song, movie, web page, executable program, or anything else, are stored as a series of sectors. A sector is a physical location on the disk that is designated to store (most commonly) 512 user bytes. Because of the encoding overhead and the requirements of the detection algorithms (discussed below), about 600 bytes are actually stored in a sector.

Sectors have traditionally been uniquely identified in a drive by cylinder, head, and sector (CHS) coordinates. The head number indicates on which surface the sector is located. The cylinder number identifies the specific concentric track on that surface where the sector can be found. And the sector number indicates which of the hundreds of sectors on the track contains the data that is sought.

How does the drive know where your file is? It doesn't. That is the job of the operating system. The operating system keeps track of which logical blocks on which drive contain your file. For convenience, we will consider a logical block to be a data sector, although each block could also point to several consecutive sectors. The drive will request a logical block from the drive, for example block # 1,635,324. The HDD must map this logical block location into a physical block (CHS) location, for example cylinder 5,000 on head 1 at sector 452. There are fast algorithms for computing this, however the interesting complication is when the usual physical location for a logical block has a defect that precludes it from reliably storing data.

Such locations are found and mapped out during the manufacturing process. There are also provisions for doing this check and re-mapping when the drive is in use in the field. The drive has many spare sectors and even spare tracks to be used as replacements for defective sectors. This is transparent to the operating system under normal operation. The drive accepts the logical block address and performs the logical-to-physical translation itself. This varies from drive-to-drive, reflecting the mapping-out of defects found during the drive's surface scan self-test.

In the field, the drive may acquire additional defects due to corrosion, handling, or other causes. These are typically identified in a table of exceptions (sometimes called the P-list and the G-list, for primary defects and grown defects, respectively). This table, the table of parameters, and the firmware are typically stored on the disk itself in the outermost tracks. These tracks are referred to as the system area, maintenance tracks, diskware, negative cylinders, etc. However, some drive models store the table in non-volatile memory on the printed circuit board. Clearly this table of exceptions is uniquely linked to the media in a particular drive. The table for one drive will not, in general, be the same for the media from another drive.

Up until the 1980's, drives typically had the same number of sectors on each track. However, the circumference of a track at the outer radius of the disk (called the OD, for outer diameter) is clearly much larger than the circumference of tracks at the ID (inner diameter). This means that the linear bit density (bpi) is highest only on the innermost track. All the other tracks contain less data than they have the potential to store. This is shown in the graphic to the left.

To maximize the amount of data that can be stored, each disk surface is divided into groups of adjacent tracks called zones. There are 8 to 32 (or more) zones per surface. From the ID to the OD, each zone is written with a higher frequency to counteract the bit spacing growth caused by the higher linear velocities at the larger radii. The bpi still drops slightly across each zone. While zoning makes better use of the storage capacity of the disk, it also means that many unique optimization settings must be determined for each surface during manufacturing. The figure to the left shows the additional sectors in the OD zone and the bpi taper across the disk.

The user's file is likely to be stored across many sectors. These sectors may be spread across different tracks in different zones and even across different disk surfaces. Furthermore, the same logical blocks may be mapped into different physical sectors on two drives depending on the unique distribution of defects on each disk.

A track of data may be less than 10 microinches in width. The drive must find this track within a few thousandths of a second and follow the repeatable and random fluctuations of the track to less than one millionth of an inch. Most amazing is that this can be accomplished in a consumer product that sells for less than $100. The servo positioning system makes this possible by using a sophisticated feedback control algorithm that controls the fast seeking and precise track following.

2.2.2 Locating the Data
For the best performance, the servo system requires a very accurate measurement of the head's position relative to the track. Each HDD surface is divided into data sectors and servo wedges. The servo wedges are arc-shaped regions that extend from the ID to the OD. They contain a unique magnetic pattern that provides a reference to the center of the track.

The servo pattern is typically written at a much lower bpi than the data and its frequency is constant across the disk. It is not zoned. This means that the bpi is lower at OD. In other words, the servo pattern is shorter near the ID and longer near the OD a wedge shape. There are typically 50 to 200 evenly spaced servo wedges per revolution. This embedded servo information is on each disk surface.

The figure to the left shows three data tracks (high bpi portions with guardbands in between) and an embedded servo field. The servo field begins with a single frequency pattern for establishing timing and amplitude references. A sync pattern indicates the beginning of the encoded cylinder number (or track ID). This is followed by three to six bursts of single-frequency magnetic transitions (only two are shown in the figure for clarity). These bursts provide accurate position information, relative to the track center.

The first two bursts, typically called the A burst and the B burst, are shown written off center. When the head is exactly on track center, it will get a certain amount of signal from the A burst and then an equal amount from the B burst. The relative amount (amplitude or energy) of each burst signal provides a precise measure of the head's position relative to the track center. Because the servo information is written before any tracks of data, the servo bursts actually define the center of the data tracks. The track ID indicates which track center.

The servo system also identifies each sector. It does this by maintaining synchronization with the first servo wedge in a revolution and timing from there to indicate the beginning and end of each data sector on the track. This timing relationship changes from zone-to-zone, but the wedge-to-wedge servo timing remains constant.

2.2.3 Detecting the Data
Every data sector is a sequence of binary 1s and 0s, stored as a pattern of magnetic transitions. A magnetic transition is a change from a north facing-magnet to a south-facing magnet or vice versa. These are sometimes called north-north transitions and south-south transitions, which stresses their polarity differences. The GMR head and its amplifier respond with a voltage pulse for each transition that is read. The polarity of the pulse indicates the transition's polarity.

An oscilloscope screen shot of a typical data sector readback waveform (top trace) is shown in the figure to the left. The trace at the bottom is the read gate. This is generated based on timing offsets from the rotational synchronization generated by the servo system. As stated above, the timing offsets vary from zone-to-zone. Detection of a sector begins with the read gate's assertion and ends with its de-assertion.

The detection of data is equivalent to the detection of the presence or absence of the pulses, and their polarity. However, detection must take place in a noisy environment, so mistakes can be made. Furthermore, the readback signal can be distorted in many ways, including due to slightly off-track placement of the head. At high bpi the pulses overlap, which causes pulse position shifting known as intersymbol interference (ISI). This makes identifying the data sequence especially difficult. Drives today use variations and extensions of partial-response maximum-likelihood (PRML) sequence detection [2, 3] in order to correctly detect data in such environments. In the future even more sophisticated techniques, such as iterative detection, will likely be employed.

For good error rate performance, it is necessary to establish the proper gain for each sector and lock the detection process to the precise frequency and phase of the readback waveform. This places three specific requirements on the stored data.

1) Every data sector must start with a single-frequency sequence of transitions. This is usually called the preamble and is about 10 to 15 bytes long. The preamble makes it much easier to establish the proper gain and timing synchronization for the sector. Every servo field also starts with a single frequency preamble for the same reason.

2) It is possible that the beginning of the user's data might look just like the repetitive pattern of the preamble. To precisely indicate the end of the preamble a unique, easily identifiable transition sequence called the sync mark, or frame sync, is written in between the preamble and the user's data. The sync mark is typically 2 to 6 bytes long and may be written in two locations in case the first sync mark is missed or damaged.

3) After the sync mark is found, gain and timing lock must be maintained throughout the user's data that follows. In order to ensure this, it must contain pulses at least every two to three bytes so that gain and timing locks can be adjusted. For example, if the user stored an all-zeros pattern there would be no transitions to generate pulses to use to maintain synchronization. For this reason, the users data is run-length limited (RLL) encoded before being written to the disk. This can expand the amount of data that must be written by about one percent to as much as 12.5%, depending on the RLL code used.

The PRML detection techniques require a target for the expected pulse shape and for how pulses interfere with each other. To ensure that the waveform is close to this target, a combination of fixed and adaptive filtering is applied to the readback signal. For best performance, all of these channel parameters must be optimized (tuned) for each zone of each head in each drive. ChannelScience's read channel simulation software package, PRMLproTM (shown in the figure to the left), models most of the signal processing used for detecting the sequences of 1s and 0s from captured readback waveforms from magnetic disk, tape, and optical drives [4].

Even with all of these steps, the post-detection raw error rate is only about 10-5 to 10-8. In order to achieve the specified unrecoverable error rates of 10-13 to 10-15, error correction coding must also be used.

2.2.4 Decoding the Data
Inside a modern HDD, the users data is encoded about 5 times before being written to the disk. This is done to 1) Ensure no incorrect data is provided to the user, 2) Correct as many errors that may occur in detection as possible, and 3) Improve the quality of detection by improving timing recovery and by mitigating the effects of certain error-prone patterns. Because of these levels of encoding, the user's data itself is not written to the disk. Instead it is the encoded user data that is stored. Even if a tool such as PRMLproTM is used to recover the data, it is actually detecting the encoded data. To yield useful information that can be reassembled into files, the various encoding steps must be decoded.

One encoding step is actually a data randomizer, also called a scrambler. The scrambler may be thought of as a circuit that pseudo-randomly flips various bits from 1 to 0 or vice versa. Surprisingly, this serves a few useful purposes. 1) Repetitive patterns are broken up. That is, it is less likely that a common pattern, (e.g., a control character, space, carriage return) that might be a difficult pattern to detect will appear over and over, thereby degrading the bit error rate. 2) Electromagnetic interference (EMI) that might be generated by the electronics, in response to a repetitive pattern at a certain frequency, can be reduced. 3) A common pattern with a lot of zeros may be scrambled into a pattern with more ones. This can help the gain and timing control loops remain locked to the waveform. 4) It is also possible to scramble adjacent tracks differently. This can provide some decorrelation between tracks, which might improve detection when the head is slightly mispositioned off track center.

Because the bits are flipped pseudo-randomly, the flipping sequence can be regenerated during readback so that the data is exactly unscrambled. Precise location of the sync mark is necessary for this to succeed. Notice that the scrambler does not prohibit any pattern. For example, it is possible for the user to store a bit sequence that is scrambled into an all-zeros pattern. For this reason, it is still necessary to apply an RLL code to the scrambled user data.

A common RLL code for PRML channels maps 16 scrambled data bits into 17 code bits. This is a coding overhead of about 6% (17/16). This type of code ensures that there are no more than a certain number of zeros (maybe 10 to 15) in between ones. This causes pulses to be present in the readback waveform often enough for gain and timing to be tracked. There are other RLL codes that have much higher rates than 16/17. There are also RLL codes that are designed to eliminate certain patterns that are more error-prone. It is possible that different RLL codes are used in different zones of a single disk surface.

Currently, most drives combine RLL codes with a parity check code. This typically adds one or two bits to the RLL code overhead. For example, a 64/65-rate code (64 user bits are encoded into 65 RLL code bits) would become a 64/66-rate code when a single parity-check bit is added. The benefit of adding this small amount of parity is that the dominant errors made by the detector can be identified and corrected with a small increase in circuitry and code overhead.

However, all of these encoding methods combined still do not achieve the unrecoverable read error rate goal of better than 10-13. This is possible only with error correction coding (ECC). ECC calculates parity bytes for the users data, which provide structured redundancy that can be used during decoding to detect and correct errors. The ECC encoded user data is what is scrambled and RLL encoded. Typically, Reed-Solomon encoding is used because of its good burst error correction capability and the economy of its implementation. Bursts of errors occur because a scratch or other small mark corrupts a group of consecutive bits. It is not uncommon to have the ECC capability to correct over 200 bit errors in a sector.

The ECC can fail in two ways. One way is that there are too many errors in a sector to correct. This is an unrecoverable read error. However, the drive will typically try several heroic recovery methods, such as re-reads, off-track reads, and even some reoptimization, to try to detect the data successfully before reporting an unrecoverable read error (also called a hard error). The other way ECC fails is much more dangerous.

If there are a few more errors in a sector than the ECC can correct, and they occur in a certain way, it is possible that the ECC decoding miscorrects the data. This is disasterous in financial transactions, for example. The probability of miscorrection, also called the probability of data corruption, is not commonly specified on drive data sheets. Ideally the probability is much less than 10-20. To ensure that it is very unlikely that data will be miscorrected, the ECC encoded data is often wrapped with a CRC (cyclic redundancy check) code. This has a very strong capability to detect errors, but is not used for correction. This provides the final check that the data is correct as delivered back to the computer over the interface.

The figure below shows the encoding sequence and the organization of sectors on a track. Notice that to get the most benefit from zoning, sometimes data sectors are split across servo wedges. The second part of a split sector must also start with a preamble and a sync mark. The detected data sequences from both portions are concatenated and the decoding and descrambling proceed as usual.

Seagate Recovery Services
 
Professional Data Recovery Service by Seagate Recovery Services. Copyright 1994-2007