Hard drive reliability: reading the warning signs
Drives rarely die without a trace. SMART data, failure-rate curves and a few minutes of vetting let you spot trouble early, buy used capacity with confidence, and replace a drive before it takes your data with it.
Every drive fails eventually; the goal is to never be surprised by it. Modern drives expose a stream of health telemetry, fail in statistically predictable patterns, and give warning signs you can act on. Learn to read them and you can vet a cheap used drive, monitor your fleet, and retire a disk on your schedule rather than during a 2 a.m. data loss.
SMART: the attributes that actually matter
SMART (Self-Monitoring, Analysis and Reporting Technology) is built into every drive and reports dozens of attributes. Most are noise; a handful are the ones that correlate with imminent failure. Watch these closely:
- Reallocated Sectors Count (05) — sectors the drive found bad and remapped to spares. A few may be benign, but a rising count is a serious warning.
- Current Pending Sectors (C5) — sectors that failed a read and are waiting to be remapped. Any non-zero, growing value means trouble.
- Reported Uncorrectable Errors / Offline Uncorrectable (C6/198) — data the drive could not recover. These are red flags.
- Power-On Hours (09) — total run time; essential context for a used drive’s age.
- UDMA CRC Error Count (C7) — often a cable or connection fault rather than the drive itself; reseat the SATA cable before condemning the disk.
- Command Timeout, Spin Retry, Reallocation Event Count — supporting indicators that, when climbing, reinforce a failure picture.
The pattern matters more than any single value: a stable drive with a few old reallocated sectors is usually fine; a drive whose pending or reallocated counts are rising over weeks is on its way out. Use a tool like smartctl, CrystalDiskInfo or your NAS’s dashboard to read and trend these.
AFR and the bathtub curve
Reliability is best understood statistically. Annualised Failure Rate (AFR) is the percentage of a population of drives expected to fail in a year — large operators publish AFRs from real-world fleets, and they are far more useful than any spec-sheet number. Failures over a drive’s life follow the famous bathtub curve: an early spike of ‘infant mortality’ as manufacturing defects surface in the first weeks or months, a long flat middle of low, steady failure rates, and a rising tail as the drive ages and wears out. The practical lessons: stress-test new drives early to flush out infant mortality under warranty, and watch ageing drives more closely as they climb the back of the curve.
| Attribute | ID | What it signals |
|---|---|---|
| Reallocated Sectors | 05 | Bad sectors remapped — rising = danger |
| Current Pending Sectors | C5 | Unstable sectors awaiting remap — act on any rise |
| Offline Uncorrectable | 198 | Unrecoverable data — red flag |
| UDMA CRC Errors | C7 | Usually cable/connection — reseat first |
| Power-On Hours | 09 | Age/usage context, vital for used drives |
| Spin Retry / Command Timeout | — | Mechanical strain — corroborating signs |
MTBF vs real life
Manufacturers quote MTBF (Mean Time Between Failures) figures in the hundreds of thousands or even millions of hours, which sounds like a drive lasts centuries. It does not. MTBF is a population statistic measured under ideal conditions over a short window, then extrapolated — it describes failure rate during the drive’s useful service life, not its lifespan. A 1-million-hour MTBF does not mean your drive will run for 114 years; it implies a certain small failure rate per year within the rated service life (commonly around five years). Treat MTBF as a relative quality signal, not a promise, and trust published real-world AFR data over it.
Vetting a used or recertified drive
Used and recertified enterprise drives offer the lowest cost per terabyte anywhere, and you can buy them confidently with a short vetting routine. On arrival, before trusting any data to the drive:
- Read the full SMART report — check reallocated and pending sectors (ideally zero or low and stable), uncorrectable errors (zero), and power-on hours (context, not a dealbreaker — a healthy high-hours drive can outlast a flaky new one).
- Run a full surface test — a long SMART self-test plus a full read/write or badblocks pass exercises the whole platter and surfaces weak sectors.
- Buy within a return window — from sellers who publish SMART data and accept returns, so a bad drive costs you nothing but time.
- Never trust a single used drive — keep it as one copy among several, per the 3-2-1 strategy.
When to replace a drive
Replace immediately if pending or reallocated sectors are climbing, if uncorrectable errors appear, or if the drive starts making unusual mechanical noises (clicking, grinding). Plan a proactive replacement as a drive approaches the wear-out tail of its life or its rated service window, especially in an array where you want to swap on your terms rather than during a rebuild. In a RAID array, replacing a marginal drive early is almost always cheaper than risking a failure during a rebuild. And if a drive does fail with important data on it, stop using it and read our data recovery guide before making things worse.
Reliability comes from process, not luck
No drive is immune. Monitor SMART and trend it over time, test new and used drives early, keep array members on CMR enterprise/NAS-grade disks, and — above all — back up. A monitored drive with a backup is a minor inconvenience when it fails; an unmonitored solo drive is a catastrophe waiting to happen.
Enterprise drives by cost per terabyte
Live high-value enterprise and recertified drives sorted by real $/TB. Verify SMART health on arrival.
Frequently asked questions
Which SMART attributes should I actually worry about?+
Is it risky to buy a used hard drive?+
Does a high MTBF mean a drive will last for decades?+
Related guides & categories
See the cheapest terabyte on the market
No account, no email. Filter every drive we track, sort by real cost per terabyte, and jump straight to a current offer.
Open the $/TB rankings →