Troubleshooting a failing hard drive

External documentation

Wikibooks: Minimizing Hard Disk Drive Failure and Data Loss

Self-Monitoring, Analysis, and Reporting Technology, or S.M.A.R.T.:

Filesystems and Mounting

Set up

Software to install and initial set up to do as soon as possible after installing your favourite Linux distribution.

Software to install

automatic check

If fstab's 6th column is 0, the mount count is not checked.


Palimpsest Disk Utility (gnome-disk-utility)

On Ubuntu, the package name is:


This will also install a mail server.

Things to do regularly

... to prevent disaster.


When things get bad...

Symptoms of a failing drive

Here are a few examples of input/output errors encountered by various programs when the disk starts to fail:

* Directory listing output is full of question marks:

ls: cannot access *** Input/output error
total 0
d????????? ? ? ? ?                ? mydirectory

* Cannot remove files:

$ sudo rm -f image.JPG
rm: cannot remove `image.JPG': Input/output error

* Cannot write to file, nor perform any operation which requires write access to the failing drive.

First steps to take

Whatever the problem is, it is important to make sure to save as much data as possible.

First of all, use dd to make an image of the drive.

See the following post for a detailed explanation of dd's uses:

In our case, something like the following command would do:

# dd if=/dev/sda of=/mnt/backup/failing_drive_image.iso bs=2048 conv=sync,notrunc

Checks to run

Make sure the partition to check is unmounted.
e2fsck /dev/sd??
Use fdisk -l to find the device path (e.g. /dev/sda1).

Check man e2fsck for more information on options.


Either boot from a "rescue" CD (or an installer and choose Rescue mode), or add the kernel boot option "forcefsck" along with the "ro" option (on both Debian and Red Hat systems).

data recovery:

