Troubleshooting a failing hard drive

This is a wiki page. Be bold and improve it!

If you have any questions about the content on this page, don't hesitate to open a new ticket and we'll do our best to assist you.

External documentation

Wikibooks: Minimizing Hard Disk Drive Failure and Data Loss
http://en.wikibooks.org/wiki/Minimizing_Hard_Disk_Drive_Failure_and_Data...

Self-Monitoring, Analysis, and Reporting Technology, or S.M.A.R.T.:
http://en.wikipedia.org/wiki/S.M.A.R.T.

Filesystems and Mounting
http://members.iinet.net.au/~herman546/p10.html

Set up

Software to install and initial set up to do as soon as possible after installing your favourite Linux distribution.

Software to install

automatic check

If fstab's 6th column is 0, the mount count is not checked.

Palimpsest

Palimpsest Disk Utility (gnome-disk-utility)
http://en.wikipedia.org/wiki/Palimpsest_Disk_Utility

On Ubuntu, the package name is:
gnome-disk-utility

smartmontools

http://sourceforge.net/apps/trac/smartmontools/wiki

On Ubuntu, the package name is:
smartmontools

This will also install a mail server.

Things to do regularly

... to prevent disaster.

troubleshooting

When things get bad...

Symptoms of a failing drive

Here are a few examples of input/output errors encountered by various programs when the disk starts to fail:

* Directory listing output is full of question marks:

ls: cannot access *** Input/output error
total 0
d????????? ? ? ? ?                ? mydirectory

* Cannot remove files:

$ sudo rm -f image.JPG
rm: cannot remove `image.JPG': Input/output error

* Cannot write to file, nor perform any operation which requires write access to the failing drive.

First steps to take

Whatever the problem is, it is important to make sure to save as much data as possible.

First of all, use dd to make an image of the drive.

See the following post for a detailed explanation of dd's uses:
http://www.linuxquestions.org/questions/linux-newbie-8/learn-the-dd-comm...

In our case, something like the following command would do:

# dd if=/dev/sda of=/mnt/backup/failing_drive_image.iso bs=2048 conv=sync,notrunc

Checks to run

Make sure the partition to check is unmounted.
e2fsck /dev/sd??
Use fdisk -l to find the device path (e.g. /dev/sda1).

Check man e2fsck for more information on options.

Recovery

Either boot from a "rescue" CD (or an installer and choose Rescue mode), or add the kernel boot option "forcefsck" along with the "ro" option (on both Debian and Red Hat systems).

data recovery:
https://help.ubuntu.com/community/DataRecovery

Issues related to this page:

ProjectSummaryStatusPriorityCategoryLast updatedAssigned to
Linux softwarels: cannot access *** Input/output erroractivenormalbug report4 years 34 weeks
Linux softwareBest use of e2fsckactivecriticaltask7 years 17 weeks
Linux hardwareFailing SSD drive: How to recover?activenormalsupport request4 years 44 weeks
Linux softwarebadblocks output doesn't show percentage doneactivenormalbug report6 years 4 weeks
Linux hardwareChecklist of things to keep in writting, in cas…activenormalfeature request6 years 4 weeks