Failing SSD drive: How to recover?

Project:Linux hardware
Component:Documentation
Category:support request
Priority:normal
Assigned:Unassigned
Status:active
Project wiki:Hardware
Related pages:#29: filesystem :-:-: #65: Troubleshooting a failing hard drive
Description

I have a failing SSD drive.

I was using Kubuntu Linux as usual, when the system became slow and certain operations (that required write access to /var/) started to fail. The root / partition was suddenly mounted read only!

I tried to reboot the system, but it wouldn't book anymore. Basically. my root partition was on a failing drive! I am currently using a live disk to access the internet.

I'll try to document as much as I can as I try to recover as much as I can.

Comments

#1

#2

fdisk -l

$ sudo fdisk -l

Disk /dev/sda: 32.3 GB, 32296140800 bytes
255 heads, 63 sectors/track, 3926 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x6a6e434d

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        3926    31535563+  83  Linux

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0003c958

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1      120121   964871901   83  Linux
/dev/sdb2          120122      121601    11888100    5  Extended
/dev/sdb5          120122      121601    11888068+  82  Linux swap / Solaris

fsck:

$ sudo fsck -f -y -v /dev/sda1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
fsck.ext4: unable to set superblock flags on /dev/sda1

#3

The drive is ext4.

#4

badblocks -swv /dev/sda has been running for 5 hours, and still running...

#5

The output overwrites the percentage done, so one doesn't know where it's at:

3986880 done, 5:25:31 elapsed
3987000
3987120
3987240 done, 5:25:32 elapsed
3987360
3987480
3987600 done, 5:25:34 elapsed
3987720
3987840 done, 5:25:35 elapsed
3987960
3988080 done, 5:25:36 elapsed
3988200
3988320 done, 5:25:37 elapsed
3988440
3988560 done, 5:25:38 elapsed
3988680
3988800 done, 5:25:39 elapsed
3988920 done, 5:25:40 elapsed
3989040
3989160
3989280 done, 5:25:42 elapsed
3989400
3989520 done, 5:25:44 elapsed
3989640
^C 12.65% done, 5:25:45 elapsed

Interrupted at block 3989760
root@ubuntu:~#

#185: badblocks output doesn't show percentage done

#6

# mount /dev/sda1 mnt.root/
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

#7

http://www.linuxquestions.org/questions/linux-newbie-8/learn-the-dd-comm...

# dd if=/dev/sda of=./ssd_root_drive.iso bs=2048 conv=sync,notrunc
15769600+0 records in
15769600+0 records out
32296140800 bytes (32 GB) copied, 1002.69 s, 32.2 MB/s

#8

# dd if=/dev/sda1 of=./ssd_root_drive_sda1.iso bs=2048 conv=sync,notrunc
15767781+1 records in
15767782+0 records out
32292417536 bytes (32 GB) copied, 1000.37 s, 32.3 MB/s

#9

#  sudo e2fsck -f /dev/sda1
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>?

What do I reply to that?

#10

#  sudo e2fsck -f /dev/sda1
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? no

Clear journal<y>? no

Truncating orphaned inode 89 (uid=1001, gid=4, mode=0100600, size=8690)
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 36 has zero dtime.  Fix<y>? yes

Inodes that were part of a corrupted orphan linked list found.  Fix<y>? yes

Inode 39 was part of the orphaned inode list.  FIXED.
Inode 44 was part of the orphaned inode list.  FIXED.
Inode 45 was part of the orphaned inode list.  FIXED.
Inode 50 was part of the orphaned inode list.  FIXED.
Inode 137 was part of the orphaned inode list.  FIXED.
Inode 2652 was part of the orphaned inode list.  FIXED.
Inode 5517 was part of the orphaned inode list.  FIXED.
Inode 9769 was part of the orphaned inode list.  FIXED.
Inode 9846 was part of the orphaned inode list.  FIXED.
Inode 133304 was part of the orphaned inode list.  FIXED.
Inode 133351, i_size is 444958, should be 450560.  Fix<y>? yes

Inode 133351, i_blocks is 880, should be 888.  Fix<y>? yes

Inode 133960 was part of the orphaned inode list.  FIXED.
Inode 145030 was part of the orphaned inode list.  FIXED.
Inode 145031 was part of the orphaned inode list.  FIXED.
Inode 145184 was part of the orphaned inode list.  FIXED.
Inode 146539 was part of the orphaned inode list.  FIXED.
Inode 146580 was part of the orphaned inode list.  FIXED.
Inode 146733 was part of the orphaned inode list.  FIXED.
Inode 148407 was part of the orphaned inode list.  FIXED.
Inode 151521 was part of the orphaned inode list.  FIXED.
Inode 151946 was part of the orphaned inode list.  FIXED.
Inode 152135 was part of the orphaned inode list.  FIXED.
Inode 152371 was part of the orphaned inode list.  FIXED.
Inode 156698 was part of the orphaned inode list.  FIXED.
Inode 158212 was part of the orphaned inode list.  FIXED.
Inode 160644 was part of the orphaned inode list.  FIXED.
Inode 160661 was part of the orphaned inode list.  FIXED.
Inode 170147 was part of the orphaned inode list.  FIXED.
Inode 218712 was part of the orphaned inode list.  FIXED.
Inode 218716 was part of the orphaned inode list.  FIXED.
Inode 219900 was part of the orphaned inode list.  FIXED.
Inode 219969 was part of the orphaned inode list.  FIXED.
Inode 219992 was part of the orphaned inode list.  FIXED.
Inode 220388 was part of the orphaned inode list.  FIXED.
Inode 917508, i_size is 1460928, should be 1728512.  Fix<y>? yes

Inode 917508, i_blocks is 2864, should be 3384.  Fix<y>? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  +2494486 -2506752 +(2520766--2520768) +2520770 +(2520775--2520776) +(2520787--2520788) +(2520795--2520796) +(2520801--2520804) -(2520999--2521002) -(2521004--2521012) +2522658 -(2522717--2522719) -(2551808--2551832) -(2575360--2575366) -(2655233--2655249) +2755944 +(2762736--2762751) -(2780160--2780162) -(2799616--2799661) -2808832 -(2820653--2821094) -(2855689--2856130) -(3367603--3367604) -(3377152--3377163) -(3391488--3391510) -(3506249--3506686) -3517440 -(3528704--3528716) -(3608576--3608686) +(3802018--3802056) -(3802288--3802343) -(4310016--4310073) -(4311040--4311776) -(6355968--6355970) -6515712 -(6782977--6783361) -(6832508--6834799) -6945792 -(7752704--7752705) -(7753728--7753731) +(7760672--7761055) -(7821312--7821319) -(7878748--7878771)
Fix<y>? yes

Free blocks count wrong for group #1 (62, counted=70).
Fix<y>? yes

Free blocks count wrong for group #2 (272, counted=263).
Fix<y>? yes

Free blocks count wrong for group #33 (248, counted=251).
Fix<y>? yes

Free blocks count wrong for group #58 (7, counted=0).
Fix<y>? yes

Free blocks count wrong for group #59 (89, counted=87).
Fix<y>? yes

Free blocks count wrong for group #76 (11417, counted=11416).
Fix<y>? yes

Free blocks count wrong for group #77 (10439, counted=10391).
Fix<y>? yes

Free blocks count wrong for group #78 (9262, counted=9269).
Fix<y>? yes

Free blocks count wrong for group #81 (15853, counted=15852).
Fix<y>? yes

Free blocks count wrong for group #84 (11007, counted=11010).
Fix<y>? yes

Free blocks count wrong for group #85 (13455, counted=13502).
Fix<y>? yes

Free blocks count wrong for group #86 (12102, counted=12544).
Fix<y>? yes

Free blocks count wrong for group #87 (829, counted=1271).
Fix<y>? yes

Free blocks count wrong for group #102 (10895, counted=10897).
Fix<y>? yes

Free blocks count wrong for group #103 (7775, counted=7810).
Fix<y>? yes

Free blocks count wrong for group #107 (6595, counted=7047).
Fix<y>? yes

Free blocks count wrong for group #110 (4329, counted=4440).
Fix<y>? yes

Free blocks count wrong for group #131 (5009, counted=5804).
Fix<y>? yes

Free blocks count wrong for group #160 (12437, counted=12436).
Fix<y>? yes

Free blocks count wrong for group #170 (822, counted=821).
Fix<y>? yes

Free blocks count wrong for group #182 (485, counted=482).
Fix<y>? yes

Free blocks count wrong for group #193 (1645, counted=1648).
Fix<y>? yes

Free blocks count wrong for group #198 (1609, counted=1610).
Fix<y>? yes

Free blocks count wrong for group #207 (1662, counted=2047).
Fix<y>? yes

Free blocks count wrong for group #208 (11300, counted=13592).
Fix<y>? yes

Free blocks count wrong for group #211 (1070, counted=1071).
Fix<y>? yes

Free blocks count wrong for group #236 (12267, counted=11889).
Fix<y>? yes

Free blocks count wrong for group #238 (19438, counted=19446).
Fix<y>? yes

Free blocks count wrong for group #240 (16047, counted=16071).
Fix<y>? yes

Free blocks count wrong (897677, counted=902287).
Fix<y>? yes

Inode bitmap differences:  -36 -39 -(44--45) -50 -9769 -9846 -133304 -133960 -(145030--145031) -145184 -146539 -146580 -146733 -148407 -151521 -151946 -152135 -152371 -156698 -158212 -160644 -160661 -170147 -218712 -218716 -219900 -219969 -219992 -220388
Fix<y>? yes

Free inodes count wrong for group #0 (5, counted=10).
Fix<y>? yes

Free inodes count wrong for group #1 (7, counted=8).
Fix<y>? yes

Free inodes count wrong for group #3 (9, counted=11).
Fix<y>? yes

Free inodes count wrong for group #16 (1, counted=3).
Fix<y>? yes

Free inodes count wrong for group #17 (0, counted=6).
Fix<y>? yes

Free inodes count wrong for group #18 (0, counted=5).
Fix<y>? yes

Free inodes count wrong for group #19 (0, counted=4).
Fix<y>? yes

Free inodes count wrong for group #20 (0, counted=1).
Fix<y>? yes

Free inodes count wrong for group #26 (10, counted=16).
Fix<y>? yes

Free inodes count wrong for group #33 (3, counted=4).
Fix<y>? yes

Free inodes count wrong for group #41 (132, counted=133).
Fix<y>? yes

Free inodes count wrong for group #46 (4050, counted=4051).
Fix<y>? yes

Free inodes count wrong for group #49 (7, counted=1).
Fix<y>? yes

Free inodes count wrong (1498821, counted=1498850).
Fix<y>? yes


/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 475422/1974272 files (0.2% non-contiguous), 6981603/7883890 blocks

#11

#  sudo e2fsck -f /dev/sda1
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? no

Clear journal<y>? no

Truncating orphaned inode 89 (uid=1001, gid=4, mode=0100600, size=8690)
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 36 has zero dtime.  Fix<y>? yes

Inodes that were part of a corrupted orphan linked list found.  Fix<y>? yes

Inode 39 was part of the orphaned inode list.  FIXED.
Inode 44 was part of the orphaned inode list.  FIXED.
Inode 45 was part of the orphaned inode list.  FIXED.
Inode 50 was part of the orphaned inode list.  FIXED.
Inode 137 was part of the orphaned inode list.  FIXED.
Inode 2652 was part of the orphaned inode list.  FIXED.
Inode 5517 was part of the orphaned inode list.  FIXED.
Inode 9769 was part of the orphaned inode list.  FIXED.
Inode 9846 was part of the orphaned inode list.  FIXED.
Inode 133304 was part of the orphaned inode list.  FIXED.
Inode 133351, i_size is 444958, should be 450560.  Fix<y>? yes

Inode 133351, i_blocks is 880, should be 888.  Fix<y>? yes

Inode 133960 was part of the orphaned inode list.  FIXED.
Inode 145030 was part of the orphaned inode list.  FIXED.
Inode 145031 was part of the orphaned inode list.  FIXED.
Inode 145184 was part of the orphaned inode list.  FIXED.
Inode 146539 was part of the orphaned inode list.  FIXED.
Inode 146580 was part of the orphaned inode list.  FIXED.
Inode 146733 was part of the orphaned inode list.  FIXED.
Inode 148407 was part of the orphaned inode list.  FIXED.
Inode 151521 was part of the orphaned inode list.  FIXED.
Inode 151946 was part of the orphaned inode list.  FIXED.
Inode 152135 was part of the orphaned inode list.  FIXED.
Inode 152371 was part of the orphaned inode list.  FIXED.
Inode 156698 was part of the orphaned inode list.  FIXED.
Inode 158212 was part of the orphaned inode list.  FIXED.
Inode 160644 was part of the orphaned inode list.  FIXED.
Inode 160661 was part of the orphaned inode list.  FIXED.
Inode 170147 was part of the orphaned inode list.  FIXED.
Inode 218712 was part of the orphaned inode list.  FIXED.
Inode 218716 was part of the orphaned inode list.  FIXED.
Inode 219900 was part of the orphaned inode list.  FIXED.
Inode 219969 was part of the orphaned inode list.  FIXED.
Inode 219992 was part of the orphaned inode list.  FIXED.
Inode 220388 was part of the orphaned inode list.  FIXED.
Inode 917508, i_size is 1460928, should be 1728512.  Fix<y>? yes

Inode 917508, i_blocks is 2864, should be 3384.  Fix<y>? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  +2494486 -2506752 +(2520766--2520768) +2520770 +(2520775--2520776) +(2520787--2520788) +(2520795--2520796) +(2520801--2520804) -(2520999--2521002) -(2521004--2521012) +2522658 -(2522717--2522719) -(2551808--2551832) -(2575360--2575366) -(2655233--2655249) +2755944 +(2762736--2762751) -(2780160--2780162) -(2799616--2799661) -2808832 -(2820653--2821094) -(2855689--2856130) -(3367603--3367604) -(3377152--3377163) -(3391488--3391510) -(3506249--3506686) -3517440 -(3528704--3528716) -(3608576--3608686) +(3802018--3802056) -(3802288--3802343) -(4310016--4310073) -(4311040--4311776) -(6355968--6355970) -6515712 -(6782977--6783361) -(6832508--6834799) -6945792 -(7752704--7752705) -(7753728--7753731) +(7760672--7761055) -(7821312--7821319) -(7878748--7878771)
Fix<y>? yes

Free blocks count wrong for group #1 (62, counted=70).
Fix<y>? yes

Free blocks count wrong for group #2 (272, counted=263).
Fix<y>? yes

Free blocks count wrong for group #33 (248, counted=251).
Fix<y>? yes

Free blocks count wrong for group #58 (7, counted=0).
Fix<y>? yes

Free blocks count wrong for group #59 (89, counted=87).
Fix<y>? yes

Free blocks count wrong for group #76 (11417, counted=11416).
Fix<y>? yes

Free blocks count wrong for group #77 (10439, counted=10391).
Fix<y>? yes

Free blocks count wrong for group #78 (9262, counted=9269).
Fix<y>? yes

Free blocks count wrong for group #81 (15853, counted=15852).
Fix<y>? yes

Free blocks count wrong for group #84 (11007, counted=11010).
Fix<y>? yes

Free blocks count wrong for group #85 (13455, counted=13502).
Fix<y>? yes

Free blocks count wrong for group #86 (12102, counted=12544).
Fix<y>? yes

Free blocks count wrong for group #87 (829, counted=1271).
Fix<y>? yes

Free blocks count wrong for group #102 (10895, counted=10897).
Fix<y>? yes

Free blocks count wrong for group #103 (7775, counted=7810).
Fix<y>? yes

Free blocks count wrong for group #107 (6595, counted=7047).
Fix<y>? yes

Free blocks count wrong for group #110 (4329, counted=4440).
Fix<y>? yes

Free blocks count wrong for group #131 (5009, counted=5804).
Fix<y>? yes

Free blocks count wrong for group #160 (12437, counted=12436).
Fix<y>? yes

Free blocks count wrong for group #170 (822, counted=821).
Fix<y>? yes

Free blocks count wrong for group #182 (485, counted=482).
Fix<y>? yes

Free blocks count wrong for group #193 (1645, counted=1648).
Fix<y>? yes

Free blocks count wrong for group #198 (1609, counted=1610).
Fix<y>? yes

Free blocks count wrong for group #207 (1662, counted=2047).
Fix<y>? yes

Free blocks count wrong for group #208 (11300, counted=13592).
Fix<y>? yes

Free blocks count wrong for group #211 (1070, counted=1071).
Fix<y>? yes

Free blocks count wrong for group #236 (12267, counted=11889).
Fix<y>? yes

Free blocks count wrong for group #238 (19438, counted=19446).
Fix<y>? yes

Free blocks count wrong for group #240 (16047, counted=16071).
Fix<y>? yes

Free blocks count wrong (897677, counted=902287).
Fix<y>? yes

Inode bitmap differences:  -36 -39 -(44--45) -50 -9769 -9846 -133304 -133960 -(145030--145031) -145184 -146539 -146580 -146733 -148407 -151521 -151946 -152135 -152371 -156698 -158212 -160644 -160661 -170147 -218712 -218716 -219900 -219969 -219992 -220388
Fix<y>? yes

Free inodes count wrong for group #0 (5, counted=10).
Fix<y>? yes

Free inodes count wrong for group #1 (7, counted=8).
Fix<y>? yes

Free inodes count wrong for group #3 (9, counted=11).
Fix<y>? yes

Free inodes count wrong for group #16 (1, counted=3).
Fix<y>? yes

Free inodes count wrong for group #17 (0, counted=6).
Fix<y>? yes

Free inodes count wrong for group #18 (0, counted=5).
Fix<y>? yes

Free inodes count wrong for group #19 (0, counted=4).
Fix<y>? yes

Free inodes count wrong for group #20 (0, counted=1).
Fix<y>? yes

Free inodes count wrong for group #26 (10, counted=16).
Fix<y>? yes

Free inodes count wrong for group #33 (3, counted=4).
Fix<y>? yes

Free inodes count wrong for group #41 (132, counted=133).
Fix<y>? yes

Free inodes count wrong for group #46 (4050, counted=4051).
Fix<y>? yes

Free inodes count wrong for group #49 (7, counted=1).
Fix<y>? yes

Free inodes count wrong (1498821, counted=1498850).
Fix<y>? yes


/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 475422/1974272 files (0.2% non-contiguous), 6981603/7883890 blocks

#12

It still won't mount:

# mount /dev/sda1 mnt.root/
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

# dmesg | tail
[26567.600285] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
[26567.600294] Descriptor sense data with sense descriptors (in hex):
[26567.600298]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[26567.600316]         03 80 00 9f
[26567.600324] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
[26567.600333] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 03 80 00 9f 00 00 08 00
[26567.600349] end_request: I/O error, dev sda, sector 58720415
[26567.600385] ata1: EH complete
[26567.600390] JBD: recovery failed
[26567.600399] EXT4-fs (sda1): error loading journal

#13

Mounting the image works, though:
# mount -o loop ./ssd_root_drive_sda1.iso /mnt/image/

#15

Basically, my problem is similar to this fellow's:
http://forums.linuxmint.com/viewtopic.php?f=49&t=60411
who got no answer either...

Since we can make an image of the partition and actually mount the image... what is the best course of action???

#16

Now that I have an image of the drive, and at least access to the most important configuration files I wanted, I'll try to reformat and cp everything back.

I install gparted on the live disk:
# apt-get install gparted

#17

Hmmmm.... that's interesting. When opening gparted, it sees /dev/sda1, as well as the total size, the amount use and the amount of free space... How does it do this, since the partition is not mounted?

#18

Using gparted to fsck the partition gives the following error (in gparted_details.htm)
e2fsck: unable to set superblock flags on /dev/sda1

#19

Creating a new partition doesn't work either!

GParted 0.6.2

Libparted 2.3

Delete /dev/sda1 (ext4, 30.07 GiB) from /dev/sda  00:00:01    ( ERROR )
   
calibrate /dev/sda1  00:00:00    ( SUCCESS )
   
path: /dev/sda1
start: 63
end: 63071189
size: 63071127 (30.07 GiB)
delete partition  00:00:01    ( ERROR )
libparted messages    ( INFO )
   
Input/output error during write on /dev/sda
Error fsyncing/closing /dev/sda: Input/output error
========================================

Create Primary Partition #1 (ext4, 30.08 GiB) on /dev/sda
========================================

#20

I bookmark here some pages I was browsing while using the live CD:

Initramfs boot error
http://ubuntuforums.org/showthread.php?p=11056317#post11056317

HOWTO: recover lost partition after unexpected shutdown (Lucid)
http://ubuntuforums.org/showthread.php?t=1682038

fsck unable to set flags on base and backup superblocks
http://forums.linuxmint.com/viewtopic.php?f=49&t=60411

[SOLVED] Unable to reinstall GRUB
http://ubuntuforums.org/showthread.php?t=1817667

Ext4 died
http://ubuntuforums.org/showthread.php?t=1188782

fsck.ext4: unable to set superblock flags on /dev/sda1
http://www.google.com/search?client=ubuntu&channel=ks&q=fsck.ext4%3A%20u...

#21

I bought a new Intel SSD drive to replace the old one. Still within the live CD session, I plugged it in via a USB adapter, formatted it and copied all the root data over from the image of the old drive I had done previously.

Then, I unplugged the computer, physically replaced the old drive with the new one, but the computer didn't boot. It's probably because I forgot to update /etc/fstab with the new UUID and update grub accordingly.

#22

Used blkid to find the UUID of the new drive and updated /etc/fstab.

#23

https://help.ubuntu.com/community/Grub2#Reinstalling%20GRUB2
https://help.ubuntu.com/community/Boot-Repair

sudo add-apt-repository ppa:yannubuntu/boot-repair
sudo apt-get update && sudo apt-get install -y boot-repair && boot-repair