Failing SSD drive: How to recover?
Jump to:
Project: | Linux hardware |
Component: | Documentation |
Category: | support request |
Priority: | normal |
Assigned: | Unassigned |
Status: | active |
Project wiki: | Hardware |
Related pages: | #29: filesystem :-:-: #65: Troubleshooting a failing hard drive |
Description
I have a failing SSD drive.
I was using Kubuntu Linux as usual, when the system became slow and certain operations (that required write access to /var/) started to fail. The root / partition was suddenly mounted read only!
I tried to reboot the system, but it wouldn't book anymore. Basically. my root partition was on a failing drive! I am currently using a live disk to access the internet.
I'll try to document as much as I can as I try to recover as much as I can.
Comments
#1
#2
fdisk -l
$ sudo fdisk -l
Disk /dev/sda: 32.3 GB, 32296140800 bytes
255 heads, 63 sectors/track, 3926 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x6a6e434d
Device Boot Start End Blocks Id System
/dev/sda1 * 1 3926 31535563+ 83 Linux
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0003c958
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 120121 964871901 83 Linux
/dev/sdb2 120122 121601 11888100 5 Extended
/dev/sdb5 120122 121601 11888068+ 82 Linux swap / Solaris
fsck:
$ sudo fsck -f -y -v /dev/sda1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
fsck.ext4: unable to set superblock flags on /dev/sda1
#3
The drive is ext4.
#4
badblocks -swv /dev/sda has been running for 5 hours, and still running...
#5
The output overwrites the percentage done, so one doesn't know where it's at:
3986880 done, 5:25:31 elapsed
3987000
3987120
3987240 done, 5:25:32 elapsed
3987360
3987480
3987600 done, 5:25:34 elapsed
3987720
3987840 done, 5:25:35 elapsed
3987960
3988080 done, 5:25:36 elapsed
3988200
3988320 done, 5:25:37 elapsed
3988440
3988560 done, 5:25:38 elapsed
3988680
3988800 done, 5:25:39 elapsed
3988920 done, 5:25:40 elapsed
3989040
3989160
3989280 done, 5:25:42 elapsed
3989400
3989520 done, 5:25:44 elapsed
3989640
^C 12.65% done, 5:25:45 elapsed
Interrupted at block 3989760
root@ubuntu:~#
#185: badblocks output doesn't show percentage done
#6
# mount /dev/sda1 mnt.root/
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
#7
http://www.linuxquestions.org/questions/linux-newbie-8/learn-the-dd-comm...
# dd if=/dev/sda of=./ssd_root_drive.iso bs=2048 conv=sync,notrunc
15769600+0 records in
15769600+0 records out
32296140800 bytes (32 GB) copied, 1002.69 s, 32.2 MB/s
#8
# dd if=/dev/sda1 of=./ssd_root_drive_sda1.iso bs=2048 conv=sync,notrunc
15767781+1 records in
15767782+0 records out
32292417536 bytes (32 GB) copied, 1000.37 s, 32.3 MB/s
#9
# sudo e2fsck -f /dev/sda1
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>?
What do I reply to that?
#10
# sudo e2fsck -f /dev/sda1
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? no
Clear journal<y>? no
Truncating orphaned inode 89 (uid=1001, gid=4, mode=0100600, size=8690)
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 36 has zero dtime. Fix<y>? yes
Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes
Inode 39 was part of the orphaned inode list. FIXED.
Inode 44 was part of the orphaned inode list. FIXED.
Inode 45 was part of the orphaned inode list. FIXED.
Inode 50 was part of the orphaned inode list. FIXED.
Inode 137 was part of the orphaned inode list. FIXED.
Inode 2652 was part of the orphaned inode list. FIXED.
Inode 5517 was part of the orphaned inode list. FIXED.
Inode 9769 was part of the orphaned inode list. FIXED.
Inode 9846 was part of the orphaned inode list. FIXED.
Inode 133304 was part of the orphaned inode list. FIXED.
Inode 133351, i_size is 444958, should be 450560. Fix<y>? yes
Inode 133351, i_blocks is 880, should be 888. Fix<y>? yes
Inode 133960 was part of the orphaned inode list. FIXED.
Inode 145030 was part of the orphaned inode list. FIXED.
Inode 145031 was part of the orphaned inode list. FIXED.
Inode 145184 was part of the orphaned inode list. FIXED.
Inode 146539 was part of the orphaned inode list. FIXED.
Inode 146580 was part of the orphaned inode list. FIXED.
Inode 146733 was part of the orphaned inode list. FIXED.
Inode 148407 was part of the orphaned inode list. FIXED.
Inode 151521 was part of the orphaned inode list. FIXED.
Inode 151946 was part of the orphaned inode list. FIXED.
Inode 152135 was part of the orphaned inode list. FIXED.
Inode 152371 was part of the orphaned inode list. FIXED.
Inode 156698 was part of the orphaned inode list. FIXED.
Inode 158212 was part of the orphaned inode list. FIXED.
Inode 160644 was part of the orphaned inode list. FIXED.
Inode 160661 was part of the orphaned inode list. FIXED.
Inode 170147 was part of the orphaned inode list. FIXED.
Inode 218712 was part of the orphaned inode list. FIXED.
Inode 218716 was part of the orphaned inode list. FIXED.
Inode 219900 was part of the orphaned inode list. FIXED.
Inode 219969 was part of the orphaned inode list. FIXED.
Inode 219992 was part of the orphaned inode list. FIXED.
Inode 220388 was part of the orphaned inode list. FIXED.
Inode 917508, i_size is 1460928, should be 1728512. Fix<y>? yes
Inode 917508, i_blocks is 2864, should be 3384. Fix<y>? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +2494486 -2506752 +(2520766--2520768) +2520770 +(2520775--2520776) +(2520787--2520788) +(2520795--2520796) +(2520801--2520804) -(2520999--2521002) -(2521004--2521012) +2522658 -(2522717--2522719) -(2551808--2551832) -(2575360--2575366) -(2655233--2655249) +2755944 +(2762736--2762751) -(2780160--2780162) -(2799616--2799661) -2808832 -(2820653--2821094) -(2855689--2856130) -(3367603--3367604) -(3377152--3377163) -(3391488--3391510) -(3506249--3506686) -3517440 -(3528704--3528716) -(3608576--3608686) +(3802018--3802056) -(3802288--3802343) -(4310016--4310073) -(4311040--4311776) -(6355968--6355970) -6515712 -(6782977--6783361) -(6832508--6834799) -6945792 -(7752704--7752705) -(7753728--7753731) +(7760672--7761055) -(7821312--7821319) -(7878748--7878771)
Fix<y>? yes
Free blocks count wrong for group #1 (62, counted=70).
Fix<y>? yes
Free blocks count wrong for group #2 (272, counted=263).
Fix<y>? yes
Free blocks count wrong for group #33 (248, counted=251).
Fix<y>? yes
Free blocks count wrong for group #58 (7, counted=0).
Fix<y>? yes
Free blocks count wrong for group #59 (89, counted=87).
Fix<y>? yes
Free blocks count wrong for group #76 (11417, counted=11416).
Fix<y>? yes
Free blocks count wrong for group #77 (10439, counted=10391).
Fix<y>? yes
Free blocks count wrong for group #78 (9262, counted=9269).
Fix<y>? yes
Free blocks count wrong for group #81 (15853, counted=15852).
Fix<y>? yes
Free blocks count wrong for group #84 (11007, counted=11010).
Fix<y>? yes
Free blocks count wrong for group #85 (13455, counted=13502).
Fix<y>? yes
Free blocks count wrong for group #86 (12102, counted=12544).
Fix<y>? yes
Free blocks count wrong for group #87 (829, counted=1271).
Fix<y>? yes
Free blocks count wrong for group #102 (10895, counted=10897).
Fix<y>? yes
Free blocks count wrong for group #103 (7775, counted=7810).
Fix<y>? yes
Free blocks count wrong for group #107 (6595, counted=7047).
Fix<y>? yes
Free blocks count wrong for group #110 (4329, counted=4440).
Fix<y>? yes
Free blocks count wrong for group #131 (5009, counted=5804).
Fix<y>? yes
Free blocks count wrong for group #160 (12437, counted=12436).
Fix<y>? yes
Free blocks count wrong for group #170 (822, counted=821).
Fix<y>? yes
Free blocks count wrong for group #182 (485, counted=482).
Fix<y>? yes
Free blocks count wrong for group #193 (1645, counted=1648).
Fix<y>? yes
Free blocks count wrong for group #198 (1609, counted=1610).
Fix<y>? yes
Free blocks count wrong for group #207 (1662, counted=2047).
Fix<y>? yes
Free blocks count wrong for group #208 (11300, counted=13592).
Fix<y>? yes
Free blocks count wrong for group #211 (1070, counted=1071).
Fix<y>? yes
Free blocks count wrong for group #236 (12267, counted=11889).
Fix<y>? yes
Free blocks count wrong for group #238 (19438, counted=19446).
Fix<y>? yes
Free blocks count wrong for group #240 (16047, counted=16071).
Fix<y>? yes
Free blocks count wrong (897677, counted=902287).
Fix<y>? yes
Inode bitmap differences: -36 -39 -(44--45) -50 -9769 -9846 -133304 -133960 -(145030--145031) -145184 -146539 -146580 -146733 -148407 -151521 -151946 -152135 -152371 -156698 -158212 -160644 -160661 -170147 -218712 -218716 -219900 -219969 -219992 -220388
Fix<y>? yes
Free inodes count wrong for group #0 (5, counted=10).
Fix<y>? yes
Free inodes count wrong for group #1 (7, counted=8).
Fix<y>? yes
Free inodes count wrong for group #3 (9, counted=11).
Fix<y>? yes
Free inodes count wrong for group #16 (1, counted=3).
Fix<y>? yes
Free inodes count wrong for group #17 (0, counted=6).
Fix<y>? yes
Free inodes count wrong for group #18 (0, counted=5).
Fix<y>? yes
Free inodes count wrong for group #19 (0, counted=4).
Fix<y>? yes
Free inodes count wrong for group #20 (0, counted=1).
Fix<y>? yes
Free inodes count wrong for group #26 (10, counted=16).
Fix<y>? yes
Free inodes count wrong for group #33 (3, counted=4).
Fix<y>? yes
Free inodes count wrong for group #41 (132, counted=133).
Fix<y>? yes
Free inodes count wrong for group #46 (4050, counted=4051).
Fix<y>? yes
Free inodes count wrong for group #49 (7, counted=1).
Fix<y>? yes
Free inodes count wrong (1498821, counted=1498850).
Fix<y>? yes
/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 475422/1974272 files (0.2% non-contiguous), 6981603/7883890 blocks
#11
# sudo e2fsck -f /dev/sda1
e2fsck 1.41.12 (17-May-2010)
/dev/sda1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? no
Clear journal<y>? no
Truncating orphaned inode 89 (uid=1001, gid=4, mode=0100600, size=8690)
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 36 has zero dtime. Fix<y>? yes
Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes
Inode 39 was part of the orphaned inode list. FIXED.
Inode 44 was part of the orphaned inode list. FIXED.
Inode 45 was part of the orphaned inode list. FIXED.
Inode 50 was part of the orphaned inode list. FIXED.
Inode 137 was part of the orphaned inode list. FIXED.
Inode 2652 was part of the orphaned inode list. FIXED.
Inode 5517 was part of the orphaned inode list. FIXED.
Inode 9769 was part of the orphaned inode list. FIXED.
Inode 9846 was part of the orphaned inode list. FIXED.
Inode 133304 was part of the orphaned inode list. FIXED.
Inode 133351, i_size is 444958, should be 450560. Fix<y>? yes
Inode 133351, i_blocks is 880, should be 888. Fix<y>? yes
Inode 133960 was part of the orphaned inode list. FIXED.
Inode 145030 was part of the orphaned inode list. FIXED.
Inode 145031 was part of the orphaned inode list. FIXED.
Inode 145184 was part of the orphaned inode list. FIXED.
Inode 146539 was part of the orphaned inode list. FIXED.
Inode 146580 was part of the orphaned inode list. FIXED.
Inode 146733 was part of the orphaned inode list. FIXED.
Inode 148407 was part of the orphaned inode list. FIXED.
Inode 151521 was part of the orphaned inode list. FIXED.
Inode 151946 was part of the orphaned inode list. FIXED.
Inode 152135 was part of the orphaned inode list. FIXED.
Inode 152371 was part of the orphaned inode list. FIXED.
Inode 156698 was part of the orphaned inode list. FIXED.
Inode 158212 was part of the orphaned inode list. FIXED.
Inode 160644 was part of the orphaned inode list. FIXED.
Inode 160661 was part of the orphaned inode list. FIXED.
Inode 170147 was part of the orphaned inode list. FIXED.
Inode 218712 was part of the orphaned inode list. FIXED.
Inode 218716 was part of the orphaned inode list. FIXED.
Inode 219900 was part of the orphaned inode list. FIXED.
Inode 219969 was part of the orphaned inode list. FIXED.
Inode 219992 was part of the orphaned inode list. FIXED.
Inode 220388 was part of the orphaned inode list. FIXED.
Inode 917508, i_size is 1460928, should be 1728512. Fix<y>? yes
Inode 917508, i_blocks is 2864, should be 3384. Fix<y>? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +2494486 -2506752 +(2520766--2520768) +2520770 +(2520775--2520776) +(2520787--2520788) +(2520795--2520796) +(2520801--2520804) -(2520999--2521002) -(2521004--2521012) +2522658 -(2522717--2522719) -(2551808--2551832) -(2575360--2575366) -(2655233--2655249) +2755944 +(2762736--2762751) -(2780160--2780162) -(2799616--2799661) -2808832 -(2820653--2821094) -(2855689--2856130) -(3367603--3367604) -(3377152--3377163) -(3391488--3391510) -(3506249--3506686) -3517440 -(3528704--3528716) -(3608576--3608686) +(3802018--3802056) -(3802288--3802343) -(4310016--4310073) -(4311040--4311776) -(6355968--6355970) -6515712 -(6782977--6783361) -(6832508--6834799) -6945792 -(7752704--7752705) -(7753728--7753731) +(7760672--7761055) -(7821312--7821319) -(7878748--7878771)
Fix<y>? yes
Free blocks count wrong for group #1 (62, counted=70).
Fix<y>? yes
Free blocks count wrong for group #2 (272, counted=263).
Fix<y>? yes
Free blocks count wrong for group #33 (248, counted=251).
Fix<y>? yes
Free blocks count wrong for group #58 (7, counted=0).
Fix<y>? yes
Free blocks count wrong for group #59 (89, counted=87).
Fix<y>? yes
Free blocks count wrong for group #76 (11417, counted=11416).
Fix<y>? yes
Free blocks count wrong for group #77 (10439, counted=10391).
Fix<y>? yes
Free blocks count wrong for group #78 (9262, counted=9269).
Fix<y>? yes
Free blocks count wrong for group #81 (15853, counted=15852).
Fix<y>? yes
Free blocks count wrong for group #84 (11007, counted=11010).
Fix<y>? yes
Free blocks count wrong for group #85 (13455, counted=13502).
Fix<y>? yes
Free blocks count wrong for group #86 (12102, counted=12544).
Fix<y>? yes
Free blocks count wrong for group #87 (829, counted=1271).
Fix<y>? yes
Free blocks count wrong for group #102 (10895, counted=10897).
Fix<y>? yes
Free blocks count wrong for group #103 (7775, counted=7810).
Fix<y>? yes
Free blocks count wrong for group #107 (6595, counted=7047).
Fix<y>? yes
Free blocks count wrong for group #110 (4329, counted=4440).
Fix<y>? yes
Free blocks count wrong for group #131 (5009, counted=5804).
Fix<y>? yes
Free blocks count wrong for group #160 (12437, counted=12436).
Fix<y>? yes
Free blocks count wrong for group #170 (822, counted=821).
Fix<y>? yes
Free blocks count wrong for group #182 (485, counted=482).
Fix<y>? yes
Free blocks count wrong for group #193 (1645, counted=1648).
Fix<y>? yes
Free blocks count wrong for group #198 (1609, counted=1610).
Fix<y>? yes
Free blocks count wrong for group #207 (1662, counted=2047).
Fix<y>? yes
Free blocks count wrong for group #208 (11300, counted=13592).
Fix<y>? yes
Free blocks count wrong for group #211 (1070, counted=1071).
Fix<y>? yes
Free blocks count wrong for group #236 (12267, counted=11889).
Fix<y>? yes
Free blocks count wrong for group #238 (19438, counted=19446).
Fix<y>? yes
Free blocks count wrong for group #240 (16047, counted=16071).
Fix<y>? yes
Free blocks count wrong (897677, counted=902287).
Fix<y>? yes
Inode bitmap differences: -36 -39 -(44--45) -50 -9769 -9846 -133304 -133960 -(145030--145031) -145184 -146539 -146580 -146733 -148407 -151521 -151946 -152135 -152371 -156698 -158212 -160644 -160661 -170147 -218712 -218716 -219900 -219969 -219992 -220388
Fix<y>? yes
Free inodes count wrong for group #0 (5, counted=10).
Fix<y>? yes
Free inodes count wrong for group #1 (7, counted=8).
Fix<y>? yes
Free inodes count wrong for group #3 (9, counted=11).
Fix<y>? yes
Free inodes count wrong for group #16 (1, counted=3).
Fix<y>? yes
Free inodes count wrong for group #17 (0, counted=6).
Fix<y>? yes
Free inodes count wrong for group #18 (0, counted=5).
Fix<y>? yes
Free inodes count wrong for group #19 (0, counted=4).
Fix<y>? yes
Free inodes count wrong for group #20 (0, counted=1).
Fix<y>? yes
Free inodes count wrong for group #26 (10, counted=16).
Fix<y>? yes
Free inodes count wrong for group #33 (3, counted=4).
Fix<y>? yes
Free inodes count wrong for group #41 (132, counted=133).
Fix<y>? yes
Free inodes count wrong for group #46 (4050, counted=4051).
Fix<y>? yes
Free inodes count wrong for group #49 (7, counted=1).
Fix<y>? yes
Free inodes count wrong (1498821, counted=1498850).
Fix<y>? yes
/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 475422/1974272 files (0.2% non-contiguous), 6981603/7883890 blocks
#12
It still won't mount:
# mount /dev/sda1 mnt.root/
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
# dmesg | tail
[26567.600285] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
[26567.600294] Descriptor sense data with sense descriptors (in hex):
[26567.600298] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[26567.600316] 03 80 00 9f
[26567.600324] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
[26567.600333] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 03 80 00 9f 00 00 08 00
[26567.600349] end_request: I/O error, dev sda, sector 58720415
[26567.600385] ata1: EH complete
[26567.600390] JBD: recovery failed
[26567.600399] EXT4-fs (sda1): error loading journal
#13
Mounting the image works, though:
# mount -o loop ./ssd_root_drive_sda1.iso /mnt/image/
#15
Basically, my problem is similar to this fellow's:
http://forums.linuxmint.com/viewtopic.php?f=49&t=60411
who got no answer either...
Since we can make an image of the partition and actually mount the image... what is the best course of action???
#16
Now that I have an image of the drive, and at least access to the most important configuration files I wanted, I'll try to reformat and cp everything back.
I install gparted on the live disk:
# apt-get install gparted
#17
Hmmmm.... that's interesting. When opening gparted, it sees /dev/sda1, as well as the total size, the amount use and the amount of free space... How does it do this, since the partition is not mounted?
#18
Using gparted to fsck the partition gives the following error (in gparted_details.htm)
e2fsck: unable to set superblock flags on /dev/sda1
#19
Creating a new partition doesn't work either!
GParted 0.6.2
Libparted 2.3
Delete /dev/sda1 (ext4, 30.07 GiB) from /dev/sda 00:00:01 ( ERROR )
calibrate /dev/sda1 00:00:00 ( SUCCESS )
path: /dev/sda1
start: 63
end: 63071189
size: 63071127 (30.07 GiB)
delete partition 00:00:01 ( ERROR )
libparted messages ( INFO )
Input/output error during write on /dev/sda
Error fsyncing/closing /dev/sda: Input/output error
========================================
Create Primary Partition #1 (ext4, 30.08 GiB) on /dev/sda
========================================
#20
I bookmark here some pages I was browsing while using the live CD:
Initramfs boot error
http://ubuntuforums.org/showthread.php?p=11056317#post11056317
HOWTO: recover lost partition after unexpected shutdown (Lucid)
http://ubuntuforums.org/showthread.php?t=1682038
fsck unable to set flags on base and backup superblocks
http://forums.linuxmint.com/viewtopic.php?f=49&t=60411
[SOLVED] Unable to reinstall GRUB
http://ubuntuforums.org/showthread.php?t=1817667
Ext4 died
http://ubuntuforums.org/showthread.php?t=1188782
fsck.ext4: unable to set superblock flags on /dev/sda1
http://www.google.com/search?client=ubuntu&channel=ks&q=fsck.ext4%3A%20u...
#21
I bought a new Intel SSD drive to replace the old one. Still within the live CD session, I plugged it in via a USB adapter, formatted it and copied all the root data over from the image of the old drive I had done previously.
Then, I unplugged the computer, physically replaced the old drive with the new one, but the computer didn't boot. It's probably because I forgot to update /etc/fstab with the new UUID and update grub accordingly.
#22
Used
blkid
to find the UUID of the new drive and updated /etc/fstab.#23
https://help.ubuntu.com/community/Grub2#Reinstalling%20GRUB2
https://help.ubuntu.com/community/Boot-Repair
sudo add-apt-repository ppa:yannubuntu/boot-repair
sudo apt-get update && sudo apt-get install -y boot-repair && boot-repair