Hard drives: Difference between revisions

From Elvanör's Technical Wiki
Jump to navigation Jump to search
 
(9 intermediate revisions by the same user not shown)
Line 11: Line 11:
== Advanced notes ==
== Advanced notes ==


* The GPT partition table lives after the first sector of the drive. Normally most drive currently use 512 bytes for the logical sector size (the physical sector size is irrelevant; translation is automatically done by the drive). However, some controllers will report a 4096 bytes logical sector size. This is probably to avoid the need to use GPT for older OSes that don't support it. This is very problematic though, because although the drive will work with these controllers (mine was in an external enclosure), it won't be usable on another controller since the logical sector size will change and the OS won't be able to read the partition table anymore!
* The GPT partition table lives after the first sector of the drive. Normally most drives currently use 512 bytes for the logical sector size (the physical sector size is irrelevant; translation is automatically done by the drive). However, some controllers will report a 4096 bytes logical sector size. This is probably to avoid the need to use GPT for older OSes that don't support it. This is very problematic though, because although the drive will work with these controllers (mine was in an external enclosure), it won't be usable on another controller since the logical sector size will change and the OS won't be able to read the partition table anymore!


= Filesystems =
= Filesystems =
Line 29: Line 29:
This can be done on a mounted system.
This can be done on a mounted system.


= Monitoring your hard drives using SMART =
= Monitoring your hard drives =


Modern hard drives come with a self monitoring system called SMART. This system will apparently report problems and can detect in advance about 2/3 of hard drive failures. Under Linux, a set of tools, smartmontools, may be used to diagnose your hard drives. You can use the program '''smartctl''' manually, or install a daemon that will always run on your system and report failures.
== Using SMART ==
 
* Modern hard drives come with a self monitoring system called SMART. This system will apparently report problems and can detect in advance about 2/3 of hard drive failures. Under Linux, a set of tools, smartmontools, may be used to diagnose your hard drives. You can use the program '''smartctl''' manually, or install a daemon that will always run on your system and report failures.
 
* Note that unfortunately, SMART does not really work with USB connections.
 
== Checking for bad blocks ==
 
* You can use the badblocks binary for this. To run a read-only test (-b 4096 is needed for large HD with more than 4TB):
 
badblocks -b 4096 -sv /dev/sdd
 
* fsck can be associated with badblocks to map out the bad sectors out of the filesystem. However, normally the HD firmware should do that on its own. If there are too many bad sectors, the disk needs to be replaced.
 
== Input / Output errors on files ==
 
* If you have a persistent issue with the following error on a file (and the file cannot even be deleted):
 
cannot stat 'hobbies/comics/sample.jpg': Input/output error
 
It can be fixed with running a fsck, then deleting the problematic files.


= Backing up your data and system =
= Backing up your data and system =
Line 45: Line 65:
== Just copying everything ==
== Just copying everything ==


* Copying everything using cp -ar will result in a working system. Just exclude dev, proc, sys directories. Even with a recent udev, you still need to create the /dev/null and/dev/console nodes in /dev. This is because they are needed *before* udev is started.
* Copying everything using cp -ar will result in a working system (the -r is normally implied by -a so can be omitted). Just exclude dev, proc, run, sys, and tmp directories. Even with a recent udev, you still need to create the /dev/null and/dev/console nodes in /dev. This is because they are needed *before* udev is started.
 
cd /
cp -a bin boot etc home lib lib64 media opt root sbin srv usr var /mnt/new-hard-drive # this is with an amd64 system without /lib32
 
* When copying to a remote host, use rsync, such as:
* When copying to a remote host, use rsync, such as:
  rsync -ave ssh /bin /etc /usr /var root@192.168.0.5:/home/ubuntu/gentoo # /home/ubuntu/gentoo is where the partition is mounted
  rsync -ave ssh /bin /boot /etc /home /lib /lib64 /media /opt /root /sbin /srv /usr /var root@192.168.0.5:/home/ubuntu/gentoo # /home/ubuntu/gentoo is where the partition is mounted
   
 
* Note that after you copied a whole partition to another, if you want to have a bootable system, run (this should be done without a binding mount before the chroot):
 
mkdir /dev /proc /run /sys /tmp
chmod a=rwx,o+t /tmp # you may need to chmod /tmp
cd /dev
mknod -m 660 console c 5 1 # you may need to fill /dev
  mknod -m 660 null c 1 3
 
* Once the previous copy is finished, you should perform the following tasks to obtain a fully working and bootable system:
** edit /etc/conf.d/net with the correct information
** run grub-install within the chroot
** potentially change the DNS zone file and /etc/bind/named.conf file
 
* The following command performs a byte-exact copy of a given partition, to a remote host:
* The following command performs a byte-exact copy of a given partition, to a remote host:



Latest revision as of 09:42, 9 March 2020

This article discusses the monitoring of hard drives, the best way to deal with hard drive failures, and general hard drive information. It explains how to make a complete backup of your Gentoo system into another hard drive.

Partitions

  • With the old (BIOS) partition table format, it is impossible to create a partition of more than 2TB. In order to create such a partition, you need to activate EFI GUID partition support in the kernel. Then you need to use GNU parted (and not fdisk) to create the partition:
parted /dev/sdb
mklabel gpt
mkpart primary ext4 0 -1

Advanced notes

  • The GPT partition table lives after the first sector of the drive. Normally most drives currently use 512 bytes for the logical sector size (the physical sector size is irrelevant; translation is automatically done by the drive). However, some controllers will report a 4096 bytes logical sector size. This is probably to avoid the need to use GPT for older OSes that don't support it. This is very problematic though, because although the drive will work with these controllers (mine was in an external enclosure), it won't be usable on another controller since the logical sector size will change and the OS won't be able to read the partition table anymore!

Filesystems

  • ext4 by default reserves a certain number of blocks for the super-user (5% by default). You can override that setting with tune2fs /dev/sdd1 -m 1 for example (it would set the number of blocks to 1%). You can see the current parameters by running:
tune2fs -l /dev/sdd1
  • To reboot and check the filesystem (force a fsck), issue:
shutdown -Fr
  • Never check (fsck) a mounted filesystem.
  • To resize a ext4 filesystem (to the size of the underlying partition), issue:
resize2fs /dev/sda1

This can be done on a mounted system.

Monitoring your hard drives

Using SMART

  • Modern hard drives come with a self monitoring system called SMART. This system will apparently report problems and can detect in advance about 2/3 of hard drive failures. Under Linux, a set of tools, smartmontools, may be used to diagnose your hard drives. You can use the program smartctl manually, or install a daemon that will always run on your system and report failures.
  • Note that unfortunately, SMART does not really work with USB connections.

Checking for bad blocks

  • You can use the badblocks binary for this. To run a read-only test (-b 4096 is needed for large HD with more than 4TB):
badblocks -b 4096 -sv /dev/sdd
  • fsck can be associated with badblocks to map out the bad sectors out of the filesystem. However, normally the HD firmware should do that on its own. If there are too many bad sectors, the disk needs to be replaced.

Input / Output errors on files

  • If you have a persistent issue with the following error on a file (and the file cannot even be deleted):
cannot stat 'hobbies/comics/sample.jpg': Input/output error

It can be fixed with running a fsck, then deleting the problematic files.

Backing up your data and system

Making a stage4

Under Linux everything is a file so it is easy to create a complete backup of your system. With Gentoo the best way is to generate a personal "stage4" file. Follow this guide on the Gentoo Wiki to do just that.

  • Warning: don't exclude "/usr/src/*" from the stage 4 as stated by the guide! /usr/src contains the kernel source, used by several packages, and more importantly, it also contains the kernel's .config file. If you exclude /usr/src in your stage4, you'll have to reedit your kernel configuration.
  • Be very careful about the /dev/null and /dev/console nodes. They are indeed needed to boot currently, even with udev. So it is mandatory to create these nodes statically, in the "real" /dev of the new hard drive. If you just follow the guide's instructions, you'll create the nodes after /dev points to something, so they won't reside on the real /dev directory of the hard drive. I advise to create them before chrooting and binding the /dev of the LiveCD.

Just copying everything

  • Copying everything using cp -ar will result in a working system (the -r is normally implied by -a so can be omitted). Just exclude dev, proc, run, sys, and tmp directories. Even with a recent udev, you still need to create the /dev/null and/dev/console nodes in /dev. This is because they are needed *before* udev is started.
cd /
cp -a bin boot etc home lib lib64 media opt root sbin srv usr var /mnt/new-hard-drive # this is with an amd64 system without /lib32
  • When copying to a remote host, use rsync, such as:
rsync -ave ssh /bin /boot /etc /home /lib /lib64 /media /opt /root /sbin /srv /usr /var root@192.168.0.5:/home/ubuntu/gentoo # /home/ubuntu/gentoo is where the partition is mounted
  • Note that after you copied a whole partition to another, if you want to have a bootable system, run (this should be done without a binding mount before the chroot):
mkdir /dev /proc /run /sys /tmp
chmod a=rwx,o+t /tmp # you may need to chmod /tmp
cd /dev
mknod -m 660 console c 5 1 # you may need to fill /dev
mknod -m 660 null c 1 3
  • Once the previous copy is finished, you should perform the following tasks to obtain a fully working and bootable system:
    • edit /etc/conf.d/net with the correct information
    • run grub-install within the chroot
    • potentially change the DNS zone file and /etc/bind/named.conf file
  • The following command performs a byte-exact copy of a given partition, to a remote host:
dd if=/dev/sda2 bs=512 | ssh elvanor@192.168.0.1 "/bin/cat > /mnt/main/storage/laptop-image.img"
  • To get it back:
dd if=/mnt/main/storage/laptop-image.img" of=/dev/sda2

Useful Links