在生产系统中使用 LVM 务必要注意 lvm metadata 的备份,曾经在京东的时候发生过虚机用了 lvm 的磁盘系统,然后虚机文件 qcow2 突然坏掉了,想尽了办法也无法恢复,这给我们拉响了警钟啊,使用lvm的时候务必要备份分区信息,否则坏的时候就欲哭无泪了。
缺省在 /etc/lvm/backup/ 目录下是最新的备份,同样历史版本都在 /etc/lvm/archive/ 目录下。
先备份
在开始恢复之前,一定先做个备份
cp -pr /etc/lvm /etc/lvm_bkp
然后去看看archive下的备份信息
ls -l /etc/lvm/archive/vg_storage_00* -rw-------. 1 root root 13722 Oct 28 23:45 /etc/lvm/archive/vg_storage_00419-1760023262.vg -rw-------. 1 root root 14571 Oct 28 23:52 /etc/lvm/archive/vg_storage_00420-94024216.vg ... -rw-------. 1 root root 14749 Nov 23 15:11 /etc/lvm/archive/vg_storage_00676-394223172.vg -rw-------. 1 root root 14733 Nov 23 15:29 /etc/lvm/archive/vg_storage_00677-187019982.vg #
最坏的情形恢复 pv
警告:这一步只能在 VG 无法正常运行的时候再运行!!!
先说最坏的情形,LVM的建立路径是 pv —> vg —> lv,假设连 pv(物理卷都没了)
我们选择最近的备份,/etc/lvm/archive/vg_storage_00677-187019982.vg
less /etc/lvm/archive/vg_storage_00677-187019982.vg ... physical_volumes { pv0 { id = "BgR0KJ-JClh-T2gS-k6yK-9RGn-B8Ls-LYPQP0" ...
从里面拿到 pv 的 id ,重建 pv
pvcreate --uuid "BgR0KJ-JClh-T2gS-k6yK-9RGn-B8Ls-LYPQP0" \ --restorefile /etc/lvm/archive/vg_storage_00677-187019982.vg
恢复 VG
如果 vg 是正常的,那么就不用做第二步了,直接从备份中恢复即可,先查看一下
vgcfgrestore --list vg1 File: /etc/lvm/archive/vg1_00000-1238318622.vg VG name: vg1 Description: Created *before* executing 'vgcreate vg1 /dev/sda6' Backup Time: Mon Feb 29 10:58:51 2016 File: /etc/lvm/archive/vg1_00001-285796155.vg VG name: vg1 Description: Created *before* executing 'lvcreate -L 1G -n lv2 vg1' Backup Time: Mon Feb 29 10:59:23 2016 File: /etc/lvm/archive/vg1_00002-1661997476.vg ---> just before removal of volume (this is the archive we need) VG name: vg1 Description: Created *before* executing 'lvremove /dev/vg1/lv2' Backup Time: Mon Feb 29 13:55:08 2016 File: /etc/lvm/backup/vg1 VG name: vg1 Description: Created *after* executing 'lvremove /dev/vg1/lv2' Backup Time: Mon Feb 29 13:55:08 2016
有信息,我们先加 –test 测试一下
vgcfgrestore vg01 --test -f /etc/lvm/archive/vg_data_00003-586203914.vg TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated. Volume group vg01 has active volume: lv001. WARNING: Found 1 active volume(s) in volume group "vg01". Restoring VG with active LVs, may cause mismatch with its metadata. Do you really want to proceed with restore of volume group "vg01", while 1 volume(s) are active? [y/n]: y Restored volume group vg01.
没问题,那就继续,真的操作了
vgcfgrestore vg01 -f /etc/lvm/archive/vg_data_00003-586203914.vg Volume group vg01 has active volume: lv001. WARNING: Found 1 active volume(s) in volume group "vg01". Restoring VG with active LVs, may cause mismatch with its metadata. Do you really want to proceed with restore of volume group "vg01", while 1 volume(s) are active? [y/n]: y Restored volume group vg01.
恢复后进行校验
# 显示信息 vgdisplay VG1 # 激活 vgchange -ay VG1 # 显示逻辑卷 lvs -a -o +devices # 重新扫描 lvscan inactive '/dev/vg1/lv2' [1.00 GiB] inherit ### its in inactive state and make it active to use. ACTIVE '/dev/vg0/lv1' [1.00 GiB] inherit # 上面如果有inactive的,重新激活 lvchange -a y /dev/vg1/lv2 # 再显示一次 lvs -a -o +devices # 再扫描 lvscan ACTIVE '/dev/vg1/lv2' [1.00 GiB] inherit ACTIVE '/dev/vg0/lv1' [1.00 GiB] inherit # mount上测试 mount /dev/vg1/lv2 /lv2 # 看看东西都在不在 ls -lh /lv2
备份和恢复的脚本
备份一个 VG 下所有卷,每个卷一个 snapshot 快照备份文件,backup_snapshot_lvm.pl
#!/usr/bin/perl -w # # Run through a particular LVM volume group and perform a snapshot # and compressed file backup of each volume. # # This script is intended for use with backing up complete system # images of VMs, in addition to data level backups. # my $source_lvm_volgroup = 'vg_storage'; my $source_lvm_snapsize = '5G'; my @source_lvm_excludes = ('lv_unwanted', 'lv_tmpfiles'); my $dest_dir='/mnt/backup/snapshots'; foreach $volume (glob("/dev/$source_lvm_volgroup/*")) { $volume =~ /\/dev\/$source_lvm_volgroup\/(\S*)$/; my $volume_short = $1; if ("$volume_short" ~~ @source_lvm_excludes) { # Excluded volume, we skip it print "[info] Skipping excluded volume $volume_short ($volume)\n"; next; } print "[info] Processing volume $volume_short ($volume)\n"; # Snapshot volume print "[info] Creating snapshot...\n"; system("lvcreate -n ${volume_short}_snapshot --snapshot $volume -L $source_lvm_snapsize"); # Write compressed backup file from snapshot, but only replace existing one once finished print "[info] Creating compressed snapshot file...\n"; system("dd if=${volume}_snapshot | gzip --fast > $dest_dir/$volume_short.temp.gz"); system("mv $dest_dir/$volume_short.temp.gz $dest_dir/$volume_short.gz"); # Delete snapshot print "[info] Removing snapshot...\n"; system("lvremove --force ${volume}_snapshot"); print "[info] Volume $volume_short backup completed.\n"; } print "[info] FINISHED! VolumeGroup backup completed\n"; exit 0;
恢复的脚本,restore_snapshot_lvm.pl:
#!/usr/bin/perl -w # # Runs through LVM snapshots taken by backup_snapshot_lvm.pl and # restores them to the LVM volume in question. # my $dest_lvm_volgroup = 'vg_storage'; my $source_dir = '/mnt/backup/snapshots'; print "[WARNING] Beginning restore process in 5 SECONDS!!\n"; sleep(5); foreach $volume (glob("$source_dir/*")) { $volume =~ /$source_dir\/(\S*).gz$/; my $volume_short = $1; print "[info] Processing volume $volume_short ($volume)\n"; # Just need to decompress & write into LVM volume system("zcat $source_dir/$volume_short.gz > /dev/$dest_lvm_volgroup/$volume_short"); print "[info] Volume $volume_short restore completed.\n"; } print "[info] FINISHED! VolumeGroup restore completed\n"; exit 0;
当然要谨记,这个是块设备级别的备份,最好还是要有另外的数据的备份。