Allen/drbd功能与故障盘重建测试

Created Tue, 30 Jan 2024 17:08:28 +0800 Modified Tue, 30 Jan 2024 17:21:08 +0800
4662 Words

drbd是基于内核的双节点块HA的一种功能,记录一下基础功能测试过程

drbd功能测试记录

环境准备

  • 操作系统采用debian 12.4 (bookworm),内核6.1.0-17-amd64
  • 本次搭建采用两台Linux虚拟机。配置如下
主机名 IP CPU/内存 数据硬盘
node1 192.168.230.170 2C4G 1 * 100G
node2 192.168.230.171 2C4G 1 * 100G

安装配置

  • 以下步骤需在两个节点执行。
  1. apt 安装依赖
# 检查drbd安装包,使用apt安装drbd-utils
root@debian12:~# apt list | grep -i drbd

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

drbd-doc/stable 8.4~20220106-1 all
drbd-utils/stable 9.22.0-1 amd64
root@debian12:~# apt install drbd-utils
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
 bsd-mailx exim4-base exim4-config exim4-daemon-light libgnutls-dane0 liblockfile1 libunbound8
Suggested packages:
 heartbeat exim4-doc-html | exim4-doc-info eximon4 spf-tools-perl swaks
The following NEW packages will be installed:
 bsd-mailx drbd-utils exim4-base exim4-config exim4-daemon-light libgnutls-dane0 liblockfile1 libunbound8
0 upgraded, 8 newly installed, 0 to remove and 0 not upgraded.
Need to get 3,769 kB of archives.
After this operation, 8,346 kB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://mirrors.huaweicloud.com/debian bookworm/main amd64 exim4-config all 4.96-15+deb12u3 [256 kB]
Get:2 http://mirrors.huaweicloud.com/debian bookworm/main amd64 exim4-base amd64 4.96-15+deb12u3 [1,117 kB]
Get:3 http://mirrors.huaweicloud.com/debian bookworm/main amd64 libunbound8 amd64 1.17.1-2+deb12u1 [548 kB]
Get:4 http://mirrors.huaweicloud.com/debian bookworm/main amd64 libgnutls-dane0 amd64 3.7.9-2+deb12u1 [406 kB]
Get:5 http://mirrors.huaweicloud.com/debian bookworm/main amd64 exim4-daemon-light amd64 4.96-15+deb12u3 [605 kB]
Get:6 http://mirrors.huaweicloud.com/debian bookworm/main amd64 liblockfile1 amd64 1.17-1+b1 [17.0 kB]
Get:7 http://mirrors.huaweicloud.com/debian bookworm/main amd64 bsd-mailx amd64 8.1.2-0.20220412cvs-1 [90.4 kB]
Get:8 http://mirrors.huaweicloud.com/debian bookworm/main amd64 drbd-utils amd64 9.22.0-1 [731 kB]
Fetched 3,769 kB in 3s (1,404 kB/s)
Preconfiguring packages ...
Selecting previously unselected package exim4-config.
(Reading database ... 153805 files and directories currently installed.)
Preparing to unpack .../0-exim4-config_4.96-15+deb12u3_all.deb ...
Unpacking exim4-config (4.96-15+deb12u3) ...
Selecting previously unselected package exim4-base.
Preparing to unpack .../1-exim4-base_4.96-15+deb12u3_amd64.deb ...
Unpacking exim4-base (4.96-15+deb12u3) ...
Selecting previously unselected package libunbound8:amd64.
Preparing to unpack .../2-libunbound8_1.17.1-2+deb12u1_amd64.deb ...
Unpacking libunbound8:amd64 (1.17.1-2+deb12u1) ...
Selecting previously unselected package libgnutls-dane0:amd64.
Preparing to unpack .../3-libgnutls-dane0_3.7.9-2+deb12u1_amd64.deb ...
Unpacking libgnutls-dane0:amd64 (3.7.9-2+deb12u1) ...
Selecting previously unselected package exim4-daemon-light.
Preparing to unpack .../4-exim4-daemon-light_4.96-15+deb12u3_amd64.deb ...
Unpacking exim4-daemon-light (4.96-15+deb12u3) ...
Selecting previously unselected package liblockfile1:amd64.
Preparing to unpack .../5-liblockfile1_1.17-1+b1_amd64.deb ...
Unpacking liblockfile1:amd64 (1.17-1+b1) ...
Selecting previously unselected package bsd-mailx.
Preparing to unpack .../6-bsd-mailx_8.1.2-0.20220412cvs-1_amd64.deb ...
Unpacking bsd-mailx (8.1.2-0.20220412cvs-1) ...
Selecting previously unselected package drbd-utils.
Preparing to unpack .../7-drbd-utils_9.22.0-1_amd64.deb ...
Unpacking drbd-utils (9.22.0-1) ...
Setting up drbd-utils (9.22.0-1) ...
Setting up libunbound8:amd64 (1.17.1-2+deb12u1) ...
Setting up exim4-config (4.96-15+deb12u3) ...
Adding system-user for exim (v4)
Setting up liblockfile1:amd64 (1.17-1+b1) ...
Setting up libgnutls-dane0:amd64 (3.7.9-2+deb12u1) ...
Setting up exim4-base (4.96-15+deb12u3) ...
exim: DB upgrade, deleting hints-db
Created symlink /etc/systemd/system/timers.target.wants/exim4-base.timer → /lib/systemd/system/exim4-base.timer.
exim4-base.service is a disabled or a static unit, not starting it.
Setting up exim4-daemon-light (4.96-15+deb12u3) ...
Setting up bsd-mailx (8.1.2-0.20220412cvs-1) ...
update-alternatives: using /usr/bin/bsd-mailx to provide /usr/bin/mailx (mailx) in auto mode
Processing triggers for man-db (2.11.2-2) ...
Processing triggers for libc-bin (2.36-9+deb12u3) ...
  1. 编辑配置文件/etc/drbd.d/drbd-demo.res
# 该配置文件两边节点都需要存在,并且内容一样
root@debian12:~# cat /etc/drbd.d/drbd-demo.res
resource drbddemo {
  meta-disk internal;
  device /dev/drbd1;
  syncer {
    verify-alg sha1;
  }
  net {
    allow-two-primaries;
  }
  on node1 {
    disk /dev/sdb;
    address 192.168.230.170:7789;
  }
  on node2 {
    disk /dev/sdb;
    address 192.168.230.171:7789;
  }
}
  1. 设置主机名
# 两台虚拟机是克隆的,在启动前需修改为不同的主机名,两台节点根据IP分别配置对应的hostname
hostnamectl hostname node1
hostnamectl hostname node2
  1. 初始化服务
# drbdadm create-md drbddemo 进行初始化
root@node1:~# drbdadm create-md drbddemo
initializing activity log
initializing bitmap (3200 KB) to all zero
Writing meta data...
New drbd meta data block successfully created.

启动服务

  1. node1 备用节点启动服务
root@node1:~# modprobe drbd
root@node1:~# cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: 488C1124B879DCE7CD031DA
root@node1:~# drbd
drbdadm    drbdmeta   drbdmon    drbdsetup
root@node1:~# drbdadm up drbddemo
root@node1:~# cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: 488C1124B879DCE7CD031DA

 1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
    ns:0 nr:0 dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:104854364
  1. node2 主节点启动服务
root@node2:~# modprobe drbd
root@node2:~# drbdadm up drbddemo
root@node2:~# drbdadm -- --overwrite-data-of-peer primary drbddemo
root@node2:~# cat /proc/drbd
version: 8.4.11 (api:1/proto:86-101)
srcversion: 488C1124B879DCE7CD031DA

 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:15464 nr:0 dw:0 dr:15464 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:104838900
        [>....................] sync'ed:  0.1% (102380/102396)M
        finish: 11:12:02 speed: 2,576 (2,576) K/sec
  1. 任意节点检查同步状态
#watch cat /proc/drbd

version: 8.4.11 (api:1/proto:86-101)
srcversion: 488C1124B879DCE7CD031DA

 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:151948 nr:0 dw:0 dr:153216 al:8 bm:0 lo:0 pe:4 ua:3 ap:0 ep:1 wo:f oos:104702708
        [>....................] sync'ed:  0.2% (102248/102396)M
        finish: 3:49:36 speed: 7,580 (7,580) K/sec

# drbdadm status drbddemo
root@node1:~# drbdadm status drbddemo
drbddemo role:Secondary
  disk:UpToDate
  peer role:Primary
   <!-- TOC -->

- [drbd功能测试记录](#drbd%E5%8A%9F%E8%83%BD%E6%B5%8B%E8%AF%95%E8%AE%B0%E5%BD%95)
    - [环境准备](#%E7%8E%AF%E5%A2%83%E5%87%86%E5%A4%87)
    - [安装配置](#%E5%AE%89%E8%A3%85%E9%85%8D%E7%BD%AE)
    - [启动服务](#%E5%90%AF%E5%8A%A8%E6%9C%8D%E5%8A%A1)
    - [硬盘故障测试](#%E7%A1%AC%E7%9B%98%E6%95%85%E9%9A%9C%E6%B5%8B%E8%AF%95)

<!-- /TOC --> replication:Established peer-disk:UpToDate

硬盘故障测试

  1. 两个节点关机后重新开机,格式化硬盘并写入测试数据。

    • node1节点执行systemctl start drbd.service后,命令无响应卡住。node2同样执行start命令后,两个节点drbd服务均正常启动。
    • 服务检查:
    root@node1:~# lsblk
    NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
    sda                       8:0    0   50G  0 disk
    ├─sda1                    8:1    0  487M  0 part /boot
    ├─sda2                    8:2    0    1K  0 part
    └─sda5                    8:5    0 49.5G  0 part
      ├─debian12--vg-root   254:0    0 16.8G  0 lvm  /
      ├─debian12--vg-swap_1 254:1    0  976M  0 lvm  [SWAP]
      └─debian12--vg-home   254:2    0 31.8G  0 lvm  /home
    sdb                       8:16   0  100G  0 disk
    sr0                      11:0    1 1024M  0 rom
    
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
        ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    
    root@node1:~# lsblk
    NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
    sda                       8:0    0   50G  0 disk
    ├─sda1                    8:1    0  487M  0 part /boot
    ├─sda2                    8:2    0    1K  0 part
    └─sda5                    8:5    0 49.5G  0 part
      ├─debian12--vg-root   254:0    0 16.8G  0 lvm  /
      ├─debian12--vg-swap_1 254:1    0  976M  0 lvm  [SWAP]
      └─debian12--vg-home   254:2    0 31.8G  0 lvm  /home
    sdb                       8:16   0  100G  0 disk
    └─drbd1                 147:1    0  100G  1 disk
    sr0                      11:0    1 1024M  0 rom
    
    • 直接格盘报错
    root@node1:~# parted /dev/drbd
    drbd/  drbd1
    root@node1:~# parted /dev/drbd1
    Error: Error opening /dev/drbd1: Wrong medium type
    Retry/Cancel? R
    Error: Error opening /dev/drbd1: Wrong medium type
    Retry/Cancel? ^C
    
    • 检查dmesg日志,显示无法打开对应的块设备
    [Tue Jan 30 15:53:58 2024] drbd drbddemo: Starting worker thread (from drbdsetup-84 [1984])
    [Tue Jan 30 15:53:58 2024] block drbd1: disk( Diskless -> Attaching )
    [Tue Jan 30 15:53:58 2024] drbd drbddemo: Method to ensure write ordering: flush
    [Tue Jan 30 15:53:58 2024] block drbd1: max BIO size = 1048576
    [Tue Jan 30 15:53:58 2024] block drbd1: drbd_bm_resize called with capacity == 209708728
    [Tue Jan 30 15:53:58 2024] block drbd1: resync bitmap: bits=26213591 words=409588 pages=800
    [Tue Jan 30 15:53:58 2024] drbd1: detected capacity change from 0 to 209708728
    [Tue Jan 30 15:53:58 2024] block drbd1: size = 100 GB (104854364 KB)
    [Tue Jan 30 15:53:58 2024] block drbd1: recounting of set bits took additional 0 jiffies
    [Tue Jan 30 15:53:58 2024] block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
    [Tue Jan 30 15:53:58 2024] block drbd1: disk( Attaching -> UpToDate )
    [Tue Jan 30 15:53:58 2024] block drbd1: attached to UUIDs C9F197865BEA00EB:0000000000000000:1DAF0182D0940332:0000000000000004
    [Tue Jan 30 15:53:58 2024] drbd drbddemo: conn( StandAlone -> Unconnected )
    [Tue Jan 30 15:53:58 2024] drbd drbddemo: Starting receiver thread (from drbd_w_drbddemo [1985])
    [Tue Jan 30 15:53:58 2024] drbd drbddemo: receiver (re)started
    [Tue Jan 30 15:53:58 2024] drbd drbddemo: conn( Unconnected -> WFConnection )
    [Tue Jan 30 15:53:59 2024] drbd drbddemo: Handshake successful: Agreed network protocol version 101
    [Tue Jan 30 15:53:59 2024] drbd drbddemo: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
    [Tue Jan 30 15:53:59 2024] drbd drbddemo: conn( WFConnection -> WFReportParams )
    [Tue Jan 30 15:53:59 2024] drbd drbddemo: Starting ack_recv thread (from drbd_r_drbddemo [1990])
    [Tue Jan 30 15:53:59 2024] block drbd1: drbd_sync_handshake:
    [Tue Jan 30 15:53:59 2024] block drbd1: self C9F197865BEA00EA:0000000000000000:1DAF0182D0940332:0000000000000004 bits:0 flags:0
    [Tue Jan 30 15:53:59 2024] block drbd1: peer C9F197865BEA00EA:0000000000000000:1DAF0182D0940332:0000000000000004 bits:0 flags:0
    [Tue Jan 30 15:53:59 2024] block drbd1: uuid_compare()=1 by rule 40
    [Tue Jan 30 15:53:59 2024] block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
    [Tue Jan 30 15:53:59 2024] block drbd1: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
    [Tue Jan 30 15:53:59 2024] block drbd1: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
    [Tue Jan 30 15:53:59 2024] block drbd1: helper command: /sbin/drbdadm before-resync-source minor-1
    [Tue Jan 30 15:53:59 2024] block drbd1: helper command: /sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0)
    [Tue Jan 30 15:53:59 2024] block drbd1: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
    [Tue Jan 30 15:53:59 2024] block drbd1: Began resync as SyncSource (will sync 0 KB [0 bits set]).
    [Tue Jan 30 15:53:59 2024] block drbd1: updated sync UUID C9F197865BEA00EA:8094DE27947D76A2:1DAF0182D0940332:0000000000000004
    [Tue Jan 30 15:53:59 2024] block drbd1: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
    [Tue Jan 30 15:53:59 2024] block drbd1: updated UUIDs C9F197865BEA00EA:0000000000000000:8094DE27947D76A2:1DAF0182D0940332
    [Tue Jan 30 15:53:59 2024] block drbd1: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
    [Tue Jan 30 15:59:22 2024] block drbd1: peer( Secondary -> Primary )
    [Tue Jan 30 16:05:44 2024] /dev/drbd1: Can't open blockdev
    [Tue Jan 30 16:05:44 2024] /dev/drbd1: Can't open blockdev
    
    • 查找资料,显示需要将对应块设备设置为primary
    root@node1:~# drbdadm primary drbddemo
    root@node1:~# drbdadm role drbddemo
    Primary/Secondary
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
        ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    
    root@node1:~# mkfs.ext4 /dev/drbd1
    mke2fs 1.47.0 (5-Feb-2023)
    Found a gpt partition table in /dev/drbd1
    Proceed anyway? (y,N) y
    Creating filesystem with 26213591 4k blocks and 6553600 inodes
    Filesystem UUID: 2918de26-0c2b-47bb-8a16-d1e7bffdd551
    Superblock backups stored on blocks:
            32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
            4096000, 7962624, 11239424, 20480000, 23887872
    
    Allocating group tables: done
    Writing inode tables: done
    Creating journal (131072 blocks): done
    Writing superblocks and filesystem accounting information: done
    
    root@node1:~# parted /dev/drbd1
    GNU Parted 3.5
    Using /dev/drbd1
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) p
    Model: Unknown (unknown)
    Disk /dev/drbd1: 107GB
    Sector size (logical/physical): 512B/512B
    Partition Table: loop
    Disk Flags:
    
    Number  Start  End    Size   File system  Flags
    1      0.00B  107GB  107GB  ext4
    
    (parted) lsbl
      align-check TYPE N                       check partition N for TYPE(min|opt) alignment
      help [COMMAND]                           print general help, or help on COMMAND
      mklabel,mktable LABEL-TYPE               create a new disklabel (partition table)
      mkpart PART-TYPE [FS-TYPE] START END     make a partition
      name NUMBER NAME                         name partition NUMBER as NAME
      print [devices|free|list,all]            display the partition table, or available devices, or free space, or all found partitions
      quit                                     exit program
      rescue START END                         rescue a lost partition near START and END
      resizepart NUMBER END                    resize partition NUMBER
      rm NUMBER                                delete partition NUMBER
      select DEVICE                            choose the device to edit
      disk_set FLAG STATE                      change the FLAG on selected device
      disk_toggle [FLAG]                       toggle the state of FLAG on selected device
      set NUMBER FLAG STATE                    change the FLAG on partition NUMBER
      toggle [NUMBER [FLAG]]                   toggle the state of FLAG on partition NUMBER
      unit UNIT                                set the default unit to UNIT
      version                                  display the version number and copyright information of GNU Parted
    (parted) q
    root@node1:~# lsblk
    NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
    sda                       8:0    0   50G  0 disk
    ├─sda1                    8:1    0  487M  0 part /boot
    ├─sda2                    8:2    0    1K  0 part
    └─sda5                    8:5    0 49.5G  0 part
      ├─debian12--vg-root   254:0    0 16.8G  0 lvm  /
      ├─debian12--vg-swap_1 254:1    0  976M  0 lvm  [SWAP]
      └─debian12--vg-home   254:2    0 31.8G  0 lvm  /home
    sdb                       8:16   0  100G  0 disk
    └─drbd1                 147:1    0  100G  0 disk
    sr0                      11:0    1 1024M  0 rom
    root@node1:~# mount /dev/drbd1 /mnt/drbd_test
    root@node1:~# df -h
    Filesystem                     Size  Used Avail Use% Mounted on
    udev                           1.9G     0  1.9G   0% /dev
    tmpfs                          389M  1.3M  387M   1% /run
    /dev/mapper/debian12--vg-root   17G  5.0G   11G  32% /
    tmpfs                          1.9G     0  1.9G   0% /dev/shm
    tmpfs                          5.0M  8.0K  5.0M   1% /run/lock
    /dev/sda1                      455M  116M  314M  27% /boot
    /dev/mapper/debian12--vg-home   32G   23M   30G   1% /home
    tmpfs                          389M   64K  389M   1% /run/user/112
    tmpfs                          389M   48K  389M   1% /run/user/0
    /dev/drbd1                      98G   24K   93G   1% /mnt/drbd_test
    
    • node2 也切为主节点,并写入测试数据
    root@node2:~# drbdadm primary drbddemo
    root@node2:~# drbdadm role drbddemo
    Primary/Primary
    root@node2:~# fdisk -l /dev/drbd1
    Disk /dev/drbd1: 100 GiB, 107370868736 bytes, 209708728 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    root@node2:~# mount /dev/drbd
    drbd/  drbd1
    root@node2:~# mount /dev/drbd
    drbd/  drbd1
    root@node2:~# mount /dev/drbd1 /mnt/drbd_test
    root@node2:~# df -h
    Filesystem                     Size  Used Avail Use% Mounted on
    udev                           1.9G     0  1.9G   0% /dev
    tmpfs                          389M  1.3M  387M   1% /run
    /dev/mapper/debian12--vg-root   17G  5.0G   11G  32% /
    tmpfs                          1.9G     0  1.9G   0% /dev/shm
    tmpfs                          5.0M  8.0K  5.0M   1% /run/lock
    /dev/sda1                      455M  116M  314M  27% /boot
    /dev/mapper/debian12--vg-home   32G   23M   30G   1% /home
    tmpfs                          389M   64K  389M   1% /run/user/112
    tmpfs                          389M   48K  389M   1% /run/user/0
    /dev/drbd1                      98G   24K   93G   1% /mnt/drbd_test
    root@node2:~# echo "hello world" > /mnt/drbd_test/test.file
    root@node2:~# ls -l /mnt/drbd_test/
    total 20
    drwx------ 2 root root 16384 Jan 30 16:03 lost+found
    -rw-r--r-- 1 root root    12 Jan 30 16:07 test.file
    root@node2:~# umount /mnt/drbd_test
    
    • 如果此时直接在nod1 访问对应目录,是看不到node2节点写入的测试文件,两边全部umont后再次mount可看到
    root@node1:~# ls -l /mnt/drbd_test/
    total 16
    drwx------ 2 root root 16384 Jan 30 16:03 lost+found
    root@node1:~# umount /mnt/drbd_test/
    root@node1:~# ls -l
    total 0
    root@node1:~# df -h
    Filesystem                     Size  Used Avail Use% Mounted on
    udev                           1.9G     0  1.9G   0% /dev
    tmpfs                          389M  1.3M  387M   1% /run
    /dev/mapper/debian12--vg-root   17G  5.0G   11G  32% /
    tmpfs                          1.9G     0  1.9G   0% /dev/shm
    tmpfs                          5.0M  8.0K  5.0M   1% /run/lock
    /dev/sda1                      455M  116M  314M  27% /boot
    /dev/mapper/debian12--vg-home   32G   23M   30G   1% /home
    tmpfs                          389M   64K  389M   1% /run/user/112
    tmpfs                          389M   48K  389M   1% /run/user/0
    root@node1:~# lsblk
    NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
    sda                       8:0    0   50G  0 disk
    ├─sda1                    8:1    0  487M  0 part /boot
    ├─sda2                    8:2    0    1K  0 part
    └─sda5                    8:5    0 49.5G  0 part
      ├─debian12--vg-root   254:0    0 16.8G  0 lvm  /
      ├─debian12--vg-swap_1 254:1    0  976M  0 lvm  [SWAP]
      └─debian12--vg-home   254:2    0 31.8G  0 lvm  /home
    sdb                       8:16   0  100G  0 disk
    └─drbd1                 147:1    0  100G  0 disk
    sr0                      11:0    1 1024M  0 rom
    root@node1:~# mount /dev/drbd1 /mnt/drbd_test
    root@node1:~# ls -l /mnt/drbd_test
    total 20
    drwx------ 2 root root 16384 Jan 30 16:03 lost+found
    -rw-r--r-- 1 root root    12 Jan 30 16:07 test.file
    root@node1:~# cat /mnt/drbd_test/test.file
    hello world
    
  2. 故障测试

    • 故障测试,直接在node1节点将对应sdb写0,并尝试重新挂载目录,mount命令返回错误,但dmesg无错误记录,drbd进程重启无异常
    root@node1:~# dd if=/dev/zero of=/dev/sdb bs=4M count=16 oflag=direct,nonblock
    16+0 records in
    16+0 records out
    67108864 bytes (67 MB, 64 MiB) copied, 0.00892653 s, 7.5 GB/s
    root@node1:~# lsblk
    NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
    sda                       8:0    0   50G  0 disk
    ├─sda1                    8:1    0  487M  0 part /boot
    ├─sda2                    8:2    0    1K  0 part
    └─sda5                    8:5    0 49.5G  0 part
      ├─debian12--vg-root   254:0    0 16.8G  0 lvm  /
      ├─debian12--vg-swap_1 254:1    0  976M  0 lvm  [SWAP]
      └─debian12--vg-home   254:2    0 31.8G  0 lvm  /home
    sdb                       8:16   0  100G  0 disk
    └─drbd1                 147:1    0  100G  0 disk
    sr0                      11:0    1 1024M  0 rom
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
        ns:5516 nr:160 dw:2168460 dr:6529 al:408 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    root@node1:~# mount /dev/drbd1 /mnt/drbd_test
    mount: /mnt/drbd_test: wrong fs type, bad option, bad superblock on /dev/drbd1, missing codepage or helper program, or other error.
          dmesg(1) may have more information after failed mount system call.
    root@node1:~# dmesg -T | tail -10
    [Tue Jan 30 15:53:58 2024] block drbd1: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
    [Tue Jan 30 15:53:58 2024] block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1
    [Tue Jan 30 15:53:58 2024] block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1 exit code 0 (0x0)
    [Tue Jan 30 15:59:21 2024] block drbd1: role( Secondary -> Primary )
    [Tue Jan 30 16:04:00 2024] EXT4-fs (drbd1): mounted filesystem with ordered data mode. Quota mode: none.
    [Tue Jan 30 16:06:37 2024] block drbd1: peer( Secondary -> Primary )
    [Tue Jan 30 16:09:10 2024] block drbd1: peer( Primary -> Secondary )
    [Tue Jan 30 16:09:27 2024] EXT4-fs (drbd1): unmounting filesystem.
    [Tue Jan 30 16:09:58 2024] EXT4-fs (drbd1): mounted filesystem with ordered data mode. Quota mode: none.
    [Tue Jan 30 16:10:56 2024] EXT4-fs (drbd1): unmounting filesystem.
    root@node1:~# systemctl status drbd.service
    ● drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager.
        Loaded: loaded (/lib/systemd/system/drbd.service; disabled; preset: enabled)
        Active: active (exited) since Tue 2024-01-30 15:53:59 CST; 20min ago
        Process: 3015 ExecStart=/lib/drbd/scripts/drbd start (code=exited, status=0/SUCCESS)
      Main PID: 3015 (code=exited, status=0/SUCCESS)
            CPU: 37ms
    
    Jan 30 15:53:45 node1 drbd[3015]: Starting DRBD resources:/lib/drbd/scripts/drbd: line 148: /var/lib/linstor/loop_device_mapping: No such file or director
    Jan 30 15:53:45 node1 drbd[3023]: [
    Jan 30 15:53:45 node1 drbd[3023]:      create res: drbddemo
    Jan 30 15:53:45 node1 drbd[3023]:    prepare disk: drbddemo
    Jan 30 15:53:45 node1 drbd[3023]:     adjust disk: drbddemo
    Jan 30 15:53:45 node1 drbd[3023]:      adjust net: drbddemo
    Jan 30 15:53:45 node1 drbd[3023]: ]
    Jan 30 15:53:59 node1 drbd[3043]: WARN: stdin/stdout is not a TTY; using /dev/console
    Jan 30 15:53:59 node1 drbd[3015]: .
    Jan 30 15:53:59 node1 systemd[1]: Finished drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager..
    root@node1:~# systemctl restart drbd.service
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
        ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    root@node1:~# systemctl status drbd.service
    ● drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager.
        Loaded: loaded (/lib/systemd/system/drbd.service; disabled; preset: enabled)
        Active: active (exited) since Tue 2024-01-30 16:15:01 CST; 12s ago
        Process: 8760 ExecStart=/lib/drbd/scripts/drbd start (code=exited, status=0/SUCCESS)
      Main PID: 8760 (code=exited, status=0/SUCCESS)
            CPU: 39ms
    
    Jan 30 16:15:00 node1 drbd[8760]: Starting DRBD resources:/lib/drbd/scripts/drbd: line 148: /var/lib/linstor/loop_device_mapping: No such file or director
    Jan 30 16:15:00 node1 drbd[8771]: [
    Jan 30 16:15:00 node1 drbd[8771]:      create res: drbddemo
    Jan 30 16:15:00 node1 drbd[8771]:    prepare disk: drbddemo
    Jan 30 16:15:00 node1 drbd[8771]:     adjust disk: drbddemo
    Jan 30 16:15:00 node1 drbd[8771]:      adjust net: drbddemo
    Jan 30 16:15:00 node1 drbd[8771]: ]
    Jan 30 16:15:01 node1 drbd[8788]: WARN: stdin/stdout is not a TTY; using /dev/console
    Jan 30 16:15:01 node1 drbd[8760]: .
    Jan 30 16:15:01 node1 systemd[1]: Finished drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager..
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
        ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    
    • 断开同步,重建配对关系
    root@node1:~# drbdadm secondary drbddemo
    root@node1:~# drbdadm detach drbddemo
    root@node1:~# drbdadm dstate drbddemo
    Diskless/UpToDate
    root@node1:~# drbdadm create-md drbddemo
    You want me to create a v08 style flexible-size internal meta data block.
    There appears to be a v08 flexible-size internal meta data block
    already in place on /dev/sdb at byte offset 107374178304
    
    Do you really want to overwrite the existing meta-data?
    [need to type 'yes' to confirm] yes
    
    md_offset 107374178304
    al_offset 107374145536
    bm_offset 107370868736
    
    Found ext3 filesystem
      104854364 kB data area apparently used
      104854364 kB left usable by current configuration
    
    Even though it looks like this would place the new meta data into
    unused space, you still need to confirm, as this is only a guess.
    
    Do you want to proceed?
    [need to type 'yes' to confirm] yes
    
    initializing activity log
    initializing bitmap (3200 KB) to all zero
    Writing meta data...
    New drbd meta data block successfully created.
    root@node1:~# drbdadm attach drbddemo
    root@node1:~# drbdadm dstate drbddemo
    Inconsistent/UpToDate
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
        ns:0 nr:115256 dw:115256 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:104739108
            [>....................] sync'ed:  0.2% (102284/102396)M
            finish: 4:16:42 speed: 6,776 (6,776) want: 13,480 K/sec
    
    • 重新挂载对应块设备,测试数据访问
    root@node1:~# drbdadm primary drbddemo
    root@node1:~# drbdadm role all
    Primary/Primary
    root@node1:~# drbdadm dstate drbddemo
    Inconsistent/UpToDate
    root@node1:~# cat /proc/drbd
    version: 8.4.11 (api:1/proto:86-101)
    srcversion: 488C1124B879DCE7CD031DA
    
    1: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
        ns:0 nr:8223744 dw:8223744 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:96630620
            [>...................] sync'ed:  7.9% (94364/102396)M
            finish: 0:40:42 speed: 39,544 (34,992) want: 41,000 K/sec
    root@node1:~# mount /dev/drbd1 /mnt/drbd_test
    root@node1:~# ls -l /mnt/drbd_test/
    total 20
    drwx------ 2 root root 16384 Jan 30 16:03 lost+found
    -rw-r--r-- 1 root root    12 Jan 30 16:07 test.file
    root@node1:~# cat /mnt/drbd_test/test.file
    hello world
    
  3. 测试结论

    • 两边如果同时转为primary,因文件系统缓存原因,数据会与硬盘持久化保存的不一致,需umount/mount清空缓存,重新读取。(也许有更好的办法,需进一步测试)
    • 如果直接通过dd写0的方式,覆盖底层磁盘,drbd服务并不会报异常,但是所在节点的数据实际已经无法访问,文件系统无法挂载
    • 同步是按照磁盘扇区顺序全量同步,因为测试数据较少,所以靠前部分扇区(包含文件系统的superblock)同步完成后,就可正常挂载