Skip to main content
Version: 2.5

Remove OSD from cluster

Remove a failed or missing hard disk

Storage status

Story: If you discover a failed or missing hard disk, it is essential to remove it from your cluster and restore the health of your storage pool. To proceed:

  • Check the storage status before starting:
  • As shown below, two OSDs (Object Storage Daemons) are down due to the failed or missing hard disk—OSD numbers 4 and 5 on the node with the hostname cc1.
    cc1:storage> status
cluster:
id: c6e64c49-09cf-463b-9d1c-b6645b4b3b85
health: HEALTH_WARN
2 osds down
Degraded data redundancy: 1611/44204 objects degraded (3.644%), 106 pgs degraded

services:
mon: 3 daemons, quorum cc1,cc2,cc3 (age 8d)
mgr: cc1(active, since 8d), standbys: cc2, cc3
mds: 1/1 daemons up, 1 standby, 1 hot standby
osd: 18 osds: 16 up (since 2d), 18 in (since 2d)
rgw: 3 daemons active (3 hosts, 1 zones)

data:
volumes: 1/1 healthy
pools: 25 pools, 753 pgs
objects: 149.93k objects, 785 GiB
usage: 2.2 TiB used, 5.6 TiB / 7.9 TiB avail
pgs: 753 active+clean

io:
client: 5.3 MiB/s rd, 308 KiB/s wr, 171 op/s rd, 56 op/s wr

ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 cc1 65.7G 380G 6 40.0k 7 161k exists,up
1 cc1 181G 265G 8 32.7k 1 58.4k exists,up
2 cc1 162G 283G 0 4096 15 604k exists,up
3 cc1 133G 313G 0 1638 2 29.6k exists,up
4 cc1 91.8G 354G 14 97.5k 6 39.2k exists
5 cc1 130G 315G 8 39.1k 3 88.9k exists
6 cc2 96.0G 350G 9 50.3k 3 160k exists,up
7 cc2 165G 281G 0 0 1 89.6k exists,up
8 cc2 75.8G 370G 0 6553 1 25.6k exists,up
9 cc2 199G 247G 0 3276 3 172k exists,up
10 cc2 122G 324G 2 13.5k 9 510k exists,up
11 cc2 95.3G 351G 1 4096 6 126k exists,up
12 cc3 184G 262G 3 12.0k 1 25.6k exists,up
13 cc3 93.6G 353G 0 0 0 5734 exists,up
14 cc3 67.8G 378G 12 71.1k 13 364k exists,up
15 cc3 92.6G 354G 0 819 0 0 exists,up
16 cc3 142G 303G 0 819 2 24.0k exists,up
17 cc3 179G 267G 0 2457 5 99.2k exists,up

Remove osd

  cc2:storage> remove_osd
Enter osd id to be removed:
1: osd.4 (hdd)
2: osd.5 (hdd)
Enter index: 1
Enter 'YES' to confirm: YES
Remove osd.0 successfully.

Results

  • wait for a while and check the status again
  • We had successfully remove the failed hard disk and the health status are OK
  cc1:storage> status
cluster:
id: c6e64c49-09cf-463b-9d1c-b6645b4b3b85
health: HEALTH_OK

services:
mon: 3 daemons, quorum cc1,cc2,cc3 (age 8d)
mgr: cc1(active, since 8d), standbys: cc2, cc3
mds: 1/1 daemons up, 1 standby, 1 hot standby
osd: 16 osds: 16 up (since 21m), 16 in (since 21m)
rgw: 3 daemons active (3 hosts, 1 zones)

data:
volumes: 1/1 healthy
pools: 25 pools, 753 pgs
objects: 149.99k objects, 786 GiB
usage: 2.2 TiB used, 4.8 TiB / 7.0 TiB avail
pgs: 753 active+clean

io:
client: 25 KiB/s rd, 304 KiB/s wr, 19 op/s rd, 43 op/s wr

ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 cc1 148G 298G 0 0 0 0 exists,up
1 cc1 176G 269G 5 23.1k 1 0 exists,up
2 cc1 202G 243G 0 28.0k 1 0 exists,up
3 cc1 220G 225G 0 3276 0 0 exists,up
6 cc2 86.1G 360G 4 20.7k 0 0 exists,up
7 cc2 180G 266G 0 0 0 0 exists,up
8 cc2 89.2G 357G 7 49.5k 2 10.3k exists,up
9 cc2 201G 245G 0 819 0 0 exists,up
10 cc2 108G 337G 1 7372 0 5734 exists,up
11 cc2 99.1G 347G 0 12.7k 0 0 exists,up
12 cc3 199G 247G 1 5734 1 0 exists,up
13 cc3 112G 333G 4 22.3k 0 0 exists,up
14 cc3 86.3G 360G 1 18.3k 2 90 exists,up
15 cc3 98.7G 347G 0 16.0k 1 0 exists,up
16 cc3 128G 318G 1 4915 2 9027 exists,up
17 cc3 141G 305G 2 22.3k 0 0 exists,up