Version: 2.5

Remove disk from a node

Storage status

Story: If you discover a failed hard disk, it is essential to remove it from your cluster and restore the health of your storage pool. To proceed:

Check the storage status before starting:
As shown below, two OSDs (Object Storage Daemons) are down due to the failed hard disk—OSD numbers 4 and 5 on the node with the hostname cc1.

    cc1:storage> status
      cluster:
        id:     c6e64c49-09cf-463b-9d1c-b6645b4b3b85
        health: HEALTH_WARN
                2 osds down
                Degraded data redundancy: 1611/44204 objects degraded (3.644%), 106 pgs degraded

        services:
          mon: 3 daemons, quorum  cc1,cc2,cc3 (age 8d)
          mgr:  cc1(active, since 8d), standbys: cc2, cc3
          mds: 1/1 daemons up, 1 standby, 1 hot standby
          osd: 18 osds: 16 up (since 2d), 18 in (since 2d)
          rgw: 3 daemons active (3 hosts, 1 zones)

        data:
          volumes: 1/1 healthy
          pools:   25 pools, 753 pgs
          objects: 149.93k objects, 785 GiB
          usage:   2.2 TiB used, 5.6 TiB / 7.9 TiB avail
          pgs:     753 active+clean

        io:
          client:   5.3 MiB/s rd, 308 KiB/s wr, 171 op/s rd, 56 op/s wr

      ID  HOST     USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
      0   cc1  65.7G   380G      6     40.0k      7      161k  exists,up
      1   cc1   181G   265G      8     32.7k      1     58.4k  exists,up
      2   cc1   162G   283G      0     4096      15      604k  exists,up
      3   cc1   133G   313G      0     1638       2     29.6k  exists,up
      4   cc1  91.8G   354G     14     97.5k      6     39.2k  exists
      5   cc1   130G   315G      8     39.1k      3     88.9k  exists
      6   cc2  96.0G   350G      9     50.3k      3      160k  exists,up
      7   cc2   165G   281G      0        0       1     89.6k  exists,up
      8   cc2  75.8G   370G      0     6553       1     25.6k  exists,up
      9   cc2   199G   247G      0     3276       3      172k  exists,up
      10  cc2   122G   324G      2     13.5k      9      510k  exists,up
      11  cc2  95.3G   351G      1     4096       6      126k  exists,up
      12  cc3   184G   262G      3     12.0k      1     25.6k  exists,up
      13  cc3  93.6G   353G      0        0       0     5734   exists,up
      14  cc3  67.8G   378G     12     71.1k     13      364k  exists,up
      15  cc3  92.6G   354G      0      819       0        0   exists,up
      16  cc3   142G   303G      0      819       2     24.0k  exists,up
      17  cc3   179G   267G      0     2457       5     99.2k  exists,up

Remove disk

connect to the host cc1
use CLI remove_disk it show /dev/sde is associated with id 4,5 on index 3.
then we remove /dev/sde from the ceph pool.
Remove the Hard disk from the nodes

  cc1:storage> remove_disk
    index          name      size     osd              serial
  --
        1      /dev/sda    894.3G     0 1      S40FNA0M800607
        2      /dev/sdc    894.3G     2 3      S40FNA0M800598
        3      /dev/sde    894.3G     4 5      S40FNA0M800608
  --
  Enter the index of disk to be removed: 3
  Disk removal mode (safe/force): force
  force mode immediately destroys disk data without taking into accounts of
  storage status so USE IT AT YOUR OWN RISK.
  Enter 'YES' to confirm: YES

let's check the status of our storage pool, ceph is recovering the data automatically

  cc1:storage> status
    cluster:
      id:     c6e64c49-09cf-463b-9d1c-b6645b4b3b85
      health: HEALTH_WARN
              Degraded data redundancy: 6075/438706 objects degraded (1.385%), 8 pgs degraded, 8 pgs undersized

    services:
      mon: 3 daemons, quorum cc1,cc2,cc3 (age 8d)
      mgr: cc1(active, since 8d), standbys: cc2, cc3
      mds: 1/1 daemons up, 1 standby, 1 hot standby
      osd: 16 osds: 16 up (since 10m), 16 in (since 10m); 15 remapped pgs
      rgw: 3 daemons active (3 hosts, 1 zones)

    data:
      volumes: 1/1 healthy
      pools:   25 pools, 753 pgs
      objects: 149.94k objects, 785 GiB
      usage:   2.2 TiB used, 4.8 TiB / 7.0 TiB avail
      pgs:     6075/438706 objects degraded (1.385%)
              5463/438706 objects misplaced (1.245%)
              738 active+clean
              8   active+undersized+degraded+remapped+backfilling
              7   active+remapped+backfilling

    io:
      client:   4.4 MiB/s rd, 705 KiB/s wr, 87 op/s rd, 83 op/s wr
      recovery: 127 MiB/s, 27 objects/s

  ID  HOST     USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
  0   cc1   141G   305G      1     28.0k      1     42.4k  exists,up
  1   cc1   177G   268G     11     88.0k      3     26.3k  exists,up
  2   cc1   212G   233G      2     12.7k      0        0   exists,up
  3   cc1   193G   253G      3     31.1k      7      634k  exists,up
  6   cc2  86.0G   360G      9     40.0k      2     27.1k  exists,up
  7   cc2   179G   267G      7      184k      2      119k  exists,up
  8   cc2  90.8G   355G      0     18.3k     19     1553k  exists,up
  9   cc2   201G   245G      8     35.1k     16     1450k  exists,up
  10  cc2   108G   337G      6     51.1k     11      755k  exists,up
  11  cc2  98.5G   348G      0     6553       2     41.6k  exists,up
  12  cc3   201G   245G     16      100k      3      230k  exists,up
  13  cc3   122G   323G      0        0       0        0   exists,up
  14  cc3  88.0G   358G     15     76.0k     47     2970k  exists,up
  15  cc3   100G   346G      7      183k     14     1286k  exists,up
  16  cc3   127G   319G      5     28.0k     15      659k  exists,up
  17  cc3   132G   314G     23      225k      9      491k  exists,up

Results

wait for a while and check the status again
We had successfully remove the failed hard disk and the health status are OK

  cc1:storage> status
    cluster:
      id:     c6e64c49-09cf-463b-9d1c-b6645b4b3b85
      health: HEALTH_OK

    services:
      mon: 3 daemons, quorum cc1,cc2,cc3 (age 8d)
      mgr: cc1(active, since 8d), standbys: cc2, cc3
      mds: 1/1 daemons up, 1 standby, 1 hot standby
      osd: 16 osds: 16 up (since 21m), 16 in (since 21m)
      rgw: 3 daemons active (3 hosts, 1 zones)

    data:
      volumes: 1/1 healthy
      pools:   25 pools, 753 pgs
      objects: 149.99k objects, 786 GiB
      usage:   2.2 TiB used, 4.8 TiB / 7.0 TiB avail
      pgs:     753 active+clean

    io:
      client:   25 KiB/s rd, 304 KiB/s wr, 19 op/s rd, 43 op/s wr

  ID  HOST     USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
  0   cc1   148G   298G      0        0       0        0   exists,up
  1   cc1   176G   269G      5     23.1k      1        0   exists,up
  2   cc1   202G   243G      0     28.0k      1        0   exists,up
  3   cc1   220G   225G      0     3276       0        0   exists,up
  6   cc2  86.1G   360G      4     20.7k      0        0   exists,up
  7   cc2   180G   266G      0        0       0        0   exists,up
  8   cc2  89.2G   357G      7     49.5k      2     10.3k  exists,up
  9   cc2   201G   245G      0      819       0        0   exists,up
  10  cc2   108G   337G      1     7372       0     5734   exists,up
  11  cc2  99.1G   347G      0     12.7k      0        0   exists,up
  12  cc3   199G   247G      1     5734       1        0   exists,up
  13  cc3   112G   333G      4     22.3k      0        0   exists,up
  14  cc3  86.3G   360G      1     18.3k      2       90   exists,up
  15  cc3  98.7G   347G      0     16.0k      1        0   exists,up
  16  cc3   128G   318G      1     4915       2     9027   exists,up
  17  cc3   141G   305G      2     22.3k      0        0   exists,up

Storage status​

Remove disk​

Results​

Storage status

Remove disk

Results