Version: 3.0

Remove OSD from Storage Pool

Scenario: One of the nodes has failed to power up with no reason. The OSDs that host by the failed node went offline, so we have to recover the storage pool ASAP from the health_warn status. We could do it from any host of the cluster.

Accessing desired Node via SSH

ssh [email protected]

Warning: Permanently added '192.168.1.x' (ECDSA) to the list of known hosts.
Password:

Check Storage Status

Use command storage status.

As shown below, we have 1 OSD down due to a failed hard disk, OSD No.7 on node hostname (cc2).

The health status is showing HEALTH_WARN.

cc1> storage status
    cluster:
      id:     c6e64c49-09cf-463b-9d1c-b6645b4b3b85
      health: HEALTH_WARN
            1 osds down
            Degraded data redundancy: 1418/251600 objects degraded (0.564%), 72 pgs degraded

    services:
      mon: 3 daemons, quorum cc1,cc2,cc3 (age 101m)
      mgr: cc1(active, since 101m), standbys: cc3, cc2
      mds: 1/1 daemons up, 1 standby, 1 hot standby
      osd: 18 osds: 17 up (since 60s), 18 in (since 2m)
      rgw: 3 daemons active (3 hosts, 1 zones)

    data:
      pools:   25 pools, 945 pgs
      objects: 87.83k objects, 449 GiB
      usage:   1.2 TiB used, 5.3 TiB / 6.5 TiB avail
      pgs:     1418/251600 objects degraded (0.564%)
              728 active+clean
              145  active+undersized
              72  active+undersized+degraded
    io:
      client:   61 KiB/s rd, 503 KiB/s wr, 18 op/s rd, 76 op/s wr

  ID  HOST   USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
  0  cc1   46.6G   325G     15      376k      0        0   exists,up
  1  cc1   85.7G   286G      4     23.2k      1       20   exists,up
  2  cc1   75.4G   296G     16     72.7k      1        0   exists,up
  3  cc1   70.8G   301G      5     60.7k      0        0   exists,up
  4  cc1   62.5G   309G      7     54.3k      1        0   exists,up
  5  cc1   66.9G   305G      0        0       1       90   exists,up
  6  cc2   73.2G   298G      6     46.3k      0        0   exists,up
  7  cc2   5575M   366G      0        0       0        0   exists
  8  cc2   77.5G   294G      0     5734       1        0   exists,up
  9  cc2   97.4G   274G     13      136k      0        0   exists,up
  10  cc2   80.5G   291G      6     55.1k      0        0   exists,up
  11  cc2   68.4G   303G      5     39.1k      1        0   exists,up
  12  cc3   84.1G   288G      0        0       1        0   exists,up
  13  cc3   52.1G   320G      1     8210       1     48.0k  exists,up
  14  cc3   62.8G   309G     13      116k      2     44.0k  exists,up
  15  cc3   51.1G   321G      5     56.7k      2        0   exists,up
  16  cc3   69.0G   303G      3     37.7k      2      135   exists,up
  17  cc3   87.1G   285G      8      372k      0        0   exists,up

Remove OSDs

Use command storage remove_osd to remove down OSDs.

cc1> storage remove_osd
  Enter osd id to be removed:
  1: osd.7
  Enter index: 1
  Enter 'YES' to confirm: YES
cc1>

Verify Storage Status

After removing failed OSDs, check the storage health with command storage status

The health status should be showing HEALTH_OK.

cc1> storage status
    cluster:
      id:     c6e64c49-09cf-463b-9d1c-b6645b4b3b85
      health: HEALTH_OK

    services:
      mon: 3 daemons, quorum cc1,cc2,cc3 (age 93m)
      mgr: cc1(active, since 92m), standbys: cc3, cc2
      mds: 1/1 daemons up, 1 standby, 1 hot standby
      osd: 17 osds: 17 up (since 60s), 17 in (since 2m); 10 remapped pgs
      rgw: 3 daemons active (3 hosts, 1 zones)

    data:
      volumes: 1/1 healthy
      pools:   25 pools, 945 pgs
      objects: 87.82k objects, 449 GiB
      usage:   1.2 TiB used, 5.0 TiB / 6.2 TiB avail
      pgs:     1250/251643 objects misplaced (0.497%)
              935 active+clean
              10  active+remapped+backfilling

    io:
      client:   340 KiB/s rd, 428 KiB/s wr, 49 op/s rd, 74 op/s wr
      recovery: 151 MiB/s, 21 objects/s

  ID  HOST   USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE
  0  cc1   46.6G   325G     15      376k      0        0   exists,up
  1  cc1   85.7G   286G      4     23.2k      1       20   exists,up
  2  cc1   75.4G   296G     16     72.7k      1        0   exists,up
  3  cc1   70.8G   301G      5     60.7k      0        0   exists,up
  4  cc1   62.5G   309G      7     54.3k      1        0   exists,up
  5  cc1   66.9G   305G      0        0       1       90   exists,up
  6  cc2   73.2G   298G      6     46.3k      0        0   exists,up
  8  cc2   77.5G   294G      0     5734       1        0   exists,up
  9  cc2   97.4G   274G     13      136k      0        0   exists,up
  10  cc2   80.5G   291G      6     55.1k      0        0   exists,up
  11  cc2   68.4G   303G      5     39.1k      1        0   exists,up
  12  cc3   84.1G   288G      0        0       1        0   exists,up
  13  cc3   52.1G   320G      1     8210       1     48.0k  exists,up
  14  cc3   62.8G   309G     13      116k      2     44.0k  exists,up
  15  cc3   51.1G   321G      5     56.7k      2        0   exists,up
  16  cc3   69.0G   303G      3     37.7k      2      135   exists,up
  17  cc3   87.1G   285G      8      372k      0        0   exists,up

Accessing desired Node via SSH​

Check Storage Status​

Remove OSDs​

Verify Storage Status​

Accessing desired Node via SSH

Check Storage Status

Remove OSDs

Verify Storage Status