Behaviour of Storage Server in case of bad disk from 6.2.30 onwards

Can someone answer my below queries in context of the below PR

  1. From 6.2.30 will the SS gets removed and restarted when io_timeout error is thrown (due to commit in SS taking more than 2 mins?)
  2. What happens in case of io error to SS process? Will it be in the hung state now?

Anyone any thoughts on my query?

I run many foundationDB instances in our testing environment. After upgrading foundationDB to 6.2.30, I see that the instances turn unhealthy more frequently with StorageServerFailed error. Generally, a restart of the foundationDB processes brings the cluster to a healthy state. Can this behavior be explained by the changes in the above PR?

I see from a comment on the PR that StorageServer will not be automatically restarted due to the changes in the PR in case of an io_error? What is the change in the case of StorageServer experience io_timeout? Is io_timeout treated as an io_error?

Did you see IoDegraded events more often?
What investigation did you do to figure out the behavior?
IIRC, there are two behavior changes in 6.2 (maybe 6.2.30):

  1. SS will prioritize write requests when it saw hot shard;
  2. SS does not automatically get recruited on io_timeout.

Both behavior changes do not seem to introduce negative impact to the cluster. @Evan can comment more

@mengxu Thanks for the reply. There were IoDegraded events while the storage failure is observed, which were not seen before. I also see other slowness-related errors at the same time.

image

The output of status details is as follows:

Using cluster file `/etc/foundationdb/fdb.cluster'.Configuration:
  Redundancy mode        - double
  Storage engine         - ssd-2
  Coordinators           - 3
  Desired Proxies        - 1
  Desired Logs           - 2
  Usable Regions         - 1Cluster:
  FoundationDB processes - 10 (less 0 excluded; 2 with errors)
  Zones                  - 5
  Machines               - 5
  Memory availability    - 15.3 GB per process on machine with least available
  Fault Tolerance        - 0 machines
  Server time            - 06/01/21 22:08:36Data:
  Replication health     - UNHEALTHY: No replicas remain of some data
  Moving data            - 102.309 GB
  Sum of key-value sizes - 143.216 GB
  Disk space used        - 219.980 GBOperating space:
  Storage server         - 376.3 GB free on most full server
  Log server             - 396.2 GB free on most full serverWorkload:
  Read rate              - 309 Hz
  Write rate             - 78 Hz
  Transactions started   - 176 Hz
  Transactions committed - 27 Hz
  Conflict rate          - 0 HzBackup and DR:
  Running backups        - 0
  Running DRs            - 0Process performance details:
  172.18.0.30:4500       (  1% cpu;  1% machine; 0.007 Gbps;  0% disk IO; 3.5 GB / 15.3 GB RAM  )
    Last logged error: StorageServerFailed: io_timeout at Tue Jun  1 02:58:35 2021
  172.18.0.30:4501       (  1% cpu;  1% machine; 0.007 Gbps;  0% disk IO; 0.3 GB / 15.3 GB RAM  )
  172.18.0.31:4500       (  2% cpu;  2% machine; 0.015 Gbps;  1% disk IO; 3.6 GB / 15.7 GB RAM  )
  172.18.0.31:4501       (  3% cpu;  2% machine; 0.015 Gbps;  1% disk IO; 0.4 GB / 15.7 GB RAM  )
  172.18.0.32:4500       (  1% cpu;  1% machine; 0.006 Gbps;  0% disk IO; 3.5 GB / 16.3 GB RAM  )
  172.18.0.32:4501       (  2% cpu;  1% machine; 0.006 Gbps;  0% disk IO; 0.4 GB / 16.3 GB RAM  )
  172.18.0.35:4500       (  3% cpu;  1% machine; 0.003 Gbps; 14% disk IO; 4.6 GB / 16.8 GB RAM  )
  172.18.0.35:4501       (  0% cpu;  1% machine; 0.003 Gbps; 14% disk IO; 0.2 GB / 16.8 GB RAM  )
  172.18.0.36:4500       (  2% cpu;  1% machine; 0.007 Gbps; 21% disk IO; 4.6 GB / 17.9 GB RAM  )
    Last logged error: StorageServerFailed: io_timeout at Tue Jun  1 02:58:39 2021
  172.18.0.36:4501       (  0% cpu;  1% machine; 0.007 Gbps; 11% disk IO; 0.3 GB / 17.9 GB RAM  )Coordination servers:
  172.18.0.30:4501  (reachable)
  172.18.0.31:4501  (reachable)
  172.18.0.32:4501  (reachable)Client time: 06/01/21 22:08:36WARNING: A single process is both a transaction log and a storage server.
  For best performance use dedicated disks for the transaction logs by setting process classes.

The clients connecting to FDB experience a lot of transaction failures during this time, which gets resolved when the FDB processes are manually restarted.