Ssd-rocksdb-v1 storage engine runs out of memory

Hi,in the process of using the ssd-rocksdb-v1 storage engine to test the write with the ycsb tool, OOM occurred.
The cluster deployment is as follows:

  • FDB version 7.1.25
  • Cluster size total 18 nodes, configured with 3DC mode three_datacenter, each DC has 6 nodes.
  • Each node contains 12 ssds, 8 are used for storage services, 1 is used for log services, and 3 are used for stateless services.
  • The memory configuration is 10GiB, the cache-memory configuration is 4GiB.
  • K8S deployment is used, and a process is started by the fdbserver binary in a single pod with limit CPU 18G.

When OOM occurs, it can be found from the OS log that the fdbserver process has used 18GB of memory, which exceeds memory + cache-memory.

fdbcli status:

fdb> status details 

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - three_datacenter
  Storage engine         - ssd-rocksdb-v1
  Coordinators           - 7
  Desired Logs           - 12
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 216
  Zones                  - 18
  Machines               - 18
  Memory availability    - 10.0 GB per process on machine with least available
  Retransmissions rate   - 57 Hz
  Fault Tolerance        - 3 machines
  Server time            - 01/30/23 05:43:34

Data:
  Replication health     - Healthy (Rebalancing)
  Moving data            - 0.409 GB
  Sum of key-value sizes - 182.220 GB
  Disk space used        - 1.211 TB

Operating space:
  Storage server         - 840.2 GB free on most full server
  Log server             - 849.5 GB free on most full server

Workload:
  Read rate              - 1599 Hz
  Write rate             - 20 Hz
  Transactions started   - 5 Hz
  Transactions committed - 1 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:
  10.181.159.41:5500     (  2% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.41:5504     (  4% cpu;  3% machine; 0.081 Gbps;  1% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.41:5508     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.181.159.41:5512     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.41:5516     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.41:5520     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.41:5524     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.41:5528     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.41:6500     (  2% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.41:7500     (  0% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.41:7501     (  1% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.41:7502     (  2% cpu;  3% machine; 0.081 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.46:5500     (  1% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.46:5504     (  9% cpu;  6% machine; 0.152 Gbps; 13% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.46:5508     (  2% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.181.159.46:5512     (  4% cpu;  6% machine; 0.152 Gbps;  1% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.46:5516     (  2% cpu;  6% machine; 0.152 Gbps;  1% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.46:5520     ( 30% cpu;  6% machine; 0.152 Gbps; 31% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.46:5524     (  9% cpu;  6% machine; 0.152 Gbps; 13% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.46:5528     (  1% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.46:6500     (  2% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.46:7500     (  2% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.46:7501     (  0% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.46:7502     (  0% cpu;  6% machine; 0.152 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.47:5500     (  6% cpu;  6% machine; 0.006 Gbps;  6% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.47:5504     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.47:5508     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.47:5512     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.47:5516     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.47:5520     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.47:5524     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.47:5528     (  1% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 2.8 GB / 10.0 GB RAM  )
  10.181.159.47:6500     (  2% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.47:7500     (  3% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.47:7501     (  0% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.47:7502     (  0% cpu;  6% machine; 0.006 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.48:5500     ( 49% cpu;  6% machine; 0.207 Gbps; 55% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.48:5504     (  5% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.48:5508     (  1% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.181.159.48:5512     (  1% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.48:5516     (  1% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.48:5520     (  2% cpu;  6% machine; 0.207 Gbps;  2% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.48:5524     (  1% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.48:5528     (  1% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.48:6500     (  2% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.48:7500     (  8% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.48:7501     (  0% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.48:7502     (  0% cpu;  6% machine; 0.207 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.64:5500     ( 33% cpu;  7% machine; 0.172 Gbps; 35% disk IO; 0.6 GB / 10.0 GB RAM  )
  10.181.159.64:5504     (  1% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.181.159.64:5508     ( 36% cpu;  7% machine; 0.172 Gbps; 33% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.181.159.64:5512     (  1% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.181.159.64:5516     (  1% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.64:5520     (  1% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.64:5524     (  1% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.64:5528     (  1% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.64:6500     (  0% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.64:7500     (  2% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.64:7501     (  0% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.64:7502     (  0% cpu;  7% machine; 0.172 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.65:5500     ( 12% cpu; 11% machine; 0.292 Gbps;  1% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.65:5504     (  2% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.65:5508     (  3% cpu; 11% machine; 0.292 Gbps;  1% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.65:5512     (  2% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.65:5516     (  9% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.65:5520     (  2% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.65:5524     (  2% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.181.159.65:5528     (  2% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.181.159.65:6500     (  0% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.181.159.65:7500     (  5% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.65:7501     (  1% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.181.159.65:7502     (  0% cpu; 11% machine; 0.292 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.44:5500     (  8% cpu;  9% machine; 0.224 Gbps;  3% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.195.152.44:5504     (  1% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.44:5508     (  1% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.44:5512     (  1% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.44:5516     (  1% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.44:5520     (  3% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.44:5524     (  4% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.44:5528     (  2% cpu;  9% machine; 0.224 Gbps;  1% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.44:6500     (  2% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.152.44:7500     (  0% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.44:7501     (  0% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.44:7502     (  0% cpu;  9% machine; 0.224 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.45:5500     ( 34% cpu; 10% machine; 0.136 Gbps; 36% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.45:5504     (  1% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.45:5508     (  1% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.45:5512     (  1% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.45:5516     (  2% cpu; 10% machine; 0.136 Gbps;  1% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.45:5520     (  1% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.45:5524     (  4% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.45:5528     (  1% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.45:6500     (  2% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.152.45:7500     (  0% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.45:7501     (  0% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.45:7502     (  0% cpu; 10% machine; 0.136 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.47:5500     (  2% cpu;  6% machine; 0.144 Gbps;  1% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.47:5504     (  4% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.47:5508     (  1% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.47:5512     (  1% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.47:5516     (  1% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.47:5520     (  1% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.47:5524     (  1% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.47:5528     (  1% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.47:6500     (  0% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.152.47:7500     (  0% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.47:7501     (  0% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.47:7502     (  0% cpu;  6% machine; 0.144 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.48:5500     (  1% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.48:5504     (  2% cpu; 10% machine; 0.229 Gbps;  1% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.48:5508     (  1% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.48:5512     (  1% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.48:5516     (  1% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.48:5520     (  4% cpu; 10% machine; 0.229 Gbps;  1% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.48:5524     (  1% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.195.152.48:5528     (  1% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.48:6500     (  0% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.48:7500     (  0% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.48:7501     (  0% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.48:7502     (  0% cpu; 10% machine; 0.229 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.50:5500     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.50:5504     (  4% cpu; 11% machine; 0.149 Gbps;  1% disk IO; 2.8 GB / 10.0 GB RAM  )
  10.195.152.50:5508     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.50:5512     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.50:5516     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.50:5520     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.50:5524     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.152.50:5528     (  1% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.50:6500     (  2% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.152.50:7500     (  0% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.50:7501     (  0% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.50:7502     (  0% cpu; 11% machine; 0.149 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.51:5500     (  1% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.51:5504     (  1% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.51:5508     (  2% cpu; 13% machine; 0.014 Gbps;  1% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.195.152.51:5512     (  1% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.51:5516     (  1% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.51:5520     (  1% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.51:5524     (  2% cpu; 13% machine; 0.014 Gbps;  1% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.152.51:5528     (  4% cpu; 13% machine; 0.014 Gbps;  3% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.152.51:6500     (  2% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.152.51:7500     (  0% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.152.51:7501     (  1% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.152.51:7502     (  0% cpu; 13% machine; 0.014 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.44:5500     ( 49% cpu;  9% machine; 0.158 Gbps; 56% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.44:5504     (  4% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.44:5508     (  1% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.154.44:5512     (  1% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.8 GB / 10.0 GB RAM  )
  10.195.154.44:5516     (  1% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.44:5520     (  1% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.44:5524     (  1% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.44:5528     (  1% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.44:6500     (  2% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.154.44:7500     (  0% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.44:7501     (  0% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.44:7502     (  0% cpu;  9% machine; 0.158 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.45:5500     (  1% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.45:5504     (  7% cpu;  7% machine; 0.157 Gbps;  3% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.45:5508     (  4% cpu;  7% machine; 0.157 Gbps;  1% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.195.154.45:5512     (  1% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.45:5516     (  1% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.45:5520     (  3% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.45:5524     (  2% cpu;  7% machine; 0.157 Gbps;  1% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.45:5528     ( 13% cpu;  7% machine; 0.157 Gbps; 12% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.45:6500     (  2% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.154.45:7500     (  0% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.45:7501     (  0% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.45:7502     (  0% cpu;  7% machine; 0.157 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.46:5500     (  1% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.46:5504     (  1% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.3 GB / 10.0 GB RAM  )
  10.195.154.46:5508     (  3% cpu;  4% machine; 0.270 Gbps;  6% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.46:5512     (  1% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.46:5516     (  1% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.46:5520     (  1% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.46:5524     (  4% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.46:5528     (  4% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 2.4 GB / 10.0 GB RAM  )
  10.195.154.46:6500     (  0% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.154.46:7500     (  0% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.46:7501     (  0% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.46:7502     (  0% cpu;  4% machine; 0.270 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.47:5500     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.47:5504     ( 14% cpu;  9% machine; 0.013 Gbps; 13% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.47:5508     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.47:5512     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.47:5516     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.47:5520     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.47:5524     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.47:5528     (  1% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 2.7 GB / 10.0 GB RAM  )
  10.195.154.47:6500     (  2% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.154.47:7500     (  0% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.47:7501     (  0% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.47:7502     (  0% cpu;  9% machine; 0.013 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.48:5500     (  1% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.48:5504     (  2% cpu;  8% machine; 0.003 Gbps;  1% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.48:5508     (  1% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.48:5512     (  1% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.48:5516     (  1% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.48:5520     (  7% cpu;  8% machine; 0.003 Gbps; 10% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.48:5524     (  1% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.48:5528     (  1% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.48:6500     (  2% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 0.2 GB / 10.0 GB RAM  )
  10.195.154.48:7500     (  0% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.48:7501     (  0% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.48:7502     (  0% cpu;  8% machine; 0.003 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.50:5500     (  1% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.50:5504     (  4% cpu;  7% machine; 0.288 Gbps;  1% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.50:5508     ( 36% cpu;  7% machine; 0.288 Gbps; 32% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.50:5512     (  1% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.50:5516     (  1% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.50:5520     (  1% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.50:5524     (  1% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 2.6 GB / 10.0 GB RAM  )
  10.195.154.50:5528     (  1% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 2.5 GB / 10.0 GB RAM  )
  10.195.154.50:6500     (  0% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.50:7500     (  0% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.50:7501     (  0% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )
  10.195.154.50:7502     (  0% cpu;  7% machine; 0.288 Gbps;  0% disk IO; 0.1 GB / 10.0 GB RAM  )

Coordination servers:
  10.181.159.41:7502  (reachable)
  10.181.159.47:6500  (reachable)
  10.181.159.65:7501  (reachable)
  10.195.152.48:5500  (reachable)
  10.195.152.50:5508  (reachable)
  10.195.152.51:7501  (reachable)
  10.195.154.46:5520  (reachable)

Client time: 01/30/23 05:44:29

The way of storage server runs

UID         PID   PPID  C STIME TTY          TIME CMD
root          1      0  9 09:10 ?        00:04:29 fdbserver --memory 10GiB --cache-memory 4GiB --seed-connection-string docker:docker@10.181.159.41:5500 --cluster-file /etc/foundationdb/fdb.cluster --listen-address 0.0.0.0:5506 --public-address 10.195.152.51:5506 --locality-diskid sdb --datadir /var/fdb/data --logdir /var/fdb/logs --locality-machineid hostname-01 --locality-zoneid hostname-01 --class storage --locality-dcid ningbo1

OS log:

Jan 30 15:55:51 hostname-01 kernel: rocksdb:low invoked oom-killer: gfp_mask=0x6201ca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE), nodemask=(null), order=0, oom_score_adj=999
Jan 30 15:55:51 hostname-01 kernel: rocksdb:low cpuset=0e78f7a7e07b1325f491dd7f2e39462ba3775d5cf9585d1a2961e828ce0a27f5 mems_allowed=0-1
Jan 30 15:55:51 hostname-01 kernel: CPU: 58 PID: 454919 Comm: rocksdb:low Kdump: loaded Not tainted 4.19.25-206.el7_6.bclinux.x86_64 #1
Jan 30 15:55:51 hostname-01 kernel: Hardware name: ZTE R5500 G4/R5500G4, BIOS 03.15.0100_70562 03/04/2020
Jan 30 15:55:51 hostname-01 kernel: Call Trace:
Jan 30 15:55:51 hostname-01 kernel:  dump_stack+0x5a/0x73
Jan 30 15:55:51 hostname-01 kernel:  dump_header+0x77/0x29c
Jan 30 15:55:51 hostname-01 kernel:  ? mem_cgroup_scan_tasks+0x8f/0xe0
Jan 30 15:55:51 hostname-01 kernel:  oom_kill_process+0x25e/0x290
Jan 30 15:55:51 hostname-01 kernel:  out_of_memory+0x134/0x4b0
Jan 30 15:55:51 hostname-01 kernel:  mem_cgroup_out_of_memory+0x49/0x80
Jan 30 15:55:51 hostname-01 kernel:  try_charge+0x6f2/0x760
Jan 30 15:55:51 hostname-01 kernel:  mem_cgroup_try_charge+0x6f/0x220
Jan 30 15:55:51 hostname-01 kernel:  __add_to_page_cache_locked+0x146/0x260
Jan 30 15:55:51 hostname-01 kernel:  add_to_page_cache_lru+0x49/0xd0
Jan 30 15:55:51 hostname-01 kernel:  pagecache_get_page+0x7e/0x270
Jan 30 15:55:51 hostname-01 kernel:  grab_cache_page_write_begin+0x1f/0x40
Jan 30 15:55:51 hostname-01 kernel:  ext4_da_write_begin+0xdf/0x4f0 [ext4]
Jan 30 15:55:51 hostname-01 kernel:  generic_perform_write+0xc2/0x1c0
Jan 30 15:55:51 hostname-01 kernel:  __generic_file_write_iter+0x184/0x1c0
Jan 30 15:55:51 hostname-01 kernel:  ext4_file_write_iter+0xc6/0x410 [ext4]
Jan 30 15:55:51 hostname-01 kernel:  ? __switch_to_asm+0x40/0x70
Jan 30 15:55:51 hostname-01 kernel:  __vfs_write+0x112/0x1a0
Jan 30 15:55:51 hostname-01 kernel:  vfs_write+0xad/0x1a0
Jan 30 15:55:51 hostname-01 kernel:  ksys_write+0x52/0xc0
Jan 30 15:55:51 hostname-01 kernel:  do_syscall_64+0x5b/0x170
Jan 30 15:55:51 hostname-01 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jan 30 15:55:51 hostname-01 kernel: RIP: 0033:0x7fe8700726fd
Jan 30 15:55:51 hostname-01 kernel: Code: cd 20 00 00 75 10 b8 01 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 4e fd ff ff 48 89 04 24 b8 01 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 97 fd ff ff 48 89 d0 48 83 c4 08 48 3d 01
Jan 30 15:55:51 hostname-01 kernel: RSP: 002b:00007fe86a1f53a0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Jan 30 15:55:51 hostname-01 kernel: RAX: ffffffffffffffda RBX: 00007fe86a1f54b0 RCX: 00007fe8700726fd
Jan 30 15:55:51 hostname-01 kernel: RDX: 00000000000ffa8d RSI: 00007fe7ba334000 RDI: 000000000000001d
Jan 30 15:55:51 hostname-01 kernel: RBP: 00007fe86a1f5400 R08: 0000000000000000 R09: 0000000000000000
Jan 30 15:55:51 hostname-01 kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007fe7ba334000
Jan 30 15:55:51 hostname-01 kernel: R13: 00000000000ffa8d R14: 00000000000ffa8d R15: 00007fe856aa4c50
Jan 30 15:55:51 hostname-01 kernel: Task in /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/0e78f7a7e07b1325f491dd7f2e39462ba3775d5cf9585d1a2961e828ce0a27f5 killed as a result of limit of /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11
Jan 30 15:55:51 hostname-01 kernel: memory: usage 17578124kB, limit 17578124kB, failcnt 74
Jan 30 15:55:51 hostname-01 kernel: memory+swap: usage 17578124kB, limit 9007199254740988kB, failcnt 0
Jan 30 15:55:51 hostname-01 kernel: kmem: usage 468896kB, limit 9007199254740988kB, failcnt 0
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/b01c03eb030730627899d164596534e1bdebeb39c0ce1f67d6bb97861ae4f10f: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/0d6a6ea3b8882339088e3b23159b54352315385c2b57b12e5fbaf41d81d7e677: cache:6408KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:616KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/0e78f7a7e07b1325f491dd7f2e39462ba3775d5cf9585d1a2961e828ce0a27f5: cache:15723104KB rss:1369772KB rss_huge:0KB shmem:72KB mapped_file:0KB dirty:14388KB writeback:1452KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Tasks state (memory values in pages):
Jan 30 15:55:51 hostname-01 kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Jan 30 15:55:51 hostname-01 kernel: [ 492467]     0 492467      242        1    28672        0          -998 pause
Jan 30 15:55:51 hostname-01 kernel: [ 454463]     0 454463   769526   354219  5349376        0           999 fdbserver
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup out of memory: Kill process 454463 (fdbserver) score 1079 or sacrifice child
Jan 30 15:55:51 hostname-01 kernel: Killed process 454463 (fdbserver) total-vm:3078104kB, anon-rss:1371748kB, file-rss:45128kB, shmem-rss:0kB
Jan 30 15:55:51 hostname-01 kernel: oom_reaper: reaped process 454463 (fdbserver), now anon-rss:0kB, file-rss:20kB, shmem-rss:0kB

I know the rocksdb storage engine is still an experimental feature, so what can we do to improve it?

1 Like