Hi,in the process of using the ssd-rocksdb-v1
storage engine to test the write with the ycsb
tool, OOM occurred.
The cluster deployment is as follows:
- FDB version 7.1.25
- Cluster size total 18 nodes, configured with 3DC mode
three_datacenter
, each DC has 6 nodes. - Each node contains 12 ssds, 8 are used for storage services, 1 is used for log services, and 3 are used for stateless services.
- The
memory
configuration is 10GiB, thecache-memory
configuration is 4GiB. - K8S deployment is used, and a process is started by the fdbserver binary in a single pod with limit CPU 18G.
When OOM occurs, it can be found from the OS log that the fdbserver process has used 18GB of memory, which exceeds memory
+ cache-memory
.
fdbcli status:
fdb> status details
Using cluster file `/etc/foundationdb/fdb.cluster'.
Configuration:
Redundancy mode - three_datacenter
Storage engine - ssd-rocksdb-v1
Coordinators - 7
Desired Logs - 12
Usable Regions - 1
Cluster:
FoundationDB processes - 216
Zones - 18
Machines - 18
Memory availability - 10.0 GB per process on machine with least available
Retransmissions rate - 57 Hz
Fault Tolerance - 3 machines
Server time - 01/30/23 05:43:34
Data:
Replication health - Healthy (Rebalancing)
Moving data - 0.409 GB
Sum of key-value sizes - 182.220 GB
Disk space used - 1.211 TB
Operating space:
Storage server - 840.2 GB free on most full server
Log server - 849.5 GB free on most full server
Workload:
Read rate - 1599 Hz
Write rate - 20 Hz
Transactions started - 5 Hz
Transactions committed - 1 Hz
Conflict rate - 0 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
Process performance details:
10.181.159.41:5500 ( 2% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.41:5504 ( 4% cpu; 3% machine; 0.081 Gbps; 1% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.41:5508 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.181.159.41:5512 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.41:5516 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.41:5520 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.41:5524 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.41:5528 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.41:6500 ( 2% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.41:7500 ( 0% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.41:7501 ( 1% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.41:7502 ( 2% cpu; 3% machine; 0.081 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.46:5500 ( 1% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.46:5504 ( 9% cpu; 6% machine; 0.152 Gbps; 13% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.46:5508 ( 2% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.181.159.46:5512 ( 4% cpu; 6% machine; 0.152 Gbps; 1% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.46:5516 ( 2% cpu; 6% machine; 0.152 Gbps; 1% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.46:5520 ( 30% cpu; 6% machine; 0.152 Gbps; 31% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.46:5524 ( 9% cpu; 6% machine; 0.152 Gbps; 13% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.46:5528 ( 1% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.46:6500 ( 2% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.46:7500 ( 2% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.46:7501 ( 0% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.46:7502 ( 0% cpu; 6% machine; 0.152 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.47:5500 ( 6% cpu; 6% machine; 0.006 Gbps; 6% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.47:5504 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.47:5508 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.47:5512 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.47:5516 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.47:5520 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.47:5524 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.47:5528 ( 1% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 2.8 GB / 10.0 GB RAM )
10.181.159.47:6500 ( 2% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.47:7500 ( 3% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.47:7501 ( 0% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.47:7502 ( 0% cpu; 6% machine; 0.006 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.48:5500 ( 49% cpu; 6% machine; 0.207 Gbps; 55% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.48:5504 ( 5% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.48:5508 ( 1% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.181.159.48:5512 ( 1% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.48:5516 ( 1% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.48:5520 ( 2% cpu; 6% machine; 0.207 Gbps; 2% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.48:5524 ( 1% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.48:5528 ( 1% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.48:6500 ( 2% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.48:7500 ( 8% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.48:7501 ( 0% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.48:7502 ( 0% cpu; 6% machine; 0.207 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.64:5500 ( 33% cpu; 7% machine; 0.172 Gbps; 35% disk IO; 0.6 GB / 10.0 GB RAM )
10.181.159.64:5504 ( 1% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 2.7 GB / 10.0 GB RAM )
10.181.159.64:5508 ( 36% cpu; 7% machine; 0.172 Gbps; 33% disk IO; 2.4 GB / 10.0 GB RAM )
10.181.159.64:5512 ( 1% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.181.159.64:5516 ( 1% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.64:5520 ( 1% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.64:5524 ( 1% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.64:5528 ( 1% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.64:6500 ( 0% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.64:7500 ( 2% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.64:7501 ( 0% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.64:7502 ( 0% cpu; 7% machine; 0.172 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.65:5500 ( 12% cpu; 11% machine; 0.292 Gbps; 1% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.65:5504 ( 2% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.65:5508 ( 3% cpu; 11% machine; 0.292 Gbps; 1% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.65:5512 ( 2% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.65:5516 ( 9% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.65:5520 ( 2% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.65:5524 ( 2% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.181.159.65:5528 ( 2% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.181.159.65:6500 ( 0% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.181.159.65:7500 ( 5% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.65:7501 ( 1% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.181.159.65:7502 ( 0% cpu; 11% machine; 0.292 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.44:5500 ( 8% cpu; 9% machine; 0.224 Gbps; 3% disk IO; 2.7 GB / 10.0 GB RAM )
10.195.152.44:5504 ( 1% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.44:5508 ( 1% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.44:5512 ( 1% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.44:5516 ( 1% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.44:5520 ( 3% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.44:5524 ( 4% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.44:5528 ( 2% cpu; 9% machine; 0.224 Gbps; 1% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.44:6500 ( 2% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.152.44:7500 ( 0% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.44:7501 ( 0% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.44:7502 ( 0% cpu; 9% machine; 0.224 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.45:5500 ( 34% cpu; 10% machine; 0.136 Gbps; 36% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.45:5504 ( 1% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.45:5508 ( 1% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.45:5512 ( 1% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.45:5516 ( 2% cpu; 10% machine; 0.136 Gbps; 1% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.45:5520 ( 1% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.45:5524 ( 4% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.45:5528 ( 1% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.45:6500 ( 2% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.152.45:7500 ( 0% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.45:7501 ( 0% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.45:7502 ( 0% cpu; 10% machine; 0.136 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.47:5500 ( 2% cpu; 6% machine; 0.144 Gbps; 1% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.47:5504 ( 4% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.47:5508 ( 1% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.47:5512 ( 1% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.47:5516 ( 1% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.47:5520 ( 1% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.47:5524 ( 1% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.47:5528 ( 1% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.47:6500 ( 0% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.152.47:7500 ( 0% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.47:7501 ( 0% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.47:7502 ( 0% cpu; 6% machine; 0.144 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.48:5500 ( 1% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.48:5504 ( 2% cpu; 10% machine; 0.229 Gbps; 1% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.48:5508 ( 1% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.48:5512 ( 1% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.48:5516 ( 1% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.48:5520 ( 4% cpu; 10% machine; 0.229 Gbps; 1% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.48:5524 ( 1% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 2.7 GB / 10.0 GB RAM )
10.195.152.48:5528 ( 1% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.48:6500 ( 0% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.48:7500 ( 0% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.48:7501 ( 0% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.48:7502 ( 0% cpu; 10% machine; 0.229 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.50:5500 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.50:5504 ( 4% cpu; 11% machine; 0.149 Gbps; 1% disk IO; 2.8 GB / 10.0 GB RAM )
10.195.152.50:5508 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.50:5512 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.50:5516 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.50:5520 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.50:5524 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.152.50:5528 ( 1% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.50:6500 ( 2% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.152.50:7500 ( 0% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.50:7501 ( 0% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.50:7502 ( 0% cpu; 11% machine; 0.149 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.51:5500 ( 1% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.51:5504 ( 1% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.51:5508 ( 2% cpu; 13% machine; 0.014 Gbps; 1% disk IO; 2.7 GB / 10.0 GB RAM )
10.195.152.51:5512 ( 1% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.51:5516 ( 1% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.51:5520 ( 1% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.51:5524 ( 2% cpu; 13% machine; 0.014 Gbps; 1% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.152.51:5528 ( 4% cpu; 13% machine; 0.014 Gbps; 3% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.152.51:6500 ( 2% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.152.51:7500 ( 0% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.152.51:7501 ( 1% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.152.51:7502 ( 0% cpu; 13% machine; 0.014 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.44:5500 ( 49% cpu; 9% machine; 0.158 Gbps; 56% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.44:5504 ( 4% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.44:5508 ( 1% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.154.44:5512 ( 1% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.8 GB / 10.0 GB RAM )
10.195.154.44:5516 ( 1% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.44:5520 ( 1% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.44:5524 ( 1% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.44:5528 ( 1% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.44:6500 ( 2% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.154.44:7500 ( 0% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.44:7501 ( 0% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.44:7502 ( 0% cpu; 9% machine; 0.158 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.45:5500 ( 1% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.45:5504 ( 7% cpu; 7% machine; 0.157 Gbps; 3% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.45:5508 ( 4% cpu; 7% machine; 0.157 Gbps; 1% disk IO; 2.7 GB / 10.0 GB RAM )
10.195.154.45:5512 ( 1% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.45:5516 ( 1% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.45:5520 ( 3% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.45:5524 ( 2% cpu; 7% machine; 0.157 Gbps; 1% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.45:5528 ( 13% cpu; 7% machine; 0.157 Gbps; 12% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.45:6500 ( 2% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.154.45:7500 ( 0% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.45:7501 ( 0% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.45:7502 ( 0% cpu; 7% machine; 0.157 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.46:5500 ( 1% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.46:5504 ( 1% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.3 GB / 10.0 GB RAM )
10.195.154.46:5508 ( 3% cpu; 4% machine; 0.270 Gbps; 6% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.46:5512 ( 1% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.46:5516 ( 1% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.46:5520 ( 1% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.46:5524 ( 4% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.46:5528 ( 4% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 2.4 GB / 10.0 GB RAM )
10.195.154.46:6500 ( 0% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.154.46:7500 ( 0% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.46:7501 ( 0% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.46:7502 ( 0% cpu; 4% machine; 0.270 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.47:5500 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.47:5504 ( 14% cpu; 9% machine; 0.013 Gbps; 13% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.47:5508 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.47:5512 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.47:5516 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.47:5520 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.47:5524 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.47:5528 ( 1% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 2.7 GB / 10.0 GB RAM )
10.195.154.47:6500 ( 2% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.154.47:7500 ( 0% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.47:7501 ( 0% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.47:7502 ( 0% cpu; 9% machine; 0.013 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.48:5500 ( 1% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.48:5504 ( 2% cpu; 8% machine; 0.003 Gbps; 1% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.48:5508 ( 1% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.48:5512 ( 1% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.48:5516 ( 1% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.48:5520 ( 7% cpu; 8% machine; 0.003 Gbps; 10% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.48:5524 ( 1% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.48:5528 ( 1% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.48:6500 ( 2% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 0.2 GB / 10.0 GB RAM )
10.195.154.48:7500 ( 0% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.48:7501 ( 0% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.48:7502 ( 0% cpu; 8% machine; 0.003 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.50:5500 ( 1% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.50:5504 ( 4% cpu; 7% machine; 0.288 Gbps; 1% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.50:5508 ( 36% cpu; 7% machine; 0.288 Gbps; 32% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.50:5512 ( 1% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.50:5516 ( 1% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.50:5520 ( 1% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.50:5524 ( 1% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 2.6 GB / 10.0 GB RAM )
10.195.154.50:5528 ( 1% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 2.5 GB / 10.0 GB RAM )
10.195.154.50:6500 ( 0% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.50:7500 ( 0% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.50:7501 ( 0% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
10.195.154.50:7502 ( 0% cpu; 7% machine; 0.288 Gbps; 0% disk IO; 0.1 GB / 10.0 GB RAM )
Coordination servers:
10.181.159.41:7502 (reachable)
10.181.159.47:6500 (reachable)
10.181.159.65:7501 (reachable)
10.195.152.48:5500 (reachable)
10.195.152.50:5508 (reachable)
10.195.152.51:7501 (reachable)
10.195.154.46:5520 (reachable)
Client time: 01/30/23 05:44:29
The way of storage server runs
UID PID PPID C STIME TTY TIME CMD
root 1 0 9 09:10 ? 00:04:29 fdbserver --memory 10GiB --cache-memory 4GiB --seed-connection-string docker:docker@10.181.159.41:5500 --cluster-file /etc/foundationdb/fdb.cluster --listen-address 0.0.0.0:5506 --public-address 10.195.152.51:5506 --locality-diskid sdb --datadir /var/fdb/data --logdir /var/fdb/logs --locality-machineid hostname-01 --locality-zoneid hostname-01 --class storage --locality-dcid ningbo1
OS log:
Jan 30 15:55:51 hostname-01 kernel: rocksdb:low invoked oom-killer: gfp_mask=0x6201ca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE), nodemask=(null), order=0, oom_score_adj=999
Jan 30 15:55:51 hostname-01 kernel: rocksdb:low cpuset=0e78f7a7e07b1325f491dd7f2e39462ba3775d5cf9585d1a2961e828ce0a27f5 mems_allowed=0-1
Jan 30 15:55:51 hostname-01 kernel: CPU: 58 PID: 454919 Comm: rocksdb:low Kdump: loaded Not tainted 4.19.25-206.el7_6.bclinux.x86_64 #1
Jan 30 15:55:51 hostname-01 kernel: Hardware name: ZTE R5500 G4/R5500G4, BIOS 03.15.0100_70562 03/04/2020
Jan 30 15:55:51 hostname-01 kernel: Call Trace:
Jan 30 15:55:51 hostname-01 kernel: dump_stack+0x5a/0x73
Jan 30 15:55:51 hostname-01 kernel: dump_header+0x77/0x29c
Jan 30 15:55:51 hostname-01 kernel: ? mem_cgroup_scan_tasks+0x8f/0xe0
Jan 30 15:55:51 hostname-01 kernel: oom_kill_process+0x25e/0x290
Jan 30 15:55:51 hostname-01 kernel: out_of_memory+0x134/0x4b0
Jan 30 15:55:51 hostname-01 kernel: mem_cgroup_out_of_memory+0x49/0x80
Jan 30 15:55:51 hostname-01 kernel: try_charge+0x6f2/0x760
Jan 30 15:55:51 hostname-01 kernel: mem_cgroup_try_charge+0x6f/0x220
Jan 30 15:55:51 hostname-01 kernel: __add_to_page_cache_locked+0x146/0x260
Jan 30 15:55:51 hostname-01 kernel: add_to_page_cache_lru+0x49/0xd0
Jan 30 15:55:51 hostname-01 kernel: pagecache_get_page+0x7e/0x270
Jan 30 15:55:51 hostname-01 kernel: grab_cache_page_write_begin+0x1f/0x40
Jan 30 15:55:51 hostname-01 kernel: ext4_da_write_begin+0xdf/0x4f0 [ext4]
Jan 30 15:55:51 hostname-01 kernel: generic_perform_write+0xc2/0x1c0
Jan 30 15:55:51 hostname-01 kernel: __generic_file_write_iter+0x184/0x1c0
Jan 30 15:55:51 hostname-01 kernel: ext4_file_write_iter+0xc6/0x410 [ext4]
Jan 30 15:55:51 hostname-01 kernel: ? __switch_to_asm+0x40/0x70
Jan 30 15:55:51 hostname-01 kernel: __vfs_write+0x112/0x1a0
Jan 30 15:55:51 hostname-01 kernel: vfs_write+0xad/0x1a0
Jan 30 15:55:51 hostname-01 kernel: ksys_write+0x52/0xc0
Jan 30 15:55:51 hostname-01 kernel: do_syscall_64+0x5b/0x170
Jan 30 15:55:51 hostname-01 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jan 30 15:55:51 hostname-01 kernel: RIP: 0033:0x7fe8700726fd
Jan 30 15:55:51 hostname-01 kernel: Code: cd 20 00 00 75 10 b8 01 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 4e fd ff ff 48 89 04 24 b8 01 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 97 fd ff ff 48 89 d0 48 83 c4 08 48 3d 01
Jan 30 15:55:51 hostname-01 kernel: RSP: 002b:00007fe86a1f53a0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Jan 30 15:55:51 hostname-01 kernel: RAX: ffffffffffffffda RBX: 00007fe86a1f54b0 RCX: 00007fe8700726fd
Jan 30 15:55:51 hostname-01 kernel: RDX: 00000000000ffa8d RSI: 00007fe7ba334000 RDI: 000000000000001d
Jan 30 15:55:51 hostname-01 kernel: RBP: 00007fe86a1f5400 R08: 0000000000000000 R09: 0000000000000000
Jan 30 15:55:51 hostname-01 kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007fe7ba334000
Jan 30 15:55:51 hostname-01 kernel: R13: 00000000000ffa8d R14: 00000000000ffa8d R15: 00007fe856aa4c50
Jan 30 15:55:51 hostname-01 kernel: Task in /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/0e78f7a7e07b1325f491dd7f2e39462ba3775d5cf9585d1a2961e828ce0a27f5 killed as a result of limit of /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11
Jan 30 15:55:51 hostname-01 kernel: memory: usage 17578124kB, limit 17578124kB, failcnt 74
Jan 30 15:55:51 hostname-01 kernel: memory+swap: usage 17578124kB, limit 9007199254740988kB, failcnt 0
Jan 30 15:55:51 hostname-01 kernel: kmem: usage 468896kB, limit 9007199254740988kB, failcnt 0
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/b01c03eb030730627899d164596534e1bdebeb39c0ce1f67d6bb97861ae4f10f: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/0d6a6ea3b8882339088e3b23159b54352315385c2b57b12e5fbaf41d81d7e677: cache:6408KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:616KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup stats for /kubepods/burstable/podfaf6d591-a4a4-4cd1-aa8d-d4906f29ec11/0e78f7a7e07b1325f491dd7f2e39462ba3775d5cf9585d1a2961e828ce0a27f5: cache:15723104KB rss:1369772KB rss_huge:0KB shmem:72KB mapped_file:0KB dirty:14388KB writeback:1452KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jan 30 15:55:51 hostname-01 kernel: Tasks state (memory values in pages):
Jan 30 15:55:51 hostname-01 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Jan 30 15:55:51 hostname-01 kernel: [ 492467] 0 492467 242 1 28672 0 -998 pause
Jan 30 15:55:51 hostname-01 kernel: [ 454463] 0 454463 769526 354219 5349376 0 999 fdbserver
Jan 30 15:55:51 hostname-01 kernel: Memory cgroup out of memory: Kill process 454463 (fdbserver) score 1079 or sacrifice child
Jan 30 15:55:51 hostname-01 kernel: Killed process 454463 (fdbserver) total-vm:3078104kB, anon-rss:1371748kB, file-rss:45128kB, shmem-rss:0kB
Jan 30 15:55:51 hostname-01 kernel: oom_reaper: reaped process 454463 (fdbserver), now anon-rss:0kB, file-rss:20kB, shmem-rss:0kB
I know the rocksdb storage engine is still an experimental feature, so what can we do to improve it?