ERROR: Out of memory with Foundationdb 7.0.0

While using new storage engine (ssd-redwood-1-experimental) we have run into a problem where our transaction process fails with ERROR: Out of memory. We haven’t had such a problem on other clusters where we run 6.3.13 with (configure new single ssd)

When trying to fix, adding new transaction processes like

[fdbserver.4515]
class=transaction

I have noticed, that all disk-related processes fail in the same way. Although there is enough space on the disks.

Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p1 7.7T 1.2T 6.1T 17% /data/nvme0n1
/dev/nvme1n1p1 7.7T 1.2T 6.1T 17% /data/nvme1n1
/dev/nvme2n1p1 7.7T 1.3T 6.1T 17% /data/nvme2n1
/dev/nvme3n1p1 7.7T 6.1G 7.3T 1% /data/nvme3n1

This is our configuration file

[fdbserver.4500]
class=stateless
[fdbserver.4501]
class=storage
datadir=/data/nvme0n1/foundationdb/data/$ID
logdir=/data/nvme0n1/foundationdb
[fdbserver.4502]
class=storage
datadir=/data/nvme1n1/foundationdb/data/$ID
logdir=/data/nvme1n1/foundationdb
[fdbserver.4503]
class=storage
datadir=/data/nvme2n1/foundationdb/data/$ID
logdir=/data/nvme2n1/foundationdb
[fdbserver.4504]
class=transaction
datadir=/data/nvme3n1/foundationdb/data/$ID
logdir=/data/nvme3n1/foundationdb
[fdbserver.4505]
class=stateless
...
[fdbserver.4515]
class=stateless

Can the problem be with storage engine, or just incorrect configuration? Because as I see from the docs, one transaction process is too little and there should be at least 8.

The recommended minimum number of class=transaction (log server) processes is 8 (active) + 2 (standby) and the recommended minimum number for class=stateless processes is 1 (GRV proxy) + 3 (commit proxy) + 1 (resolver) + 1 (cluster controller) + 1 (master) + 2 (standby). It is better to spread the transaction and stateless processes across as many machines as possible.

It’s curious that using the redwood storage engine would cause transaction logs to OOM. FWIW I know that @SteavedHams has made a lot of improvements regarding memory usage in redwood since 7.0.0, but I don’t expect those to affect transaction log memory usage.

Redwood is never used on Transaction logs. Transaction logs do use a local storage engine for storing older unpopped mutations but this is always SQLite no matter what the configuration is as Redwood has not been tested for this use case yet.

So while Redwood in 7.0 does have memory budget issues and requires a larger memory limit compared to SQLite for the same cache memory setting, this should not affect the Transaction log processes at all.

The config above does not change the memory or cache_memory settings so increasing memory from the default such as memory = 12GiB or more is likely the way out of this situation.

Thank you for the response, I have played around with different settings. And this is what I came to:
indeed, the problem is with storage processes.

What’s more, what I see, they consume pretty a lot of memory. I have tried with up to 20GiB per process and they managed to eat it up instantly. Looking through documentation, I didn’t manage to find the correct way, to count the required memory size for the load on the fdb our services make.

Unfortunately we don’t have enough RAM on the servers for such consumption, so we might end up in switching back to ssd storage engine. But it would be nice to have the information about correct usage of ssd-redwood-1-experimental storage engine.

For the reference:
/etc/foundationdb/foundationdb.conf

[fdbserver.4500]
class=stateless

[fdbserver.4501]
class=storage
datadir=/data/nvme0n1/foundationdb/data/$ID
logdir=/data/nvme0n1/foundationdb
memory = 20GiB
cache_memory = 14GiB
[fdbserver.4502]
class=storage
datadir=/data/nvme1n1/foundationdb/data/$ID
logdir=/data/nvme1n1/foundationdb
memory = 20GiB
cache_memory = 14GiB
[fdbserver.4503]
class=storage
datadir=/data/nvme2n1/foundationdb/data/$ID
logdir=/data/nvme2n1/foundationdb
memory = 20GiB
cache_memory = 14GiB
[fdbserver.4504]
class=storage
datadir=/data/nvme3n1/foundationdb/data/$ID
logdir=/data/nvme3n1/foundationdb
memory = 20GiB
cache_memory = 14GiB

[fdbserver.4505]
class=stateless
#memory=8GiB
[fdbserver.4506]
class=stateless
#memory=8GiB
[fdbserver.4507]
class=stateless
#memory=8GiB
[fdbserver.4508]
class=stateless
#memory=8GiB
[fdbserver.4509]
class=stateless
#memory=8GiB
[fdbserver.4510]
class=transaction
#memory=8GiB
[fdbserver.4511]
class=transaction
[fdbserver.4512]
class=transaction
[fdbserver.4513]
class=transaction
[fdbserver.4514]
class=transaction

fdb> status details (example of memory eat up, after which process fails with “ERROR: out of memory”

Process performance details:
:4500:tls ( 11% cpu; 37% machine; 0.789 Gbps; 22% disk IO; 0.5 GB / 5.7 GB RAM )
:4501:tls ( 71% cpu; 37% machine; 0.789 Gbps; 58% disk IO;14.5 GB / 14.1 GB RAM )
:4502:tls ( 96% cpu; 37% machine; 0.789 Gbps; 81% disk IO;14.4 GB / 14.1 GB RAM )
:4503:tls ( 86% cpu; 37% machine; 0.789 Gbps; 80% disk IO;14.7 GB / 14.1 GB RAM )
:4504:tls ( 0% cpu; 37% machine; 0.789 Gbps; 0% disk IO; 0.2 GB / 14.2 GB RAM )

Redwood has some memory issues in 7.0. Take a look at Redwood storage engine runs out of memory

7.1 with the latest patches looks to be more stable.

1 Like

Indeed. 7.1 includes a reworking of both how FDB tracks memory usage and how it allocates memory for various purposes, and also a reworking of how Redwood tracks and allocates its internal memory.

The main issue within Redwood was that aside from its page cache it also has a per-page cache that stores decompressed key strings for each key visited while the page is in cache. This is an optimization to make seek, iteration, and returning KV pairs faster. However depending on the KV pattern and the read/write pattern it can also result in substantial additional memory usage which in many cases is not worth the memory cost.

Redwood in 7.1 fixes these issues by fully tracking all memory associated with the decompressed key caches and counting that against the total memory budget, in addition to proactively discarding leaf level decompressed key caches as they are not as useful as the ones for internal nodes.

@osamarin @SteavedHams thank you very much for your answers, I have precisely looked at the mentioned thread and the answers.

Getting back to you with more feedback. I have noticed same behavior with the ssd storage engine on the other servers (version 6.3). I think it was somewhat a result of overloading database (?).

Description:
As was described in the first post, I have noticed that processes fail with “ERROR: out of memory” issue.

What was tried:

  • giving storage processes (the ones which fail) more RAM
  • decreasing load on the foundationdb

It resulted in nothing. Even when I have increased memory to 18 GiB it still managed to eat up the memory. Then I decided to shut down the docker completely, so that our services don’t load database at all. It still has constant overflowing of memory with same error after one day without the load. Database health status is “healthy”.

I am really sorry, if this thread is not a place for such issues, but it seems to be some sort of bug.

*$ fdbcli -v*
FoundationDB CLI 6.3 (v6.3.23)
source version 47b9a81d1c10897c863098fe2d66e827fed0d239
protocol fdb00b063010001
[fdbserver.4500]
class=stateless
#
[fdbserver.4501]
class=storage
datadir=/data/sdb/foundationdb/data/$ID
logdir=/data/sdb/foundationdb
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4502]
class=storage
datadir=/data/sdc/foundationdb/data/$ID
logdir=/data/sdc/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4503]
class=storage
datadir=/data/sdd/foundationdb/data/$ID
logdir=/data/sdd/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4504]
class=storage
datadir=/data/sde/foundationdb/data/$ID
logdir=/data/sde/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4505]
class=storage
datadir=/data/sdf/foundationdb/data/$ID
logdir=/data/sdf/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4506]
class=storage
datadir=/data/sdg/foundationdb/data/$ID
logdir=/data/sdg/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4507]
class=storage
datadir=/data/sdh/foundationdb/data/$ID
logdir=/data/sdh/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4508]
class=storage
datadir=/data/sdi/foundationdb/data/$ID
logdir=/data/sdi/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4509]
class=storage
datadir=/data/sda/foundationdb/data/$ID
logdir=/data/sda/foundationdb  
memory = 18GiB
cache_memory = 12GiB
[fdbserver.4510]
class=storage
datadir=/data/sdk/foundationdb/data/$ID
logdir=/data/sdk/foundationdb  
memory = 18GiB
cache_memory = 12GiB

[fdbserver.4511]
class=transaction
datadir=/data/sdl/foundationdb/data/$ID
logdir=/data/sdl/foundationdb  

[fdbserver.4512]
class=stateless
memory = 5GiB
cache_memory = 3GiB
[fdbserver.4513]
class=stateless
memory = 5GiB
cache_memory = 3GiB
...
[fdbserver.4524]
class=stateless
memory = 5GiB
cache_memory = 3GiB

database state after one day without load:

*fdb> status details*

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - single
  Storage engine         - ssd-2
  Coordinators           - 1
  Exclusions             - 15 (type `exclude' for details)
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 24
  Zones                  - 1
  Machines               - 1
  Memory availability    - 5.0 GB per process on machine with least available
  Fault Tolerance        - 0 machines
  Server time            - 07/21/22 16:22:19

Data:
  Replication health     - Healthy
  Moving data            - 0.000 GB
  Sum of key-value sizes - 12.639 TB
  Disk space used        - 15.825 TB

Operating space:
  Storage server         - 2007.9 GB free on most full server
  Log server             - 3583.2 GB free on most full server

Workload:
  Read rate              - 7 Hz
  Write rate             - 0 Hz
  Transactions started   - 2 Hz
  Transactions committed - 0 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:
  :4500:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO; 0.4 GB / 7.9 GB RAM  )
  :4501:tls (  1% cpu;  1% machine; 0.000 Gbps;  1% disk IO; 8.6 GB / 17.8 GB RAM  )
  :4502:tls (  3% cpu;  1% machine; 0.000 Gbps;  3% disk IO;15.7 GB / 17.8 GB RAM  )
  :4503:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO;13.6 GB / 17.8 GB RAM  )
  :4504:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO; 9.9 GB / 17.8 GB RAM  )
  :4505:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO;10.3 GB / 17.8 GB RAM  )
  :4506:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO;17.3 GB / 17.8 GB RAM  )
  :4507:tls (  1% cpu;  1% machine; 0.000 Gbps;  1% disk IO;17.4 GB / 17.8 GB RAM  )
  :4508:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO;16.3 GB / 17.8 GB RAM  )
  :4509:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO;12.1 GB / 17.8 GB RAM  )
  :4510:tls (  1% cpu;  1% machine; 0.000 Gbps;  0% disk IO;14.0 GB / 17.8 GB RAM  )
  :4511:tls (  1% cpu;  1% machine; 0.000 Gbps;  1% disk IO; 1.8 GB / 7.9 GB RAM  )
  :4512:tls (  0% cpu;  1% machine; 0.000 Gbps;  1% disk IO; 0.4 GB / 5.0 GB RAM  )
  :4513:tls (  0% cpu;  1% machine; 0.000 Gbps;  1% disk IO; 0.4 GB / 5.0 GB RAM  )
     ...
  :4524:tls (  0% cpu;  1% machine; 0.000 Gbps;  1% disk IO; 0.8 GB / 5.0 GB RAM  )

Coordination servers:
  :4500:tls  (reachable)

Client time: 07/21/22 16:22:19

UPD: explaining 5 GiB memory limitation, it is config after experiments, before it was without memory/cache_memory in config and used approximately 8Gib per process (with overflow)

Following up this issue, as we managed to resolve it.

What helped:

  • increasing transaction processes amount at least to the recommend (8), better to set their amount the same as the amount of storage processes (for 12 storage processes - 12 transaction processes)
  • giving storage processes more memory, particulary it should be at least 12Gib for our load, perfectly 16Gib
  • as a side note, on the servers with not as many memory available, when configuring, transaction processes doesn’t require more than 4-5Gib of RAM configured
  • if you have 8Tb and more disks, it’s better to split it up to smaller disks of 4Tb, as mentioned in the documentation

What we tried but didn’t help:

  • decreasing load on the foundationdb
  • disabling huge pages (offered in one solution here)

If you are still using FDB 7.0, please know that you cannot upgrade to FDB 7.1 while using Redwood. If you upgrade to 7.1, your storage servers will fail with an error that they do not recognize the data format, and from there you will have to

  • Downgrade back to FDB 7.0.*
  • Change storage engine to ssd
  • Wait for Replication Health to be Healthy with no Data Movement.
  • Upgrade to FDB 7.1
  • Change storage engine to ssd-redwood-1-experimental

Redwood in FDB 7.1 is the first production-ready release and its file format will be supported going forward. It is still labeled “experimental” out of caution, and the experimental tag will be removed in an upcoming release.

2 Likes

My performance tests show that one transaction process is sufficient for 6-8 storage processes, so the number of storage processes should be more than than transaction processes 6-8 times,

1 Like