Trace log files accumulated and total size exceeded maxlogssize

In one of our dev FDB clusters, there were two fdb nodes (Kubernetes pods/containers) that didn’t purge trace log files as expected, instead accumulated log files with total size exceeded maxlogssize.

The log related parameters in foundationdb.conf:

logdir = /var/log/foundationdb
logsize = 20MiB
maxlogssize = 5GiB

The logdir is at the root file system of the container, which has a capacity of 20GB. We got the following error:

/var/log/foundationdb# du -sh
21G	.
bash: history: /root/.bash_history: cannot create: Disk quota exceeded

The trace files had been generated and accumulated since Jun 2nd.

Why did FDB not purge according to ‘maxlogssize’ limit? We are using FDB v6.2.27.

How many processes do you have running. AFAIK these limits are per process. So if you have 5 processes and each has 5GiB for its maxlogssize, this directly will use up to 25 GiB of disk space.

1 Like

That’s it! We have 6 processes per node for this DEV cluster. Thank you, Markus.

Hello, how about if a pod got restarted? In that case, the ip address has been changed so will it be treated as a new process and then the log size used will be re-calculated from 0? Thanks!

That case is currently not well handled and I’m not aware of a solution for this since the trace_file_identifier is only supported in the client but not in the fdbserver binary to my knowledge.