I’ve been running FDB in production since October, and just noticed a logqueue file this past weekend that was that was 120GB. I’m currently running two servers on one disk, with only one being a coordinator.
Status
fdb> status
Using cluster file `/etc/foundationdb/fdb.cluster'.
Configuration:
Redundancy mode - single
Storage engine - ssd-2
Coordinators - 1
Cluster:
FoundationDB processes - 2
Machines - 1
Memory availability - 4.0 GB per process on machine with least available
Fault Tolerance - 0 machines
Server time - 01/23/19 17:36:25
Data:
Replication health - Healthy
Moving data - 0.000 GB
Sum of key-value sizes - 57 MB
Disk space used - 82 MB
Operating space:
Storage server - 118.7 GB free on most full server
Log server - 118.7 GB free on most full server
Workload:
Read rate - 679 Hz
Write rate - 9 Hz
Transactions started - 337 Hz
Transactions committed - 3 Hz
Conflict rate - 6 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
My workload is very read heavy, and a large majority of my write transactions never end up making writes to disk. My config file is default except for adding one more server and having the servers listen on the public interface.
## Default parameters for individual fdbserver processes
[fdbserver]
command = /usr/sbin/fdbserver
public_address = auto:$ID
listen_address = public
datadir = /var/lib/foundationdb/data/$ID
logdir = /var/log/foundationdb
# logsize = 10MiB
# maxlogssize = 100MiB
# machine_id =
# datacenter_id =
# class =
# memory = 8GiB
# storage_memory = 1GiB
# metrics_cluster =
# metrics_prefix =
## An individual fdbserver process with id 4500
## Parameters set here override defaults from the [fdbserver] section
[fdbserver.4500]
[fdbserver.4501]
My coordinator server is 4500, but 4501 is the server with the ballooned logqueue.
Here is an ls
of the problem server’s data directory
root@fdb-1:/var/lib/foundationdb/data/4501# ls -lh
total 122G
-rw---S--- 1 foundationdb foundationdb 16K Oct 19 07:21 log-51b11ce3e25f3a29e4107aa5fb2ed583.sqlite
-rw---S--- 1 foundationdb foundationdb 104K Oct 19 07:21 log-51b11ce3e25f3a29e4107aa5fb2ed583.sqlite-wal
-rw---S--- 1 foundationdb foundationdb 122G Jan 23 17:46 logqueue-51b11ce3e25f3a29e4107aa5fb2ed583-0.fdq
-rw---S--- 1 foundationdb foundationdb 1.7M Oct 19 07:13 logqueue-51b11ce3e25f3a29e4107aa5fb2ed583-1.fdq
-rw---S--- 1 foundationdb foundationdb 4.0K Oct 7 09:19 processId
-rw---S--- 1 foundationdb foundationdb 39M Jan 23 17:46 storage-60bcc85c37134f573bd38d20b4ecc825.sqlite
-rw---S--- 1 foundationdb foundationdb 3.1M Jan 23 17:46 storage-60bcc85c37134f573bd38d20b4ecc825.sqlite-wal
And the good one
root@fdb-1:/var/lib/foundationdb/data/4500# ls -lh
total 40M
-rw---S--- 1 foundationdb foundationdb 28K Oct 19 07:21 coordination-0.fdq
-rw---S--- 1 foundationdb foundationdb 16K Oct 19 07:21 coordination-1.fdq
-rw---S--- 1 foundationdb foundationdb 76K Jan 23 17:48 log-9a7b5ec05e8401f3bf2ce59043f298f6.sqlite
-rw---S--- 1 foundationdb foundationdb 100K Jan 23 17:48 log-9a7b5ec05e8401f3bf2ce59043f298f6.sqlite-wal
-rw---S--- 1 foundationdb foundationdb 964K Jan 23 17:47 logqueue-9a7b5ec05e8401f3bf2ce59043f298f6-0.fdq
-rw---S--- 1 foundationdb foundationdb 2.1M Jan 23 17:48 logqueue-9a7b5ec05e8401f3bf2ce59043f298f6-1.fdq
-rw---S--- 1 foundationdb foundationdb 4.0K Oct 7 09:18 processId
-rw---S--- 1 foundationdb foundationdb 35M Jan 23 17:48 storage-f4607c08210c6c967767a903992c691a.sqlite
-rw---S--- 1 foundationdb foundationdb 2.2M Jan 23 17:48 storage-f4607c08210c6c967767a903992c691a.sqlite-wal
Does anyone have any thoughts on this? It seems to be growing at about 1GB a day which doesn’t seem sustainable.