Problem: Recently, we started getting the PacketLimitExceeded error (Severity 40)
We have pretty extensive load on this DB, but we haven’t noticed such error earlier. Lately we noticed that latency for getting info has been increased, which was resulted as I think due to high conflict rate, which was caused by this error.
It occurred on multiple different databases at once, I think due to increasing load on it. However, when monitoring status we are getting “The database is not being saturated by the workload”.
I have looked up the forum and issue, but couldn’t find such issues.
Also in the code base, there are “FlowKnobs” with this option, but it’s impossible to set from config if I am right?
What’s proper way to solve it?
(!) Also worth noting, that we are getting this message with status time to time, when facing this error (that’s storage server), but it looks more like a result to me:
Performance limited by process: Storage server MVCC memory.
Most limiting process: <IP>:4501:tls
Foundationdb version:
FoundationDB CLI 7.1 (v7.1.23)
source version dd2e68b9b3175667914673539f06b8e1071c07d6
protocol fdb00b071010000
Configuration is as follows
Configuration:
Redundancy mode - double
Storage engine - ssd-redwood-1-experimental
Coordinators - 6
Usable Regions - 2
Regions:
Primary -
Datacenter - WC1
Remote -
Datacenter - EC1
Cluster:
FoundationDB processes - 247 (less 0 excluded; 6 with errors)
Zones - 6
Machines - 6
Memory availability - 3.5 GB per process on machine with least available
>>>>> (WARNING: 4.0 GB recommended) <<<<<
Retransmissions rate - 16 Hz
Fault Tolerance - 1 machines
Server time - 10/19/23 10:09:24
Data:
Replication health - Healthy (Repartitioning)
Moving data - 389.000 GB
Sum of key-value sizes - 50.081 TB
Disk space used - 205.194 TB
Operating space:
Storage server - 701.4 GB free on most full server
Log server - 3341.2 GB free on most full server
Workload:
Read rate - 42293 Hz
Write rate - 50489 Hz
Transactions started - 27356 Hz
Transactions committed - 560 Hz
Conflict rate - 5 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
So, this is how most of stateless servers look like (status details):
<IP>:4560:tls ( 13% cpu; 20% machine; 1.916 Gbps; 45% disk IO; 2.8 GB / 3.5 GB RAM )
Last logged error: PacketLimitExceeded at Thu Oct 19 09:33:33 2023
<IP>:4561:tls ( 7% cpu; 20% machine; 1.916 Gbps; 53% disk IO; 3.0 GB / 3.5 GB RAM )
Last logged error: PacketLimitExceeded at Thu Oct 19 01:11:17 2023
<IP>:4562:tls ( 3% cpu; 20% machine; 1.916 Gbps; 46% disk IO; 2.5 GB / 3.5 GB RAM )
Last logged error: PacketLimitExceeded at Thu Oct 19 01:11:10 2023
Related error description from status json
"messages": [
{
"description": "PacketLimitExceeded at Thu Oct 19 09:33:33 2023",
"name": "process_error",
"raw_log_message": "\"Severity\"=\"40\", \"ErrorKind\"=\"Unset\", \"Time\"=\"1697708013.974044\", \"DateTime\"=\"2023-10-19T09:33:33Z\", \"Type\"=\"PacketLimitExceeded\", \"ID\"=\"0000000000000000\", \"ToPeer\"=\"67.213.220.39:36284:tls\", \"Length\"=\"129276072\", \"ThreadID\"=\"12729161780852052616\", \"Backtrace\"=\"addr2line -e fdbserver.debug -p -C -f -i 0x366fadc 0x366e6f0 0x366eaee 0x3474c8e 0x347ed25 0xb924a8 0xba0603 0x10a5ca0 0x10a67e5 0x10a72d0 0x11ad550 0x360a35e 0xa574d9 0x7f092816b083\", \"Machine\"=\"<IP>:4560\", \"LogGroup\"=\"default\", \"Roles\"=\"CP\"",
"time": 1697710000,
"type": "PacketLimitExceeded"
}
],
sudo systemctl status foundationdb.service doesn’t return any errors