Urgent help needed on transaction not committed error 1020

We are using FDB version 620 and in last 3-4 days saw increased transaction activity and following error.
Current configuration is as follows.

Can someone please help/repy on what can be done to fix this issue i.e. may be add more transaction processes or any DB parameters to tune this issue.

Currently we have 165 storage processes (max 10 on single servers) and 15 transaction process across 1 max on a server
Also attached transaction count increased which i think causing the issue


.

2023-02-17T19:12:05,565 [grpc-default-executor-219494] WARN com.apple.foundationdb.record.provider.foundationdb.FDBDatabaseRunner - Retrying FDB Exception code=“1020” delay=“1” max_attempts=“10” message=“Transaction not committed due to conflict with another transaction” tries=“0”
com.apple.foundationdb.record.provider.foundationdb.FDBExceptions$FDBStoreTransactionConflictException: Transaction not committed due to conflict with another transaction
at com.apple.foundationdb.record.provider.foundationdb.FDBExceptions.wrapException(FDBExceptions.java:144) ~[fdb-record-layer-core-pb3-2.7.74.0.jar:2.7.74.0]
at com.apple.foundationdb.record.provider.foundationdb.FDBDatabase.lambda$new$0(FDBDatabase.java:161) ~[fdb-record-layer-core-pb3-2.7.74.0.jar:2.7.74.0]
at

fdb> status

Using cluster file `/etc/foundationdb/fdb.cluster’.

Configuration:
Redundancy mode - three_datacenter
Storage engine - ssd-2
Coordinators - 7
Exclusions - 4 (type `exclude’ for details)
Desired Proxies - 5
Desired Resolvers - 7
Desired Logs - 12
Usable Regions - 1

Cluster:
FoundationDB processes - 244
Zones - 46
Machines - 46
Memory availability - 6.3 GB per process on machine with least available
Retransmissions rate - 353 Hz
Fault Tolerance - 3 machines
Server time - 02/17/23 20:46:11

Data:
Replication health - Healthy (Repartitioning)
Moving data - 0.329 GB
Sum of key-value sizes - 1.473 TB
Disk space used - 10.615 TB

Operating space:
Storage server - 5982.3 GB free on most full server
Log server - 59.4 GB free on most full server

Workload:
Read rate - 82119 Hz
Write rate - 128599 Hz
Transactions started - 3160 Hz
Transactions committed - 72 Hz
Conflict rate - 0 Hz

Backup and DR:
Running backups - 0
Running DRs - 0

Client time: 02/17/23 20:46:11

Is your workload fundamentally conflicting? That is, every transaction updates the same place and your goal is to get as many of those done as possible. Or would some attention to why it is conflicting help? If it can be done, getting rid of the conflicts is almost certainly a better solution than just scaling up the cluster in hopes that they will complete fast enough. Because, just as you saw, that eventually reaches a point where it breaks hard.

I know nothing about your workload, so it is hard to give more specific advice. You seem to be using record layer. So, if you are doing more than just save a single record per transactions, think carefully about the overlap in what you do when it happens multiple times at once.

I don’t have application details but in short application is having following issue. We have public API that is frequently queried and it updated the lastUsedTimestamp to show that it has been queried. This API is highly co-related with our concurrent users/players

I have shared the foundationDB doc on conflict resolution with application team but from infrastructure point of view if I add more transaction process (15 we have and 165 storage processes) will that help or increasing transaction server/processes will not help at this time.

fdb.write(recordA, timestampA)
fdb.write(recordA, timestampB)
// first FDB write doesn’t succeed/commit in time before the second call, our service retries by default.

@chetan_pg Take the following with a grain of salt since I’ve never actually used either FoundationDB or the record layer.

FoundationDB has the concept of atomic operations. One of the operations available is max. So, you could change your timestamps to be stored as unix timestamps, and use max to avoid the contention.

The record layer may not support this, so you might have to use base API directly.

Longer term, I’d recommend digging into the use case and trying to decouple the querying logic from the “looking” logic. Hard to say the best forward for that without knowing more though.

That probably won’t help, no. Increasing the number of transaction processes can help with increasing the throughput of a cluster (by giving the write pipeline more drives to write to) up to a point, but that’s not generally correlated with transaction conflict problems.

Usually, the only way to address transaction conflict problems is to change the application, because the fundamental problem is in the data model (or in optimistic concurrency, if you prefer) and not in anything you can tune in FDB itself.

If you can’t change the application code to adjust its data model, the only other thing I could think of is to adjust how requests are routed so that you can serialize requests to the lastUsedTimestamp data outside of the system (e.g., routing requests to some queue, and then processing the queue in order, potentially with batching if that’s possible).

I’ll say that FDB allows for two transactions writing the same keys to both succeed (last write wins), which sometimes surprises people. The only conflict that FDB will fail is if one transaction reads data that another transaction writes, then the reading transaction will fail. However, the Record Layer’s saveRecord method will always read the data it is about to update (because it needs to know the existing data to ensure it updates indexes correctly, amongst other reasons), which means that any two transactions bring to save the same record at the same time will conflict. We’ve given some thought to how we could change the data model to support multiple write operations on the same record (for example, how we could implement an “update counter” method on a record), but nothing is implemented.

Alec /Maven
Thanks for the reply . Alex with your reply it looks like infra capacity increase may not solve the issue but i hope it I try to increase the transaction processes at least it will not impact adversely .

I am still not clear adjust the data model means exactly what options/method are available to solve such issue (I got using external queue solution but thats outside FDB solutiion). Is any way to avoid saveRecord method and use something else instead.

Well, if you want to avoid using saveRecord, there are a few things you could do.

  1. The simplest would be to remove the write entirely from your frequently queried API, if you can. That would imply that you don’t update lastUpdateTimestamp, and then you have to work around that limitation. Whether that’s viable depends on how the data is used, but this option is probably simpler than any of the alternatives that keep updating this field.
  2. Next would be to create a non-Record Layer backed thing that you update instead of saving a record. It sounds like this could probably be pretty simple–something like a small layer that takes a Subspace and writes a key with the new timestamp, possibly something like, tr.mutate(MutationType.BYTES_MAX, key, Tuple.from(System.currentTimeMillis()).pack()). (This will update key to be a tuple containing the current time stamp in milliseconds since the epoch, though you could pick whatever format you’d like. The use of BYTES_MAX there ensures that if multiple people update the timestamp at once, the winner is whoever wrote last.) You’d then need to make sure to call this update code outside of the RL code.
  3. You could use a MAX_EVER index within the Record Layer. This can’t be used on its own, but if you have an update_time field on a record, you could create a MAX_EVER index on the field that will get updated whenever a record is updated. If you care about having this data available for both reads and writes, you could introduce a record that you insert on read, and then (in a separate transaction) clean those records up. This is pretty expensive, though, for just a “last accessed timestamp” field.

You could also use FDB’s versionstamp operations (exposed through the Record Layer’s FDBRecordVersion class) to have FDB write its internal commit version into the database instead of wall clock time, if what you’re really going after is a monotonically increasing value and don’t care as much about wall clock time.

Alec
Thanks for the reply and yes this suggestions are helpful . Thanks again . I am going to increase transaction servers to 20 from 15 meantime as code changes will take a while.