It seems like an API for transaction size would be generally useful, especially as there are a few pathologies that FDB runs into with larger transactions, so it’s often useful for clients to know how large their transactions are. (See: Issue #1466: Better handling of large transactions.)
There are a few proposals on what might be some work that could help clients with that. These are primarily designed around how one would improve on doing a bunch of work in a series of transactions (like, say index building):
- A generic “get transaction size” API that can be called at any time. It would hopefully use the same metric as transactions do upon commit so that the user could do something like
while(tr.getSize() < some_threshold) add_more_work(tr);. Then their loop stops when they have done as much as they are willing in a single transaction. (Say, 1 MB or 750 kB.) The tricky thing (as I understand it) with adding this is that such an API would need to handle things like updating its understanding of transaction size every time a clear range or even a get range happens, which might be somewhat tricky. It could also recalculate the value from the transaction’s data structures, but that could be expensive if it’s something that is done after, say, each time a significant data segment is added.
- A “get committed transaction size” API that can only be called after a transaction has been committed. This would allow someone who’s doing a bunch of work to then inspect how big their already committed transaction was. Then if the transaction exceeds some value, they can themselves apply a limit to do less work in the next transaction. This is obviously less good then knowing before one commits (or before one adds more work), but it might be easier to implement as the transaction already has to calculate its size before it commits, so it would then be as simple as just “remembering” what the value was.
- An option for artificially limiting the size of a transaction to something less than the official limit. (This was proposed by @ryanworl in Issue #1466.) This is probably easier to implement than the other two options insofar as it would be taking a constant and making it something that the user can configure. (Possibly on a database-wide or maybe transaction level–or both?) This would then show up as client-side errors, which would keep the error from hurting the server.
If people have thoughts on these proposals, that would be good to know. I think the first two of these would require an update to the
libfdb_c API and therefore also all of the bindings. The third can be accomplished solely through adding more options in vexilographer (from an API point of view).