When using Java client libraries for accessing an fdb cluster, how much maximum memory can be allocated at any given time by the native C library that Java client library calls into? Does it depend on how much data is being returned in a single read/range iteration? Does it also depend on the number of concurrent server requests at a given point in time?
What is the strategy to release the allocated memory back to the OS? Is there any way to influence the client to release the memory back in case the allocations are above a threshold?
If we want to cap this memory (allocated by C client), are there any Knobs, or other suggestions to limit the memory usage (like using a particular Iteration strategy, to limit max number of rows returned in a single RPC)?
Additionally, how can I get visibility into the amount of allocated memory by the C client library at any given point (logs, tools…)? Java heap will not show that allocation, as it is outside the heap; and I am not very good with other lower level tools to measure a JVM’s outside heap allocated memory.
Hi, just wanted to recheck if someone has any thoughts on this. This consideration is important for us to size the machines where clients would be running. And we also want to control the memory requirements for fdb clients due to tight memory situation.
I’ll admit, I’m not the most knowledgeable on all of these topics, but I’ll try my best to answer what I know.
I’m not aware of any way to have the C client purposefully decrease it’s memory usage (by, say, using smaller buffers), but I could be wrong. Usually, I think you would do this by restricting the process’s memory usage or something.
Not too much. I suppose when you do a read, there will be slightly more memory used for the lifetime of that request, but the bigger thing is that transactions keep around a cache of read keys and read values for the lifetime of the transaction. So the easy way to fill up memory accidentally is to start a transaction, read a bunch of data, and then never close it. This is a big reason why it’s important to call close on any transaction you start.
I guess? There’s some amount of space used to store, say, the fact that you are making a request. I don’t think that’s where most of the memory is going most of the time.
All of the native objects are AutoCloseable and upon close, free their underlying data in the C client. (Or release their reference.) The big ones to watch out for are Transactions. These are automatically closed by the retry loops in the Java bindings, but it’s not impossible to write client code that doesn’t properly close them. They also release their references on object finalization (for now), but that’s not reliable in that the object might not be finalized at any predictable time. The other problem with relying on the finalizers is that they are invoked when the JVM is under memory pressure which might be after when the native memory is under memory pressure. (For example, because transactions keep a large amount of state associated with what’s happened so far but have a fairly minimal representation in the JVM, the native memory might fill up with data from Transaction objects well before the JVM memory is used to any significant degree.)
Not to my knowledge. For the most part, you should be releasing references to native objects as soon as possible, and then there is less need to worry about increasing memory releasing.
I don’t think there are any knobs that are, like, “don’t allocate more than this much memory”, but there are a few related to the sizes of caches that might be useful to set (location cache size, for example). But fiddling with knobs in general is somewhat dangerous, and we don’t really document which ones are dangerous to set and which ones aren’t.
I think all of the iteration strategies will be roughly equivalent. Decreasing the number of rows read per transaction also might help.
You can also disable the read your writes cache through a transaction option: TransactionOptions (FoundationDB Java Client API) This will decrease the amount of memory used, and for read only transactions, it doesn’t change semantics. (It will change the semantics of read/write transactions, so use with care.) It also means for read-only transactions that reading the same key twice will require two successive network requests, so my suggestion would be to try it after you’ve explored other things.
Not sure. I think tools that work for profiling any C++ program would generally work for the FDB client (or I don’t see why they wouldn’t), but I haven’t actually tried.
Maybe @killertypo or @panghy have insight into what they’ve done for this kind of thing?
So to control memory use, you cannot rely on finalizers, you should just dispose native objects as aggressively as possible. Since that’s not always doable, what we do (in 3.x at least but this is being ported to 5.x/6.x) is to only allow objects to live during the live of the transaction itself. We track every call to Transaction and collect them in a map from txn to native objects. When we dispose the transaction, we also dispose every object. We also print a message in the finalizer if we forgot to dispose an object and we are now doing it in the finalizer.
With AutoCloseable I’d imagine it would be easier but still the chance that someone might forget is still high (I assume it’s easier to control how you start and end transactions since I don’t think we ever start and commit/abort transactions manually but YMMV).
Thank you @alloc and @panghy for the helpful suggestions!
In this post, I am lesser concerned about the JVM freeing up references to native pointers, but rather, I am trying to get a better understanding of the fast allocator pool size maintained in C library and any memory held by it over a long time (even after the transactions using that memory being closed).
Please refer to this post by @SteavedHams : fast-alloc . This talks about some long term memory pool maintained by fast-allocator, and transaction arenas borrow out memory from this pool and return the memory back to it on being closed.
This also clarifies that given a transaction limit of 10MB and a duration of 5 sec, there should be a very little chance of the fast-allocator pool being of a large size.
However, I wanted to get a better understanding of this and confirm the worst case size of this common pool in the client process. Specifically,
Is the max pool size proportional to the number of concurrent write transactions (each tx may need up to 10 MB, and there can be 100s of such concurrent transactions)?
Does the max pool size depend on the number of concurrent read operations? Each range-read operation can retrieve huge amounts of data in a single network call, and if data from each call needs to be buffered in this pool, then the pool needs to grow to accommodate the reads.
If there is no cap on the worst-case size of this pool by default, then are there any suggestions to limit it to a threshold, or else, keep it minimal?
How is the lifecycle of this pool maintained? How soon does it shrink back when unused?
It may be that I have misunderstood the behavior of fast-allocator pool in C library, and in that case, the entire post is irrelevant.
Transactions do use memory in proportion to the work they are doing, so if a transaction does a lot of reads or writes, it will use more memory. Also, if a transaction isn’t able to get any free memory from the pool, it will instead allocate more. That means that if you have a lot of concurrent transactions, there will need to be enough memory allocated for all of them.
Limiting concurrency or the size of your transactions can help. Being timely with your destruction of relevant objects (e.g. Transaction) like Alec mentioned is also important, as failing to do so can effectively increase your concurrency.
I don’t believe this memory is ever returned back from the allocation pools. If you allocate a lot of it and stop using it, it will persist for the lifetime of the program waiting for someone else who needs it.
I would have to do some profiling for my client code over a longer term to see how much memory are they allocating out-of-heap, but do you think there should be some more control given to the clients for limiting or reducing the memory footprint?
I think it will be very tricky to reason about long term memory usage based on point-in-time concurrency and size of operations (specially with size of range reads).
If you’re interested, could you summarize your requirements for client memory management as a GitHub issue for consideration by the development team?
I’m not sure how easy the features will be to add without compromising performance as I believe our fast allocator was designed with speed as a top priority. I think the current design would make it difficult to reclaim memory, for example, but I haven’t given it a ton of thought.