Here is the page describing storage of blobs in Java. You can see the two lines for determining the chunk size below:
int numChunks = (data.length() + CHUNK_SIZE - 1)/CHUNK_SIZE;
int chunkSize = (data.length() + numChunks)/numChunks;
What is the purpose of the
chunkSize variable. Why not simple use the
I think it’s trying to equalize the size of all chunks. Using CHUNK_SIZE, the last chunk may be smaller than the other ones while here they all have the same size… though I’m not sure if this is a real concern: I don’t see any perf reason to go one way or the other, but equalizing the size of all chunks makes it a bit difficult to support append semantics on your blobs.
Another thing: the code referenced use the Tuple encoding to encode the value … which is a bit wasteful. I would have simply truncated the bytes obtained from the UTF-8 representation of the string into smaller chunks. This is even simpler when storing plain bytes.
If you are looking for a “full featured” blob layer implementation, there was an original Blob Layer sample written in python, but it is probably not available anymore.
I did port it to C# a long time ago, it’s still available here: https://github.com/Doxense/foundationdb-dotnet-client/blob/master/FoundationDB.Layers.Common/Blobs/FdbBlob.cs
Disclaimer: In practice, I never used this implementation because I don’t really need to support sparse files or shrink/truncate blobs and the representation of the keys is a bit weird. The implementation we use is a lot more straight-forward, and adds support for compression, deduplication and attribute indexing.