Variable chunk size for blobs

janderland · February 21, 2019, 1:20am

Here is the page describing storage of blobs in Java. You can see the two lines for determining the chunk size below:

int numChunks = (data.length() + CHUNK_SIZE - 1)/CHUNK_SIZE;
int chunkSize = (data.length() + numChunks)/numChunks;

What is the purpose of the chunkSize variable. Why not simple use the CHUNK_SIZE constant?

KrzysFR · February 21, 2019, 9:39pm

I think it’s trying to equalize the size of all chunks. Using CHUNK_SIZE, the last chunk may be smaller than the other ones while here they all have the same size… though I’m not sure if this is a real concern: I don’t see any perf reason to go one way or the other, but equalizing the size of all chunks makes it a bit difficult to support append semantics on your blobs.

Another thing: the code referenced use the Tuple encoding to encode the value … which is a bit wasteful. I would have simply truncated the bytes obtained from the UTF-8 representation of the string into smaller chunks. This is even simpler when storing plain bytes.

KrzysFR · February 21, 2019, 9:47pm

If you are looking for a “full featured” blob layer implementation, there was an original Blob Layer sample written in python, but it is probably not available anymore.

I did port it to C# a long time ago, it’s still available here: https://github.com/Doxense/foundationdb-dotnet-client/blob/master/FoundationDB.Layers.Common/Blobs/FdbBlob.cs

Disclaimer: In practice, I never used this implementation because I don’t really need to support sparse files or shrink/truncate blobs and the representation of the keys is a bit weird. The implementation we use is a lot more straight-forward, and adds support for compression, deduplication and attribute indexing.

Topic		Replies	Views
How many bytes does a chunk equal to? Using FoundationDB	2	341	July 15, 2022
Adding new APIs to specify keys as a list of chunks, allowing "zero-copy" serialization FoundationDB Core	3	808	November 12, 2018
Should I optimize for single reads? Using FoundationDB	1	445	February 19, 2019
Record size limit Using FoundationDB	3	278	February 19, 2023
Request for feedback: tuple encoding bug Using FoundationDB	12	950	January 24, 2021

Variable chunk size for blobs

Related topics