Intent/roadmap to handle larger value sizes?

ringods · April 20, 2018, 1:08pm

Hello,

The Known Limitations mention value sizes up to 100KB at the time of writing. Is there anything on the roadmap for handling larger value sizes?

Let’s assume for a minute FoundationDB could handle values in the [5-10]MB range. This could eventually lead to FoundationDB be a valid solution for an on-premise S3 replacement (cfr multi-part upload). Even better, I could join S3 and DynamoDB in a single storage solution using different “layers” to store data and metadata in a single transaction.

Also, handling larger chunks would be a perfect fit for this HTTP extension: tus.io

Ringo

pineapple · April 20, 2018, 1:18pm

I believe Dave talked about this in an earlier post and should probably be in the FAQ since the question comes up so often.

It’s super easy to store huge values split over multiple keys, that’s not a problem. You just create a virtual table of keys where each successive key stores the next block. Then you can use a range query to get the entire blob back out in a streaming fashion

The issue is doing it in a way that works generically and makes sense as a layer that works across different user applications, since your key space might be mixed up with all kinds of other information.

That said, if that were your primary use case for FDB this would not be a problem.

This is also closely related to the FoundationDB as a Document Store question that also comes up a lot. It’s pretty much the same question.

In any case, yes you can do it, with very little effort and don’t need to wait until there is an official layer that provides this functionality.

ringods · April 20, 2018, 2:39pm

@pineapple,

I understand that I can do it now and it was clear to me that I need to manage the list of the chunk keys. I am currently not after a generic way to handle this.

But with max value (chunk) size of 100kb, it’s not an easy map from http client uploads which offer chunks of e.g. 5MB. I still would have to chop that up even further to chunks of 100kB and vice-versa on retrieval. I would rather have the 1-1 mapping in place.

pineapple · April 20, 2018, 4:40pm

Sure, I understand the desire for a single API that you just give it a byte array of whatever size and it simply stores or retrieves it, but you can do that.

It’s really not hard and if you need help please don’t hesitate to PM me if you need help, though I’m certain if you are willing to take on FDB then you can do it.

The mapping is absolutely no worse than reading or writing a block at a time from disk.

pineapple · April 20, 2018, 6:26pm

Just to go further here

An HTTP post request comes in for a 133K upload (See Content-Length)
Generate a document ID for it and a content frame ID of 0
Start receiving the content frames asynchronously (maintain an accumulator buffer)
When data is received add it to the accumulator
If accumulator is more than 100KB chop it and write it with key tuple [ documentID contentFrameId ]
… Increment the content frame ID
If you reached the end and accumulator size > 0 then write the last block
Otherwise go back to receiving more data

So you get

Upload initiated - generate document ID 0 for document of size 133K
You receive 50K from client and accumulate
You receive 60K from and accumulate
Accumulator is now 110K which is bigger than 100K
Write 100KB to key [ 0 0 ] (Document ID 0, Block 0)
Set accumulator to remaining 10K from the read overshoot (50K + 60K > 100K)
You receive another 23K
Append to Accumulator which is now 10K + 23K => 33K
Because you are done, write 33K to key [ 0 1 ] (Document ID 0, Block 1)
And you are done saving the asynchronously uploaded document

When download requested for Document 0 then you do a range query for keys with prefix [ 0 ] (Document ID 0)

That will get you [ 0 0 ] and [ 0 1 ] sequentially

You get the first block [ 0 0 ] (100K) and send to client
You get the second block [ 0 1 ] (33K) and send to client

Hope this makes sense, I mean despite my lame flowchart you get the idea

[Edited for my bad math etc ]

KrzysFR · April 20, 2018, 7:19pm

Splitting a value into smaller chunk is “easy” enough, but the less visible gotcha is that you also have a 10MB per-transaction limit!

So if you want to store a “file” larger than 10MB (even chunked into 100KB keys), it will not be possible to do it in a single transaction, you need two or more!

But now:

a concurrent reader could observe a partially uploaded file, so you need additional signalling with a “status” key, and so on,
what happens if the upload crash mid-flight, or if the web worker process dies? You need to do garbage collection of fragmented or incomplete documents at some point. But which process will do it? how often?
if you want to support resuming by the user, you will need some sort of index to map the “random” document ID into a specific file or user session.

It can grow in complexity pretty fast!

Topic		Replies	Views
Object store on FoundationDB FoundationDB Layers	9	4977	May 18, 2018
Design document of internals & storage? FoundationDB Core	4	2028	April 20, 2018
Missing information in documentation about how to store blobs Community	0	1174	October 7, 2018
Considerations for key and value sizes Using FoundationDB	2	2021	November 28, 2018
Understanding the upper limit of 100TB databases Using FoundationDB performance , operator	1	530	September 15, 2023

Intent/roadmap to handle larger value sizes?

Related topics