FoundationDB

Intent/roadmap to handle larger value sizes?


(Ringo De Smet) #1

Hello,

The Known Limitations mention value sizes up to 100KB at the time of writing. Is there anything on the roadmap for handling larger value sizes?

Let’s assume for a minute FoundationDB could handle values in the [5-10]MB range. This could eventually lead to FoundationDB be a valid solution for an on-premise S3 replacement (cfr multi-part upload). Even better, I could join S3 and DynamoDB in a single storage solution using different “layers” to store data and metadata in a single transaction.

Also, handling larger chunks would be a perfect fit for this HTTP extension: tus.io

Ringo


(Brian Haslet) #2

I believe Dave talked about this in an earlier post and should probably be in the FAQ since the question comes up so often.

It’s super easy to store huge values split over multiple keys, that’s not a problem. You just create a virtual table of keys where each successive key stores the next block. Then you can use a range query to get the entire blob back out in a streaming fashion

The issue is doing it in a way that works generically and makes sense as a layer that works across different user applications, since your key space might be mixed up with all kinds of other information.

That said, if that were your primary use case for FDB this would not be a problem.

This is also closely related to the FoundationDB as a Document Store question that also comes up a lot. It’s pretty much the same question.

In any case, yes you can do it, with very little effort and don’t need to wait until there is an official layer that provides this functionality.


(Ringo De Smet) #3

@pineapple,

I understand that I can do it now and it was clear to me that I need to manage the list of the chunk keys. I am currently not after a generic way to handle this.

But with max value (chunk) size of 100kb, it’s not an easy map from http client uploads which offer chunks of e.g. 5MB. I still would have to chop that up even further to chunks of 100kB and vice-versa on retrieval. I would rather have the 1-1 mapping in place.


(Brian Haslet) #4

Sure, I understand the desire for a single API that you just give it a byte array of whatever size and it simply stores or retrieves it, but you can do that.

It’s really not hard and if you need help please don’t hesitate to PM me if you need help, though I’m certain if you are willing to take on FDB then you can do it.

The mapping is absolutely no worse than reading or writing a block at a time from disk.


(Brian Haslet) #5

Just to go further here

  1. An HTTP post request comes in for a 133K upload (See Content-Length)
  2. Generate a document ID for it and a content frame ID of 0
  3. Start receiving the content frames asynchronously (maintain an accumulator buffer)
  4. When data is received add it to the accumulator
  5. If accumulator is more than 100KB chop it and write it with key tuple [ documentID contentFrameId ]
    … Increment the content frame ID
  6. If you reached the end and accumulator size > 0 then write the last block
  7. Otherwise go back to receiving more data

So you get

  1. Upload initiated - generate document ID 0 for document of size 133K
  2. You receive 50K from client and accumulate
  3. You receive 60K from and accumulate
  4. Accumulator is now 110K which is bigger than 100K
  5. Write 100KB to key [ 0 0 ] (Document ID 0, Block 0)
  6. Set accumulator to remaining 10K from the read overshoot (50K + 60K > 100K)
  7. You receive another 23K
  8. Append to Accumulator which is now 10K + 23K => 33K
  9. Because you are done, write 33K to key [ 0 1 ] (Document ID 0, Block 1)
  10. And you are done saving the asynchronously uploaded document

When download requested for Document 0 then you do a range query for keys with prefix [ 0 ] (Document ID 0)

That will get you [ 0 0 ] and [ 0 1 ] sequentially

You get the first block [ 0 0 ] (100K) and send to client
You get the second block [ 0 1 ] (33K) and send to client

Hope this makes sense, I mean despite my lame flowchart you get the idea

[Edited for my bad math etc :slight_smile: ]


(Christophe Chevalier) #6

Splitting a value into smaller chunk is “easy” enough, but the less visible gotcha is that you also have a 10MB per-transaction limit!

So if you want to store a “file” larger than 10MB (even chunked into 100KB keys), it will not be possible to do it in a single transaction, you need two or more!

But now:

  1. a concurrent reader could observe a partially uploaded file, so you need additional signalling with a “status” key, and so on,
  2. what happens if the upload crash mid-flight, or if the web worker process dies? You need to do garbage collection of fragmented or incomplete documents at some point. But which process will do it? how often?
  3. if you want to support resuming by the user, you will need some sort of index to map the “random” document ID into a specific file or user session.

It can grow in complexity pretty fast!