Hi,
I want to better understand how do key and value sizes determine the performance of various read/write operations. Could some high-level description of how key and values are laid out in the storage system be helpful?
Maintaining efficient key and value sizes is essential to getting maximum performance with FoundationDB. As a rule, smaller key sizes are better, as they can be more efficiently transmitted over the network and compared. In concrete terms, the highest performance applications will keep key sizes below 32 bytes. Key sizes above 10 kB are not allowed, and sizes above 1 kB should be avoidedâstore the data in the value if possible.
Value sizes are more flexible, with 0-10 kB normal. Value sizes cannot exceed 100 kB. Like any similar system, FoundationDB has a âcharacteristic value sizeâ where the fixed costs of the random read roughly equal the marginal costs of the actual bytes. For FoundationDB running on SSD hardware, this characteristic size is roughly 1 kB for randomly accessed data and roughly 100 bytes for frequently accessed (cached) data.
(Could someone please explain in simpler terms, the reference to âcharacteristic value sizeâ, mentioned above?)
For instance, if I need to store rows where keys are 64Bytes, and values of roughly 10KB, there are two ways I could think of storing these:
option1
key -> value
option2
keys/key -> pointer
data/pointer' -> value
I would need to do a mixture of key lookups (using KeySelectors as I do not know the exact key) and range queries over ranges of size 10-100. There would be 100âs of reads for every write operation.
In option1, there are fewer rows touched, (read and write time) whereas, in option 2, the index rows are much smaller, which would hypothetically help in faster lookups and fewer disk page reads at the read-time.
How do I go about reasoning such questions when designing the data model?