FoundationDB cluster performance issue - Periods of high disk I/O and sustained high latency

subramaniamr · July 6, 2020, 6:31am

I can think of a few things -

First of all we were doing a get() and getRange() and not getKey(). This means the whole value must be read from the disk and transported all the way back to client.
– Now for deletes , I haven’t read the code but I am sure SS can do this in a more optimized way.
I verified cache stats were different before and after. When an old value is read i could see the cacheMiss and cacheEviction stats in the logs. This means the newly read value is loaded in the cache and something else was evicted.
– If the key thats being deleted is not in the cache then the cache remains untouched.
The actual commit latency ( time taken for commit().join() to return ) didnt change much before and after. This means whether key was read recently or not didnt have much difference while deleting it.

For context we shaved down about 40-50ms to delete 5000 keys by avoiding the read. Disk busy % used to go above 70 - 80% and stay there. I dont know how much it has come down primarily because now we face a different issue - constant repartitioning.

gaurav · July 6, 2020, 6:50am

Thanks for detailed answer. I would request @ajbeamon or @SteavedHams to probably reason out the change in performance based on above changes.

Keys and values are stored in same btree disk block so it should not matter. For deletes, SS would have to read same disk blocks (logically speaking).

I suspect that implicit disk block reads is not being counted for cache misses. However, I am not sure why these implicit blocks read are not polluting the cache and not resulting in more misses for normal gets.

Topic		Replies	Views
Transaction/operation throughput Using FoundationDB performance	10	1967	January 23, 2020
FoundationdDB Cluster Performance Issues Using FoundationDB performance , operator	11	1300	October 21, 2020
High P99.9 Latencies (±70millis) on range reads (<1KiB) with ±1000 reads per second Using FoundationDB performance	0	55	February 15, 2025
Scaling issues with FDB for write throughput Running FoundationDB	6	1827	September 14, 2020
CPU limited storage processes Using FoundationDB performance	9	1534	May 18, 2021

FoundationDB cluster performance issue - Periods of high disk I/O and sustained high latency

Related topics