I can think of a few things -
- First of all we were doing a get() and getRange() and not getKey(). This means the whole value must be read from the disk and transported all the way back to client.
– Now for deletes , I haven’t read the code but I am sure SS can do this in a more optimized way. - I verified cache stats were different before and after. When an old value is read i could see the cacheMiss and cacheEviction stats in the logs. This means the newly read value is loaded in the cache and something else was evicted.
– If the key thats being deleted is not in the cache then the cache remains untouched. - The actual commit latency ( time taken for commit().join() to return ) didnt change much before and after. This means whether key was read recently or not didnt have much difference while deleting it.
For context we shaved down about 40-50ms to delete 5000 keys by avoiding the read. Disk busy % used to go above 70 - 80% and stay there. I dont know how much it has come down primarily because now we face a different issue - constant repartitioning.