I’m trying to reason about the bandwidth consumption of a single fdb client and the fdbserver processes under different replication factors. Here is what my experiment looks like.
I have one single fdbclient and 4 fdbserver processes on 4 machines (i.e. one process per machine) with redundancy mode ‘double’. Using the go-ycsb benchmark I can do 1K updates per second with the value size of 10K (key size is less than 100 bytes). I understand that the actual bandwidth would be larger than 10MB/sec but the observed outgoing bandwidth on this client machine is about ~100MB/sec according to iftop. I repeated the experiment multiple times and I made sure that there was no other processes consuming high bandwidth during the experiment.
Now I’m trying to reason about this 10x relationship between the actual data being updated (10K * 1K/sec == 10MB/sec) and the bandwidth consumed.
As I read the write path diagram, the client needs to send the mutation to the proxy (which would send to tLog servers) and it talks to the storage server directly. Since the replication factor is 2 in my case, I’d have 2 + 1 == 3 times amplification. Also, the encoding of the key-value pairs has an overhead of 2x, therefore the above number would become 6 now but it still doesn’t match the observed 10x number.
(I also repeated the experiment of 1KB value and I can still observe ~10x bandwidth amplification).
Any pointers to how to explain the relationship between the actual data being requested vs the actual consumed bandwidth would be appreciated.