Best practices for bulk load

Are you inserting keys in a sequential order?

Try randomizing the order to spread the load around the cluster rather than it all hitting one machine at the beginning or end of the key space you’re writing to.

I don’t think 5mb is optimal and is probably too high. I have never written that much in a transaction before during my testing over the last few weeks.

The client library is single threaded regardless of how many goroutines you start (AFAIK), so splitting the load across multiple processes should also increase parallelism which is needed to fully use the cluster. This will only work if you’re splitting the keyspace you’re writing to as well.

1 Like