Hello all! This is Kyle over at IBM Cloudant.
We’re scaling out the storage on an upcoming
three_data_hall cluster featuring ~30.4TB of 3x replicated storage (so ~91.2TB raw disk space).
Now, we’ve been using a single model of SSD after shaking down the performance of a few, which happens to be the smallest/cheapest/fastest we can purchase at 960GB. For our scale-up we’ve chosen denser disks which are more expensive but just as fast.
Naively I scaled up the tlog disks too but Adam Kocoloski got me thinking – do we need much tlog disk space generally speaking?
We have two good options to move forward:
- scale up the tlogs: ~6.7TB of 4x replicated tlog space
- keep the 960GB base SSDs: ~1.7TB of 4x replicated tlog space
We run 12 tlog processes, 7 active and 5 for redundancy. The sizes above are active sizes. Each process gets its own disk.
I looked through our historical metrics on test servers but haven’t seen the logs using much disk space, though that could be a complete testing gotcha. I’m not seeing evidence pointing me to the larger option so any guidance on this topic is welcome.