Understand FDB read/write with disk IOPS/throughtput/Blocksize

Hi all,
I setup FDB cluster with fdb-kubernetes-operator with 8 instances m6a.2xlarge (8 vcore, 32GB RAM each).
I have 8 storage pods, each pod run 3 storage processes.

apiVersion: apps.foundationdb.org/v1beta2
kind: FoundationDBCluster
metadata:
  labels:
    cluster-group: foundationdb-cluster
  name: foundationdb-cluster
spec:
  version: 7.1.27
  faultDomain:
    key: foundationdb.org/none
  storageServersPerPod: 3
  processCounts:
    cluster_controller: 1
    stateless: 20
    log: 8
    storage: 8
    coordinator: 3
    test: 2
  databaseConfiguration:
    redundancy_mode: "double"
    commit_proxies: 6
    grv_proxies: 2
    logs: 8
    resolvers: 4
  processes:
      volumeClaimTemplate:
        spec:
          resources:
            requests:
              storage: "100G"
      podTemplate:
        spec:
          containers:
            - name: foundationdb
              resources:
                requests:
                  cpu: 500m
                  memory: 1Gi
                limits:
                  cpu: 3000m
                  memory: 24Gi

Then I run RandomReadWriteTest

testTitle=RandomReadWriteTest
    testName=ReadWrite
    testDuration=60.0
    transactionsPerSecond=1000000
    readsPerTransactionA=1
    rangeReads=true
    valueBytes=2000
    nodeCount=500000000
    discardEdgeMeasurements=false

This always make storage queue reach to limit (1.4GB) then cluster performance drop.
I read almost question on FDB forums and get some advice from link below:

https://forums.foundationdb.org/t/reasons-for-not-co-locating-tlog-and-ss-io-characteristics-of-ss/
https://forums.foundationdb.org/t/how-to-troubleshoot-throughput-performance-degrade

I change some configuration like:

  • storageServerPerPod
  • knob_max_outstanding (from 64 to 128)
  • change EBS gp3 to maximum IOPS (16000) and throughput (1000MB)
  • add some more TLog, storage pod …

But storage queue problem is still there.
Then I change only one config for EBS gp3 type and this affect, queue doesn’t have problem anymore

  • blockSize=64 (this is maximum block size for EBS gp3, default is 4KB)

Can anyone who understand explain for me why it make cluster load data for test workload faster?
Does it consume more space on disk?
Is this configuration good for cluster?

Another question, I read a lots topic about write in FDB but I still not understand IOPS on disk and IO queue depth of FDB write, what is knob_max_outstanding parameter…

You will have to post more details about what you have done.

The test spec you posted is going to first build a 500M * 2000 byte data set, which is 2 TB, before it executes the test workload. During this time, given your log and proxy configuration the Storage Queue will be full because the storage servers persisting the loaded data to disk will be the bottleneck.

After the build step, this workload will likely be IO bound at a low enough transaction rate rate that the Storage Queue will be smaller. The storage queue holds 5s worth of mutations + a very large amount of overhead normally larger than the KV records changed in the 5s window.

The SQLite engine is using 4k pages by default. Redwood uses 8k. These are the page sizes that FDB is writing, so there is no reason I can think of that making your GP3 volume block size larger would increase performance. If that is truly occurring, I would be curious to know why.

The SQLite engine is using 4k pages by default. Redwood uses 8k.

In case when disk has 4kb block size.
I’m right understand: if application write one key with 4kb value, with SSD-2 engine it’s 1 IO but for redwood it’s 2 IO?

From FDB and the kernel’s perspective it’s still a single IO operation in either case regardless of the page size. The disk’s internal physical block size does not change the disk interface.

Also, the kernel will actually merge simultaneously outstanding adjacent writes into a single IO operation sent to the disk, which happens frequently when a storage file is growing as many of the writes are contiguous at the end of the file. This is particularly useful when using block device services like EBS because it reduces the number of IO operations which count against the provisioned quota.

1 Like