Clj-foundationdb - Clojure bindings over Java wrapper

Hi,

I have been developing Clojure bindings for FoundationDB. The API is shaping up nicely. It’s still not complete with directories, watches, etc. to be implemented. I have implemented basic kv actions, ranges, keyselector and subspaces. I have added examples in README with class-scheduling program written using the library.

GitHub repo : https://github.com/tirkarthi/clj-foundationdb
API docs : https://tirkarthi.github.io/clj-foundationdb/

I am writing more test cases, docs and refactoring. PR and comments are welcome.

It uses macros to do cool stuff like automatic subspace prefixing inside blocks :

> (let [fd         (select-api-version 510)
         classes [["class" "intro"] ["class" "algebra"] ["class" "maths"] ["class" "bio"]]
         time      "10:00"]
  (with-open [db (open fd)]
    (tr! db
         (doall (map #(set-val tr %1 time) classes))
         (get-val tr ["class" "algebra"]))))

> "10:00"

Automatic subspace prefixing in context :

> (let [fd         (select-api-version 510)
         classes ["intro" "algebra" "maths" "bio"]
         time      "10:00"]
  (with-open [db (open fd)]
    (tr! db
         (with-subspace "class"
           (doall (map #(set-val tr %1 time) classes))
           (get-val tr "algebra")))))

> "10:00"

10 parallel clients setting 100k keys takes 23s on my machine. That’s around 43k keys per second for 1M keys.

(let [fd (select-api-version 510)
      kv (map #(vector (str %1) %1) (range 100000))]
  (time (let [clients (repeatedly 10 #(future 
                                        (with-open [db (open fd)]
                                          (tr! db
                                               (doall (doseq [[k v] kv] 
                                                        (set-val tr k v)))))))]
          (doall (map deref clients))
          "Finished")))
"Elapsed time: 23740.242199 msecs"
"Finished"

Output of status in fdbcli

Workload:
  Read rate              - 14 Hz
  Write rate             - 39999 Hz
  Transactions started   - 6 Hz
  Transactions committed - 1 Hz
  Conflict rate          - 0 Hz
    Hardware Overview:

      Model Name: MacBook Air
      Model Identifier: MacBookAir7,2
      Processor Name: Intel Core i5
      Processor Speed: 1.6 GHz
      Number of Processors: 1
      Total Number of Cores: 2
      L2 Cache (per Core): 256 KB
      L3 Cache: 3 MB
      Memory: 4 GB
      Boot ROM Version: MBA71.0166.B06
      SMC Version (system): 2.27f2
      Serial Number (system): C02Q4LHTG940
      Hardware UUID: 4ECF3B7B-6444-51AA-BB60-BCE6C285D90D

I replicated a similar benchmark on a core i7-6700K @ 4.00 Ghz, with a Samsung 951 NVMe, and got 145k write/sec (it took 6.8 sec to insert 1M small keys with 10 concurrent clients), using the .NET binding on Windows 10.

That’s 3.3x faster for a CPU with 2.5x higher clock speed and 4x more cores (though not all were used). Looks like the performance scale nicely with compute power! :slight_smile:

Though, looking at the result, each thread commits a single transaction that is almost 5 MB long, and the latency is 3.5 sec for the first one to commit and ~6.7 sec for the 9 others… Looks like there is some serious contention when having multiple concurrent bulk inserters with jumbo transactions. This probably means that nobody else can commit for about 4-5 sec while this is happening !

I changed the test to have 1,000 concurrent inserters with 1,000 keys each (about 50k per transaction), and it only takes 8.7 sec or ~114k write/sec, which is only 2 sec slower and the latency per transaction is a lot less. All my cores were used though, so I’m probably bottlenecked by the CPU.

2 Likes

Interesting one. I guess my library is also limited by the fact it does tuple serialization every time that will add to the overhead. I am also curious if FoundationDB’s performance is comparable with Redis and any thoughts on this?

I tried to replicate the redis benchmark for 100k keys 50 parallel clients with 2000 keys per client with results as below. When I add pipeline Redis is insanely fast. I am just wondering if I am comparing apple and oranges here or trying to benchmark in incorrect manner.

FoundationDB 100k keys with 50 parallel clients (25258.91 requests per second)

(let [fd (select-api-version 510)
      kv (map #(vector (str %1) %1) (range 2000))]
  (time (let [clients (repeatedly 50 #(future
                                        (with-open [db (open fd)]
                                          (tr! db
                                               (doall (doseq [[k v] kv]
                                                        (set-val tr k v)))))))]
          (doall (map deref clients))
          "Finished")))
"Elapsed time: 3959.786679 msecs"
"Finished"

Redis normal set with parallel client (63938.62 requests per second)

➜  src git:(unstable) ./redis-benchmark -t set
====== SET ======
  100000 requests completed in 1.56 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

99.24% <= 1 milliseconds
100.00% <= 2 milliseconds
100.00% <= 2 milliseconds
63938.62 requests per second

Redis with pipeline (578034.69 requests per second)

➜  src git:(unstable) redis-benchmark -t set -P 160
====== SET ======
  100000 requests completed in 0.17 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

0.00% <= 7 milliseconds
1.60% <= 8 milliseconds
3.68% <= 10 milliseconds
11.20% <= 11 milliseconds
47.04% <= 12 milliseconds
75.04% <= 13 milliseconds
91.68% <= 14 milliseconds
95.68% <= 15 milliseconds
97.44% <= 16 milliseconds
99.04% <= 17 milliseconds
100.00% <= 17 milliseconds
578034.69 requests per second

Edit :

I tried with keys and values converted to byte and it saves another 700ms

(let [fd (select-api-version 510)
      kv (map #(vector (key->packed-tuple %1) 
                       (key->packed-tuple %1)) (range 2000))]
  (time (let [clients (repeatedly 50 #(future
                                        (with-open [db (open fd)]
                                          (tr! db
                                               (doall (doseq [[k v] kv]
                                                        (.set tr k v)))))))]
          (doall (map deref clients))
          "Finished")))
"Elapsed time: 3128.477095 msecs"
"Finished"

I think the issue with this kind of benchmark is that you are having lots of write-only transactions that insert thousands of sequential keys at once. This looks like more benchmarking a backup / restore tool, or some bulk loader. You’d need to have a workload that emulates what a typical REST API would do to the database (some writes, lots of reads?).

Since FDB has to serialize all transactions at some point (process them in sequence), having a long series of very fat transactions would be like having a traffic jam of big trucks in a special convoy hogging the single-lane toll booth at the end of the 1000-lane Mega Highway. If you have only one thing going on that is a full db restore and you want it to go as fast a possible, then you should try to max out the transactions as much as possible (you care about throughput, not latency). But in normal operations, probably smallish transactions (30-40 KB?) looks like the sweet spot.

Also, if you are testing with localhost, you are cheating a bit, because a 5-10 MB transaction needs about 50-100 ms additional latency to go over the wire. This creates a hard limit of 10 to 20 transactions per second, or about 100 MB/sec per client max!

Also beware that transactions of that size have other negative consequences. In particular, the processing of transactions that size results in brief starvation of other jobs on the proxies (e.g. getting read versions, committing), so you will see latencies for those operations go up on the cluster while such a transaction is being committed. We recommend not exceeding 1MB for your transaction size. See also Known Limitations — FoundationDB 7.1.

2 Likes