From Build to Running Tests

Hi there,

Great to see FDB open sourced, and look forward to contributing.

I successfully managed to build it for Linux using the instructions in the Repo, however I’m a little unclear what to do next :slight_smile:

There looks to be a wonderful tests directory, but not sure how to run any of them.

Any tips from how to go from the end of the Docker compilation phase to having a server up and doing some testing please?

I know I can just download the pre-built binaries, but probably those don’t include the test-harness/tests etc (I haven’t checked)

thanks!

2 Likes

The good news is that if you have an fdbserver, whether you built it from source or not, and you are in the source directory, you can run a particular simulation test by just typing something like

bin/fdbserver -r simulation -f tests/fast/Sideband.txt -s 12345

and among other junk it will print out something like

1 tests passed; 0 tests failed; waiting for DD to end…

Unseed: 81362
Elapsed: 118.350939 simsec, 6.486105 real seconds

The bad news is that I don’t think you can really trust that “1 tests passed” business. To trust that a test passed without errors you also need to examine the megabytes of XML log file that it has spit out. Unfortunately, it doesn’t look like the tools for doing that (let alone the tools for diagnosing failed runs, or for running millions of these tests and examining the ensemble of results) have been released, so it might be rough on community developers until we can build some tooling and/or give fdbserver itself some of the key functionality.

Apple folks, do you have a plan for this? Or can you start by taking a look at your tools and tell us what they currently check to determine if a simulation test is successful?

3 Likes

As a side note, to make simulation tests work, the user will also have to:

  1. Build the TLS plugin by running: make FDBLibTLS
  2. Set the TLS plugin environment variable: export FDB_TLS_PLUGIN=lib/libFDBLibTLS.so

Then you should be able to run simulation successfully. Building the TLS plugin also requires having LibreSSL around, which is true if one is using the docker file, and there is a GitHub issue (#160) to add instructions on what’s needed on other platforms.

Tests conveniently print out if an error occurred during the run, which is a different message from the tests passed line.

Most developers rely on grepping the log files for the Severity=40 event that caused the test to fail. Debugging the root cause of failed tests is a lot harder depending on the scope of the changes that are being tested.

You will want to run tests in the tests/fast tests/slow and tests/rare folders (not the ones in the root test directory or the restarting tests). You can add “-b on” to turn buggification on.

After getting simulation working you can try breaking stuff and seeing what happens. One interesting thing to try would be to modify the client code so that it adds 10000 to the read version received by the client. The database will mostly work, but it has the possibility that a client will read data from transactions that did not successfully commit. You may need to try multiple random seeds, and multiple tests, but this should definitely cause something to fail.

I got annoyed with this, and had submitted this change, which adds a "%d SevError events logged" message at the end to indicate any actual errors, and sets the return code to non-zero.

So a zero exit code reliably means a passed test now? That’s good.

I still think the community needs some kind of tooling. My thought is to rely on something like hyper.sh where you can launch containers quickly and cheaply (they have a 10 second minimum). My goal would be to make a level of testing sufficient to iterate on broken code both cheap and quick, and that for $1 ish and in a few minutes you can do enough testing to feel good about submitting a PR. Then Apple can test the hell out of it, of course :slight_smile:

What is the average CPU time for a single simulated test with the distribution of tests Apple uses?

Pricing per core hour, assuming 1 core = 2 “vCPU” and 4GiB RAM.

Container services
hyper.sh - $0.055, 10 second minimum
Amazon Fargate - $0.152, 1 minute minimum

VM services
GCE preemptible VM - $0.01772, 1 minute minimum
Amazon m5.large spot - $0.0163, 1 minute minimum

So far it looks like the preemptible/spot VM services are much cheaper, and they have low granularity billing these days. Is there a “preemptible container service” anywhere? How long will it take to boot VMs on the latter services?