To get this file off of my local computer…
We had a meeting where we took @Evan and had him go through the workloads that exist and give a brief description of them with optional commentary. Below is my set of notes from this, and there’s some of them with ??? where no one actually knew or remembered what the workload did or why it exists.
If any enterprising person wishes to dig through some of the ??? workloads to provide the explanation/commentary instead, your help would be greatly appreciated. Otherwise, I probably eventually will, and then add it as comments to the files.
ApiCorrectness:
Does set of operations on database, and in memory, and then compares
Multiple clients can run, but each has their own prefix
Uses more transaction API than ReadsDuringWrites
AsyncFile:
???
Probably related to tests/AsyncFile
But that’s also a broken test
AtomicOps:
Issues atomic operations into one keyspace, and a log of the operations into another
Verifies that atomic operation results matches the log of committed operations
Only does one atomic op type per run
doesn’t check durability
AtomicOpsApiCorrectness:
Tests results of the atomic ops
AtomicRestore:
starts database, locks database, restores
because the lock, no data should be lost, so can be paired with tests that don’t expect data loss
AtomicSwitchover:
Similar to above, but switches between two databases instead of doing a restore
BackgroundSelectors:
pretty useless
issues key selectors in the background during other workloads
None of other tests have coverage for this, so it was added, but it’s never found any issues.
doesn’t verify result, only that reads happen
BackupCorrectness:
Primary tests for backups.
Heavily exercises API.
BackupToDBAbort:
Test that aborts an in-progress DR
Not really sure why here, but it is
BackupToDBCorrectness:
Same as BackupCorrects, but with DR
BackupToDBUpgrade:
DR test done in combination with an upgrade/restarting test
start backup with older version
finish backup with newer version
verifies that backup is backwards compatible
BulkLoad:
Loads data into the database
BulkSetup:
Performance tests use this to load data
Checks if data is already present, and then doesn’t load
Not used in simulation
ChangeConfig:
Used at the start of every simulation to change configuration into desired one
for multi-region, waits for data to be fully replicated before continuing
ClientTransactionProfilCorrectness:
Client transaction profiling test
CommitBugCheck:
“Regression test for 2 commit related bugs”
no memory of what they were
ConfigureDatabase:
Changes database configuration many times during a test
Not mixable with attrition
Also changes region configurations
Anything configure
toggleable should be added to this test
ConflictRange:
designed to ensure that conflict ranges added to transaction are as minimal as possible
checks if two transactions should conflict, runs them, sees if they do
ConsistencyCheck:
Run at the end of every simulation test
checks a large number of properties
Makes sure data in replicas match
also run in real life
CpuProfiler:
Does flow profiling for a duration
Cycle:
builds a cycle of keys, and then switches nodes
good at detecting ACI violations
doesn’t check durability
DDBalance:
aggressively tries to cause data movement
moves large amounts of data between different key space prefixes
DDMetrics:
???
Maybe used as part of a circus test?
DiskDurability{,Test}:
Another async file test
may or may not be broken
DummyWorkload:
Example workload/test of workload framework
FastTriggeredWatches:
Ensure time bound on when key being set causes the watch to be fired
Because simulation has lots of errors, doesn’t enforce a tight bound (~12s)
FileSystem:
???
A workload that pretends to be a filesystem and do “operations” on the filesystem
Fuzz:
Made when actor compiler was written to test that actor compiler works
Python program makes actors that should have a known output
This workload runs that, and verifies the correctness
FuzzApiCorrectness:
Randomly uses API with an expected list of exceptions
Verifies that random values don’t produce crashes or unexpected errors
Increment:
Increments keys and then checks the total sums are equal.
IndexScan:
???
Inventory:
Attempting to datamodel a warehouse (pseudo TPC-C)
Doesn’t keep memory copy, so no durability verification
KillRegion:
Sets up a multi-region cluster
Kills a region including satellites
Runs a forced ACI-only recovery
Has data loss, but should be consistent data loss
KVStoreTest:
Directly opens storage engine and does reads/writes
LockDatabase:
Just locks and unlocks a database
Mostly a test of database locking
LogMetrics:
???
LowLatency:
Verifies that recoveries happens in a “timely” manner
Run with buggification disabled has a tight bound
With buggification has a long bound
MachineAttrition:
Kills or reboots processes, machines, datacenters, etc.
Super important workload
MemoryKeyValueStore:
???
MemoryLifetime:
Checks some properties of RYW
???
MetricLogging:
tests TDMetric things
Performance:
???
Ping:
Sending interfaces to workloads and sending messages between them
???
Maybe a network performance test
replaced by fdbserver -r networktestsender
PubSubMultiples:
???
QueuePush:
Sequential insert load for performance test
RandomClogging:
Nemesis behavior test
blocks network communication between two processes for some time
RandomMoveKeys:
turns off data distribution, and does its own
moves shards to random places constantly
RandomSelector:
tests correctness of keyselectors
ensures they resolve to the right thing
Two different selectors that should be equal, and verifies equalness
ReadWrite:
Lots of parameters
Almost all perforamnce tests are this workload
RemoveServersSafely:
Tests excludes
Because of this, can kill much more machines than attrition
Rollback:
Tries to trigger storage server rollbacks
RyowCorrectness:
???
RYWDisable:
???
RYWPerformance:
???
SaveAndKill:
For restarting tests
saves the state of simulation so that it can be restarted
produces a restartInfo file
SelectorCorrectness:
Uses getrange to verify that a keyselector is correct
Serializability:
???
Sideband:
Sends messages between processes while doing commits to find causal errors
SlowTaskWorkload:
???
StatusWorkload:
Runs status along with a workload
All status tests are nondeterministic
Storefront:
Datamodeling a store
StreamingRead:
Performance test workload
TargettedKill:
Alternative to attrition to kill something of a specific role
Used for recovery performance tests
TaskBucketCorrectness:
Tests taskbucket, which is used by backup and DR
Not comprehensive for the API
ThreadSafety:
???
Throttling:
???
Throughput:
Perforamnce test workload
Tries to maintain a certain latency while doing as many writes as possible
TimekeeperCorrectness:
Tests timekeeper
TriggerRecovery:
???
UnitPerf:
???
UnitTests:
Runs UNIT_TESTs that don’t start with an exclamation point
Unreadable:
When doing a versionstamp operation, you can’t read that key with a RYW
This test makes sure that you can’t
VersionStamp:
Verifies everything about versionstamps
Inluding DR, non durability verification
WatchAndWait:
???
Watches:
???
WorkerErrors:
???
WriteBandwidth:
Performance test to do as many writes as possible
WriteDuringRead:
“most random test”
Keeps an in-memory copy of the DB while doing things to the DB
verifies that FDB matches the in-memory state at the end
Sometimes very memory intensive and OOMs