Performance characteristics of using Watches for Distributed Task Scheduling


(Christophe Chevalier) #21

Up till now, I used a local 4-process fdb cluster when running the tests, and I wanted to check the difference that going to the network introduce.

Unfortunately, I had to change multiple factors at the same time, so it is difficult to pin point what is going on, but the results where exactly the opposite to what I was expecting:

The client and local cluster run on this machine:

  • i7-6700K @ 4.0 Ghz, 32 GB ram
  • Windows 10, .NET 4.7.2
  • FDB 5.1.7, .NET binding
  • Samsung SM951 (MZHPV512HDGL)

I have also set up 2 VMs running 2 process each, on Ubuntu 18.4, hosted by Hyper-V, via 1Gbps ethernet in between for comparison.

The test simulates a fixed throughput (24,000 messages per minute) and measure the end-to-end latency of a message dispatch. It runs for 2 minutes. The request rate is fixed so I’m not running the cluster above its limits (I think).

Latency over time

The test running entirely localhost (Win -> Win):


The same test but using the remote cluster (Win -> Linux):

Looking at the distribution of latency for both, it is clear that:

  1. The linux server (and/or the network?) is more consistent with timings than the Windows server
  2. The Windows server implementation seems to be all other the place with timings.

Latency distributions

LOCALHOST (Win->Win)

- Min/Max: 27.7ms .. 566.2ms
- Average: 77.9ms (+/-29.2ms)
- Median : 73.3ms (+/-10.4ms)
- Distrib: (48.2ms) - 61.4ms =[ 73.3ms ]= 88.4ms - (118.2ms)
   ___________________________________________________________________________________________________________________________________ 
  |____[ Min , Max )____|___Count____|__Percent____________________________________________________|___Cumulative________|__Weight____|
  |   25.0ms - 30.0ms   |         14 |   0.037% `                                                  |   0.037% `--------- |    385.0ms |
  |   30.0ms - 35.0ms   |         62 |   0.162% .                                                  |   0.199% `--------- |      2.02s |
  |   35.0ms - 40.0ms   |        207 |   0.541% ;                                                  |   0.740% [--------- |      7.76s |
  |   40.0ms - 45.0ms   |        594 |   1.553% $                                                  |   2.292% [--------- |      25.2s |
  |   45.0ms - 50.0ms   |      1,602 |   4.187% ##.                                                |   6.480% )--------- |        76s |
  |   50.0ms - 60.0ms   |      5,889 |  15.392% #######X                                           |  21.872% --[------- |       324s |
  |   60.0ms - 70.0ms   |      8,403 |  21.963% ###########                                        |  43.835% ----(----- |       546s |
  |   70.0ms - 80.0ms   |      7,050 |  18.427% #########:                                         |  62.262% ------[--- |       529s |
  |   80.0ms - 90.0ms   |      5,796 |  15.149% #######x                                           |  77.412% -------)-- |       493s |
  |   90.0ms - 100.0ms  |      3,568 |   9.326% ####X                                              |  86.738% --------)- |       339s |
  |  100.0ms - 120.0ms  |      3,479 |   9.093% ####=                                              |  95.831% ---------) |       383s |
  |  120.0ms - 140.0ms  |        905 |   2.365% #:                                                 |  98.197% ---------] |       118s |
  |  140.0ms - 160.0ms  |        257 |   0.672% ;                                                  |  98.868% ---------] |      38.6s |
  |  160.0ms - 180.0ms  |        123 |   0.321% :                                                  |  99.190% ---------] |      20.9s |
  |  180.0ms - 200.0ms  |         64 |   0.167% .                                                  |  99.357% ---------] |      12.2s |
  |  200.0ms - 250.0ms  |         49 |   0.128% .                                                  |  99.485% ---------] |      11.0s |
  |  250.0ms - 300.0ms  |         91 |   0.238% .                                                  |  99.723% ---------> |      25.0s |
  |  300.0ms - 350.0ms  |         51 |   0.133% .                                                  |  99.856% ---------> |      16.6s |
  |  350.0ms - 400.0ms  |         26 |   0.068% `                                                  |  99.924% ---------> |      9.75s |
  |  400.0ms - 450.0ms  |          6 |   0.016% `                                                  |  99.940% ---------> |      2.55s |
  |  450.0ms - 500.0ms  |          7 |   0.018% `                                                  |  99.958% ---------> |      3.33s |
  |  500.0ms - 600.0ms  |         16 |   0.042% `                                                  | 100.000% ---------@ |      8.80s |
  |_____________________|____________|_____________________________________________________________|_____________________|____________|

REMOTE (Win->Linux)

- Min/Max: 9.0ms .. 345.8ms
- Average: 49.0ms (+/-16.6ms)
- Median : 47.7ms (+/-7.3ms)
- Distrib: (31.3ms) - 38.2ms =[ 47.7ms ]= 56.6ms - (71.0ms)
   ___________________________________________________________________________________________________________________________________ 
  |____[ Min , Max )____|___Count____|__Percent____________________________________________________|___Cumulative________|__Weight____|
  |    8.0ms - 9.0ms    |          1 |   0.003% `                                                  |   0.003% `--------- |      8.5ms |
  |    9.0ms - 10.0ms   |         17 |   0.044% `                                                  |   0.046% `--------- |    161.5ms |
  |   10.0ms - 12.0ms   |         57 |   0.147% .                                                  |   0.193% `--------- |    627.0ms |
  |   12.0ms - 14.0ms   |         70 |   0.180% .                                                  |   0.373% `--------- |    910.0ms |
  |   14.0ms - 16.0ms   |         43 |   0.111% .                                                  |   0.484% `--------- |    645.0ms |
  |   16.0ms - 18.0ms   |         52 |   0.134% .                                                  |   0.618% [--------- |    884.0ms |
  |   18.0ms - 20.0ms   |         47 |   0.121% .                                                  |   0.739% [--------- |    893.0ms |
  |   20.0ms - 25.0ms   |        210 |   0.541% ;                                                  |   1.279% [--------- |      4.73s |
  |   25.0ms - 30.0ms   |        358 |   0.922% =                                                  |   2.201% [--------- |      9.85s |
  |   30.0ms - 35.0ms   |      4,331 |  11.149% #####x                                             |  13.350% -(-------- |       141s |
  |   35.0ms - 40.0ms   |      7,020 |  18.071% #########                                          |  31.421% ---[------ |       263s |
  |   40.0ms - 45.0ms   |      1,846 |   4.752% ##+                                                |  36.173% ---)------ |        78s |
  |   45.0ms - 50.0ms   |     10,091 |  25.976% #############                                      |  62.149% ------[--- |       479s |
  |   50.0ms - 60.0ms   |      7,561 |  19.464% #########X                                         |  81.612% --------[- |       416s |
  |   60.0ms - 70.0ms   |      5,123 |  13.188% ######x                                            |  94.800% ---------| |       333s |
  |   70.0ms - 80.0ms   |        785 |   2.021% #                                                  |  96.821% ---------) |      58.9s |
  |   80.0ms - 90.0ms   |        605 |   1.557% $                                                  |  98.378% ---------] |      51.4s |
  |   90.0ms - 100.0ms  |        284 |   0.731% +                                                  |  99.109% ---------] |      27.0s |
  |  100.0ms - 120.0ms  |        177 |   0.456% :                                                  |  99.565% ---------> |      19.5s |
  |  120.0ms - 140.0ms  |         75 |   0.193% .                                                  |  99.758% ---------> |      9.75s |
  |  140.0ms - 160.0ms  |         40 |   0.103% .                                                  |  99.861% ---------> |      6.00s |
  |  160.0ms - 180.0ms  |          3 |   0.008% `                                                  |  99.869% ---------> |    510.0ms |
  |  250.0ms - 300.0ms  |         35 |   0.090% `                                                  |  99.959% ---------> |      9.63s |
  |  300.0ms - 350.0ms  |         16 |   0.041% `                                                  | 100.000% ---------@ |      5.20s |
  |_____________________|____________|_____________________________________________________________|_____________________|____________|

But then, looking at the peaks of the Linux test, at~30ms, ~45ms and ~60ms, it looks like it is always a multiple of 15ms.

My spider sense is tingling because 15ms is the default timeslice of the Windows scheduler (and system time resolution). I’ve seen so many times 15.x and 31ms come up in naive benchmarking using the system clock on Windows, that I always reject these values by default.

My tests are already using the QPC as the source, so the issue does not come from the benchmark framework, but maybe by some other component. There are other “media” application running so the OS timeslice should already be at 1000hz (1ms resolution for timers), but the system clock will still be at ~15ms resolution.

So:

  • either this is a coincidence and 15ms seem to be a natural cycle length for my test
  • or the client windows implementation uses the system clock somewhere as a source for timers?

Unfortunately the test app is not running on .NET Core yet, so I can’t replicate the tests with a Linux client to disprove this hypothesis…

Does anyone have any idea what would cause these N x 15ms latency modes with the Linux client?


(Christophe Chevalier) #22

New hardware, new tests!

  • AMD Ryzen 1950X “ThreadRipper” with 64 GB RAM
  • 3 x Samsung 960 EVO 1TB
  • 3 x Ubuntu 18.04 VM, 8 GB RAM, 2 process, each with its own disk
  • Cluster configured in single ssd-2

note: the little spike at 56 sec is probably .NET GC triggering on the bench client.

- Min/Max: 12.4ms .. 279.6ms
- Average: 30.3ms (+/-13.1ms)
- Median : 28.4ms (+/-4.2ms)
- Distrib: (17.8ms) - 23.7ms =[ 28.4ms ]= 34.1ms - (47.8ms)
   ___________________________________________________________________________________________________________________________________ 
  |____[ Min , Max )____|___Count____|__Percent____________________________________________________|___Cumulative________|__Weight____|
  |   12.0ms - 14.0ms   |         37 |   0.092% `                                                  |   0.092% `--------- |    481.0ms |
  |   14.0ms - 16.0ms   |        479 |   1.188% x                                                  |   1.280% [--------- |      7.19s |
  |   16.0ms - 18.0ms   |      1,675 |   4.156% ##.                                                |   5.436% |--------- |      28.5s |
  |   18.0ms - 20.0ms   |      2,707 |   6.716% ###+                                               |  12.151% -[-------- |      51.4s |
  |   20.0ms - 25.0ms   |      6,921 |  17.170% ########x                                          |  29.322% --]------- |       156s |
  |   25.0ms - 30.0ms   |     12,180 |  30.217% ###############.                                   |  59.539% ----->---- |       335s |
  |   30.0ms - 35.0ms   |      7,645 |  18.966% #########=                                         |  78.506% -------]-- |       248s |
  |   35.0ms - 40.0ms   |      3,681 |   9.132% ####x                                              |  87.638% --------]- |       138s |
  |   40.0ms - 45.0ms   |      2,251 |   5.584% ##$                                                |  93.222% ---------( |        96s |
  |   45.0ms - 50.0ms   |      1,263 |   3.133% #x                                                 |  96.356% ---------) |      60.0s |
  |   50.0ms - 60.0ms   |        968 |   2.402% #:                                                 |  98.757% ---------] |      53.2s |
  |   60.0ms - 70.0ms   |        231 |   0.573% ;                                                  |  99.330% ---------] |      15.0s |
  |   70.0ms - 80.0ms   |         77 |   0.191% .                                                  |  99.521% ---------> |      5.78s |
  |   80.0ms - 90.0ms   |         25 |   0.062% `                                                  |  99.583% ---------> |      2.13s |
  |   90.0ms - 100.0ms  |         12 |   0.030% `                                                  |  99.613% ---------> |      1.14s |
  |  100.0ms - 120.0ms  |         20 |   0.050% `                                                  |  99.663% ---------> |      2.20s |
  |  120.0ms - 140.0ms  |         30 |   0.074% `                                                  |  99.737% ---------> |      3.90s |
  |  140.0ms - 160.0ms  |         30 |   0.074% `                                                  |  99.811% ---------> |      4.50s |
  |  160.0ms - 180.0ms  |         17 |   0.042% `                                                  |  99.854% ---------> |      2.89s |
  |  180.0ms - 200.0ms  |          9 |   0.022% `                                                  |  99.876% ---------> |      1.71s |
  |  200.0ms - 250.0ms  |         39 |   0.097% `                                                  |  99.973% ---------> |      8.78s |
  |  250.0ms - 300.0ms  |         11 |   0.027% `                                                  | 100.000% ---------@ |      3.03s |
  |_____________________|____________|_____________________________________________________________|_____________________|____________|

Zooming in, I can still kind of see the boundaries of 1/60th and 1/30th of a second, with a white band right in between:

Testing while the cluster is moving data around?

I tried switching to double redundancy, but probably ran the test while the cluster was still repartitionning: we can see that there are moments when the performance is degraded:

Looks like there is a 10 sec cycle in the process.


(Christophe Chevalier) #23

Testing impact of server crash on the latency.

The test runs for 120 sec with a steady load. The cluster is composed of 3 x VM with 2 process each, configured in double ssd.

At T+20 sec, I kill one of the 3 VM (so only 2 remains), and then at T+60 sec I restart the VM.

Remarks:

  • Before the crash, the typical latency is 30ms.
  • The application takes about 10 seconds to recover from the crash (new transactions start showing at T+30sec)
  • While the other 2 nodes are reacting to the crash, latency goes to about 40ms.
  • When the VM restarts at T+60sec, the latency get a tiny bit more erratic (not by much)
  • 90 sec after the VM restart, the cluster goes back down to the typical 30ms latency.

The 10 seconds reaction delay seems a bit long, but maybe this can be configured somewhere?

Repeating the test again, but this time observing that VM #3 only has storage nodes (no log, no master, …)

This time there is still a 10 sec period where things are funky, but some transactions do make progress during this time. Also, the time to restore the typical performance after the VM restart is about 10 sec (including the boot time of the RHEL that is about 5sec?).

It looks like for the first test, the VM that was killed may have been either the cluster Master, or had one or two logger.

Remarks:

  • The impact of loosing a VM depends on what roles were running on that VM.

(Christophe Chevalier) #24

After fiddling with the hardware and the drivers, plus some optimizations here and there, I’m now getting very stable latency in the steady case.

Typical latency is now either ~15ms or ~25ms.

I’m thinking that the two latency modes correspond to two cases:

  • when there is already one or more commands available in a server’s queue when it checks again, and can proceed immediately to the next
  • when there is no commands, and it must setup a watch an wait to be notified.

If this is true, it would mean that the overhead of a watch is ~10ms ?

- Min/Max: 11.3ms .. 280.9ms
- Average: 24.5ms (+/-13.3ms)
- Median : 25.1ms (+/-2.5ms)
- Distrib: (14.8ms) - 18.8ms =[ 25.1ms ]= 28.0ms - (32.5ms)
   ______________________________________________________________________________________________________________________ 
  |____[ Min , Max )____|___Count____|__Percent____________________________________________________|___Cumulative________|
  |   10.0ms - 12.0ms   |         33 |   0.081% `                                                  |   0.081% `--------- |
  |   12.0ms - 14.0ms   |        784 |   1.925% #                                                  |   2.006% [--------- |
  |   14.0ms - 16.0ms   |      3,071 |   7.540% ###$                                               |   9.545% >--------- |
  |   16.0ms - 18.0ms   |      4,688 |  11.509% #####$                                             |  21.055% --[------- |
  |   18.0ms - 20.0ms   |      3,818 |   9.373% ####X                                              |  30.428% -->------- |
  |   20.0ms - 25.0ms   |      7,760 |  19.051% #########=                                         |  49.480% ----]----- |
  |   25.0ms - 30.0ms   |     17,335 |  42.559% #####################;                             |  92.038% ---------[ |
  |   30.0ms - 35.0ms   |      2,391 |   5.870% ##&                                                |  97.908% ---------] |
  |   35.0ms - 40.0ms   |        372 |   0.913% =                                                  |  98.822% ---------] |
  |   40.0ms - 45.0ms   |        146 |   0.358% :                                                  |  99.180% ---------] |
  |   45.0ms - 50.0ms   |         47 |   0.115% .                                                  |  99.295% ---------] |
  |   50.0ms - 60.0ms   |         69 |   0.169% .                                                  |  99.465% ---------] |
  |   60.0ms - 70.0ms   |         18 |   0.044% `                                                  |  99.509% ---------> |
  |  100.0ms - 120.0ms  |          1 |   0.002% `                                                  |  99.511% ---------> |
  |  120.0ms - 140.0ms  |          3 |   0.007% `                                                  |  99.519% ---------> |
  |  140.0ms - 160.0ms  |         18 |   0.044% `                                                  |  99.563% ---------> |
  |  160.0ms - 180.0ms  |         48 |   0.118% .                                                  |  99.681% ---------> |
  |  180.0ms - 200.0ms  |         65 |   0.160% .                                                  |  99.840% ---------> |
  |  200.0ms - 250.0ms  |         59 |   0.145% .                                                  |  99.985% ---------> |
  |  250.0ms - 300.0ms  |          6 |   0.015% `                                                  | 100.000% ---------@ |
  |_____________________|____________|_____________________________________________________________|_____________________|


(Christophe Chevalier) #25

Looking at the transaction log (the test generates 400 MB of logs on the client side!), It looks like this is indeed the case:

Case when a command is already present:

Only one transaction required to get next message.

Transaction #190013 (read/write, 5 operations, '#' = 0.5 ms, started 12:43:28.6200999Z, ended 12:43:28.6341425Z)
┌  oper. ┬──────────────────────────────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   // │                              │ T+  0.003                        │     -     - │ // WaitForNextCommands(SRV000, 10)
│:0   op │                              │ T+  0.003                        │     -     - │ SetOption PrioritySystemImmediate
│:0   R °│ ##&                          │ T+  0.006 ~   1.481 (  1,475 µs) │    60   163 │ Snapshot.GetRange fGT{(58, 10, "server_commands", "SRV000").<00>} <= k < fGE{(58, 10, "server_commands", "SRV000").<FF>}, limit(10), Exact => 1 result(s)
│ 1   c  │ __`                          │ T+  1.483 ~   1.485 (      2 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222371293-0#1)
│ 2   Co*│ __`######################### │ T+  1.487 ~  14.059 ( 12,572 µs) │             │ Commit
└────────┴──────────────────────────────┴──────────────────────────────────┴─────────────┘
> Read 163 bytes and Committed 113 bytes in 14.064 ms and 1 attempt(s)

The transaction takes 14ms to read and consume one command from the queue.

Case when a watch must be set up:

First transaction sees nothing, create the watch. When it fires, the transaction is retried and now sees 3 messages.

Transaction #190055 (read/write, 5 operations, '#' = 0.5 ms, started 12:43:28.6451670Z, ended 12:43:28.6471734Z)
┌  oper. ┬──────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   // │      │ T+  0.002                        │     -     - │ // WaitForNextCommands(SRV000, 10)
│:0   op │      │ T+  0.002                        │     -     - │ SetOption PrioritySystemImmediate
│:0   R °│ ###: │ T+  0.005 ~   1.808 (  1,804 µs) │    60     0 │ Snapshot.GetRange fGT{(58, 10, "server_commands", "SRV000").<00>} <= k < fGE{(58, 10, "server_commands", "SRV000").<FF>}, limit(10), Exact => 0 result(s)
│ 1   W  │ ___` │ T+  1.810                        │     -     - │ Watch (58, 10, "server_commands", "SRV000")
│:1   Co │ ___X │ T+  1.819 ~   2.227 (    409 µs) │             │ Commit
└────────┴──────┴──────────────────────────────────┴─────────────┘
> Completed in 2.231 ms and 1 attempt(s)

// waiting for watch to fire...

Transaction #190087 (read/write, 7 operations, '#' = 0.5 ms, started 12:43:28.6602071Z, ended 12:43:28.6682276Z)
┌  oper. ┬──────────────────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   // │                  │ T+  0.003                        │     -     - │ // WaitForNextCommands(SRV000, 10)
│:0   op │                  │ T+  0.003                        │     -     - │ SetOption PrioritySystemImmediate
│:0   R °│ ##&              │ T+  0.005 ~   1.551 (  1,546 µs) │    60   489 │ Snapshot.GetRange fGT{(58, 10, "server_commands", "SRV000").<00>} <= k < fGE{(58, 10, "server_commands", "SRV000").<FF>}, limit(10), Exact => 3 result(s)
│ 1   c  │ __`              │ T+  1.553 ~   1.555 (      2 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222401665-0#1)
│ 2   c  │ __`              │ T+  1.555 ~   1.555 (      1 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222401753-1#1)
│ 3   c  │ __`              │ T+  1.555 ~   1.556 (      1 µs) │    42       │ Clear (58, 10, "server_commands", "SRV000", @164222401753-2#1)
│ 4   Co°│ __.############# │ T+  1.558 ~   8.449 (  6,892 µs) │             │ Commit
└────────┴──────────────────┴──────────────────────────────────┴─────────────┘
> Read 489 bytes and Committed 339 bytes in 8.455 ms and 1 attempt(s)

For the second case, the transaction that sets up the watch start at xx.6451670. The watch fires at xx.6602071 (15ms later), and the second transaction completes at xx…6682276 (23ms).

These numbers look seems to confirm that indeed the two modes are with and without watch.


Logging of transactions for debugging purposes
(Christophe Chevalier) #26

Of the importance of having an idem-potent algorithm to deal with commit_unknown_result

When doing crash tests (killing a random VM during a benchmark test) I observed some crash that at first pointed to an inconsistency in the data, which was not supposed to be possible.

Looking closely at the logs, I see that this is caused by commit_unknown_result that happen right when the VM crash: the transaction has been durably stored on the disk, but the client did not receive a success notification, and retried the transaction as if it never happened.

And since the transaction selects a random GUID to create the new key, and also uses versionstamps for the scheduling, I ended up with creating two work orders for the same item. Once the first once completes, the original item is deleted. And then the second work order fails because it cannot find the original item anymore (which is not supposed to be possible with ACID transactions).

So the code has to handle this case, by testing for commit_unknown_result errors, and doing a preliminary check when it retries (look for signs that the work item has already been scheduled).

Example log of a transaction that was committed just when the cluster node it was talking to failed, and created a duplicate work order for the same original item.

Transaction #16608 (read/write, 29 operations, '#' = 50.0 ms, started 09:40:32.8652885Z, ended 09:40:34.8247080Z)
┌  oper. ┬─────────────────────────────────────────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   // │                                         │ T+  0.001                        │     -     - │ // ScheduleJobs([1])
│:0   G*°│ `                                       │ T+  0.003 ~   1.469 (  1,466 µs) │    35   692 │ GetValues [1] {(58, 10, "tickets_meta", {2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2}) } => [1] {{"id":"2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2","usr":"DOMACME\\user060","q":"VIRTUALQ","a":"SRV000","c":"2018-06-18T09:40:32.7549962Z","st":0,"m":{"Title":"Job_19_User060","Type":"unknown","UserDomain":"DOMACME","UserLogin":"User060","UserFullName":"John USER060","Date":"1970-01-01T00:00:00Z","Sheets":3425,"Pages":3425,"ColorPages":0,"Copies":0,"IsColor":false,"IsDuplex":false,"IsLargeFormat":false,"NUp":0,"IsLandscape":false,"HasSeparatorPage":false,"IsStapled":false,"IsFolded":false,"IsEconomode":false,"IsHighQuality":false,"PageRatio":0,"Priority":0,"StationName":"PC123","StationIp":"192.168.1.23"},"ft":0,"loc":{"uri":"wd://srv000/Queues/VIRTUALQ/3425/TheKey_C36EB6D7","sz":4226450}} }
│ 1   // │ `                                       │ T+  1.508                        │     -     - │ // Create Schedule 46fc20d9-9875-4e8b-83b0-a57fece6d50d for Ticket 2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2 on device PRN0987
│:1   s  │ `                                       │ T+  1.519 ~   1.520 (      2 µs) │   278       │ Set (58, 10, "schedules", {46fc20d9-9875-4e8b-83b0-a57fece6d50d}, "") = {"id":"46fc20d9-9875-4e8b-83b0-a57fece6d50d","usr":"DOMACME\\User060","tid":"2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2","dev":"PRN0987","act":"SRV000","dt":"2018-06-18T09:40:32.8652885Z","ix":0,"st":0,"ka":false,"tag":"W#0000060:0:3","x":10.3622772}
│ 2   s  │ `                                       │ T+  1.521 ~   1.522 (      1 µs) │    56       │ Set (58, 10, "schedules", {46fc20d9-9875-4e8b-83b0-a57fece6d50d}, "ticket") = *<10><D2><8D>[<D1>OZ<9D><11>6<A4><DF><9C>T<C2>
│ 3   s  │ `                                       │ T+  1.522 ~   1.523 (      1 µs) │    47       │ Set (58, 10, "schedules", {46fc20d9-9875-4e8b-83b0-a57fece6d50d}, "device") = 'PRN0987'
│ 4   s  │ `                                       │ T+  1.526 ~   1.527 (      1 µs) │    43       │ Set (58, 10, "schedules", {46fc20d9-9875-4e8b-83b0-a57fece6d50d}, "state") = 'None'
│ 5   s  │ `                                       │ T+  1.527 ~   1.528 (      1 µs) │    59       │ Set (58, 10, "schedules_by_ticket", {2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2}, {46fc20d9-9875-4e8b-83b0-a57fece6d50d}) = ''
│ 6   R °│ `                                       │ T+  1.530 ~   2.647 (  1,118 µs) │    60     0 │ GetRange fGE{(58, 10, "devices_queues", "PRN0987").<00>} <= k < fGE{(58, 10, "devices_queues", "PRN0987").<FF>}, limit(1) => 0 result(s)
│ 7   // │ `                                       │ T+  2.654                        │     -     - │ // First job in queue PRN0987, will notify actor SRV000!
│:7   a  │ `                                       │ T+  2.656 ~   2.658 (      2 µs) │    33       │ Atomic_Add (58, 10, "server_commands", "SRV000"), <01 00 00 00>
│ 8   a  │ `                                       │ T+  2.666 ~   2.667 (      2 µs) │   165       │ Atomic_VersionStampedKey (58, 10, "server_commands", "SRV000", @?#1), <7B 22 63 6D 64 22 3A 22 70 72 69 6E 74 22 2C 22 64 65 76 22 3A 22 50 52 4E 30 39 38 37 22 2C 22 73 69 64 22 3A 22 34 36 66 63 32 30 64 39 2D 39 38 37 35 2D 34 65 38 62 2D 38 33 62 30 2D 61 35 37 66 65 63 65 36 64 35 30 64 22 2C 22 74 69 64 22 3A 22 32 61 31 30 64 32 38 64 2D 35 62 64 31 2D 34 66 35 61 2D 39 64 31 31 2D 33 36 61 34 64 66 39 63 35 34 63 32 22 7D>
│ 9   a  │ `                                       │ T+  2.669 ~   2.670 (      1 µs) │    60       │ Atomic_VersionStampedKey (58, 10, "devices_queues", "PRN0987", @?#1), <46 FC 20 D9 98 75 4E 8B 83 B0 A5 7F EC E6 D5 0D>
│ 10  a  │ `                                       │ T+  2.671 ~   2.672 (      1 µs) │    50       │ Atomic_VersionStampedValue (58, 10, "schedules", {46fc20d9-9875-4e8b-83b0-a57fece6d50d}, "slot"), <FF F7 57 19 F4 B0 77 19 F0 00 00 01>
│ 11 !Co*│ !#####################################x │ T+  2.677 ~ 1939.873 (1,937,196 µs) │             │ Commit => [CommitUnknownResult] Transaction may or may not have committed
│ 12  Er°│ ______________________________________. │ T+1939.914 ~ 1947.272 (  7,358 µs) │             │ OnError CommitUnknownResult (1021)
├────────┼─────────────────────────────────────────┼──────────────────────────────────┼─────────────┤ == Attempt #2 ==
│ 13  // │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1947.276                        │     -     - │ // ScheduleJobs([1])
│:13  G*°│ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°. │ T+1947.280 ~ 1950.752 (  3,472 µs) │    35   692 │ GetValues [1] {(58, 10, "tickets_meta", {2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2}) } => [1] {{"id":"2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2","usr":"DOMACME\\user060","q":"VIRTUALQ","a":"SRV000","c":"2018-06-18T09:40:32.7549962Z","st":0,"m":{"Title":"Job_19_User060","Type":"unknown","UserDomain":"DOMACME","UserLogin":"User060","UserFullName":"John USER060","Date":"1970-01-01T00:00:00Z","Sheets":3425,"Pages":3425,"ColorPages":0,"Copies":0,"IsColor":false,"IsDuplex":false,"IsLargeFormat":false,"NUp":0,"IsLandscape":false,"HasSeparatorPage":false,"IsStapled":false,"IsFolded":false,"IsEconomode":false,"IsHighQuality":false,"PageRatio":0,"Priority":0,"StationName":"PC123","StationIp":"192.168.1.23"},"ft":0,"loc":{"uri":"wd://srv000/Queues/VIRTUALQ/3425/TheKey_C36EB6D7","sz":4226450}} }
│ 14  // │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.810                        │     -     - │ // Create Schedule f6af2968-66f9-4f1f-96c5-75df43c04f10 for Ticket 2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2 on device PRN0987
│:14  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.837 ~ 1950.843 (      6 µs) │   278       │ Set (58, 10, "schedules", {f6af2968-66f9-4f1f-96c5-75df43c04f10}, "") = {"id":"f6af2968-66f9-4f1f-96c5-75df43c04f10","usr":"DOMACME\\User060","tid":"2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2","dev":"PRN0987","act":"SRV000","dt":"2018-06-18T09:40:32.8652885Z","ix":0,"st":0,"ka":false,"tag":"W#0000060:0:3","x":10.3622772}
│ 15  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.844 ~ 1950.846 (      1 µs) │    56       │ Set (58, 10, "schedules", {f6af2968-66f9-4f1f-96c5-75df43c04f10}, "ticket") = *<10><D2><8D>[<D1>OZ<9D><11>6<A4><DF><9C>T<C2>
│ 16  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.847 ~ 1950.850 (      3 µs) │    47       │ Set (58, 10, "schedules", {f6af2968-66f9-4f1f-96c5-75df43c04f10}, "device") = 'PRN0987'
│ 17  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.872 ~ 1950.874 (      3 µs) │    43       │ Set (58, 10, "schedules", {f6af2968-66f9-4f1f-96c5-75df43c04f10}, "state") = 'None'
│ 18  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.876 ~ 1950.877 (      1 µs) │    59       │ Set (58, 10, "schedules_by_ticket", {2a10d28d-5bd1-4f5a-9d11-36a4df9c54c2}, {f6af2968-66f9-4f1f-96c5-75df43c04f10}) = ''
│ 19  R  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1950.883 ~ 1951.553 (    670 µs) │    60    58 │ GetRange fGE{(58, 10, "devices_queues", "PRN0987").<00>} <= k < fGE{(58, 10, "devices_queues", "PRN0987").<FF>}, limit(1) => 1 result(s), has_more
│ 20  // │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1951.556                        │     -     - │ // There is already at least one schedule for device PRN0987
│:20  a  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1951.560 ~ 1951.562 (      3 µs) │    60       │ Atomic_VersionStampedKey (58, 10, "devices_queues", "PRN0987", @?#1), <F6 AF 29 68 66 F9 4F 1F 96 C5 75 DF 43 C0 4F 10>
│ 21  a  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+1951.563 ~ 1951.564 (      1 µs) │    50       │ Atomic_VersionStampedValue (58, 10, "schedules", {f6af2968-66f9-4f1f-96c5-75df43c04f10}, "slot"), <FF 5F 9A C9 C9 FA 56 A4 F0 01 00 01>
│ 22  Co°│ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°: │ T+1951.567 ~ 1959.953 (  8,386 µs) │             │ Commit
└────────┴─────────────────────────────────────────┴──────────────────────────────────┴─────────────┘
> Read 1,442 bytes and Committed 733 bytes in 1,959.958 ms and 2 attempt(s)

(Christophe Chevalier) #27

I can attest that randomly killing VMs here and there, is a good way to test for all variants of error codes that can happen at different locations :slight_smile:

I had a few future_version errors in the Watch that were not really expected, and had to specifically handle the commit_unknown_result in a few places (go read the database to see if the work has already been done or not) to survive all the failures.

The last test was starting with 5 VMs, and killing them all one after the other, until the application stops responding, then restarting them all, and waiting for everything to resume without any funky errors.

note: Y axis is in log scale!

With 5 VM, killing the first 2 VMs is quickly handled by the cluster (between 1.5 and 2 sec latency), but the application keeps going (with 3 VM remaining).

Then killing the third VM makes the cluster stop responding (last 2 VMs are not enough to have a quorum).

At T+80sec, I quickly restart all 3 VM in quick succession, and about 10 sec later, the application resume processing, wityh one or two micro-spikes (probably when the cluster starts rebalancing data).

I should make the test run longer to measure the impulse response of such an event before the latency return to normal…

This is what it looks like for a transaction that spanned the crash:

Transaction #71455 (read/write, 31 operations, '#' = 500.0 ms, started 15:29:11.1712164Z, ended 15:29:50.1686888Z)
┌  oper. ┬───────────────────────────────────────────────────────────────────────────────┬────── start ────── end ── duration ──┬─ sent  recv ┐
│ 0   // │                                                                               │ T+  0.005                            │     -     - │ // ScheduleJobs([ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5])
│:0   G**│ ############################################################################$ │ T+  0.011 ~ 38899.690 (38,899,680 µs)│    35   696 │ GetValues [1] {(58, 10, "tickets_meta", {ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5}) } => [1] {{"id":"ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5","usr":"DOMACME\\user074","q":"VIRTUALQ","a":"SRV003","c":"2018-06-18T15:29:11.0980215Z","st":0,"m":{"Title":"Job_88_User074","Type":"unknown","UserDomain":"DOMACME","UserLogin":"User074","UserFullName":"John USER074","Date":"1970-01-01T00:00:00Z","Sheets":14746,"Pages":14746,"ColorPages":0,"Copies":0,"IsColor":false,"IsDuplex":false,"IsLargeFormat":false,"NUp":0,"IsLandscape":false,"HasSeparatorPage":false,"IsStapled":false,"IsFolded":false,"IsEconomode":false,"IsHighQuality":false,"PageRatio":0,"Priority":0,"StationName":"PC123","StationIp":"192.168.1.23"},"ft":0,"loc":{"uri":"wd://srv003/Queues/VIRTUALQ/14746/TheKey_851BBAFF","sz":18196564}} }
│ 1   // │ ____________________________________________________________________________` │ T+38899.724                          │     -     - │ // Create Schedule 1b78b73e-4fb0-4747-acdc-a95f3bcb724f for Ticket ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5 on device PRN0527
│:1   s  │ ____________________________________________________________________________` │ T+38899.734 ~ 38899.735 (      2 µs) │   278       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "") = {"id":"1b78b73e-4fb0-4747-acdc-a95f3bcb724f","usr":"DOMACME\\User074","tid":"ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5","dev":"PRN0527","act":"SRV003","dt":"2018-06-18T15:29:11.1712164Z","ix":0,"st":0,"ka":false,"tag":"W#0000074:3:3","x":53.7603924}
│ 2   s  │ ____________________________________________________________________________` │ T+38899.736 ~ 38899.737 (      1 µs) │    56       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "ticket") = <EB><FF><15><B8><0F><FD>K<CC><9A>*<C3><E0>2<BF><B9><F5>
│ 3   s  │ ____________________________________________________________________________` │ T+38899.738 ~ 38899.739 (      1 µs) │    47       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "device") = 'PRN0527'
│ 4   s  │ ____________________________________________________________________________` │ T+38899.741 ~ 38899.742 (      1 µs) │    43       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "state") = 'None'
│ 5   s  │ ____________________________________________________________________________` │ T+38899.742 ~ 38899.745 (      3 µs) │    59       │ Set (58, 10, "schedules_by_ticket", {ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5}, {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}) = ''
│ 6   R °│ ____________________________________________________________________________` │ T+38899.747 ~ 38901.298 (  1,551 µs) │    60     0 │ GetRange fGE{(58, 10, "devices_queues", "PRN0527").<00>} <= k < fGE{(58, 10, "devices_queues", "PRN0527").<FF>}, limit(1) => 0 result(s)
│ 7   // │ ____________________________________________________________________________` │ T+38901.300                          │     -     - │ // First job in queue PRN0527, will notify actor SRV003!
│:7   a  │ ____________________________________________________________________________` │ T+38901.301 ~ 38901.302 (      1 µs) │    33       │ Atomic_Add (58, 10, "server_commands", "SRV003"), <01 00 00 00>
│ 8   a  │ ____________________________________________________________________________` │ T+38901.306 ~ 38901.307 (      1 µs) │   165       │ Atomic_VersionStampedKey (58, 10, "server_commands", "SRV003", @?#1), <7B 22 63 6D 64 22 3A 22 70 72 69 6E 74 22 2C 22 64 65 76 22 3A 22 50 52 4E 30 35 32 37 22 2C 22 73 69 64 22 3A 22 31 62 37 38 62 37 33 65 2D 34 66 62 30 2D 34 37 34 37 2D 61 63 64 63 2D 61 39 35 66 33 62 63 62 37 32 34 66 22 2C 22 74 69 64 22 3A 22 65 62 66 66 31 35 62 38 2D 30 66 66 64 2D 34 62 63 63 2D 39 61 32 61 2D 63 33 65 30 33 32 62 66 62 39 66 35 22 7D>
│ 9   a  │ ____________________________________________________________________________` │ T+38901.307 ~ 38901.308 (      1 µs) │    60       │ Atomic_VersionStampedKey (58, 10, "devices_queues", "PRN0527", @?#1), <1B 78 B7 3E 4F B0 47 47 AC DC A9 5F 3B CB 72 4F>
│ 10  a  │ ____________________________________________________________________________` │ T+38901.309 ~ 38901.310 (      1 µs) │    50       │ Atomic_VersionStampedValue (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "slot"), <FF 59 FD 8C 4B 32 B3 74 F0 00 00 01>
│ 11 !Co*│ ____________________________________________________________________________. │ T+38901.312 ~ 38947.152 ( 45,840 µs) │             │ Commit => [PastVersion] Transaction is too old to perform reads or be committed
│ 12  Er*│ ____________________________________________________________________________` │ T+38947.196 ~ 38962.783 ( 15,587 µs) │             │ OnError PastVersion (1007)
├────────┼───────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────┼─────────────┤ == Attempt #2 ==
│ 13  // │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38962.788                          │     -   -   │ // ScheduleJobs([ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5])
│:13  G*°│ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38962.791 ~ 38964.467 (  1,676 µs) │    35   696 │ GetValues [1] {(58, 10, "tickets_meta", {ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5}) } => [1] {{"id":"ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5","usr":"DOMACME\\user074","q":"VIRTUALQ","a":"SRV003","c":"2018-06-18T15:29:11.0980215Z","st":0,"m":{"Title":"Job_88_User074","Type":"unknown","UserDomain":"DOMACME","UserLogin":"User074","UserFullName":"John USER074","Date":"1970-01-01T00:00:00Z","Sheets":14746,"Pages":14746,"ColorPages":0,"Copies":0,"IsColor":false,"IsDuplex":false,"IsLargeFormat":false,"NUp":0,"IsLandscape":false,"HasSeparatorPage":false,"IsStapled":false,"IsFolded":false,"IsEconomode":false,"IsHighQuality":false,"PageRatio":0,"Priority":0,"StationName":"PC123","StationIp":"192.168.1.23"},"ft":0,"loc":{"uri":"wd://srv003/Queues/VIRTUALQ/14746/TheKey_851BBAFF","sz":18196564}} }
│ 14  // │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.500                          │     -   -   │ // Create Schedule 1b78b73e-4fb0-4747-acdc-a95f3bcb724f for Ticket ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5 on device PRN0527
│:14  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.508 ~ 38964.509 (      2 µs) │   278       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "") = {"id":"1b78b73e-4fb0-4747-acdc-a95f3bcb724f","usr":"DOMACME\\User074","tid":"ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5","dev":"PRN0527","act":"SRV003","dt":"2018-06-18T15:29:11.1712164Z","ix":0,"st":0,"ka":false,"tag":"W#0000074:3:3","x":53.7603924}
│ 15  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.510 ~ 38964.511 (      1 µs) │    56       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "ticket") = <EB><FF><15><B8><0F><FD>K<CC><9A>*<C3><E0>2<BF><B9><F5>
│ 16  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.511 ~ 38964.512 (      1 µs) │    47       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "device") = 'PRN0527'
│ 17  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.514 ~ 38964.515 (      1 µs) │    43       │ Set (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "state") = 'None'
│ 18  s  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.516 ~ 38964.517 (      1 µs) │    59       │ Set (58, 10, "schedules_by_ticket", {ebff15b8-0ffd-4bcc-9a2a-c3e032bfb9f5}, {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}) = ''
│ 19  R °│ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38964.518 ~ 38972.793 (  8,275 µs) │    60     0 │ GetRange fGE{(58, 10, "devices_queues", "PRN0527").<00>} <= k < fGE{(58, 10, "devices_queues", "PRN0527").<FF>}, limit(1) => 0 result(s)
│ 20  // │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38972.794                          │     -     - │ // First job in queue PRN0527, will notify actor SRV003!
│:20  a  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38972.795 ~ 38972.797 (      2 µs) │    33       │ Atomic_Add (58, 10, "server_commands", "SRV003"), <01 00 00 00>
│ 21  a  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38972.801 ~ 38972.802 (      1 µs) │   165       │ Atomic_VersionStampedKey (58, 10, "server_commands", "SRV003", @?#1), <7B 22 63 6D 64 22 3A 22 70 72 69 6E 74 22 2C 22 64 65 76 22 3A 22 50 52 4E 30 35 32 37 22 2C 22 73 69 64 22 3A 22 31 62 37 38 62 37 33 65 2D 34 66 62 30 2D 34 37 34 37 2D 61 63 64 63 2D 61 39 35 66 33 62 63 62 37 32 34 66 22 2C 22 74 69 64 22 3A 22 65 62 66 66 31 35 62 38 2D 30 66 66 64 2D 34 62 63 63 2D 39 61 32 61 2D 63 33 65 30 33 32 62 66 62 39 66 35 22 7D>
│ 22  a  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38972.804 ~ 38972.807 (      4 µs) │    60       │ Atomic_VersionStampedKey (58, 10, "devices_queues", "PRN0527", @?#1), <1B 78 B7 3E 4F B0 47 47 AC DC A9 5F 3B CB 72 4F>
│ 23  a  │ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38972.808 ~ 38972.810 (      2 µs) │    50       │ Atomic_VersionStampedValue (58, 10, "schedules", {1b78b73e-4fb0-4747-acdc-a95f3bcb724f}, "slot"), <FF 8F 03 BD 54 3F 16 3A F0 01 00 01>
│ 24  Co*│ °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°` │ T+38972.814 ~ 38997.849 ( 25,035 µs) │             │ Commit
└────────┴───────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────┴─────────────┘

What is weird, is that it dies with a past_version caused by a read… It stalled on the first read, be we ended up getting the result 40 sec later. Then it continue setting some keys, before falling on the commit.

Transaction that did not read anything don’t fail on the commit:

Transaction #72179 (read/write, 4 operations, '#' = 500.0 ms, started 15:29:18.1897053Z, ended 15:29:49.8598697Z)
┌  oper. ┬─────────────────────────────────────────────────────────────────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   s  │ `                                                               │ T+  0.050 ~   0.056 (      6 µs) │   731       │ Set (58, 10, "tickets_meta", {b6ef118d-e4f3-4dc5-a51f-24ec816e5e7c}) = {"id":"b6ef118d-e4f3-4dc5-a51f-24ec816e5e7c","usr":"DOMACME\\user000","q":"VIRTUALQ","a":"SRV002","c":"2018-06-18T15:29:18.1897053Z","st":0,"m":{"Title":"Job_98_User000","Type":"unknown","UserDomain":"DOMACME","UserLogin":"User000","UserFullName":"John USER000","Date":"1970-01-01T00:00:00Z","Sheets":14885,"Pages":14885,"ColorPages":0,"Copies":0,"IsColor":false,"IsDuplex":false,"IsLargeFormat":false,"NUp":0,"IsLandscape":false,"HasSeparatorPage":false,"IsStapled":false,"IsFolded":false,"IsEconomode":false,"IsHighQuality":false,"PageRatio":0,"Priority":0,"StationName":"PC123","StationIp":"192.168.1.23"},"ft":0,"loc":{"uri":"wd://srv002/Queues/VIRTUALQ/14885/TheKey_F4DF6B7C","sz":18368090}}
│ 1   s  │ `                                                               │ T+  0.058 ~   0.059 (      1 µs) │    55       │ Set (58, 10, "tickets_by_user", "DOMACME\user000", {b6ef118d-e4f3-4dc5-a51f-24ec816e5e7c}) = ''
│ 2   s  │ `                                                               │ T+  0.059 ~   0.060 (      1 µs) │    49       │ Set (58, 10, "tickets_by_queue", "VIRTUALQ", {b6ef118d-e4f3-4dc5-a51f-24ec816e5e7c}) = ''
│ 3   Co*│ ############################################################### │ T+  0.061 ~ 31670.582 (31,670,521 µs) │             │ Commit
└────────┴─────────────────────────────────────────────────────────────────┴──────────────────────────────────┴─────────────┘
> Committed 919 bytes in 31,670.589 ms and 1 attempt(s)

I even have one transaction that was able to read two keys 50 sec apart!

Transaction #63849 (read-only, 2 operations, '#' = 1,000.0 ms, started 15:28:57.9100775Z, ended 15:29:49.9822494Z)
┌  oper. ┬──────────────────────────────────────────────────────┬──── start ──── end ── duration ──┬─ sent  recv ┐
│ 0   G *│ #x                                                   │ T+  0.009 ~ 1587.633 (1,587,624 µs) │    34   243 │ Get (58, 10, "schedules", {1386bdf5-4c54-450c-8123-20e7f544bd53}, "") => {"id":"1386bdf5-4c54-450c-8123-20e7f544bd53","usr":"DOMACME\\User096","tid":"e60...15:28:57.8910264Z","ix":0,"st":0,"ka":false,"tag":"W#0000096:1:0","x":40.480004}
│:0   G *│ #################################################### │ T+  0.023 ~ 52072.983 (52,072,960 µs) │    35   695 │ Get (58, 10, "tickets_meta", {e6036194-90f1-4fff-a145-99fdb60fa7e6}) => {"id":"e6036194-90f1-4fff-a145-99fdb60fa7e6","usr":"DOMACME\\user096","q":"VIRTU..."loc":{"uri":"wd://srv001/Queues/VIRTUALQ/13180/TheKey_F3649B95","sz":16264120}}
└────────┴──────────────────────────────────────────────────────┴──────────────────────────────────┴─────────────┘
> Read 938 bytes in 52,073.068 ms and 1 attempt(s)

I guess even though there is 52 seconds of wall clock between both reads, the cluster was not alive during that time, and less than 5 seconds of time has elapsed, as seen from the cluster. But then by the time the commit happens, the cluster can finally close the previous 5 second window…


(Dave Rosenthal) #28

Just wanted to say that this is an awesome thread!

I can attest that randomly killing VMs here and there, is a good way to test for all variants of error codes that can happen at different locations!

Sounds like a client mode that randomly threw errors would be a nice feature to let FDB developers build tests to ensure that their code handles all cases gracefully. (Does this already exist?)