FoundationDB

Few queries on status json


(gaurav) #1

Hi,

I am trying to understand fields in status json for monitoring purposes. There are a few fields that I could not understand, and it would be great if someone can help out with these:

  1. cluster->machines->id->memory->committed_bytes => What does this field convey? And what operating system counter/metric is it using to get its value?

  2. cluster->processes->id->memory->used_bytes => What does this field convey? And what operating system counter/metric is it using to get its value (RES memory or something else)?

  3. cluster->workload->keys->read => What is this field for and how is it different from cluster->workload->operations->read?

  4. cluster->workload->bytes->read/write => what are these fields for? Are they indicating data exchange rate (read and written) at the interface of fdbserver, or does it indicate data written out by fdbservers to disk? How is the field useful for monitoring?

Is there any field that tells me per process or per machine disk read and write rate in bytes? I


(A.J. Beamon) #2

This is the memory on the machine that is not regarded as available. On Linux it comes from /proc/meminfo, and you can see its computation here:

This is the virtual memory size of the process, and it comes from /proc/self/statm on Linux. You can see it being collected here:

This is the number of keys read, expressed as a triple of rate, roughness, and total. The rate is measured per second and the total is the sum over all storage processes for their current lifetimes. Roughness is a measure of how evenly spaced or bursty the metric is, with a roughness of 1 corresponding to evenly spaced events and a higher roughness representing events that come in spurts.

The operations metric measures the number of operations, ignoring how many keys each returns. For example, a get range that returns 10 keys counts as 10 in the first metric and 1 in the second.

For reads, this is the same as above except that instead of reporting the number of keys returned, it reports the size of the data returned.

For writes, this is reporting the logical size of the mutations being written to the database (i.e. key+value size, ignoring replication, disk overhead, etc.).

These fields give you an idea how much work the clients of a database are doing in aggregate.

cluster->processes->id->disk->reads/writes->sectors will tell you the number of sectors being read or written, which are typically 512 bytes in size (though you may want to confirm that’s the case in your environment).


(gaurav) #3

Thank you for the detailed explanation. This is very helpful.


(gaurav) #4

Hi, while writing the monitoring scripts, I could not understand precise semantics of process.disk counters:

disk" : {
    "busy" : 0.047199600000000001,
    "free_bytes" : 827432890368,
    "reads" : {
        "counter" : 8447124,
        "hz" : 17.1999,
        "sectors" : 688
    },
    "total_bytes" : 914706128896,
    "writes" : {
        "counter" : 188700975,
        "hz" : 183.19900000000001,
        "sectors" : 48352
    }
}

Specifically:

  • busy: what does this indicate? I tried making the disk perform a lot of IO using dd commands in parallel; iostat command showed %util to be close to 98%, whereas the busy counter remained in low single digit %.
  • read/writes.hz : do these inficate number of read/write iops done by the specific fdbserver process (and not the total iops being issued against the disk - including those done by any other non-fdb process)? If so how are they related (or different) to sectors counter?

And a few more questions on workload counters:

"workload" : {
"bytes" : {
    "read" : {
        "counter" : 2613703043638,
        "hz" : 23296.599999999999,
        "roughness" : 9629.0699999999997
    },
    "written" : {
        "counter" : 3954328153,
        "hz" : 968.18499999999995,
        "roughness" : 901.48900000000003
    }
},
"keys" : {
    "read" : {
        "counter" : 41657599527,
        "hz" : 186.39699999999999,
        "roughness" : 114.771
    }
},
"operations" : {
    "read_requests" : {
        "counter" : 1110279557,
        "hz" : 652.99000000000001,
        "roughness" : 4.3613800000000005
    },
    "reads" : {
        "counter" : 1110279557,
        "hz" : 652.99000000000001,
        "roughness" : 4.3613800000000005
    },
    "writes" : {
        "counter" : 31491107,
        "hz" : 4.5999300000000005,
        "roughness" : 4.2830500000000002
    }
},

There are three seemingly similar counters:

  1. keys.read
  2. operations.read_requests
  3. operations.reads

@ajbeamon explained above that keys->read is the number of keys read by all storage servers, whereas operations->read is the number of read operations (including range reads) issued by clients.
What does operations.read_requests indicate (assuming my above understanding is correct for the other two counters)?


(A.J. Beamon) #5

This gives a fraction of time the disk associated with a process’s data directory is doing IOs over what is typically a 5 second window. On Linux, it comes from reading /proc/diskstats. I don’t have a good explanation for why you saw what you did unless you happened to be doing your IOs on a different drive.

This is the rate that reads and writes are being performed on the disk and is not specific to a single process. The sectors counter tells you how many sectors are being read or written to by those reads or writes, as each read or write may consist of multiple sectors. This comes from the same code linked above. It seems that in status, though, we report the read/write hz as a rate per second (as implied by the name), but sectors is reported as the number that occurred over a ~5 second window (the exact window size is not reported). That seems wrong to me, so we should probably change that to be a rate and/or a counter.

Read requests measure incoming requests, the other two measure metrics for completed requests. The former may differ from the latter if the cluster can’t keep up with all of the reads.


(gaurav) #6

Thanks AJ! I was mistakenly running dd command against a different disk. You explanation matches the corrected test results.