I have started working with FDB recently. Could anyone please expand on the following:
What are the different types that are maintained in foundation db trace files like “GetValueDebug” etc ?
What is the location of transaction log file? Is it stored in the data dir as configured in the configuation file ? and is there any prefix for the log files ? Also, if you could expound on the format of data stored in transaction ogs ?
I also have few doubt related to three_datacenter_mode:
Doc says that TLog processes are maintained in 2 data centers and the third datacenter only has storage server processes, how is the data replicated in the third datacenter and the second datacenter ( assuming second and first data center both have tlogs and client has started transaction request from first datacenter) ?
Since three_datacenter_mode is based upon synchronous data replication, and commit latencies observe round trip latencies also, wanted to know what is the process of data replication here, couldn’t find any detailed article
What are the different types that are maintained in foundation db trace files like “GetValueDebug” etc ?
We don’t really have much documentation for the contents of trace files. Probably the best we have are a couple partially completed wiki pages:
The transaction logs’ data files (rather than trace logs) are stored in the data directory. There are two sets of files here:
Disk queue - These files have the form logqueue-*.fdq. All mutations sent to a transaction log get appended to one of these, and storage servers gradually pop data off the front.
Persistent data - These files are usually b-tree files using the ssd-2 storage engine, and in that case will have the form log*.sqlite and log*.sqlite-wal. If the disk queue gets “full”, then data gets spilled to this data structure from the queue until it can be popped or is no longer needed.
The filenames of both of these also contain some version information about the log system used to generate them. There are some good write-ups of the design of this in these documents:
I don’t have a lot of experience with three_datacenter mode, but I think this works fairly similarly to normal configurations. Commits must synchronously write data to all the logs in the two datacenters, and storage servers asynchronously grab data from the logs.
A commit from the primary datacenter would have to interact with that datacenter to initiate the commit, the data would get written to the first and second datacenter transaction logs, and then the commit would succeed. Asynchronously, the storage servers in all datacenters would grab the updated data from their transaction logs to ensure that it ends up everywhere.