FoundationDB

KAIO and EIO how are these structured?


(Jesse Bennett) #1

I’ve noticed that Linux uses something called KAIO which seems to wrap EIO functions derived from the EIO library for instance in fdbrpc/Net2Filesystem.cpp KAIO is initialized here:

#ifdef __linux__
AsyncFileKAIO::init( Reference<IEventFD>(N2::ASIOReactor::getEventFD()), ioTimeout );

Well I guess the first thing, what does KAIO stand for? How is it different from using EIO directly? Looks like EIO can be initialized actually outside of the KAIO altogether. I’m interested in how it works because FreeBSD is not using KAIO currently but EIO has hooks into sendfile and possibly then sendfile(2) via some customizations. I have not been able to get the Async File tests (AsyncFileRead, AsyncFileWrite) to complete on Linux due to this “.fdb-lock” file not being present it turns out. Now I’m starting to think that is a setup issue- I hit this assertion in both Linux Docker and FreeBSD here: https://github.com/wolfspider/foundationdb/blob/4ffbb776ceab8fc01be30a3dad211b6585a573c9/fdbrpc/sim2.actor.cpp#L1703

If anyone has any info on this I’d really appreciate it. Being able to run these tests on any platform would allow me to make changes necessary so that Async file handling can be supported even if it has to be platform specific or not. Just a few days behind on merging latest also so if any recent changes here have happened let me know. Thanks!


(A.J. Beamon) #2

KAIO uses Linux’s native asynchronous IO support (I think KAIO stands for kernel asynchronous IO) for reads and writes. It’s not really a wrapper around our libeio file implementation, but it does use AsyncFileEIO for fsync calls. It’s not necessary to use KAIO (we don’t in macOS for example), but where it’s available (Linux), we prefer it.

The assertion you are hitting is trying to say that all calls to open that have the OPEN_CREATE flag also have the OPEN_ATOMIC_WRITE_AND_CREATE flag (with the exception of .fdb-lock, which actually no longer exists).


(Jesse Bennett) #3

Alright, thanks for the info- so far I haven’t gone through the FDB docs that much nor the code outside platform stuff for flow. I’m going to have to sit down and run through the examples and admin guides now that Python works on FreeBSD…I think that will help getting a proper setup. Also, actually using the MacOS side as the client now it will be possible to compare with the Linux Docker image. Anyhow, trace files say something like “cannot read/write to filesystem” on all platforms with Async tests. I assumed this had to do with the .fdb-lock file operation even though it’s excluded maybe that’s what it’s trying to use. I’ll look for which part of the FS it’s actually trying to open maybe I missed something in the Admin guide I’ll look at that again too. When I leave my office today I’ll get the exact error from the trace file on Linux after running the test again.


(Jesse Bennett) #4

Alright here is information from the AsyncFileWrite test from /tests/AsyncFileWrite.txt and step by step what I did in detail:

  1. Checked out latest from apple/foundationdb on github.com

  2. Built Docker image with the following commands:

sudo docker build .
sudo docker volume create applefdbvol
sudo docker run -it -v ‘/Users/jessebennett/Documents/GitHub/foundationdbapple:/home/applefdbvol’ 9e1288a7d2c2 /bin/bash

  1. Built from source in Linux Docker VM with “make”

  2. When that was done Built FDBLibTLS with “make FDBLibTLS”

  3. Exported global var with “export FDB_TLS_PLUGIN=lib/libFDBLibTLS.so”

  4. Ran AsyncFileWrite test with command “bin/fdbserver -r simulation -f tests/AsyncFileWrite.txt -s 12345”

  5. Output generates familiar error here:

setting up test (AsyncFileWriteTest)…
Test received trigger for setup…
Test received trigger for setup…
Test received trigger for setup…
Internal Error @ fdbrpc/sim2.actor.cpp 1703:
addr2line -e fdbserver.debug -p -C -f -i 0x13dce88 0x1384758 0xae4d52 0xae5245 0xaf6aa5 0xaf6d60 0xa3e7cf 0xa3eaa7 0x4d5229 0x1302590 0x1302818 0x482988 0x1389a17 0x13b06bd 0x13b099b 0x13b070d 0x13b0e97 0x482988 0x1404113 0x13b0b73 0x13b0e1f 0x42b5b4 0x7f2d7d55ca40

  1. Checked the trace file generated for FS error which has the same output and this additional error message:

Event Severity=“40” Time=“6.600973” Type=“TestFailure” Machine=“3.4.3.3:1” ID=“0000000000000000” Reason=“Could not open file” logGroup=“default” Backtrace=“addr2line -e fdbserver.debug -p -C -f -i 0x143219c 0x143130a 0xade575 0xae503c 0xae5245 0xaf6aa5 0xaf6d60 0xa3e7cf 0xa3eaa7 0x4d5229 0x1302590 0x1302818 0x482988 0x1389a17 0x13b06bd 0x13b099b 0x13b070d 0x13b0e97 0x482988 0x1404113 0x13b0b73 0x13b0e1f 0x42b5b4 0x7f2d7d55ca40”

The rest of the messages are “An internal error occurred” and error code 4100.

Maybe it’s because these tests are not inside the /fast or /slow folders? I’m not sure but get the feeling I’m doing something wrong here.


(David Scherer) #5

The AsyncFileWrite test is not typically run in simulation. When it is not configured with a specific filename, it apparently creates a randomly named file in a straightforward way. The simulator is not prepared to do thorough testing of the potential durability impact of creating files without atomic rename, so it throws an error (since if something in the database itself started doing that, we would have a hole in our tests).

Maybe it’s because these tests are not inside the /fast or /slow folders?

Yeah, that’s pretty much right. Everything in those folders (and rare/, I guess) should work in simulation; outside that there is a mix of stuff with different purposes and some of it may have to be run in specific ways.


(Jesse Bennett) #6

Well, that is actually really good news and glad that’s all that was. Thanks again.