FoundationDB

FoundationDB does not run on Windows Subsystem For Linux (WSL)


(Christophe Chevalier) #1

I’m not sure why you’d want to do it, but if you’d want to do it: installing FoundationDB 5.2.5 on Windows Subsystem for Linux with an Ubuntu userland does not work (Win10 1803)

For those wondering what the hell I’m talking about: yes, this is a real thing https://docs.microsoft.com/en-us/windows/wsl/faq

Installing the server package fails

~/fdb$ sudo dpkg -i foundationdb-server_5.2.5-1_amd64.deb
Selecting previously unselected package foundationdb-server.
Preparing to unpack foundationdb-server_5.2.5-1_amd64.deb ...
Unpacking foundationdb-server (5.2.5-1) ...
Setting up foundationdb-clients (5.2.5-1) ...
Adding group `foundationdb' (GID 115) ...
Done.
Adding system user `foundationdb' (UID 111) ...
Adding new user `foundationdb' (UID 111) with group `foundationdb' ...
Not creating home directory `/var/lib/foundationdb'.
...
Setting up foundationdb-server (5.2.5-1) ...
ERROR: Disk i/o operation failed (1510)
dpkg: error processing package foundationdb-server (--configure):
 installed foundationdb-server package post-installation script subprocess returned error exit status 1
dmesg: read kernel buffer failed: Function not implemented
                                                          
Processing triggers for ureadahead (0.100.0-20) ...
Processing triggers for systemd (237-3ubuntu10.3) ...
Errors were encountered while processing:
 foundationdb-server
E: Sub-process /usr/bin/dpkg returned an error code (1)

Installing the client package does not fail, but running fdbcli fails immediately:

~/fdb$ fdbcli
ERROR: Disk i/o operation failed (1510)

The log files all look identical to this (this is the whole file! nothing more)

<?xml version="1.0"?>
<Trace>
<Event Severity="10" Time="1540663766.702862" Type="Binding" Machine="127.0.0.1:4500" ID="0000000000000000" PublicAddress="127.0.0.1:4500" ListenAddress="127.0.0.1:4500" logGroup="default"/>
<Event Severity="10" Time="1540663766.702862" Type="IOSetupError" Machine="127.0.0.1:4500" ID="0000000000000000" UnixErrorCode="26" UnixError="Function not implemented" logGroup="default"/>
<Event Severity="40" Time="1540663766.702862" Type="MainError" Machine="127.0.0.1:4500" ID="0000000000000000" Error="io_error" ErrorDescription="Disk i/o operation failed" ErrorCode="1510" logGroup="default" Backtrace="addr2line -e fdbserver.debug -p -C -f -i 0x12abcc4 0x12aad32 0x432900 0x7fbfc1241b97"/>

I guess fdb must calling into some syscalls that are not emulated by the windows kernel, crashes, gets restarted, rince, repeat…

Running fdbserver would have been cool, but at least I can install it on the Windows host itself, but it would have been cool to be able to use the client and fdbcli from a linux host, to test linux-native apps (in my case, running .NET Core on Linux on my Windows laptop, without paying the cost of a VM)


(Alec Grieser) #2

I’m not an expert on WSL, but could this be related to the fact that FoundationDB uses O_DIRECT for I/O? That same problem has caused issues for those trying to run FDB within a Docker for Mac container with a mounted volume as the data directory: https://github.com/apple/foundationdb/issues/842

I’m basically suggesting this because of the fact that it’s failing with an I/O error on startup, and (absent, say, a bad disk), the above issue is I think where I’ve seen that before.


(Christophe Chevalier) #3

It’s possible. The way I understand it, WSL works by translating linux syscalls into their equivalent Windows kernel calls, but not everything is either implemented, or possible. O_DIRECT I/O may be one of the unsupported calls.

If you implement a workaround for platforms that do not support O_DIRECT, it’s possible that WSL may be fixed as well.

Now to be fair, I don’t expect the server to be fully working, but the client part would be nice. And my suspicion is that it’s only the code that deals with the fdb.cluster file that fails (log files are written to disk without issues).


(A.J. Beamon) #4

IOSetupError is caused by a failure to setup linux kernel aio. I take it that’s not supported in WSL?

Maybe using AsyncFileEIO instead of AsyncFileKAIO in this case would work? I think this would require a code change to test.


(Christophe Chevalier) #5

It looks like AIO is not supported by WSL, according to this issue (that impacts mysql as well): https://github.com/Microsoft/WSL/issues/3631 or https://github.com/Microsoft/WSL/issues/2113

Are there other linux platforms that don’t support AIO? If WSL is recognized as a Linux platform by the client, then that would mean that “if (platform == LINUX) …” is not a sufficient test, and would require either a runtime check for AIO support (with fallback to something else), or a custom package built for WSL? That’s starting to look like a lot of work…


(A.J. Beamon) #6

I haven’t really looked into it, so I’m not sure. I don’t think I’ve heard anyone else mention that they had trouble with AIO on a Linux platform before, though.

Another option is to have it be a runtime toggleable piece of behavior, perhaps using knobs. I haven’t attempted to do this, but this may be the only place where you need to modify the behavior: https://github.com/apple/foundationdb/blob/51afb29e3b23e41e55c1aec1a2865b29baf6073b/fdbrpc/Net2FileSystem.cpp#L61.

As a side note, maybe avoiding AIO in this way is a better option for those who can’t use O_DIRECT than reverting to synchronous behavior (see https://github.com/apple/foundationdb/pull/859).