I am starting to look at porting FoundationDB code to work on Power (ppc64le) and seems that there is a lot of architecture specific code (dependency on x86_64).
Just wanted to check if there is any general guidance on how to approach this, how complex this might be and if it has been ported (or a port attempted) to any other architecture, which can be used as a reference?
Heya @alexmiller I can provide some color. We’ve a number of ideas for use of FoundationDB in IBM. Many of those use cases are motivated by the needs of services delivered in IBM Cloud. Those services often have an on premises counterpart with a consistent API to help our clients build portable applications, and many of our clients run their on prem “private cloud” environment on Power hardware. So, we’re looking ahead to make sure that we don’t find ourselves scrambling to port something here. I can think of at least two customers where FoundationDB and Power could intersect in the coming year.
Priya’s team has the broader mission to support a robust open source ecosystem on Power, and I’ve asked for their help to figure out what we might be up against here
If you want to build and maintain a port of FDB to a new architecture, I would recommend budgeting for ongoing compute and human time to run a full spectrum of simulation and other tests on each release. Otherwise the port may be significantly less trustworthy than the official releases.
It would be nice if there was a document somewhere describing a “baseline” testing routine for FoundationDB porters, etc. I don’t exactly know what integration strategies Apple uses, for example. (For instance, NixOS users can’t use foundationdb.org binaries, so we have to package our own). I ran many of the simulation tests under tests/ with our builds – but only after figuring out even how to do that by reading forum threads, etc. It would be nice to have a document saying something like, “Here are the tests you need to run manually, and here are tests that can be fully automated by running fdbserver in simulation mode”, and how to do so.
It would also be nice to know about the memory/compute requirements for this. For instance, I can write a test for NixOS that can execute a set of test routes running FoundationDB, including things like simulation runs; this is how we integrate and test other databases, such as CockroachDB, too (this link contains a full end-to-end cluster test). While this might not be enabled by default (it could take hours), if the requirements are known, I can certainly run tests myself regularly, etc.
Perhaps this already exists somewhere in the source code? I still think this would be good information to have in the documentation so it’s easier to find.
On a related note, it would probably be nice on top of this if the GitHub repository for FoundationDB could be set up to use a CI system that ran such tests somehow. This is not only very useful for you but also important for contributors, etc. (Right now I would simply have no clue how to test my changes before submitting them for review). I’d guess that a public CI system for GitHub would probably run a smaller set of tests to keep iteration times lower.
(A useful CI system to examine might be Azure Pipelines, since it’s got a free version and does actually support Windows/Linux/Mac natively, and it’s the only one I can think of that does support all 3 out of the box.)
I don’t see a PowerPC port for libcoroutine that anyone has already done. FWIW, I wouldn’t be opposed to a PR switching from coroutine to Boost.Context instead.
I think this is going to, somewhat by necessity, fall out of the “port FDB to FreeBSD” PR, because part of that involves a discussion of “when would this be marked as stable” and the list of requirements that would entail.
Porting to a new architecture but same platform would likely involve a separate list, as trying to verify that the networking stack or file system doesn’t behave detrimentally differently wouldn’t be needed. I’m happy to sit down and think hard and chat with folks if someone shows up with an FDB port to a new arch, like PowerPC.
There’s a CMake build system that’s been added, that also has CTest support. I’d recommend using that, because a lot of work was put into getting ctest able to run simulation tests locally in parallel. If ctest passes for you locally, that’s a reasonable time to upload a PR. That still only gives you O(100) runs of simulation versus the O(50,000) runs that I’d generally aim for if it’s a serious change to the core, but if there’s any concern, your reviewer will pull the PR and kick off more simulation runs in a cluster of machines to close that testing gap.
I… don’t see that in the README, which is a pretty good point.
It currently can! That that support was added a week ago or something, though. (The Correctness check on this PR, for example.) It runs ctest -L fast, and there’s commands for reviewers to issue to get it to do a more comprehensive testing if requested.