Improving (Perceived) Complexity

Full disclosure, we (Wavefront) started using FoundationDB (before Apple acquired it) around 2014, and I believe it is one of the foundational piece of technology that allowed us to scale to the moon and back. It pains me that the perceived complexity of the project is still rather high after the project becoming open-source in 2018.

Someone pointed me to this article:

Freedium Link

As an open-source project, I can understand how a newcomer would feel a sense of “Where do I begin?” given the “scattered-ness” of documentation, talks, videos, PDFs, etc., that one would need to read (and figure out if they are still correct/relevant). You want the documentation for the Redwood Storage Engine, you need to find a recorded talk. Is RocksDB production ready? When and why should you use it? Maybe you need to talk to someone at a meet-up next time (if you are in San Jose, that is). Is rangeconfig usable yet? Maybe try it in your dev cluster and see?

It’s not too surprising that when you look at something like https://neon.com/ (also open-source), there’s a general feeling that the developer experience would be great (no less of which it gives you testimonials and big logos using it).

The fact that the project is seen as not having a mature ecosystem, that it’s only feasible for early adopters, that it is not battle-tested in production, that it is still a young project is simply not true, but at the same time, a lot of the counter-arguments are in people’s heads, and not spelled out in writing and/or organized in a digestible fashion (the academia-ness of the docs per the blog).

Given that there are significant resources behind improving FoundationDB and the planet-scale demonstrations of its scale and resiliency, I think the community (or concerned citizens?) can help with the DevEx of the project to help make it feel a bit more loved.

3 Likes

Like this literally should not happen: How to actually use FDB OpenTelemetry tracing?

There needs to be a better way than people banging their heads against a wall for hours, come to the forums, and to be told that something wasn’t finished yet but is somehow in the codebase to tempt you to try. There needs to be something like a roadmap perhaps so people can know whether something is planned, coming soon, stalled (help wanted), or shipped.

2 Likes

And only by happenstance that I found: GitHub - deepseek-ai/3FS: A high-performance distributed file system designed to address the challenges of AI training and inference workloads. which mentions that the metadata layer is built on FoundationDB. Any other project would have had a banner on the top of the website that mentions it with a link to the blog on how it was used. :slight_smile:

This is one of the biggest problems I faced when adopting FoundationDB. Many features have no documentation, sometimes do not work as they are superseded or are experimental and abandoned.

I feel like the best course of action right now would be to create a separate community repository for user-created documentation and guides.

The way I process the status quo is the following:
FoundationDB is a great complex distributed DB that is used by large corporations that use it internally and allocate human capital to operate it as a platform.

These companies pick it DESPITE the horrible dev Experience and operational burden.

The development of FoundationDB happens behind close doors, the large corps that use it successfully run their own internal forks and probably have multiple teams working on different dimentions of the problem of running and operating deployments of FoundationDB.

You can’t compile it without compilers and SDKs for +5 programming languages, there aren’t even published containers for Linux/arm64 (which is needed even for macOS dev-experience)

It is truly a PITA to operate… There isn’t any economic incentive for this to improve. the current status quo is okay-ish for the large users.

What I don’t know is why Apple continues to maintain/develop this in semi-open format ..

If there’s enough interest, I suppose it’s possible to get the snowflakes, adobe, deepseek, and even apple to sponsor an independent open source body to manage the forums, public website, documentation, etc. (and even do small features that makes life easier for first timers: 1) sign the mac binaries, 2) default to the right storage engine, 3) include purgeable space on mac, etc. all are not relevant to the established players but very relevant to new folks).

That and there needs to be a single, up-to-date guide on how to scale this from your laptop to a multi-region, low-latency, planet-scale cluster :wink:

That or FDB is forever the reason why some of us, after figuring out how to make it sing, can build companies out of it. Oh and it also lets outsiders get a glimpse on how Apple writes software… (j/k but if they have just one person from their docs team or marketing team help, the website and docs would have been eons better…)

2 Likes

It is!

1 Like

I think the best option as I stated is to start a community repository, to start collecting resources scattered across blog posts, forum posts, random threads and talks.

Documentation is the biggest hurdle to FoundationDB’s adoption in my opinion, from experience.

It would be best if this was put under the FoundationDB GitHub organization as it would make it easier to find, but I could start a temporary repository as well if needed.

https://apple.github.io/foundationdb/ documentation is based on the GitHub - apple/foundationdb at gh-pages branch, which is based on the documentation under foundationdb/documentation/sphinx/source at main · apple/foundationdb · GitHub directory. The documentation is only updated when a new release is created, thus can be a little outdated. I suppose it’s easier to consolidate the effort just updating docs in that directory.

1 Like

Much of the documentation is still from 7 years ago when FoundationDB source was open-sourced initially (fun story, I had to run OSX Lion in a VM when we got the source code after FDB got acquired by Apple – it’s in our contract then – we had absolutely no help to figure out how to compile it at all and I had to figure it out =p). It’s obviously better now but the issue is largely still there – the docs are targeted at someone who’s trying to build apps on top of it but not necessarily to understand why certain decisions were made, the roadmap of upcoming features (incl. where help is needed), performance test results (see CockroachDB Performance Visualization), etc.

It’s possible that most people only really wants to try running the database (and not to build it) but most of the docs are indeed more akin to design docs than step-by-step guides. Then there’s the scattering of docs about the operator and record layer in its own repo too. Docs are also a bit lacking in the graphics department…

I think just asking folks to update the docs isn’t going to cut it. I personally have had PRs abandoned because “that’s not a direction the team is going to take”, so any intelligent attempt to steer the docs in a particular direction will need the team’s blessing, and for that, you’ll need someone to articulate that direction clearly before people could try. It could be something like: aggressively deprecate all obsolete storage engines by this release, gRPC by another, remove fdbdr completely by another (or at least suggest three_datacenter is what you should do in 2025), etc. so people know not to even bother with improving the documentation on some things. Just imagine if someone were to suggest a deployment guide that they wrote in a couple days with fdbdr in a PR, it would probably get shot down right away: “that’s not the best approach now… go read this design doc, that’s the replacement”

Essentially, the lack of top-level, communicated direction from the team is the primary reason why people shy away from making basic documentation changes – let alone feature/bug fixes.

1 Like

What is wrong with fdbdr?

I guess, with the recent “changefeed” feature it could be implemented in an application-specific way.

That feature has no documentation, and one thing I learned the hard way is if you leave a changefeed running and don’t consume it, data-distribution will grind to a halt at <1MB/s compared to the standard up to 500MB/s I got.

It feels like:

  • Development is too closed off and external contributions/issues are often ignored.
  • There is no communication from developers, which makes it really hard to debug issues or to start contributing if you don’t know how exactly it works.
  • Many things are experimental, some features are completely abandoned (like encrypted backups), there is no documentation anywhere about this.

I feel like it would take significant effort to get core contributors to agree on this, and it is something they are not particularly incentivized to do right now (it works for them, so who cares?)

Until there seems to be interest on having a more active FoundationDB community and making it more approachable to non-core users, I think the only way this can be solved by separate community-based documentation.

For new FoundationDB deployments, region configuration should be your default choice, not fdbdr.

Region configuration gives you:

  • Automatic failover with zero data loss
  • 2x storage instead of 6x (old three_datacenter) or 4x (two separate clusters with fdbdr)
  • Efficient WAN usage - only replicates what’s necessary between regions
  • Single cluster to manage - much simpler operations
  • Automatic healing - when a failed datacenter comes back online (fiber cut repaired, hardware replaced), region configuration automatically resyncs without intervention

With fdbdr, recovering from a datacenter failure requires manual work: reverse the replication direction, wait for catch-up, then switch back.

The only reasons to consider fdbdr:

  1. Version migrations - Running different FDB versions simultaneously during upgrades (like Snowflake does)
  2. Regulatory/compliance - When you legally need completely separate, isolated clusters
  3. One-way replication - When you need a read-only copy for analytics/reporting that won’t affect production
2 Likes

I don’t think just starting a community page is the solution - it’ll just create yet another place for people to look for information that may or may not be up to date.

The problem isn’t that we need more documentation sources. The problem is that nobody’s making decisions about what the canonical path should be. Like, why are we still documenting fdbdr prominently when region configuration has been the better choice for years? Why is changefeed in the codebase if it can silently break your cluster?

What we need is someone who can just make the call: “This is deprecated. This is experimental. This is what you should use.” No more of this “try it in your dev cluster and see” nonsense.

The companies running FDB successfully have entire teams who’ve already figured all this out. But that tribal knowledge is locked up inside those companies. If DeepSeek can build their entire AI filesystem on FDB, if Snowflake can run their metadata layer on it - clearly it works. But good luck figuring out how to replicate their success from the current docs.

I think the only real solution is getting one of the big users (Apple, Snowflake, whoever) to fund someone to just own the documentation problem for 6 months. Not to contribute to yet another community effort, but to fix the official docs once and for all.

1 Like

Ah, ok. I was considering using it for a one-way mirror or creating copies.

Some features like encrypted backups don’t work even though the option is listed with no warnings too, for example. I feel like any experimental/to be deprecated feature should require a knob to be set (if it was not already in use before the deprecation).

Same things with knobs: Some people may want to increase the length of a read transaction that is allowed, but under certain circumstances that can actually lead to issues in recovery.

Another thing I noticed is duplication of documentation. For example, there are four different pages that try to describe the transaction system, some in detail, some not.

It would be great to contribute to the documentation, but I currently don’t have the necessary understanding of the codebase to do that and it might just get stuck in the PR list forever.

Thanks for all the discussions and feedbacks above! We know there are many difficulties and complexities of understanding features and running FDB as a service. What we are thinking are:

  • As a first step, we’ll create a documentation on a list of features and their status (in production, experimental, deprecated, or need community support).
  • For the features we are working on, we’ll communicate better with the community and create documentations on their designs and operations.
  • We’ll need feedbacks from the community about usage of different features so that we can gradually remove unused ones.
4 Likes

Thanks Jingyu, we (VMware) had really good doc writers that can churn out documentation for external technical folks rapidly. I assume the same exists at Apple (or Snowflake or…) so it’s really just a matter of finding a mercenary who’s willing to do this. It’s good PR and a general cozy feeling for companies who do technical documentation well :slight_smile:

I also hope that the repository would be open for external documentation contributions.

I would like to contribute some things I have gotten experience with from running FoundationDB, and perhaps there can also be a page linking to external subjective content, write-ups or talks.

I also hope that the repository would be open for external documentation contributions.

It is I believe.

I would like to contribute some things I have gotten experience with from running FoundationDB,

I’d like to help land these contributions.

Michael