One fdb per account/store?

Hypothetically, say you are building a shopify/wordpress clone. Where instead of an “ebay searching over all items”, there is a concept of a “store” and each “store” has it’s own list of items, and there are very little interactions between stores, and there is also no way to search multiple stores at once.

In such a hypothetical world, would it makes sense to have one fdb per user/store ? If so, is there a easy way to do this? One sqlite per user/store would be trivial, but fdb feels a bit “heavier” to do one per user/store.

The motivation behind of the split is: there’s never going to be a transaction hitting multiple stores at once, so we might as well as split it. This also ensures that high activity on one store does not degrade performance of other stores.

Thanks!

I would use a single cluster. FoundationDB is meant to be scaled up. I would have a separate directory for each user/store and build an API in front of FDB which controls how the data is accessed.

I don’t see a reason why you’d come to this conclusion. Can you explain your reasoning?

Unfortunately, I only have intuition, not an actual argument.

Suppose you have a Pizza store and I have a Hamburger store. It is not obvious to me at all why your store and my store should hit the same fdb cluster. In particular, if we have different clusters, your fdb cluster could be in Texas, while my fdb cluster could be in CA. It seems having both our stores connect to the same fdb is an “unnecessary coupling/constraint”.

1 Like

Yea fair. It’s good to keep things decoupled. At my job we use a single FDB cluster for multiple applications so that we don’t need to manage multiple deployments. That’s the trade off we chose.

What you want is multi-tenancy. Sadly, FDB doesn’t have great native support for this.

The reasons you probably don’t want one cluster per customer is cost and operational complexity: A reasonably robust and highly available FDB cluster needs 3-9 machines (depends a bit on how reliable you want). This can be very costly if you’re dealing with many very small databases. You can, of course, run two FDB clusters on the same set of machines, but this will create operational headaches.

What you probably want is something like a logical database and some control layer above FDB which decides which database runs on which physical FDB cluster. Now if you have two customers in Texas they both can share the same hardware. You could even move database from one cluster to another, though this is a bit tricky to implement without FDB support, but it’s not impossible. I believe Apple’s record layer has some features which would help for that.

Another reason is scaling. Say you have a cluster in Paris. Your product becomes very popular in France and your Paris cluster starts hosting hundreds of databases. This initially won’t be an issue because you can simply grow your cluster. But at one point you want more isolation and control cost (two small clusters are often cheaper than one large cluster). So you spin up a second cluster in Paris and move some of your customers there.

One challenge of that approach is security: you wouldn’t want to give your customers direct access to the database, as this would now allow them to access data of other customers. Instead you’d build a proxy which now needs to do authentication and authorization.