(Disclaimer: This is not a well-written post and stuff in here is probably described badly. My main intention is to make people aware of a new feature. Please ask questions if anything is unclear ).
At Snowflake we’re currently working on a simple authorization feature. I wanted to announce this here in case others in the community find this useful and want to start preparing to use this (or want to start testing this as soon as we have it code-complete in the main
branch).
Current State
So currently there is almost no security in FDB. However, there is a feature that clients can be authenticated via mTLS. mTLS is, as of today the only way someone can use TLS in FDB. The rationale here is that clients are treated in a binary way: either they are trusted entities and therefore get full access to FDB, or they’re not and therefore can’t establish a connection.
Small Tangent: Multi-tenancy (new 7.1 feature)
FDB 7.1 will introduce the concept of tenants. This feature is still in development, but the basic functionality is working (other things, like workload isolation, automatic movement of tenants across clusters, meta-cluster management etc will be added in later versions of FDB).
A tenant will provide its own transactional subspace. Running transactions that touch multiple tenants is not something we support. Instead they should be thought of as independent databases. So instead of running 10 clusters for 10 applications, we can now run 1 clusters (or more, up to 10, depending on the load requirements).
Authorization Model
The feature we’re currently implementing is very simple: instead of all-or-nothing access, a client can be given access to only a limited number of tenants.
The model we’re using is the following:
Client Machine (untrusted) -> Authentication Service -> Application -> FDB
- We assume there will be some client (a machine not controlled by the organization that runs the FDB application – so this could be an iPhone App or some web browser). It will send requests to some service.
- Before anything happens, there will be some authentication service.
- Then the actual application (which runs the FDB client) will receive this request. This machine will then only read/write data of a specific tenant or a small set of tenants (a tenant could be a user or a specific service – ultimately the application will decide what the meaning of tenant is).
In this new world, the application will connect to FDB using TLS, but it won’t provide a certificate (so mTLS won’t be a requirement anymore, only FDB → FDB connections will have to use mTLS). FDB will accept the connection, but it won’t allow the client to do anything useful. That means, by default the client won’t be able to read or write any data.
If it wants to do anything, it has to send an access token to FDB. This token will basically just contain a list of tenants the client is allowed to access (raw key access will be denied) and this token has to be signed by some private key. FDB will then need to know about the public key (distributing public keys to the FDB nodes will, again, be the responsibility of the user).
One of the difficulties for using this feature is that the user will have to figure out how to generate the tokens and deliver them to the application. Technically, FDB doesn’t do authorization, it just enforces it. This token can be generated by some service and then passed to the application.
Limitations
This first implementation will have some limitations:
- There’s nothing that will prevent a replay attack. The only protection against replay attacks is a TTL in the token. So if a token is being leaked, anyone who can make a connection to the FDB cluster will be able to use this token. This has to be addressed later, but for now it’s the responsibility of the application to keep tokens safe.
- There’s a relatively high operational overhead: pubic keys need to be distributed to all FDB hosts. Key rotation has to be solved outside of FDB. In order to create a token, the FDB client library will provide helper functions, but ultimately, most of the burden will be with the user. We don’t plan to change this. We expect most FDB users will already have some infrastructure for these kind of operations.
- Authorization is still very coarse – instead of a cluster, authorization will simply be on a tenant level. We don’t yet have plans to change this.
Timeline
We’re planning to release this feature in FDB 7.2 which we hope to release in fall 2022. The code is currently being written and tested. As soon as the APIs are finalized, we will write some documentation so people can test this feature on a prerelease version.