Setting up remote access to FDB Server hosted on EC2, with AWS Private IP Addresses

I am trying to setup FoundationDB on an AWS EC2 server (amazon linux 2, t3a.medium). I can get the database running and accessible locally. However, I’m trying to get it accessible from a different webserver and that is not working.

For the server running fdb (not the webserver), my publicly accessible IP address is 3.18.104.196, but AWS uses private IP addresses, so if I for example run make_public.py, the “public” IP address is 172.31.38.132. I’ve tried different combinations of those two addresses in the command to run fdbserver and in the cluster file used to run the client, but I can’t get remote access to work.

If I do:
sudo fdbserver -p 172.31.38.132:4500
and fdbcli with a fdb.cluster of: abc:abc@172.31.38.132:4500 from the server running fdbserver
then fdbcli works and can read and write to the database

but fdbcli with a fdb.cluster of: abc:abc@172.31.38.132:4500 from a different webserver gets:

The database is unavailable; type `status’ for more information.
Could not communicate with a quorum of coordination servers:
172.31.38.132:4500 (unreachable)

This makes sense, but I wanted to include for completion and to show that the database does work with local access.

It seems like I should be able to use the -l and -p arguments to make it work, but that hasn’t fixed it.
When I do:
sudo fdbserver -l 172.31.38.132:4500 -p 3.18.104.196:4500
and fdbcli with a fdb.cluster of: abc:abc@172.31.38.132:4500 from the server running fdbserver
I get

Unable to communicate with the cluster controller at 3.18.104.196:4500 to get
status.
Coordination servers:
172.31.38.132:4500 (reachable)

-------

When I do:
sudo fdbserver -l 172.31.38.132:4500 -p 3.18.104.196:4500
and fdbcli with a fdb.cluster of: abc:abc@3.18.104.196:4500 from the server running fdbserver

Could not communicate with a quorum of coordination servers:
3.18.104.196:4500 (unreachable)

-------

When I do:
(while the fdb server is running with: sudo fdbserver -l 172.31.38.132:4500 -p 3.18.104.196:4500)

running fdbcli with a fdb.cluster of: abc:abc@172.31.38.132:4500 from the webserver
I get

Could not communicate with a quorum of coordination servers:
172.31.38.132:4500 (unreachable)

running fdbcli with a fdb.cluster of: abc:abc@3.18.104.196:4500 from the webserver
I get

Could not communicate with a quorum of coordination servers:
3.18.104.196:4500 (unreachable)

Any help I can get in running fdb in a way that I can connect to it would be appreciated. Or at least some tips for where to look into what’s going wrong.

FDB is overall not good with NAT rewriting IPs. We store IPs into the database, notably as a way for clients to discover and connect to storage servers, so if FDB thinks the IP for a storage server is 10.0.0.1, then a client needs to also be able to connect to 10.0.0.1. It sounds like either way you do this, either FDB won’t be able to connect to FDB (if you use public IPs), or clients won’t be able to connect to FDB (if you use internal IPs).

If there’s any way to get your webserver to be able to have an interface on both public and private networks, then that would be one solution. If I were to be in this situation, I’d probably just set up nginx as a load balancer on your public side, to forward requests into webservers running in your private network.

And just in case it is of concern, you shouldn’t expose your FDB to the public internet.

Ok, thanks for the info and sorry if these questions are too basic.

How would I set up the fdb server and webserver in the same private network? The already are in the same VPC and the same subnet.

I had understood fro your description that your setup looked something like

If you instead have your webservers and your fdbservers within the same private subnet, then you should be able to just use the 172.* IP addresses on both sides. Start fdbserver with -p 172.*, and have 172.* in the cluster file that you give to clients. The fact that this didn’t work for you is why I assumed that your webservers were in a different private network than your database.

Before today I didn’t realize they were on the same subnet, but they in fact are. The subnet ID for both servers is the same.

I tried it with 127.0.0.1 in the cluster files. It works running fdbcli on the server running fdb, but not the other server in the subnet. That one shows 127.0.0.1:4500 (unreachable)

Sorry, that was bad muscle memory invocation. Let me go edit my post, but I meant the 172.* private subnet instead of 127.0.0.1 localhost. Localhost definitely wouldn’t work across hosts, being localhost and all :sweat_smile:

Which I’ll point out that you did already do:

And it’s the fact that your other webserver thinks 172.31.38.132:4500 is unreachable is why I’m suspicious that you either have servers on different private networks, or some weird firewall rules that prevent some set of hosts from talking to other hosts.

It might be easier to remove FDB from this, and just test your network connections using netcat first. If you have netcat listen on one webserver on 172.*, and try to connect to it from a different host, you should have the same issue of the connection not working. It’s sounding like this is a networking/how to use Amazon VPCs problem, and not specifically an FDB problem.

Ok, it’s working! Thanks so much for your help. Trying to get netcat working between the two servers was a great tip.

A bunch of things were set up correctly, such as:
The two EC2 instances are in the same VPC
The VPC route table was correctly routing local traffic locally
The two instances were in the same subnet
The subnet Network ACL was allowing inbound and outbound access
There were no firewalls on the host or client server blocking access

The issue was with security groups:
The two instances were in different security groups and the security group of the fdb server needed to allow inbound access to the security group of the client.
To do so, in the AWS console, you can go to the VPC Dashboard and select Security Groups (next to Network ACLs in the Security header).
Choose the security group for the fdb server and add an Inbound Rule. I did Custom TCP Rule for port 4500, with the source being the security group that my webserver is a part of.

Once I did that netcat worked between the two (nc -l 4500 on fdb server and nc -n 172.31.38.132 4500 on webserver, then typing something and hitting enter on webserver shows up on fdb server). And then fdbcli with cluster file of abc:abc@172.31.38.132:4500 on both fdb server and webserver made it work.

1 Like