Golang fdb.MustOpenDefault() does not fail when fdb.cluster content points to invalid host

I was attempting a simple lookup and set but my code was hanging and I could not figure out why. It turns our fdb.cluster was pointed at the wrong IP address.

I’d expect fdb.MustOpenDefault() to fail if it could not connect to the FoundationDB server however, it does not and tr.Get() just hangs in this instance. Nothing is running on port 4500 on the machine.

root@947ede3bb5eb:fdbtest# cat fdb.cluster && echo ""
fdb:fdb@127.0.0.1:4500
root@947ede3bb5eb:fdbtest# go run main.go
2018/09/18 20:32:51 [MAIN] Connected to fdb
2018/09/18 20:32:51 transaction started
^Csignal: interrupt
root@947ede3bb5eb:fdbtest# echo 'fdb:fdb@172.21.0.8:4500' > fdb.cluster
root@947ede3bb5eb:fdbtest# cat fdb.cluster && echo ""
fdb:fdb@172.21.0.8:4500
root@947ede3bb5eb:fdbtest# go run main.go
2018/09/18 20:33:29 [MAIN] Connected to fdb
2018/09/18 20:33:29 transaction started
2018/09/18 20:33:29 key does not exist
2018/09/18 20:33:29 key set
root@947ede3bb5eb:fdbtest# telnet 127.0.0.1 4500
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
root@947ede3bb5eb:fdbtest# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:44487        0.0.0.0:*               LISTEN      -
udp        0      0 127.0.0.11:38569        0.0.0.0:*                           -

Source:

package main

import (
    "github.com/apple/foundationdb/bindings/go/src/fdb"
    "log"
)

func main(){
    fdb.MustAPIVersion(200)
    db := fdb.MustOpenDefault()
    log.Printf("[MAIN] Connected to fdb")

    _, err := db.Transact(func(tr fdb.Transaction) (interface{}, error){ 
         log.Printf("transaction started")
         key := "hello"
         if tr.Get(fdb.Key(key)).MustGet() != nil {
             log.Printf("got key")
             return true, nil
         }

         log.Printf("key does not exist")

         tr.Set(fdb.Key(key), []byte{})
         log.Printf("key set")
         return false, nil
    })

    if err = nil& {
         log.Fatalln(err)
    }
}

Why wouldn’t this fail if it cannot connect to FoundationDB?

This is intended behavior, though it’s not well documented here. It was chosen that the inability to communicate with the database in general not be considered a failure, and so “opening” the database on the client does not depend on the database being up.

The reason we don’t consider this to be a failure is to help manage cases where the reasons that a client can’t connect to a cluster are temporary. For example, maybe the cluster you are trying to connect to does exist but a temporary network issue is preventing you from reaching it. Or perhaps the process is being restarted and will be back shortly. There are a wide variety of cases where you may be unable to talk to the cluster, and it’s in general difficult to determine which of these cases are permanent.

Our philosophy has been to treat all such situations where you can’t communicate with the cluster as potentially temporary, having the client continue to attempt a connection indefinitely. If you want to manage how long clients will try to perform operations before giving up, then you can do so explicitly via transaction timeouts (https://godoc.org/github.com/apple/foundationdb/bindings/go/src/fdb#TransactionOptions.SetTimeout).

One other side note, I notice your code is using API version 200. I can see that we’re also using API version 200 in our example in the documentation, which I intend to fix, but if you are writing a new program and don’t have a specific reason to use an old API version, I’d recommend using the latest version supported by your bindings (520 in the 5.2 bindings).

Thanks @ajbeamon. I’ll set a transaction timeout and try to get a known key as a check for the ability to reach the database when my application starts up. For me, the app doesn’t work if the database isn’t immediately acceptable so just needed a solution for that.

Yea, on the 200, I just wanted the code to be as close to the examples on the site as possible, I’m using 520 in my production code.

Good to know that the default timeout is infinite. I will make the suggestion that you may want to change that to some smaller value for the default settings. Almost no one would want to wait infinitely for a transaction to complete in the real world. Or perhaps just updating the examples to set a timeout as not everyone will read the full godoc before trying some simple use cases to see if FoundationDB is within the right ballpark for their requirements.

Thanks for the thoughtful response!

I meant to reply to this, but it seems it fell off my radar. There’s been other discussion about timeouts on this thread, so if you’re interested you could look through the various thoughts expressed there and weigh in if there’s anything you want to add. I do think that illustrating the usage of timeouts better is a good idea, so I raised an issue for it: Consider setting timeouts in example code · Issue #882 · apple/foundationdb · GitHub.

Thanks for the issue and pointing me to that thread. No worries.