Coredump when use fdb_create_transaction at atexit function

Problem is :upside_down_face:
We use the fdb_create_transaction when process exit on atexit function and do transaction creation.
After create transaction, we will destory database and stop network.

But there is a coredump when execute fdb_create_transaction.

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff7873610 in addref<IRandom> (ptr=0x5555555747c0) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/flow/FastRef.h:88
#2  Reference<IRandom>::Reference (r=..., this=0x7fffffffda50) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/flow/FastRef.h:109
#3  deterministicRandom () at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/flow/flow.cpp:110
#4  0x00007ffff6fbf879 in generateSpanID (transactionTracingSample=<optimized out>, parentContext=...) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/fdbclient/NativeAPI.actor.cpp:3132
#5  0x00007ffff7013d02 in Transaction::Transaction (this=0x5555555c4a18) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/flow/IRandom.h:69
#6  0x00007ffff71ae7b5 in ReadYourWritesTransaction::ReadYourWritesTransaction (this=0x5555555c4a00) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/flow/Error.h:59
#7  0x00007ffff7640695 in ISingleThreadTransaction::allocateOnForeignThread (type=type@entry=ISingleThreadTransaction::Type::RYW) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/flow/FastAlloc.h:211
#8  0x00007ffff7611c37 in ThreadSafeTransaction::ThreadSafeTransaction (this=0x5555555c27d0, cx=0x5555555c6030, type=ISingleThreadTransaction::Type::RYW, tenant=...) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/fdbclient/ThreadSafeTransaction.cpp:190
#9  0x00007ffff7611f52 in ThreadSafeDatabase::createTransaction (this=0x5555555c2dd0) at /usr/include/c++/9/optional:688
#10 0x00007ffff6f3df11 in fdb_database_create_transaction (d=<optimized out>, out_transaction=0x555555558038 <tr>) at /home/gpadmin/pie-db/fdb-dpkg/foundationdb/bindings/c/fdb_c.cpp:406
#11 0x000055555555566c in atexit_callback () at reproduce_problem.c:66
#12 0x00007ffff6a4a8a7 in __run_exit_handlers (status=0, listp=0x7ffff6bf0718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#13 0x00007ffff6a4aa60 in __GI_exit (status=<optimized out>) at exit.c:139
#14 0x00005555555558fd in main () at reproduce_problem.c:132

Environment:

os : 5.13.0-51-generic #58~20.04.1-Ubuntu
fdb-version : 7.1.9

Testcode:

// gcc reproduce_problem.c -lfdb_c -lpthread -I /usr/include/foundationdb/ -ggdb3 -O0 -g3 -o reproduce_problem
#define FDB_API_VERSION 710

#include "string.h"
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

#include "fdb_c.h"

FDBTenant *fdb_tenant = NULL;
FDBTransaction *tr = NULL;
FDBDatabase *db = NULL;
pthread_t netThread;
// bool fdb_create_network = false;

static void checkError(fdb_error_t errorNum) {
  if (errorNum) {
    fprintf(stderr, "Error (%d): %s\n", errorNum, fdb_get_error(errorNum));
    exit(errorNum);
  }
}

static void waitAndCheckError(FDBFuture *future) {
  checkError(fdb_future_block_until_ready(future));
  if (fdb_future_get_error(future) != 0) {
    checkError(fdb_future_get_error(future));
  }
}

static void runNetwork() { checkError(fdb_run_network()); }

void createData(FDBDatabase *db) {
  int committed = 0;
  /*  Create transaction. */
  checkError(fdb_database_create_transaction(db, &tr));

  while (!committed) {
    /* Create data */
    char *key1 = "Test Key1";
    char *val1 = "Test Value1";
    fdb_transaction_set(tr, key1, (int)strlen(key1), val1, (int)strlen(val1));

    /* Commit to database.*/
    printf("committing key/value.\n");
    FDBFuture *commitFuture = fdb_transaction_commit(tr);
    checkError(fdb_future_block_until_ready(commitFuture));
    if (fdb_future_get_error(commitFuture) != 0) {
      waitAndCheckError(
          fdb_transaction_on_error(tr, fdb_future_get_error(commitFuture)));
    } else {
      committed = 1;
    }
    fdb_future_destroy(commitFuture);
  }
  /* Destroy transaction. */
  fdb_transaction_destroy(tr);
}

/* In our application, the function is used to cleanup the buffer data:
 * Just commit the buffer data to fdb when process exit.
 */
static void atexit_callback(void) {
  /* Create transaction and do nothing.*/
  puts("exit callback.\n");
  // checkError(fdb_tenant_create_transaction(fdb_tenant, &tr));
  checkError(fdb_database_create_transaction(db, &tr));
  if (tr) {
    puts("before destory transaction.\n");
    fdb_transaction_destroy(tr);
  }
  fdb_database_destroy(db);
  db = NULL;

  //pthread_join(netThread, NULL);
  checkError(fdb_stop_network());
}

void createTenant(FDBDatabase *db) {
  char *tenant_name = "example";
  checkError(fdb_database_open_tenant(db, (uint8_t const *)tenant_name,
                                      strlen(tenant_name), &fdb_tenant));
}

void readData(FDBDatabase *db) {
  FDBTransaction *tr;
  checkError(fdb_database_create_transaction(db, &tr));
  char *key = "Test Key1";
  FDBFuture *getFuture = fdb_transaction_get(tr, key, (int)strlen(key), 0);
  waitAndCheckError(getFuture);

  fdb_bool_t valuePresent;
  const uint8_t *value;
  int valueLength;
  checkError(
      fdb_future_get_value(getFuture, &valuePresent, &value, &valueLength));

  printf("Got Value : %s: '%.*s'\n", key, valueLength, value);
  fdb_transaction_destroy(tr);
  fdb_future_destroy(getFuture);
}

int main() {
  puts("Starting FoundationDB.");

  char *cluster_file = "/etc/foundationdb/fdb.cluster";

  /* Setup network. */
  checkError(fdb_select_api_version(FDB_API_VERSION));
  checkError(fdb_setup_network());
  puts("Got network");

  /* Run network. */
  pthread_create(&netThread, NULL, (void *)runNetwork, NULL);

  /* Run database. */
  checkError(fdb_create_database(cluster_file, &db));

  puts("Got database");

  /*Create tenant and do nothing.*/
  createTenant(db);
  createData(db);
  readData(db);

  puts("Program done. Now exiting...");
  int res = atexit(atexit_callback);
  if (res) {
    printf("Set exit function failed.\n");
    exit(EXIT_FAILURE);
  }

  exit(0);
}

Anyone please take a short look of the problem.
The problem operation is in atexit function callback, but function callback will be executed normally when exit c libary called. At this time, process’s memory, task struct and so on won’t be destoryed only if the function is called in do_exit in kernel.

So i think the function callback which registered by atexit has a correct executed order, but some thing unexpected happen in foundationdb.

I think this may be a result of the FDB code you are calling using thread local state that gets destroyed by the exit call prior to the atexit handler being run. Rather than calling exit in this way, is it feasible for you to call the logic in your handler directly and terminate from there?

@ZhangHuiGui did you solve this? I am experiencing a segfault from fdb_database_create_transaction() also.

Yes, our problem is that we exit fdb in exit’s callback, and do the fdb commit in this callback.
After we move the fdb exit logic out of the exit callback, everything is normal.

1 Like

Yes @ZhangHuiGui I discovered the same. Mine works now as well. This is indeed not an issue with FoundationDB. A C API example would be nice to have in the official documentation, though!