I guess the issue might have something to do with clear_range. The database is flushed before running every test. The code below contains a sample
def flushdb do
t = new_transaction()
:ok = Transaction.clear_range(t, "", <<0xFF>>)
Transaction.commit(t)
end
setup do
flushdb()
end
test "resource early garbage collection" do
parent = self()
# A temp process is used to trigger garbage collection of cluster
# & database.
spawn_link(fn ->
send(parent, new_transaction())
end)
receive do
t ->
:ok = Transaction.set_option(t, FDB.Option.transaction_option_access_system_keys())
assert Transaction.get(t, "\xff\xff/status/json")
assert Transaction.get(t, "\xff\xff/cluster_file_path")
end
end
==2642== Memcheck, a memory error detector
==2642== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2642== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==2642== Command: /otp_src_20.3/bin/x86_64-unknown-linux-gnu/beam.valgrind.smp -S4:4 -- -root /otp_src_20.3 -progname /otp_src_20.3/bin/cerl\ -valgrind -- -home /root -- -pa /elixir/bin/../lib/eex/ebin /elixir/bin/../lib/elixir/ebin /elixir/bin/../lib/ex_unit/ebin /elixir/bin/../lib/iex/ebin /elixir/bin/../lib/logger/ebin /elixir/bin/../lib/mix/ebin -noshell -s elixir start_cli -extra /elixir/bin/mix test test/fdb_stress_test.exs:52
==2642==
==2642== Warning: set address range perms: large range [0x3a056000, 0x7a056000) (noaccess)
make: Nothing to be done for 'all'.
Including tags: [line: "52"]
Excluding tags: [:test]
==2642== Thread 33 fdb:
==2642== Invalid read of size 4
==2642== at 0xC04AE97: addref (FastRef.h:71)
==2642== by 0xC04AE97: operator() (ThreadSafeTransaction.actor.cpp:120)
==2642== by 0xC04AE97: a_body1cont1 (ThreadHelper.actor.h:574)
==2642== by 0xC04AE97: a_body1when1 (ThreadHelper.actor.g.h:896)
==2642== by 0xC04AE97: a_callback_fire (ThreadHelper.actor.g.h:910)
==2642== by 0xC04AE97: ActorCallback<(anonymous namespace)::DoOnMainThreadVoidActor<ThreadSafeTransaction::ThreadSafeTransaction(ThreadSafeDatabase*)::{lambda()#1}>, 0, Void>::fire((anonymous namespace)::DoOnMainThreadVoidActor<ThreadSafeTransaction::ThreadSafeTransaction(ThreadSafeDatabase*)::{lambda()#1}> const&) (flow.h:928)
==2642== by 0xBFD7007: void SAV<Void>::send<Void>(Void&&) (flow.h:382)
==2642== by 0xC1E6662: send<Void> (flow.h:708)
==2642== by 0xC1E6662: operator() (Net2.actor.cpp:473)
==2642== by 0xC1E6662: N2::Net2::run() (Net2.actor.cpp:628)
==2642== by 0xBF43627: runNetwork() (NativeAPI.actor.cpp:863)
==2642== by 0xBF1731B: MultiVersionApi::runNetwork() (MultiVersionTransaction.actor.cpp:1197)
==2642== by 0xBEF5B98: fdb_run_network (fdb_c.cpp:119)
==2642== by 0xB941CAA: run_network_wrapper (fdb_nif.c:115)
==2642== by 0x5E16E1: erl_drv_thread_wrapper (erl_drv_thread.c:122)
==2642== by 0x6BFA99: thr_wrapper (ethread.c:118)
==2642== by 0x578F6A9: start_thread (pthread_create.c:333)
==2642== Address 0x95b5dc0 is 0 bytes inside a block of size 968 free'd
==2642== at 0x4C2CE10: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2642== by 0xC04A377: delref (FastRef.h:74)
==2642== by 0xC04A377: operator() (ThreadSafeTransaction.actor.cpp:106)
==2642== by 0xC04A377: a_body1cont1 (ThreadHelper.actor.h:574)
==2642== by 0xC04A377: a_body1when1 (ThreadHelper.actor.g.h:896)
==2642== by 0xC04A377: a_callback_fire (ThreadHelper.actor.g.h:910)
==2642== by 0xC04A377: ActorCallback<(anonymous namespace)::DoOnMainThreadVoidActor<ThreadSafeDatabase::~ThreadSafeDatabase()::{lambda()#1}>, 0, Void>::fire((anonymous namespace)::DoOnMainThreadVoidActor<ThreadSafeDatabase::~ThreadSafeDatabase()::{lambda()#1}> const&) (flow.h:928)
==2642== by 0xBFD7007: void SAV<Void>::send<Void>(Void&&) (flow.h:382)
==2642== by 0xC1E6662: send<Void> (flow.h:708)
==2642== by 0xC1E6662: operator() (Net2.actor.cpp:473)
==2642== by 0xC1E6662: N2::Net2::run() (Net2.actor.cpp:628)
==2642== by 0xBF43627: runNetwork() (NativeAPI.actor.cpp:863)
==2642== by 0xBF1731B: MultiVersionApi::runNetwork() (MultiVersionTransaction.actor.cpp:1197)
==2642== by 0xBEF5B98: fdb_run_network (fdb_c.cpp:119)
==2642== by 0xB941CAA: run_network_wrapper (fdb_nif.c:115)
==2642== by 0x5E16E1: erl_drv_thread_wrapper (erl_drv_thread.c:122)
==2642== by 0x6BFA99: thr_wrapper (ethread.c:118)
==2642== by 0x578F6A9: start_thread (pthread_create.c:333)
==2642==
full log can be found here https://gist.github.com/ananthakumaran/9b8a9f525c522ba5b3f08543837294db
There error goes away when
- the flushdb call before the test is removed, then all the tests run without errors from valgrind
- if I try to trace the exact call sequence. I guess
fprintf
causes enough delay to mask the issue
==2779== Memcheck, a memory error detector
==2779== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2779== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==2779== Command: /otp_src_20.3/bin/x86_64-unknown-linux-gnu/beam.valgrind.smp -S4:4 -- -root /otp_src_20.3 -progname /otp_src_20.3/bin/cerl\ -valgrind -- -home /root -- -pa /elixir/bin/../lib/eex/ebin /elixir/bin/../lib/elixir/ebin /elixir/bin/../lib/ex_unit/ebin /elixir/bin/../lib/iex/ebin /elixir/bin/../lib/logger/ebin /elixir/bin/../lib/mix/ebin -noshell -s elixir start_cli -extra /elixir/bin/mix test test/fdb_stress_test.exs:52
==2779==
==2779== Warning: set address range perms: large range [0x3a056000, 0x7a056000) (noaccess)
make: Nothing to be done for 'all'.
Including tags: [line: "52"]
Excluding tags: [:test]
fdb_cluster_create
fdb_future_get_*
fdb_cluster_create_database
fdb_future_get_*
fdb_future_destroy
fdb_future_destroy
fdb_cluster_destroy
fdb_database_create_transaction
fdb_transaction_commit
fdb_future_get_*
fdb_cluster_create
fdb_future_get_*
fdb_cluster_create_database
fdb_future_get_*
fdb_database_create_transaction
fdb_database_destroy
fdb_future_destroy
fdb_cluster_destroy
fdb_future_destroy
fdb_transaction_get
fdb_future_get_*
fdb_transaction_get
fdb_future_get_*
fdb_future_destroy
fdb_future_destroy
fdb_transaction_destroy
fdb_database_destroy
.fdb_future_destroy
fdb_transaction_destroy
Finished in 3.5 seconds
3 tests, 0 failures, 2 skipped
Randomized with seed 895969
==2779==
==2779== HEAP SUMMARY:
==2779== in use at exit: 33,210,142 bytes in 28,684 blocks
==2779== total heap usage: 203,607 allocs, 174,923 frees, 268,317,204 bytes allocated
==2779==
==2779== 480 bytes in 10 blocks are definitely lost in loss record 628 of 820
==2779== at 0x4C2BBA0: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2779== by 0x626648: erts_sys_alloc (sys.c:1206)
==2779== by 0x59796F: erts_alloc (erl_alloc.h:232)
==2779== by 0x598524: erts_thr_progress_register_unmanaged_thread (erl_thr_progress.c:531)
==2779== by 0x5AE7B8: async_thread_init (erl_async.c:497)
==2779== by 0x5AE89C: async_main (erl_async.c:524)
==2779== by 0x6BFA99: thr_wrapper (ethread.c:118)
==2779== by 0x578F6A9: start_thread (pthread_create.c:333)
==2779==
==2779== LEAK SUMMARY:
==2779== definitely lost: 480 bytes in 10 blocks
==2779== indirectly lost: 0 bytes in 0 blocks
==2779== possibly lost: 919,274 bytes in 4,517 blocks
==2779== still reachable: 32,290,388 bytes in 24,157 blocks
==2779== suppressed: 0 bytes in 0 blocks
==2779== Reachable blocks (those to which a pointer was found) are not shown.
==2779== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2779==
==2779== For counts of detected and suppressed errors, rerun with: -v
==2779== ERROR SUMMARY: 220 errors from 220 contexts (suppressed: 0 from 0)
- if I keep a reference to database & cluster in transaction, thereby preventing the VM from calling database_destroy and or cluster_destroy before transaction_destroy, then the error goes away.
I am using library version 5.1.7. Let me know if this gives enough clues. Otherwise, I will try to track down further when I get free time.