Segfault in version 6.2.11

gaurav · March 4, 2020, 2:32pm

Hi, we are seeing some unexplained Segfault with fdb version 6.2.11.

Below are the logs printed by fdbmonitor. I do not have fdbserver logs from the Segfault time instants as they have rolled over, but we track all Sev 30/40 events using Wavefront telemetry and there are no such events being shown.

Is this something that is known or fixed in a more recent patch release?

fdbserver --version
FoundationDB 6.2 (v6.2.11)
source version 66ac949e6ae0dcd9e8f1b59d91830b0163aa1ac0
protocol fdb00b062010001

Feb 26 21:49:56 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": SIGNAL: Segmentation fault (11)
Feb 26 21:49:56 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": Trace: addr2line -e fdbserver.debug -p -C -f -i 0x1928205 0x7f488c88f390 0x1893eb6 0x18942c8 0x18946fa 0x1894f08 0x188f5cb 0x1890733 0x18907b1 0x189093a 0x1898abd 0x7fca30 0x197fa10 0x67dbee 0x7f488bfc3830
Feb 26 21:49:56 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": Process 9675 exited 139, restarting in 0 seconds
--
Feb 27 17:56:39 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": SIGNAL: Segmentation fault (11)
Feb 27 17:56:39 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": Trace: addr2line -e fdbserver.debug -p -C -f -i 0x1928205 0x7f5fdc3f9390 0x1893eb6 0x18942c8 0x18946fa 0x1894f08 0x188f5cb 0x1890733 0x18907b1 0x189093a 0x1898abd 0x7fca30 0x197fa10 0x67dbee 0x7f5fdbb2d830
Feb 27 17:56:39 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": Process 43063 exited 139, restarting in 0 seconds
--
Mar 03 06:32:11 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": SIGNAL: Segmentation fault (11)
Mar 03 06:32:11 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": Trace: addr2line -e fdbserver.debug -p -C -f -i 0x1928205 0x7fa64348d390 0x1893eb6 0x18942c8 0x18946fa 0x1894f08 0x188f5cb 0x1890733 0x18907b1 0x189093a 0x1898abd 0x7fca30 0x197fa10 0x67dbee 0x7fa642bc1830
Mar 03 06:32:11 platform2 fdbmonitor[9672]: LogGroup="default" Process="fdbserver.4500": Process 40381 exited 139, restarting in 0 seconds

ajbeamon · March 4, 2020, 4:16pm

We have seen this crash (see Unexplained crash in 6.2.11 · Issue #2463 · apple/foundationdb · GitHub), but as of yet the cause is not certain and is still being investigated. There is a pending change which may resolve it, though we haven’t been able to prove that it would.

gaurav · March 5, 2020, 4:28am

Thanks @ajbeamon.
Could these crashes result in data corruption on a single or multi node fdb cluster? We have a product release around the corner that is upgrading fdb version from 6.1 to 6.2, and I want to be certain that this issue will not cause any data corruption or any kind of irrecoverable failure.

If there is even small chance of that, then we would want to stick to 6.1, till this is resolved.

Another question: some of our internal canary setups have been upgraded to 6.2. If we decide to stick to 6.1, can we just revert to 6.1 server and client binaries and work against existing fdb data files?

ajbeamon · March 5, 2020, 6:56pm

I’m hesitant to make any guarantees involving a crash that hasn’t been diagnosed, but we haven’t observed any corruption.

I don’t believe this is currently possible.

gaurav · March 6, 2020, 3:37am

@ajbeamon thank you for the quick reply. We will stick with 6.1 for now as many of our deployments run on single process fdb and I don’t want to risk those setups.

Topic		Replies	Views
Segfault in cluster coordinator after force exclude Using FoundationDB	2	331	April 10, 2021
Segmentation fault error and broken cluster Using FoundationDB	16	4310	June 11, 2018
Problem with upgrade to 6.2.15 from 6.1.x version Using FoundationDB	3	666	March 30, 2020
Run 7.2.0 fdbserver multitest has segmentation fault Using FoundationDB	2	292	January 28, 2023
Large spikes in memory usage on storage processes, 6.0.18 Using FoundationDB	2	571	May 12, 2020

Segfault in version 6.2.11

Related topics