Multitest QuietDatabaseFailure

k8s cluster have a process that class is test. When do fdbserver -r multitest -f test.txt, there have error like this:

<Event Severity="10" Time="1670294638.021585" DateTime="2022-12-06T02:43:58Z" Type="GetVersionOffset" ID="0000000000000000" Stage="ReadingVersionEpoch" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />
<Event Severity="10" Time="1670294638.058358" DateTime="2022-12-06T02:43:58Z" Type="DataDistributionQueueSize" ID="0000000000000000" Stage="GotString" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />
<Event Severity="20" Time="1670294638.058358" DateTime="2022-12-06T02:43:58Z" Type="TraceEventFieldNotFound" ID="0000000000000000" SuppressedEventCount="0" FieldName="InQueue" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />
<Event Severity="10" Time="1670294638.058358" DateTime="2022-12-06T02:43:58Z" Type="QuietDatabaseFailure" ID="5eda12b35aab663d" Reason="Failed to extract DataDistributionQueueSize" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />
<Event Severity="10" Time="1670294638.058358" DateTime="2022-12-06T02:43:58Z" Type="QuietDatabaseStartError" ID="0000000000000000" Error="attribute_not_found" ErrorDescription="Attribute not found" ErrorCode="2014" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />
<Event Severity="10" Time="1670294638.058358" DateTime="2022-12-06T02:43:58Z" Type="QuietDatabaseStartRetry" ID="0000000000000000" Error="attribute_not_found" ErrorDescription="Attribute not found" ErrorCode="2014" NotReady0="dataDistributionQueueSize" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />
<Event Severity="10" Time="1670294638.058358" DateTime="2022-12-06T02:43:58Z" Type="StorageServersRecruiting" ID="0000000000000000" Message="&quot;Severity&quot;=&quot;10&quot;, &quot;Time&quot;=&quot;1670294608.277340&quot;, &quot;DateTime&quot;=&quot;2022-12-06T02:43:28Z&quot;, &quot;Type&quot;=&quot;StorageServerRecruitment&quot;, &quot;ID&quot;=&quot;8ce531249de1efd9&quot;, &quot;State&quot;=&quot;Idle&quot;, &quot;ThreadID&quot;=&quot;1107443169146200672&quot;, &quot;Machine&quot;=&quot;10.181.159.46:7502&quot;, &quot;LogGroup&quot;=&quot;default&quot;, &quot;Roles&quot;=&quot;DD&quot;, &quot;TrackLatestType&quot;=&quot;Original&quot;" ThreadID="8344889570567331082" Machine="0.0.0.0:0" LogGroup="default" />

can not write data into fdb.

If you’re using the FDB kubernetes operator you should check if your test processes are up and running, there was a bug in the operator: Add test process count by iyuroch · Pull Request #1362 · FoundationDB/fdb-kubernetes-operator · GitHub on how those processes are created.

Can you share some more details about your setup and the status of the FDB cluster?

think you very much. now fdb7.1.25,
always:

PingingDatabaseLiveness_QuietDatabaseStart→ SomewhatSlowRunLoopTop → PingLatency → LBDistant → TransactionMetrics → PingingDatabaseLivenessDone_QuietDatabaseStart

find Severity=“20”

grep -n 'Severity="20"' trace.0.0.0.0.0.1670394484.8Wcdbb.0.6.xml
59:<Event Severity="20" Time="1670459432.870896" DateTime="2022-12-08T00:30:32Z" Type="Net2RunLoopTrace" ID="0000000000000000" TraceTime="1670459432.856632" Trace="addr2line -e fdbserver.debug -p -C -f -i 0x7fce6bc45630 0x7fce6bc42aa1 0x364ef8d 0x365121d 0xa67517 0x7fce6b88a555 0xaca7d2" ThreadID="10348090079953542892" Machine="0.0.0.0:0" LogGroup="default" />
89:<Event Severity="20" Time="1670459452.624212" DateTime="2022-12-08T00:30:52Z" Type="Net2RunLoopTrace" ID="0000000000000000" TraceTime="1670459452.616479" Trace="addr2line -e fdbserver.debug -p -C -f -i 0x7fce6bc45630 0x7fce6bc42aa1 0x364ef8d 0x365121d 0xa67517 0x7fce6b88a555 0xaca7d2" ThreadID="10348090079953542892" Machine="0.0.0.0:0" LogGroup="default" />

backtrace debug:

addr2line -e fdbserver.debug -p -C -f -i 0x7fce6bc45630 0x7fce6bc42aa1 0x364ef8d 0x365121d 0xa67517 0x7fce6b88a555 0xaca7d2

65121d 0xa67517 0x7fce6b88a555 0xaca7d2
?? ??:0
?? ??:0
boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::cancel(boost::asio::detail::deadline_timer_service<boost::asio::time_traits<boost::posix_time::ptime> >::implementation_type&, boost::system::error_code&) at /opt/boost_1_78_0/include/boost/asio/detail/deadline_timer_service.hpp:153
 (inlined by) boost::asio::basic_deadline_timer<boost::posix_time::ptime, boost::asio::time_traits<boost::posix_time::ptime>, boost::asio::any_io_executor>::cancel() at /opt/boost_1_78_0/include/boost/asio/basic_deadline_timer.hpp:348
 (inlined by) N2::ASIOReactor::sleep(double) at /home/foundationdb_ci/src/oOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOo/foundationdb/flow/Net2.actor.cpp:2081
MetricHandle<ContinuousMetric<bool> >::operator=(bool const&) at /home/foundationdb_ci/src/oOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOo/foundationdb/flow/TDMetric.actor.h:1374
 (inlined by) N2::Net2::run() at /home/foundationdb_ci/src/oOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOo/foundationdb/flow/Net2.actor.cpp:1508
main at /home/foundationdb_ci/src/oOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOo/foundationdb/fdbserver/fdbserver.actor.cpp:2245
?? ??:0
_start at ??:?