Hi, Experts
Now our fdb user encountered the following error message in the log:
…
Nov 25 11:46:52 mdm-job-b15619-mdm-job-5bf88bc674-rpc47 mdm-job FINEST Closing FDB database object…
Nov 25 11:46:52 mdm-job-b15619-mdm-job-5bf88bc674-rpc47 mdm-job SEVERE Commit encountered exception: com.apple.foundationdb.FDBException: Broken promise, i: 0, maxRuns: 3, to be restarted with inserts: 0 and deletes: 1
Nov 25 11:46:52 mdm-job-b15619-mdm-job-5bf88bc674-rpc47 mdm-job WARNING In execution mode: READ_COMMITTED_WITH_WRITE and when restart transaction, encountered FDB transaction object being null
…
And after several retrying the jvm crashed, any one know the root cause ? Thanks!
Are you closing the FDB database and then trying to use it? I’m not certain that would result in a broken promise, but you generally shouldn’t close a Database
object unless you’re shutting down your process, as FDB databases aren’t really meant to be closed. (They’re not quite like traditional database “connections” in that way.)
I’m not sure, but I think that closing the Database
(i.e., calling database.close()
) is safe, as long as you aren’t using it or its transactions any more, though it won’t actually do all that much in terms of, say, the TCP connections your client is creating to the FDB servers. Calling FDB::shutdown
and then trying to use any of the created databases is an error.
I found some error in storage pod’s log for above issue. But I am not sure if it’s the root cause. Any comments? Thanks!
<Event Severity=“20” Time=“1669624724.417892” DateTime=“2022-11-28T08:38:44Z” Type=“N2_ReadProbeError” ID=“7f2a226b0b5ff14d” SuppressedEventCount=“0” ErrorCode=“125” Message="Operation canceled" Machine=“52.117.8.165:4500” LogGroup=“sample-cluster” Roles=“CD,SS” />
The above error message was truncated. Before the operation canceled message, there were some connection errors:
Event Severity=“10” Time=“1669624724.417892” DateTime=“2022-11-28T08:38:44Z” Type="ConnectionTimeout " ID=“0000000000000000” SuppressedEventCount=“0” WithAddr="67.228.123.183:30549:tls " Machine=“52.117.8.165:4500” LogGroup=“sample-cluster” Roles=“CD,SS” />
and then
Event Severity=“10” Time=“1669624724.417892” DateTime=“2022-11-28T08:38:44Z” Type=“ConnectionClosed” ID=“7f2a226b0b5ff14d” Error="connection_failed " ErrorDescription="Network connection failed " ErrorCode=“1026” SuppressedEventCount=“2” PeerAddr=“67.228.123.183:30549:tls” Machine=“52.117.8.165:4500” LogGroup=“sample-cluster” Roles=“CD,SS” />