When reading code to understand better the internal recovery, this line of code makes me confused:
if (trs.size() ||
(commitData.db->get().recoveryState >= RecoveryState::ACCEPTING_COMMITS &&
masterLifetime.isEqual(commitData.db->get().masterLifetime) && lastCommitComplete.isReady())) {…}
why the condition (commitData.db->get().recoveryState >= RecoveryState::ACCEPTING_COMMIT) is allowed to be bypassed?
After checking older version code which handles any type(GRV, Commit) of transaction requests in MasterProxyServer, I think the RecoveryState checking for client transaction is in function transactionStarter,
in current version this function is in GrvProxyServer:
ASSERT(db->get().recoveryState >=
RecoveryState::ACCEPTING_COMMITS);
and there is a wait in function grvProxyServerCore before it calls transactionStarter :
while (!(masterLifetime.isEqual(grvProxyData.db->get().masterLifetime) &&
grvProxyData.db->get().recoveryState >= RecoveryState::ACCEPTING_COMMITS)) {
wait(grvProxyData.db->onChange());
}
In CommitProxyServer’s commitProxyServerCore, the purpose of the recovery state checking in my original question is not to see if it is ok to accept client transaction but to see if empty transaction is allowed, so that makes sense. Also, ClusterController’s clusterRecoveryCore send a CommitTransactionRequest to proxy[0] and TxnStateRequests to proxies, so CommitProxy need to handle them in order for RecoveryState in CC can move forward to ACCEPTING_COMMIT. But that specific CommitTransactionRequest is not from client. This aligns with the document.
There is one more issue I couldn’t find answer, it seems that GrvProxyServer would reply to readVersion request without checking current RecoveryState when a GrvProxy stay in the proxy list of recovered cluster.
When a GrvProxyServer is recruited the first time there is no such issue because when it runs function grvProxyServerCore it encounters the blocking wait, after recovery state allow it pass this wait it starts actor transactionStarter to serve GetReadVersionRequest, so far so good.
In function grvProxyServer it monitors dbInfo change,
so when current proxy is removed from current dbInfo’s proxy interfaces this proxy terminates, this is good.
But when it is still in current dbInfo’s proxy list after recovery then it means the code in loop of transactionStarter will continue reply to GetReadVersionRequest no matter what the recoveryState is.
I searched the code to see if somewhere else is checking recoveryState to block client transaction during recovery but found nothing.
Could anyone share the idea about what the steps are to create a workload to test the scenario I described? Thanks.
But when it is still in current dbInfo’s proxy list after recovery then it means the code in loop of transactionStarter will continue reply to GetReadVersionRequest no matter what the recoveryState is.
After a recovery happens, in the new generation, ClusterController will recruit a new transaction system, including new GRV proxies, new commit proxies, and new tlogs. So it’s not possible for the old GRV proxy to be in the new transaction system.
Additionally, before such recruitment, old tlogs have been locked, thus preventing transactions from the previous generation from committing. I.e., even though the old GRV proxies may linger around for a short period of time, the transaction can’t be committed.
For details on how recovery works, here is a detailed documentation.
Thanks for the confirmation about all new proxies, my confusion comes from this piece of code:
ACTOR Future<Void> checkRemoved(Reference<AsyncVar<ServerDBInfo> const> db,
uint64_t recoveryCount,
GrvProxyInterface myInterface) {
loop {
if (db->get().recoveryCount >= recoveryCount &&
std::find(db->get().client.grvProxies.begin(), db->get().client.grvProxies.end(), myInterface) ==
db->get().client.grvProxies.end()) {
throw worker_removed();
}
wait(db->onChange());
}
}
After checking code that creates new proxies it seems any new proxy’s interface would not be the same as any old one, so I guess it is for exact timing that above code choosing to only terminate a proxy when its interface is not in dbinfo anymore rather than anything else – the old one will be gone definitely.