I getting the same kind of errors again, on Windows, with v6.0.15, but this time it seems to happen while the files are being closed and then re-opened (right when doing a “configure double” after adding more nodes to a cluster)
see XML log below
The code that trigger the “must be unbuffered” error is in AsynFileWinASIO::open() (-> here)
The corresponding code for linux KAIO does an ASSERT(...)
(-> here) while the Windows version does an actual if (..) { ... return io_error(); }
, so maybe the bug is not visible with Linux release builds, but will blow up on Windows release builds?
EDIT: the configure double failure is unrelated, it’s a different issue that has been solved, see Unable to fake multiple "machine_id" on the same host for testing ). Though maybe the I/O error sometimes happening at the same time are a contributing factors…
Example log from one of the nodes:
<Event Severity="10" Time="1545415114.576911" Type="DQDestroy" ID="6a277c88dd62d671" LastPoppedSeq="745557" PoppedSeq="745557" NextPageSeq="815104" File0Name="C:\ProgramData\foundationdb\data\4501\logqueue-6a277c88dd62d6719e1e3995266d60ba-0.fdq" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="10" Time="1545415114.576911" Type="DiskQueueShutdownDeleting" ID="6a277c88dd62d671" File0="C:\ProgramData\foundationdb\data\4501\logqueue-6a277c88dd62d6719e1e3995266d60ba-0.fdq" File1="C:\ProgramData\foundationdb\data\4501\logqueue-6a277c88dd62d6719e1e3995266d60ba-1.fdq" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="40" Time="1545415114.576911" Type="FileOpenError" ID="0000000000000000" Reason="Must be unbuffered" Flags="131074" File="C:\ProgramData\foundationdb\data\4501\logqueue-6a277c88dd62d6719e1e3995266d60ba-0.fdq" Backtrace="" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="40" Time="1545415114.576911" Type="DiskQueueShutdownError" ID="6a277c88dd62d671" Error="io_error" ErrorDescription="Disk i/o operation failed" ErrorCode="1510" Reason="unknown" Backtrace="" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="10" Time="1545415114.576911" Type="TLogTerminated" ID="6a277c88dd62d671" Error="worker_removed" ErrorDescription="Normal worker shut down" ErrorCode="1202" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
...
<Event Severity="10" Time="1545415114.588058" Type="KVWriterDestroying" ID="6a277c88dd62d671" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="10" Time="1545415114.588058" Type="KVWriterDestroyed" ID="6a277c88dd62d671" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="40" Time="1545415114.595561" Type="FileOpenError" ID="0000000000000000" Reason="Must be unbuffered" Flags="131074" File="C:\ProgramData\foundationdb\data\4501\log-6a277c88dd62d6719e1e3995266d60ba.sqlite" Backtrace="" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="40" Time="1545415114.595561" Type="KVDoCloseError" ID="6a277c88dd62d671" Error="io_error" ErrorDescription="Disk i/o operation failed" ErrorCode="1510" Reason="unknown" Backtrace="" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
<Event Severity="10" Time="1545415114.595561" Type="KVClosed" ID="6a277c88dd62d671" Machine="10.10.0.173:4501" LogGroup="default" Roles="RV,SS" />
I see the same things on several of the processes, and they are happening for log-*
, logqueue-*
, storage-*
files, so bascially any file.