Hello,
I am working on a utility for a full export (inconsistent, non-transactional) for subsequent data processing elsewhere. I would appreciate your feedback on the approach and any concerns.
(For now, ignore the fact that the client runs from only one client node and not scalable)
The first thing to get over was to batch taking care of the 5s limit. Based on your feedback in other threads, I have the following code that is the crux of the loop to fetch a given range of keys
Couple of questions on this
-
I know there is a newer split-key-ranges-by-size API coming up in 7.0 . But for now, is the approach of ‘creating a new txn after a failure’ the best approach?
-
I feel a bit uneasy about the byte slice returned by
.Get()
- how long is that slice valid? Should I clone it before assigning it to a variable (see line 178) which is referenced in the next loop?
Future plans
Ideal setup, if it is possible, would be to make sure I can narrow down each key-range to a particular host and make the read local (by having an agent run local to that node), filter some data locally, and then return data to caller. The challenge of course would be test if filtering would add any value and how to make it generic to be useful in an open-source project. But I am working on it. Let me know if you have any thoughts on it. Lot of this is based on hunches, only tests will show if there is any value in doing this.
Any feed back is appreciated. The wip code is open-source, although it is just a prototype now
Thank you for your time
Hari