Request for feedback: Full export, Go binding usage, Future plans

harikb · March 22, 2021, 4:46pm

Hello,

I am working on a utility for a full export (inconsistent, non-transactional) for subsequent data processing elsewhere. I would appreciate your feedback on the approach and any concerns.

(For now, ignore the fact that the client runs from only one client node and not scalable)

The first thing to get over was to batch taking care of the 5s limit. Based on your feedback in other threads, I have the following code that is the crux of the loop to fetch a given range of keys

github.com

adobe/ferry/blob/86be2b5487db505efb24dcd63cd9a45337211f6e/lib/exporter/exporter.go#L133:L206


		txn, err := exp.db.CreateTransaction()
		if err != nil {
			return errors.Wrapf(err, "Unable to create fdb transaction")
		}

		keysRead := 0
		keysReadInOneTxn := 0
		keysReadInOneBatch := 0
		bytesRead := int64(0)
		lastReadKey, endKey := keyRange.FDBRangeKeys()
		batchReadLimit := 1_000_000
	Fetch:
		for {
			fKey := txn.GetRange(keyRange, fdb.RangeOptions{Limit: batchReadLimit, Mode: fdb.StreamingModeSerial})
			it := fKey.Iterator()
			for it.Advance() {
				// ---------------------------------------------------------
				// uncomment line below for testing only
				// time.Sleep(time.Millisecond * 1)
				// This is to artifically create the 5 second txn limit test
				// ---------------------------------------------------------

Couple of questions on this

I know there is a newer split-key-ranges-by-size API coming up in 7.0 . But for now, is the approach of ‘creating a new txn after a failure’ the best approach?
I feel a bit uneasy about the byte slice returned by .Get() - how long is that slice valid? Should I clone it before assigning it to a variable (see line 178) which is referenced in the next loop?

Future plans

Ideal setup, if it is possible, would be to make sure I can narrow down each key-range to a particular host and make the read local (by having an agent run local to that node), filter some data locally, and then return data to caller. The challenge of course would be test if filtering would add any value and how to make it generic to be useful in an open-source project. But I am working on it. Let me know if you have any thoughts on it. Lot of this is based on hunches, only tests will show if there is any value in doing this.

Any feed back is appreciated. The wip code is open-source, although it is just a prototype now

Thank you for your time
Hari

harikb · March 29, 2021, 7:12pm

I appreciate if anyone who knows the Go binding has any input on this.

I see that this code is creating a new slice of KeyValue structs, but since the .Value is being recast with unsafe, I am not quite sure what is going on

github.com

apple/foundationdb/blob/main/bindings/go/src/fdb/futures.go#L297


      
          	f.BlockUntilReady()
          
          	var kvs *C.FDBKeyValue
          	var count C.int
          	var more C.fdb_bool_t
          
          	if err := C.fdb_future_get_keyvalue_array(f.ptr, &kvs, &count, &more); err != 0 {
          		return nil, false, Error{int(err)}
          	}
          
          	ret := make([]KeyValue, int(count))
          
          	for i := 0; i < int(count); i++ {
          		kvptr := unsafe.Pointer(uintptr(unsafe.Pointer(kvs)) + uintptr(i*24))
          
          		ret[i].Key = stringRefToSlice(kvptr)
          		ret[i].Value = stringRefToSlice(unsafe.Pointer(uintptr(kvptr) + 12))
          	}
          
          	return ret, (more != 0), nil
          }

ajbeamon · March 29, 2021, 11:49pm

I am not a Go expert, but looking at this, it appears that stringRefToSlice is using C.GoBytes to copy the data from the C buffer into a Go buffer. My understanding, then, is that the memory lifetime is not tied to any underlying native object.

Topic		Replies	Views
Streaming data out of FoundationDB Using FoundationDB	2	2608	September 11, 2018
Object store on FoundationDB FoundationDB Layers	9	4981	May 18, 2018
Why can I only range read 2857 keys? Using FoundationDB	1	627	July 13, 2019
Duplication of key data? Using FoundationDB	2	451	April 19, 2023
A few design-pattern + check-my-understanding questions Using FoundationDB	9	2241	February 21, 2019

Request for feedback: Full export, Go binding usage, Future plans

Related topics