Crashes every day go bindings

We have reproducible error

Internal Error @ ./flow/ThreadPrimitives.h 78:
addr2line -e libfdb_c.so-debug -p -C -f -i 0x3ac2c9 0xd63c0 0xd6509 0xd5415 0xffff80c6f305fd10
terminate called after throwing an instance of ‘Error’
SIGABRT: abort
PC=0x7f390ce49428 m=12 sigcode=18446744073709551610

on write and read process

goroutine 0 [idle]:
runtime: unknown pc 0x7f390ce49428
stack: frame={sp:0x7f3900ff84c8, fp:0x0} stack=[0x7f39007f9168,0x7f3900ff8d68)
00007f3900ff83c8: 0000000000923608 0000000000000000
00007f3900ff83d8: 0000000000000000 0000000600000001
00007f3900ff83e8: 00007f3900000000 00007f3900ff8270
00007f3900ff83f8: 0000000000000000 00007f3900000000
00007f3900ff8408: 000000726f727245 00007f3900ff8588
00007f3900ff8418: 00007f390ddbab1f 0000000000000003
00007f3900ff8428: 00007f3900ff85e0 00007f3900ff85e0
00007f3900ff8438: 00007f3900ff8670 0000000000000002
00007f3900ff8448: 800000000000000e 0000000000000000
00007f3900ff8458: 0000000000000000 0000000000000000
00007f3900ff8468: 0000000000000000 0000000000000000
00007f3900ff8478: 0000000000000000 0000000000000000
00007f3900ff8488: 00007f38d00039b0 00007f38d00039b8
00007f3900ff8498: 00007f390d96f481 00007f390d1d9700
00007f3900ff84a8: 00007f38d000a690 00007f3900ff8740
00007f3900ff84b8: 0000000000000000 00007f390dac9a76
00007f3900ff84c8: <00007f390ce4b02a 0000000000000020
00007f3900ff84d8: 0000000000000000 0000000000000000
00007f3900ff84e8: 0000000000000000 0000000000000000
00007f3900ff84f8: 0000000000000000 0000000000000000
00007f3900ff8508: 0000000000000000 0000000000000000
00007f3900ff8518: 0000000000000000 0000000000000000
00007f3900ff8528: 0000000000000000 0000000000000000
00007f3900ff8538: 0000000000000000 0000000000000000
00007f3900ff8548: 0000000000000000 0000000000000000
00007f3900ff8558: 00007f390cf0b2e9 0000000000000000
00007f3900ff8568: 00007f390ce8cbff 00007f390d1d9540
00007f3900ff8578: 00007f390d96f481 00007f390d1d9700
00007f3900ff8588: 00007f38d000a690 00007f3900ff8740
00007f3900ff8598: 0000000000000000 00007f390dac9a76
00007f3900ff85a8: 00007f38d0012e10 0000000000000002
00007f3900ff85b8: 00007f38d0012e10 00007f38d000a690
runtime: unknown pc 0x7f390ce49428
stack: frame={sp:0x7f3900ff84c8, fp:0x0} stack=[0x7f39007f9168,0x7f3900ff8d68)
00007f3900ff83c8: 0000000000923608 0000000000000000
00007f3900ff83d8: 0000000000000000 0000000600000001
00007f3900ff83e8: 00007f3900000000 00007f3900ff8270
00007f3900ff83f8: 0000000000000000 00007f3900000000
00007f3900ff8408: 000000726f727245 00007f3900ff8588
00007f3900ff8418: 00007f390ddbab1f 0000000000000003
00007f3900ff8428: 00007f3900ff85e0 00007f3900ff85e0
00007f3900ff8438: 00007f3900ff8670 0000000000000002
00007f3900ff8448: 800000000000000e 0000000000000000
00007f3900ff8458: 0000000000000000 0000000000000000
00007f3900ff8468: 0000000000000000 0000000000000000
00007f3900ff8478: 0000000000000000 0000000000000000
00007f3900ff8488: 00007f38d00039b0 00007f38d00039b8
00007f3900ff8498: 00007f390d96f481 00007f390d1d9700
00007f3900ff84a8: 00007f38d000a690 00007f3900ff8740
00007f3900ff84b8: 0000000000000000 00007f390dac9a76
00007f3900ff84c8: <00007f390ce4b02a 0000000000000020
00007f3900ff84d8: 0000000000000000 0000000000000000
00007f3900ff84e8: 0000000000000000 0000000000000000
00007f3900ff84f8: 0000000000000000 0000000000000000
00007f3900ff8508: 0000000000000000 0000000000000000
00007f3900ff8518: 0000000000000000 0000000000000000
00007f3900ff8528: 0000000000000000 0000000000000000
00007f3900ff8538: 0000000000000000 0000000000000000
00007f3900ff8548: 0000000000000000 0000000000000000
00007f3900ff8558: 00007f390cf0b2e9 0000000000000000
00007f3900ff8568: 00007f390ce8cbff 00007f390d1d9540
00007f3900ff8578: 00007f390d96f481 00007f390d1d9700
00007f3900ff8588: 00007f38d000a690 00007f3900ff8740
00007f3900ff8598: 0000000000000000 00007f390dac9a76
00007f3900ff85a8: 00007f38d0012e10 0000000000000002
00007f3900ff85b8: 00007f38d0012e10 00007f38d000a690

goroutine 4 [syscall]:
runtime.cgocall(0x55a5f0, 0xc000039ed0, 0x13c6eb0)
/opt/go/src/runtime/cgocall.go:128 +0x5b fp=0xc000039ea0 sp=0xc000039e68 pc=0x40631b
github.com/apple/foundationdb/bindings/go/src/fdb._Cfunc_fdb_future_destroy(0x13caf70)
_cgo_gotypes.go:219 +0x41 fp=0xc000039ed0 sp=0xc000039ea0 pc=0x51ebd1
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1.1(0xc000010780)
/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x5e fp=0xc000039f10 sp=0xc000039ed0 pc=0x524c0e
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1(0xc000010780)
/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x2b fp=0xc000039f28 sp=0xc000039f10 pc=0x524c4b
runtime.call32(0x0, 0x5c02b0, 0xc00084a000, 0x1000000010)
/opt/go/src/runtime/asm_amd64.s:519 +0x3b fp=0xc000039f58 sp=0xc000039f28 pc=0x45987b
runtime.runfinq()
/opt/go/src/runtime/mfinal.go:222 +0x1e2 fp=0xc000039fe0 sp=0xc000039f58 pc=0x41a592
runtime.goexit()
/opt/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000039fe8 sp=0xc000039fe0 pc=0x45b591
created by runtime.createfing
/opt/go/src/runtime/mfinal.go:156 +0x61

now we use 2 services with ha proxy to avoid crashes…
but it crashes every day

Configuration:
Redundancy mode - double
Storage engine - ssd-2
Coordinators - 3

Cluster:
FoundationDB processes - 45
Machines - 3
Memory availability - 25.0 GB per process on machine with least available
Retransmissions rate - 2 Hz
Fault Tolerance - 1 machine
Server time - 03/13/19 10:37:00

Data:
Replication health - Healthy (Rebalancing)
Moving data - 0.087 GB
Sum of key-value sizes - 936.280 GB
Disk space used - 2.735 TB

Operating space:
Storage server - 734.1 GB free on most full server
Log server - 726.8 GB free on most full server

Workload:
Read rate - 4112 Hz
Write rate - 5696 Hz
Transactions started - 994 Hz
Transactions committed - 947 Hz
Conflict rate - 0 Hz

Backup and DR:
Running backups - 0
Running DRs - 1 as primary

Running DR tags (as primary):
default - a32b529dcd247588333ecea113404e82

Process performance details:
172.20.130.213:4500 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4501 ( 9% cpu; 11% machine; 0.093 Gbps; 60% disk IO; 3.3 GB / 25.1 GB RAM )
172.20.130.213:4502 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.213:4503 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.213:4504 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4505 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4506 ( 34% cpu; 11% machine; 0.093 Gbps; 59% disk IO; 3.4 GB / 25.1 GB RAM )
172.20.130.213:4507 ( 8% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4508 ( 18% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.213:4509 ( 14% cpu; 11% machine; 0.093 Gbps; 60% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4510 ( 8% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4511 ( 22% cpu; 11% machine; 0.093 Gbps; 60% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4512 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4513 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.213:4514 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.214:4500 ( 16% cpu; 17% machine; 0.067 Gbps; 72% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.214:4501 ( 8% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4502 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4503 ( 8% cpu; 17% machine; 0.067 Gbps; 74% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4504 ( 8% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 2.9 GB / 25.1 GB RAM )
172.20.130.214:4505 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4506 ( 16% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4507 ( 32% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.4 GB / 25.1 GB RAM )
172.20.130.214:4508 ( 12% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4509 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4510 ( 32% cpu; 17% machine; 0.067 Gbps; 74% disk IO; 3.4 GB / 25.1 GB RAM )
172.20.130.214:4511 ( 8% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4512 ( 22% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4513 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.214:4514 ( 84% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.215:4500 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4501 ( 82% cpu; 13% machine; 0.067 Gbps; 69% disk IO; 2.8 GB / 25.0 GB RAM )
172.20.130.215:4502 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4503 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4504 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4505 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4506 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4507 ( 8% cpu; 13% machine; 0.067 Gbps; 69% disk IO; 3.2 GB / 25.0 GB RAM )
172.20.130.215:4508 ( 14% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4509 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4510 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 6.5 GB / 25.0 GB RAM )
172.20.130.215:4511 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4512 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4513 ( 8% cpu; 13% machine; 0.067 Gbps; 69% disk IO; 3.1 GB / 25.0 GB RAM )
172.20.130.215:4514 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )

Coordination servers:
172.20.130.213:4500 (reachable)
172.20.130.214:4500 (reachable)
172.20.130.215:4500 (reachable)

I was thinking this was resolved when you switched to using the retry loops and not deferring tr.commit. Is that not the case?

it wasn’t resolve
we still have crashes

we change code to

aStr, err := fdb.Transact(func(tr fdb.Transaction) (result interface{}, err error) {
		h1 := program.GetHkey1(ttlpart)
		titles, err := directory.CreateOrOpen(tr, []string{`xxx`}, nil)
		if err != nil {
			return
		}

		listHK1 := titles.Sub(`hkey1`)
		listId := titles.Sub(`list`)

		rr := tr.GetRange(listHK1.Sub(*h1), fdb.RangeOptions{})
		ri := rr.Iterator()
		ttlCnt := 0
		rezz := []string{}
		for ri.Advance() {
			var kv fdb.KeyValue
			kv, err = ri.Get()
			if err != nil {
				fmt.Printf("Unable to read next value: %v\n", err)
				return
			}
			ttlCnt++

			var unkey tuple.Tuple
			unkey, err = listHK1.Unpack(kv.Key)

			if err != nil {
				log.Println(err)
				return
			}

			keyId := listId.Pack(tuple.Tuple{unkey[1].(string)})

			fbyte := tr.Get(keyId)

			var data []byte
			data, err = fbyte.Get()
			if err != nil {
				return
			}

			rezz = append(rezz, string(data))
		}
		return rezz, err
	})
2019/03/14 03:32:22 FdbTest.go:24: All 32810441 correct 32802351 In second  977
Internal Error @ ./flow/ThreadPrimitives.h 78:
  addr2line -e libfdb_c.so-debug -p -C -f -i 0x3ac2c9 0xd63c0 0xd6509 0xd5415 0xffff80d44a351d10
terminate called after throwing an instance of 'Error'
SIGABRT: abort
PC=0x7f2bb5b57428 m=13 sigcode=18446744073709551610

goroutine 0 [idle]:
runtime: unknown pc 0x7f2bb5b57428
stack: frame={sp:0x7f2ba97694c8, fp:0x0} stack=[0x7f2ba8f6a168,0x7f2ba9769d68)
00007f2ba97693c8:  0000000000923608  0000000000000000 
00007f2ba97693d8:  0000000000000000  0000000600000001 
00007f2ba97693e8:  00007f2b00000000  00007f2ba9769270 
00007f2ba97693f8:  0000000000000000  00007f2b00000000 
00007f2ba9769408:  000000726f727245  00007f2ba9769588 
00007f2ba9769418:  00007f2bb6ac8b1f  0000000000000003 
00007f2ba9769428:  00007f2ba97695e0  00007f2ba97695e0 
00007f2ba9769438:  00007f2ba9769670  0000000000000002 
00007f2ba9769448:  800000000000000e  0000000000000000 
00007f2ba9769458:  0000000000000000  0000000000000000 
00007f2ba9769468:  0000000000000000  0000000000000000 
00007f2ba9769478:  0000000000000000  0000000000000000 
00007f2ba9769488:  00007f2b84000b70  00007f2b84000b78 
00007f2ba9769498:  00007f2bb667d481  00007f2bb5ee7700 
00007f2ba97694a8:  00007f2b84011be0  00007f2ba9769740 
00007f2ba97694b8:  0000000000000000  00007f2bb67d7a76 
00007f2ba97694c8: <00007f2bb5b5902a  0000000000000020 
00007f2ba97694d8:  0000000000000000  0000000000000000 
00007f2ba97694e8:  0000000000000000  0000000000000000 
00007f2ba97694f8:  0000000000000000  0000000000000000 
00007f2ba9769508:  0000000000000000  0000000000000000 
00007f2ba9769518:  0000000000000000  0000000000000000 
00007f2ba9769528:  0000000000000000  0000000000000000 
00007f2ba9769538:  0000000000000000  0000000000000000 
00007f2ba9769548:  0000000000000000  0000000000000000 
00007f2ba9769558:  00007f2bb5c192e9  0000000000000000 
00007f2ba9769568:  00007f2bb5b9abff  00007f2bb5ee7540 
00007f2ba9769578:  00007f2bb667d481  00007f2bb5ee7700 
00007f2ba9769588:  00007f2b84011be0  00007f2ba9769740 
00007f2ba9769598:  0000000000000000  00007f2bb67d7a76 
00007f2ba97695a8:  00007f2b84012b90  0000000000000002 
00007f2ba97695b8:  00007f2b84012b90  00007f2b84011be0 
runtime: unknown pc 0x7f2bb5b57428
stack: frame={sp:0x7f2ba97694c8, fp:0x0} stack=[0x7f2ba8f6a168,0x7f2ba9769d68)
00007f2ba97693c8:  0000000000923608  0000000000000000 
00007f2ba97693d8:  0000000000000000  0000000600000001 
00007f2ba97693e8:  00007f2b00000000  00007f2ba9769270 
00007f2ba97693f8:  0000000000000000  00007f2b00000000 
00007f2ba9769408:  000000726f727245  00007f2ba9769588 
00007f2ba9769418:  00007f2bb6ac8b1f  0000000000000003 
00007f2ba9769428:  00007f2ba97695e0  00007f2ba97695e0 
00007f2ba9769438:  00007f2ba9769670  0000000000000002 
00007f2ba9769448:  800000000000000e  0000000000000000 
00007f2ba9769458:  0000000000000000  0000000000000000 
00007f2ba9769468:  0000000000000000  0000000000000000 
00007f2ba9769478:  0000000000000000  0000000000000000 
00007f2ba9769488:  00007f2b84000b70  00007f2b84000b78 
00007f2ba9769498:  00007f2bb667d481  00007f2bb5ee7700 
00007f2ba97694a8:  00007f2b84011be0  00007f2ba9769740 
00007f2ba97694b8:  0000000000000000  00007f2bb67d7a76 
00007f2ba97694c8: <00007f2bb5b5902a  0000000000000020 
00007f2ba97694d8:  0000000000000000  0000000000000000 
00007f2ba97694e8:  0000000000000000  0000000000000000 
00007f2ba97694f8:  0000000000000000  0000000000000000 
00007f2ba9769508:  0000000000000000  0000000000000000 
00007f2ba9769518:  0000000000000000  0000000000000000 
00007f2ba9769528:  0000000000000000  0000000000000000 
00007f2ba9769538:  0000000000000000  0000000000000000 
00007f2ba9769548:  0000000000000000  0000000000000000 
00007f2ba9769558:  00007f2bb5c192e9  0000000000000000 
00007f2ba9769568:  00007f2bb5b9abff  00007f2bb5ee7540 
00007f2ba9769578:  00007f2bb667d481  00007f2bb5ee7700 
00007f2ba9769588:  00007f2b84011be0  00007f2ba9769740 
00007f2ba9769598:  0000000000000000  00007f2bb67d7a76 
00007f2ba97695a8:  00007f2b84012b90  0000000000000002 
00007f2ba97695b8:  00007f2b84012b90  00007f2b84011be0 

goroutine 4 [syscall]:
runtime.cgocall(0x55a940, 0xc000039ed0, 0x8c00c820)
	/opt/go/src/runtime/cgocall.go:128 +0x5b fp=0xc000039ea0 sp=0xc000039e68 pc=0x40631b
github.com/apple/foundationdb/bindings/go/src/fdb._Cfunc_fdb_future_destroy(0x7f2bac008cb0)
	_cgo_gotypes.go:219 +0x41 fp=0xc000039ed0 sp=0xc000039ea0 pc=0x51ebd1
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1.1(0xc0002eec08)
	/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x5e fp=0xc000039f10 sp=0xc000039ed0 pc=0x524c0e
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1(0xc0002eec08)
	/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x2b fp=0xc000039f28 sp=0xc000039f10 pc=0x524c4b
runtime.call32(0x0, 0x5c06f8, 0xc000532000, 0x1000000010)
	/opt/go/src/runtime/asm_amd64.s:519 +0x3b fp=0xc000039f58 sp=0xc000039f28 pc=0x45987b
runtime.runfinq()
	/opt/go/src/runtime/mfinal.go:222 +0x1e2 fp=0xc000039fe0 sp=0xc000039f58 pc=0x41a592
runtime.goexit()
	/opt/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000039fe8 sp=0xc000039fe0 pc=0x45b591
created by runtime.createfing
	/opt/go/src/runtime/mfinal.go:156 +0x61

every day afret 1-12 hour we have crash.

Testing time is over. Our test stend must go work in production in other configuration.

Do you mind filing a bug on the foundationdb github issues with all this detail? It’s more likely to be picked up there.

really ??? are you kidding ???

ajbeamon commented [2 days ago]

If you don’t mind, could you raise some of these questions in posts on the forums? We usually reserve GitHub issues for specific tasks, bugs, or features rather than discussions about using FoundationDB.

Hmm ok, my mistake then. The error you’re getting suggests that a future is being destroyed while it’s being used, but I’m not spotting any obvious reason why your code should be causing that to happen. In that case, it may very well be something going on inside the binding itself. The only place from the snippet where I would expect an outstanding future is within the range iterator, so perhaps something is going wrong there. We’ll have to dig in to confirm, though.

As a side note, if you get an error within the Transact function, you should return it. This is how the error gets checked for being retryable, and whether or not it is, the call to commit will be skipped. See fdb package - github.com/apple/foundationdb/bindings/go/src/fdb - Go Packages.

Also since it came up earlier, if you want to limit the number of retries that a transaction will do and be able to see the last error that occurred if it exceeds the limit, use the SetRetryLimit transaction option.

yes we understand how error check on repeatability.

we have this error every 1-12 hours from 1200 rps and 1000 wps test load.
now we use nginx upstream proxy for 2 services to serv test and all ok.

sorry we cant contiune tests, this servers need to other jobs.

you can foun code here
trace here
debug_trace.log
debug_trace2.log

test data here test.log
our test log is about 300g size.

writer test
main.go

read test
DataLog_test.go

i add last error in my repo
look at
write_1000.log
debug_trace_3.log

last crashes of last version of code.

have you any ideas how to make stable application?
can it return error not crash.

I haven’t yet been able to identify what’s causing the error, so I can’t really offer much in the way of advice yet. I put together a little program of my own to try to reproduce, but it was unsuccessful. Let me try running the test you put together to see if it reproduces the problem for me.

Hi Folks,

We too have been occasionally hitting panics with the go bindings on

fdb._Cfunc_fdb_future_destroy(0x7f2bac008cb0)

My co-worker just submitted a PR that might be of use. So far we haven’t hit panics with this patch. Currently waiting on review. https://github.com/apple/foundationdb/pull/1451

Good find, thanks for the PR. @Vasilii if you are able to test this change, does it make things better for you?

Related to this, we are planning to eliminate use of finalizers and require explicit calls to close objects with native resources. See this issue for details:

Hi,
I am experiencing the same panics and I am interested in trying out your patch.
Since I am fairly new to go, can you refer me to any resources that explain how to deploy your patch?

I copied your modified files in the binding/go folder of my foundationdb cloned repository, and I can compile with make fdb_go (on Linux, using the docker image).

What are the next steps to instruct my go application to use this new version of the bindings? Apparently, my app is still using the old version (I added some println in the new bindings and I cannot see those prints).

Thank you.

Hi we install new version and run some test

$ fdbcli --exec “status”
Using cluster file `/etc/foundationdb/fdb.cluster’.

Configuration:
Redundancy mode - triple
Storage engine - ssd-2
Coordinators - 5

Cluster:
FoundationDB processes - 300
Machines - 10
Memory availability - 8.5 GB per process on machine with least available
Retransmissions rate - 4 Hz
Fault Tolerance - 2 machines
Server time - 07/05/19 18:11:00

Data:
Replication health - Healthy
Moving data - 0.000 GB
Sum of key-value sizes - 402 MB
Disk space used - 39.190 GB

Operating space:
Storage server - 1695.8 GB free on most full server
Log server - 1695.8 GB free on most full server

Workload:
Read rate - 6569 Hz
Write rate - 1268 Hz
Transactions started - 301 Hz
Transactions committed - 283 Hz
Conflict rate - 14 Hz

Backup and DR:
Running backups - 0
Running DRs - 0

Client time: 07/05/19 18:10:43

test data about 300g and one week (we have already cleared the data)

no problem rised

1 Like