Vasilii
(Vasiliy Popkov)
March 13, 2019, 7:03am
1
We have reproducible error
Internal Error @ ./flow/ThreadPrimitives.h 78:
addr2line -e libfdb_c.so-debug -p -C -f -i 0x3ac2c9 0xd63c0 0xd6509 0xd5415 0xffff80c6f305fd10
terminate called after throwing an instance of ‘Error’
SIGABRT: abort
PC=0x7f390ce49428 m=12 sigcode=18446744073709551610
on write and read process
goroutine 0 [idle]:
runtime: unknown pc 0x7f390ce49428
stack: frame={sp:0x7f3900ff84c8, fp:0x0} stack=[0x7f39007f9168,0x7f3900ff8d68)
00007f3900ff83c8: 0000000000923608 0000000000000000
00007f3900ff83d8: 0000000000000000 0000000600000001
00007f3900ff83e8: 00007f3900000000 00007f3900ff8270
00007f3900ff83f8: 0000000000000000 00007f3900000000
00007f3900ff8408: 000000726f727245 00007f3900ff8588
00007f3900ff8418: 00007f390ddbab1f 0000000000000003
00007f3900ff8428: 00007f3900ff85e0 00007f3900ff85e0
00007f3900ff8438: 00007f3900ff8670 0000000000000002
00007f3900ff8448: 800000000000000e 0000000000000000
00007f3900ff8458: 0000000000000000 0000000000000000
00007f3900ff8468: 0000000000000000 0000000000000000
00007f3900ff8478: 0000000000000000 0000000000000000
00007f3900ff8488: 00007f38d00039b0 00007f38d00039b8
00007f3900ff8498: 00007f390d96f481 00007f390d1d9700
00007f3900ff84a8: 00007f38d000a690 00007f3900ff8740
00007f3900ff84b8: 0000000000000000 00007f390dac9a76
00007f3900ff84c8: <00007f390ce4b02a 0000000000000020
00007f3900ff84d8: 0000000000000000 0000000000000000
00007f3900ff84e8: 0000000000000000 0000000000000000
00007f3900ff84f8: 0000000000000000 0000000000000000
00007f3900ff8508: 0000000000000000 0000000000000000
00007f3900ff8518: 0000000000000000 0000000000000000
00007f3900ff8528: 0000000000000000 0000000000000000
00007f3900ff8538: 0000000000000000 0000000000000000
00007f3900ff8548: 0000000000000000 0000000000000000
00007f3900ff8558: 00007f390cf0b2e9 0000000000000000
00007f3900ff8568: 00007f390ce8cbff 00007f390d1d9540
00007f3900ff8578: 00007f390d96f481 00007f390d1d9700
00007f3900ff8588: 00007f38d000a690 00007f3900ff8740
00007f3900ff8598: 0000000000000000 00007f390dac9a76
00007f3900ff85a8: 00007f38d0012e10 0000000000000002
00007f3900ff85b8: 00007f38d0012e10 00007f38d000a690
runtime: unknown pc 0x7f390ce49428
stack: frame={sp:0x7f3900ff84c8, fp:0x0} stack=[0x7f39007f9168,0x7f3900ff8d68)
00007f3900ff83c8: 0000000000923608 0000000000000000
00007f3900ff83d8: 0000000000000000 0000000600000001
00007f3900ff83e8: 00007f3900000000 00007f3900ff8270
00007f3900ff83f8: 0000000000000000 00007f3900000000
00007f3900ff8408: 000000726f727245 00007f3900ff8588
00007f3900ff8418: 00007f390ddbab1f 0000000000000003
00007f3900ff8428: 00007f3900ff85e0 00007f3900ff85e0
00007f3900ff8438: 00007f3900ff8670 0000000000000002
00007f3900ff8448: 800000000000000e 0000000000000000
00007f3900ff8458: 0000000000000000 0000000000000000
00007f3900ff8468: 0000000000000000 0000000000000000
00007f3900ff8478: 0000000000000000 0000000000000000
00007f3900ff8488: 00007f38d00039b0 00007f38d00039b8
00007f3900ff8498: 00007f390d96f481 00007f390d1d9700
00007f3900ff84a8: 00007f38d000a690 00007f3900ff8740
00007f3900ff84b8: 0000000000000000 00007f390dac9a76
00007f3900ff84c8: <00007f390ce4b02a 0000000000000020
00007f3900ff84d8: 0000000000000000 0000000000000000
00007f3900ff84e8: 0000000000000000 0000000000000000
00007f3900ff84f8: 0000000000000000 0000000000000000
00007f3900ff8508: 0000000000000000 0000000000000000
00007f3900ff8518: 0000000000000000 0000000000000000
00007f3900ff8528: 0000000000000000 0000000000000000
00007f3900ff8538: 0000000000000000 0000000000000000
00007f3900ff8548: 0000000000000000 0000000000000000
00007f3900ff8558: 00007f390cf0b2e9 0000000000000000
00007f3900ff8568: 00007f390ce8cbff 00007f390d1d9540
00007f3900ff8578: 00007f390d96f481 00007f390d1d9700
00007f3900ff8588: 00007f38d000a690 00007f3900ff8740
00007f3900ff8598: 0000000000000000 00007f390dac9a76
00007f3900ff85a8: 00007f38d0012e10 0000000000000002
00007f3900ff85b8: 00007f38d0012e10 00007f38d000a690
goroutine 4 [syscall]:
runtime.cgocall(0x55a5f0, 0xc000039ed0, 0x13c6eb0)
/opt/go/src/runtime/cgocall.go:128 +0x5b fp=0xc000039ea0 sp=0xc000039e68 pc=0x40631b
github.com/apple/foundationdb/bindings/go/src/fdb._Cfunc_fdb_future_destroy(0x13caf70)
_cgo_gotypes.go:219 +0x41 fp=0xc000039ed0 sp=0xc000039ea0 pc=0x51ebd1
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1.1(0xc000010780)
/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x5e fp=0xc000039f10 sp=0xc000039ed0 pc=0x524c0e
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1(0xc000010780)
/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x2b fp=0xc000039f28 sp=0xc000039f10 pc=0x524c4b
runtime.call32(0x0, 0x5c02b0, 0xc00084a000, 0x1000000010)
/opt/go/src/runtime/asm_amd64.s:519 +0x3b fp=0xc000039f58 sp=0xc000039f28 pc=0x45987b
runtime.runfinq()
/opt/go/src/runtime/mfinal.go:222 +0x1e2 fp=0xc000039fe0 sp=0xc000039f58 pc=0x41a592
runtime.goexit()
/opt/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000039fe8 sp=0xc000039fe0 pc=0x45b591
created by runtime.createfing
/opt/go/src/runtime/mfinal.go:156 +0x61
Vasilii
(Vasiliy Popkov)
March 13, 2019, 7:12am
2
now we use 2 services with ha proxy to avoid crashes…
but it crashes every day
Vasilii
(Vasiliy Popkov)
March 13, 2019, 7:40am
3
Configuration:
Redundancy mode - double
Storage engine - ssd-2
Coordinators - 3
Cluster:
FoundationDB processes - 45
Machines - 3
Memory availability - 25.0 GB per process on machine with least available
Retransmissions rate - 2 Hz
Fault Tolerance - 1 machine
Server time - 03/13/19 10:37:00
Data:
Replication health - Healthy (Rebalancing)
Moving data - 0.087 GB
Sum of key-value sizes - 936.280 GB
Disk space used - 2.735 TB
Operating space:
Storage server - 734.1 GB free on most full server
Log server - 726.8 GB free on most full server
Workload:
Read rate - 4112 Hz
Write rate - 5696 Hz
Transactions started - 994 Hz
Transactions committed - 947 Hz
Conflict rate - 0 Hz
Backup and DR:
Running backups - 0
Running DRs - 1 as primary
Running DR tags (as primary):
default - a32b529dcd247588333ecea113404e82
Process performance details:
172.20.130.213:4500 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4501 ( 9% cpu; 11% machine; 0.093 Gbps; 60% disk IO; 3.3 GB / 25.1 GB RAM )
172.20.130.213:4502 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.213:4503 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.213:4504 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4505 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4506 ( 34% cpu; 11% machine; 0.093 Gbps; 59% disk IO; 3.4 GB / 25.1 GB RAM )
172.20.130.213:4507 ( 8% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4508 ( 18% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.213:4509 ( 14% cpu; 11% machine; 0.093 Gbps; 60% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4510 ( 8% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4511 ( 22% cpu; 11% machine; 0.093 Gbps; 60% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4512 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.213:4513 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.213:4514 ( 9% cpu; 11% machine; 0.093 Gbps; 61% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.214:4500 ( 16% cpu; 17% machine; 0.067 Gbps; 72% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.214:4501 ( 8% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4502 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4503 ( 8% cpu; 17% machine; 0.067 Gbps; 74% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4504 ( 8% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 2.9 GB / 25.1 GB RAM )
172.20.130.214:4505 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4506 ( 16% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4507 ( 32% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.4 GB / 25.1 GB RAM )
172.20.130.214:4508 ( 12% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.0 GB / 25.1 GB RAM )
172.20.130.214:4509 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4510 ( 32% cpu; 17% machine; 0.067 Gbps; 74% disk IO; 3.4 GB / 25.1 GB RAM )
172.20.130.214:4511 ( 8% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4512 ( 22% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.214:4513 ( 9% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.2 GB / 25.1 GB RAM )
172.20.130.214:4514 ( 84% cpu; 17% machine; 0.067 Gbps; 73% disk IO; 3.1 GB / 25.1 GB RAM )
172.20.130.215:4500 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4501 ( 82% cpu; 13% machine; 0.067 Gbps; 69% disk IO; 2.8 GB / 25.0 GB RAM )
172.20.130.215:4502 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4503 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4504 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4505 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4506 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4507 ( 8% cpu; 13% machine; 0.067 Gbps; 69% disk IO; 3.2 GB / 25.0 GB RAM )
172.20.130.215:4508 ( 14% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 2.9 GB / 25.0 GB RAM )
172.20.130.215:4509 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4510 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 6.5 GB / 25.0 GB RAM )
172.20.130.215:4511 ( 8% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4512 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
172.20.130.215:4513 ( 8% cpu; 13% machine; 0.067 Gbps; 69% disk IO; 3.1 GB / 25.0 GB RAM )
172.20.130.215:4514 ( 9% cpu; 13% machine; 0.067 Gbps; 70% disk IO; 3.0 GB / 25.0 GB RAM )
Coordination servers:
172.20.130.213:4500 (reachable)
172.20.130.214:4500 (reachable)
172.20.130.215:4500 (reachable)
ajbeamon
(A.J. Beamon)
March 13, 2019, 9:49pm
4
I was thinking this was resolved when you switched to using the retry loops and not deferring tr.commit. Is that not the case?
Vasilii
(Vasiliy Popkov)
March 14, 2019, 7:50am
5
it wasn’t resolve
we still have crashes
Vasilii
(Vasiliy Popkov)
March 14, 2019, 7:52am
6
we change code to
aStr, err := fdb.Transact(func(tr fdb.Transaction) (result interface{}, err error) {
h1 := program.GetHkey1(ttlpart)
titles, err := directory.CreateOrOpen(tr, []string{`xxx`}, nil)
if err != nil {
return
}
listHK1 := titles.Sub(`hkey1`)
listId := titles.Sub(`list`)
rr := tr.GetRange(listHK1.Sub(*h1), fdb.RangeOptions{})
ri := rr.Iterator()
ttlCnt := 0
rezz := []string{}
for ri.Advance() {
var kv fdb.KeyValue
kv, err = ri.Get()
if err != nil {
fmt.Printf("Unable to read next value: %v\n", err)
return
}
ttlCnt++
var unkey tuple.Tuple
unkey, err = listHK1.Unpack(kv.Key)
if err != nil {
log.Println(err)
return
}
keyId := listId.Pack(tuple.Tuple{unkey[1].(string)})
fbyte := tr.Get(keyId)
var data []byte
data, err = fbyte.Get()
if err != nil {
return
}
rezz = append(rezz, string(data))
}
return rezz, err
})
Vasilii
(Vasiliy Popkov)
March 14, 2019, 7:53am
7
2019/03/14 03:32:22 FdbTest.go:24: All 32810441 correct 32802351 In second 977
Internal Error @ ./flow/ThreadPrimitives.h 78:
addr2line -e libfdb_c.so-debug -p -C -f -i 0x3ac2c9 0xd63c0 0xd6509 0xd5415 0xffff80d44a351d10
terminate called after throwing an instance of 'Error'
SIGABRT: abort
PC=0x7f2bb5b57428 m=13 sigcode=18446744073709551610
goroutine 0 [idle]:
runtime: unknown pc 0x7f2bb5b57428
stack: frame={sp:0x7f2ba97694c8, fp:0x0} stack=[0x7f2ba8f6a168,0x7f2ba9769d68)
00007f2ba97693c8: 0000000000923608 0000000000000000
00007f2ba97693d8: 0000000000000000 0000000600000001
00007f2ba97693e8: 00007f2b00000000 00007f2ba9769270
00007f2ba97693f8: 0000000000000000 00007f2b00000000
00007f2ba9769408: 000000726f727245 00007f2ba9769588
00007f2ba9769418: 00007f2bb6ac8b1f 0000000000000003
00007f2ba9769428: 00007f2ba97695e0 00007f2ba97695e0
00007f2ba9769438: 00007f2ba9769670 0000000000000002
00007f2ba9769448: 800000000000000e 0000000000000000
00007f2ba9769458: 0000000000000000 0000000000000000
00007f2ba9769468: 0000000000000000 0000000000000000
00007f2ba9769478: 0000000000000000 0000000000000000
00007f2ba9769488: 00007f2b84000b70 00007f2b84000b78
00007f2ba9769498: 00007f2bb667d481 00007f2bb5ee7700
00007f2ba97694a8: 00007f2b84011be0 00007f2ba9769740
00007f2ba97694b8: 0000000000000000 00007f2bb67d7a76
00007f2ba97694c8: <00007f2bb5b5902a 0000000000000020
00007f2ba97694d8: 0000000000000000 0000000000000000
00007f2ba97694e8: 0000000000000000 0000000000000000
00007f2ba97694f8: 0000000000000000 0000000000000000
00007f2ba9769508: 0000000000000000 0000000000000000
00007f2ba9769518: 0000000000000000 0000000000000000
00007f2ba9769528: 0000000000000000 0000000000000000
00007f2ba9769538: 0000000000000000 0000000000000000
00007f2ba9769548: 0000000000000000 0000000000000000
00007f2ba9769558: 00007f2bb5c192e9 0000000000000000
00007f2ba9769568: 00007f2bb5b9abff 00007f2bb5ee7540
00007f2ba9769578: 00007f2bb667d481 00007f2bb5ee7700
00007f2ba9769588: 00007f2b84011be0 00007f2ba9769740
00007f2ba9769598: 0000000000000000 00007f2bb67d7a76
00007f2ba97695a8: 00007f2b84012b90 0000000000000002
00007f2ba97695b8: 00007f2b84012b90 00007f2b84011be0
runtime: unknown pc 0x7f2bb5b57428
stack: frame={sp:0x7f2ba97694c8, fp:0x0} stack=[0x7f2ba8f6a168,0x7f2ba9769d68)
00007f2ba97693c8: 0000000000923608 0000000000000000
00007f2ba97693d8: 0000000000000000 0000000600000001
00007f2ba97693e8: 00007f2b00000000 00007f2ba9769270
00007f2ba97693f8: 0000000000000000 00007f2b00000000
00007f2ba9769408: 000000726f727245 00007f2ba9769588
00007f2ba9769418: 00007f2bb6ac8b1f 0000000000000003
00007f2ba9769428: 00007f2ba97695e0 00007f2ba97695e0
00007f2ba9769438: 00007f2ba9769670 0000000000000002
00007f2ba9769448: 800000000000000e 0000000000000000
00007f2ba9769458: 0000000000000000 0000000000000000
00007f2ba9769468: 0000000000000000 0000000000000000
00007f2ba9769478: 0000000000000000 0000000000000000
00007f2ba9769488: 00007f2b84000b70 00007f2b84000b78
00007f2ba9769498: 00007f2bb667d481 00007f2bb5ee7700
00007f2ba97694a8: 00007f2b84011be0 00007f2ba9769740
00007f2ba97694b8: 0000000000000000 00007f2bb67d7a76
00007f2ba97694c8: <00007f2bb5b5902a 0000000000000020
00007f2ba97694d8: 0000000000000000 0000000000000000
00007f2ba97694e8: 0000000000000000 0000000000000000
00007f2ba97694f8: 0000000000000000 0000000000000000
00007f2ba9769508: 0000000000000000 0000000000000000
00007f2ba9769518: 0000000000000000 0000000000000000
00007f2ba9769528: 0000000000000000 0000000000000000
00007f2ba9769538: 0000000000000000 0000000000000000
00007f2ba9769548: 0000000000000000 0000000000000000
00007f2ba9769558: 00007f2bb5c192e9 0000000000000000
00007f2ba9769568: 00007f2bb5b9abff 00007f2bb5ee7540
00007f2ba9769578: 00007f2bb667d481 00007f2bb5ee7700
00007f2ba9769588: 00007f2b84011be0 00007f2ba9769740
00007f2ba9769598: 0000000000000000 00007f2bb67d7a76
00007f2ba97695a8: 00007f2b84012b90 0000000000000002
00007f2ba97695b8: 00007f2b84012b90 00007f2b84011be0
goroutine 4 [syscall]:
runtime.cgocall(0x55a940, 0xc000039ed0, 0x8c00c820)
/opt/go/src/runtime/cgocall.go:128 +0x5b fp=0xc000039ea0 sp=0xc000039e68 pc=0x40631b
github.com/apple/foundationdb/bindings/go/src/fdb._Cfunc_fdb_future_destroy(0x7f2bac008cb0)
_cgo_gotypes.go:219 +0x41 fp=0xc000039ed0 sp=0xc000039ea0 pc=0x51ebd1
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1.1(0xc0002eec08)
/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x5e fp=0xc000039f10 sp=0xc000039ed0 pc=0x524c0e
github.com/apple/foundationdb/bindings/go/src/fdb.newFuture.func1(0xc0002eec08)
/home/pvv/GOPKG/pkg/mod/github.com/apple/foundationdb/bindings/go@v0.0.0-20190311170436-f2d582ffa197/src/fdb/futures.go:81 +0x2b fp=0xc000039f28 sp=0xc000039f10 pc=0x524c4b
runtime.call32(0x0, 0x5c06f8, 0xc000532000, 0x1000000010)
/opt/go/src/runtime/asm_amd64.s:519 +0x3b fp=0xc000039f58 sp=0xc000039f28 pc=0x45987b
runtime.runfinq()
/opt/go/src/runtime/mfinal.go:222 +0x1e2 fp=0xc000039fe0 sp=0xc000039f58 pc=0x41a592
runtime.goexit()
/opt/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000039fe8 sp=0xc000039fe0 pc=0x45b591
created by runtime.createfing
/opt/go/src/runtime/mfinal.go:156 +0x61
Vasilii
(Vasiliy Popkov)
March 14, 2019, 8:00am
8
every day afret 1-12 hour we have crash.
Testing time is over. Our test stend must go work in production in other configuration.
josephg
(Seph Gentle)
March 14, 2019, 11:51am
9
Do you mind filing a bug on the foundationdb github issues with all this detail? It’s more likely to be picked up there.
Vasilii
(Vasiliy Popkov)
March 14, 2019, 1:22pm
10
really ??? are you kidding ???
ajbeamon commented [2 days ago]
If you don’t mind, could you raise some of these questions in posts on the forums ? We usually reserve GitHub issues for specific tasks, bugs, or features rather than discussions about using FoundationDB.
opened 03:11PM - 06 Mar 19 UTC
closed 03:38PM - 12 Mar 19 UTC
We have repeateble error on 1200 wps
```
Internal Error @ ./flow/ThreadPrimi… tives.h 78:
addr2line -e libfdb_c.so-debug -p -C -f -i 0x3abf89 0xd6250 0xd6399 0xd52a5 0xffff80578481ca40
terminate called after throwing an instance of 'Error'
SIGABRT: abort
PC=0x7fa87b90953f m=4 sigcode=18446744073709551610
goroutine 0 [idle]:
runtime: unknown pc 0x7fa87b90953f
stack: frame={sp:0x7fa878e383d0, fp:0x0} stack=[0x7fa878639188,0x7fa878e38d88)
00007fa878e382d0: fffffffffffffff0 0000000000000001
00007fa878e382e0: 0000000000000000 0000000000000000
00007fa878e382f0: 0000000000000000 0000000000000000
00007fa878e38300: 0000000000000000 0000000000000000
00007fa878e38310: fffffffffffffff8 0000000000000001
00007fa878e38320: 0000000000000000 0000000000000000
00007fa878e38330: ffffffffffffffff 0000ff00000000ff
00007fa878e38340: 6468705f65746172 34646e6172730072
00007fa878e38350: 0000705f65746172 0000000000000000
00007fa878e38360: 6e61727300726468 65735f5f00383464
00007fa878e38370: 7830203939333664 7830203561323564
00007fa878e38380: 7830206531316361 7830203035323664
00007fa878e38390: 0000000000000000 0000000000000000
00007fa878e383a0: 5000687361724300 6554737365636f72
00007fa878e383b0: 0000000000000000 0000000000000000
00007fa878e383c0: 000000c00000e200 00000000006d58e0
00007fa878e383d0: <0000000000000000 0000000000000000
00007fa878e383e0: 0000000000000000 0000000000000000
00007fa878e383f0: 000000c00010c3a0 0000000000000001
00007fa878e38400: 0000000000000001 00000000008ac920
00007fa878e38410: 0000000000000000 0000000000000000
00007fa878e38420: 00007fa800000000 000000726f727245
00007fa878e38430: 00007fa86c00a3f8 00007fa86c00a3f0
00007fa878e38440: 00007fa878e38690 00007fa878e38af0
00007fa878e38450: fffffffe7fffffff ffffffffffffffff
00007fa878e38460: ffffffffffffffff ffffffffffffffff
00007fa878e38470: ffffffffffffffff ffffffffffffffff
00007fa878e38480: ffffffffffffffff ffffffffffffffff
00007fa878e38490: ffffffffffffffff ffffffffffffffff
00007fa878e384a0: ffffffffffffffff ffffffffffffffff
00007fa878e384b0: ffffffffffffffff ffffffffffffffff
00007fa878e384c0: ffffffffffffffff ffffffffffffffff
runtime: unknown pc 0x7fa87b90953f
stack: frame={sp:0x7fa878e383d0, fp:0x0} stack=[0x7fa878639188,0x7fa878e38d88)
00007fa878e382d0: fffffffffffffff0 0000000000000001
00007fa878e382e0: 0000000000000000 0000000000000000
00007fa878e382f0: 0000000000000000 0000000000000000
00007fa878e38300: 0000000000000000 0000000000000000
00007fa878e38310: fffffffffffffff8 0000000000000001
00007fa878e38320: 0000000000000000 0000000000000000
00007fa878e38330: ffffffffffffffff 0000ff00000000ff
00007fa878e38340: 6468705f65746172 34646e6172730072
00007fa878e38350: 0000705f65746172 0000000000000000
00007fa878e38360: 6e61727300726468 65735f5f00383464
00007fa878e38370: 7830203939333664 7830203561323564
00007fa878e38380: 7830206531316361 7830203035323664
00007fa878e38390: 0000000000000000 0000000000000000
00007fa878e383a0: 5000687361724300 6554737365636f72
00007fa878e383b0: 0000000000000000 0000000000000000
00007fa878e383c0: 000000c00000e200 00000000006d58e0
00007fa878e383d0: <0000000000000000 0000000000000000
00007fa878e383e0: 0000000000000000 0000000000000000
00007fa878e383f0: 000000c00010c3a0 0000000000000001
00007fa878e38400: 0000000000000001 00000000008ac920
00007fa878e38410: 0000000000000000 0000000000000000
00007fa878e38420: 00007fa800000000 000000726f727245
00007fa878e38430: 00007fa86c00a3f8 00007fa86c00a3f0
00007fa878e38440: 00007fa878e38690 00007fa878e38af0
00007fa878e38450: fffffffe7fffffff ffffffffffffffff
00007fa878e38460: ffffffffffffffff ffffffffffffffff
00007fa878e38470: ffffffffffffffff ffffffffffffffff
00007fa878e38480: ffffffffffffffff ffffffffffffffff
00007fa878e38490: ffffffffffffffff ffffffffffffffff
00007fa878e384a0: ffffffffffffffff ffffffffffffffff
00007fa878e384b0: ffffffffffffffff ffffffffffffffff
00007fa878e384c0: ffffffffffffffff ffffffffffffffff
goroutine 4 [syscall]:
runtime.cgocall(0x606290, 0xc000037ed0, 0x0)
/usr/lib/golang/src/runtime/cgocall.go:128 +0x5b fp=0xc000037ea0 sp=0xc000037e68 pc=0x4067bb
github.com/apple/foundationdb/bindings/go/src/fdb._Cfunc_fdb_future_destroy(0x7fa86c002ad0)
_cgo_gotypes.go:223 +0x41 fp=0xc000037ed0 sp=0xc000037ea0 pc=0x4d5321
```
ajbeamon
(A.J. Beamon)
March 14, 2019, 3:36pm
11
Hmm ok, my mistake then. The error you’re getting suggests that a future is being destroyed while it’s being used, but I’m not spotting any obvious reason why your code should be causing that to happen. In that case, it may very well be something going on inside the binding itself. The only place from the snippet where I would expect an outstanding future is within the range iterator, so perhaps something is going wrong there. We’ll have to dig in to confirm, though.
As a side note, if you get an error within the Transact
function, you should return it. This is how the error gets checked for being retryable, and whether or not it is, the call to commit will be skipped. See fdb package - github.com/apple/foundationdb/bindings/go/src/fdb - Go Packages .
Also since it came up earlier, if you want to limit the number of retries that a transaction will do and be able to see the last error that occurred if it exceeds the limit, use the SetRetryLimit
transaction option.
Vasilii
(Vasiliy Popkov)
March 14, 2019, 7:03pm
12
yes we understand how error check on repeatability.
we have this error every 1-12 hours from 1200 rps and 1000 wps test load.
now we use nginx upstream proxy for 2 services to serv test and all ok.
sorry we cant contiune tests, this servers need to other jobs.
Vasilii
(Vasiliy Popkov)
March 15, 2019, 1:56pm
13
you can foun code here
trace here
debug_trace.log
debug_trace2.log
test data here test.log
our test log is about 300g size.
writer test
main.go
read test
DataLog_test.go
Vasilii
(Vasiliy Popkov)
March 18, 2019, 7:35am
14
i add last error in my repo
look at
write_1000.log
debug_trace_3.log
last crashes of last version of code.
have you any ideas how to make stable application?
can it return error not crash.
ajbeamon
(A.J. Beamon)
March 18, 2019, 3:09pm
15
I haven’t yet been able to identify what’s causing the error, so I can’t really offer much in the way of advice yet. I put together a little program of my own to try to reproduce, but it was unsuccessful. Let me try running the test you put together to see if it reproduces the problem for me.
rjenkins
(Ray Jenkins)
April 10, 2019, 5:28pm
16
Hi Folks,
We too have been occasionally hitting panics with the go bindings on
fdb._Cfunc_fdb_future_destroy(0x7f2bac008cb0)
My co-worker just submitted a PR that might be of use. So far we haven’t hit panics with this patch. Currently waiting on review. https://github.com/apple/foundationdb/pull/1451
ajbeamon
(A.J. Beamon)
April 10, 2019, 5:49pm
17
Good find, thanks for the PR. @Vasilii if you are able to test this change, does it make things better for you?
Related to this, we are planning to eliminate use of finalizers and require explicit calls to close objects with native resources. See this issue for details:
ddi
May 15, 2019, 6:54am
18
Hi,
I am experiencing the same panics and I am interested in trying out your patch.
Since I am fairly new to go, can you refer me to any resources that explain how to deploy your patch?
I copied your modified files in the binding/go folder of my foundationdb cloned repository, and I can compile with make fdb_go (on Linux, using the docker image).
What are the next steps to instruct my go application to use this new version of the bindings? Apparently, my app is still using the old version (I added some println in the new bindings and I cannot see those prints).
Thank you.
Vasilii
(Vasiliy Popkov)
July 5, 2019, 3:14pm
19
Hi we install new version and run some test
$ fdbcli --exec “status”
Using cluster file `/etc/foundationdb/fdb.cluster’.
Configuration:
Redundancy mode - triple
Storage engine - ssd-2
Coordinators - 5
Cluster:
FoundationDB processes - 300
Machines - 10
Memory availability - 8.5 GB per process on machine with least available
Retransmissions rate - 4 Hz
Fault Tolerance - 2 machines
Server time - 07/05/19 18:11:00
Data:
Replication health - Healthy
Moving data - 0.000 GB
Sum of key-value sizes - 402 MB
Disk space used - 39.190 GB
Operating space:
Storage server - 1695.8 GB free on most full server
Log server - 1695.8 GB free on most full server
Workload:
Read rate - 6569 Hz
Write rate - 1268 Hz
Transactions started - 301 Hz
Transactions committed - 283 Hz
Conflict rate - 14 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
Client time: 07/05/19 18:10:43
test data about 300g and one week (we have already cleared the data)
no problem rised
1 Like