Backup to S3 with Signature Version 4 works in AWS but get rejected in GCP with SignatureDoesNotMatch

Backup to S3 with Signature Version 4 works in AWS but get rejected in GCP with SignatureDoesNotMatch.

We want to deploy a FoundationDB cluster on Google Cloud but we noticed the fdbbackup’s issue regarding backup to GCS using S3 compatible protocol and we decided to investigate. All our testing have been performed with Signature Version 4 enabled.

We use fdbbackup with the option knob_http_request_aws_v4_header=true set, it works for AWS S3 but not for GCP Storage. Below are both responses from AWS and GCP respectively:

AWS:

root@486fa579b364:/tmp# ./backup_agent -C /var/fdb/fdb.cluster --knob_http_verbose_level=4 --knob_http_request_aws_v4_header=true
[cb680bf8591113d81de1b1ef05d371e3] HTTP starting PUT /ci-cd-code-deploy/data/fdb-service/kvranges/snapshot.000000533810482490/0/range,533810528742,bc9ed0fd90c96313a555204c0aeae1a6,1048576 ContentLen:47
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/us-east-2/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=b3718d27efc55802aabbdf576734eaefe3c5d81e78ec6535c2201e9b41c101c6
Request Header: Content-Length: 47
Request Header: Content-MD5: v286vjzzSoN6yVcznbCqDQ==
Request Header: Host: s3.us-east-2.amazonaws.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T034206Z
[cb680bf8591113d81de1b1ef05d371e3] HTTP code=200 early=0, time=0.336246s PUT /ci-cd-code-deploy/data/fdb-service/kvranges/snapshot.000000533810482490/0/range,533810528742,bc9ed0fd90c96313a555204c0aeae1a6,1048576 contentLen=47 [626 out, response content len 0]
[cb680bf8591113d81de1b1ef05d371e3] HTTP RESPONSE:  PUT /ci-cd-code-deploy/data/fdb-service/kvranges/snapshot.000000533810482490/0/range,533810528742,bc9ed0fd90c96313a555204c0aeae1a6,1048576
Response Code: 200
Response ContentLen: 0
Reponse Header: Content-Length: 0
Reponse Header: Date: Thu, 14 Apr 2022 03:42:07 GMT
Reponse Header: ETag: "bf6f3abe3cf34a837ac957339db0aa0d"
Reponse Header: Server: AmazonS3
Reponse Header: x-amz-id-2: GTbmvgCORy1FxDk9JiO9P/ClOafXi1yTlEqRyFpGJ4EYVazusbQHbajOzgfCpIPSFtI6ZHAMgdE=
Reponse Header: x-amz-request-id: 797DZNPQ6Z5AQ3W2
-- RESPONSE CONTENT--

--------

[cb680bf8591113d81de1b1ef05d371e3] HTTP starting PUT /ci-cd-code-deploy/data/fdb-service/snapshots/snapshot,533810528742,533810528742,47 ContentLen:338
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/us-east-2/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=d84baabbfaa2db988cb669477d698b9d5433135f9e1d972360cf728e4124960b
Request Header: Content-Length: 338
Request Header: Content-MD5: Rr91S5wRHm0FPf1xW7N+6g==
Request Header: Host: s3.us-east-2.amazonaws.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T034206Z
[cb680bf8591113d81de1b1ef05d371e3] HTTP code=200 early=0, time=0.310763s PUT /ci-cd-code-deploy/data/fdb-service/snapshots/snapshot,533810528742,533810528742,47 contentLen=338 [867 out, response content len 0]
[cb680bf8591113d81de1b1ef05d371e3] HTTP RESPONSE:  PUT /ci-cd-code-deploy/data/fdb-service/snapshots/snapshot,533810528742,533810528742,47
Response Code: 200
Response ContentLen: 0
Reponse Header: Content-Length: 0
Reponse Header: Date: Thu, 14 Apr 2022 03:42:07 GMT
Reponse Header: ETag: "46bf754b9c111e6d053dfd715bb37eea"
Reponse Header: Server: AmazonS3
Reponse Header: x-amz-id-2: zabU13fM4P58xUg8110OubKffP2GiUviXWTNgdsKaReQzlM08od6zh3d+6NrhSwBQFQKj2Vlh9o=
Reponse Header: x-amz-request-id: 7972BJ07KG914DNY
-- RESPONSE CONTENT--

--------

[5621d922b32d841e5d94f76be727dc3c] HTTP starting PUT /ci-cd-code-deploy/data/fdb-service/logs/0000/0005/log,533810482490,533830482490,cf235f6e28e9578bb12e8acce0a44d45,1048576 ContentLen:0
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/us-east-2/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=0155973d12b2f2d9e190c4e4a916a8f8c67824e67c54ab8eb18c8fc3a2002873
Request Header: Content-Length: 0
Request Header: Content-MD5: 1B2M2Y8AsgTpgAmY7PhCfg==
Request Header: Host: s3.us-east-2.amazonaws.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T034243Z
[5621d922b32d841e5d94f76be727dc3c] HTTP code=200 early=0, time=0.314697s PUT /ci-cd-code-deploy/data/fdb-service/logs/0000/0005/log,533810482490,533830482490,cf235f6e28e9578bb12e8acce0a44d45,1048576 contentLen=0 [565 out, response content len 0]
[5621d922b32d841e5d94f76be727dc3c] HTTP RESPONSE:  PUT /ci-cd-code-deploy/data/fdb-service/logs/0000/0005/log,533810482490,533830482490,cf235f6e28e9578bb12e8acce0a44d45,1048576
Response Code: 200
Response ContentLen: 0
Reponse Header: Content-Length: 0
Reponse Header: Date: Thu, 14 Apr 2022 03:42:44 GMT
Reponse Header: ETag: "d41d8cd98f00b204e9800998ecf8427e"
Reponse Header: Server: AmazonS3
Reponse Header: x-amz-id-2: S4iuq609dYqv4uKM0DWVc4fcXholSBGojtFkO381RJkvtfVigJCeqXgNCQUD4VxBThg1rSL4Ymg=
Reponse Header: x-amz-request-id: MT0FTXST0ZFHQ6K4
-- RESPONSE CONTENT--

--------

GCP:

root@486fa579b364:/tmp# ./backup_agent -C /var/fdb/fdb.cluster --knob_http_verbose_level=4 --knob_http_request_aws_v4_header=true
[4c81593e7235562a0ea405ad6ac3ddbc] HTTP starting PUT /fdb-backup/data/fdb/kvranges/snapshot.000000534297541501/0/range,534297584836,0a2302e1c16dad06c9a0d4809cbd8814,1048576 ContentLen:47
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/googleapis/storage/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=91dde1438bbf69f25994558b5d1fee70855138c05b78c909297e3ad2e9c29b47
Request Header: Content-Length: 47
Request Header: Content-MD5: v286vjzzSoN6yVcznbCqDQ==
Request Header: Host: storage.googleapis.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T035013Z
[4c81593e7235562a0ea405ad6ac3ddbc] HTTP code=403 early=0, time=0.088937s PUT /fdb-backup/data/fdb/kvranges/snapshot.000000534297541501/0/range,534297584836,0a2302e1c16dad06c9a0d4809cbd8814,1048576 contentLen=47 [654 out, response content len 760]
[4c81593e7235562a0ea405ad6ac3ddbc] HTTP RESPONSE:  PUT /fdb-backup/data/fdb/kvranges/snapshot.000000534297541501/0/range,534297584836,0a2302e1c16dad06c9a0d4809cbd8814,1048576
Response Code: 403
Response ContentLen: 760
Reponse Header: Content-Length: 760
Reponse Header: Content-Type: application/xml; charset=UTF-8
Reponse Header: Date: Thu, 14 Apr 2022 03:50:13 GMT
Reponse Header: Server: UploadServer
Reponse Header: X-GUploader-UploadID: ADPycdtHUmG3YpVi2jJxYR3diEKsWpq9ClFpSPm4isGO4-eDEh26CMo3F9HONTuBenx3wf1dgNIZ2oZnCWt8B5nybC3Mfw
-- RESPONSE CONTENT--
<?xml version='1.0' encoding='UTF-8'?><Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your Google secret key and signing method.</Message><StringToSign>AWS4-HMAC-SHA256
20220414T035013Z
20220414/googleapis/storage/aws4_request
d412db61c6a44ed649acc4b68ce2f5f4628321bb535b4d56efbb73cd032e3a2d</StringToSign><CanonicalRequest>PUT
/fdb-backup/data/fdb/kvranges/snapshot.000000534297541501/0/range,534297584836,0a2302e1c16dad06c9a0d4809cbd8814,1048576

content-md5:v286vjzzSoN6yVcznbCqDQ==
host:storage.googleapis.com
x-amz-content-sha256:UNSIGNED-PAYLOAD
x-amz-date:20220414T035013Z

content-md5;host;x-amz-content-sha256;x-amz-date
UNSIGNED-PAYLOAD</CanonicalRequest></Error>
--------

When the request is signed by fdbbackup it seems to first convert it to a canonical request which is
percent encoded. When the request is sent by fdbbackup it is not percent encoded. There seems to be a
different way of handling Signature Version 4 on both AWS and GCP. AWS first convert the HTTP resource path
to a percent encoded one before validating the signature, on GCP the HTTP resource path is taken as is (not percent encoded) which cause the signature mismatch.

The (Signature Version 4: Calculating a Signature)https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html
document specify that a resource path should be encoded before calculating the signature which makes the behavior
of fdbbackup correct, but GCP does not seem to have implemented this part correctly since, in GCP, the resource path URL is required to be encoded according to this document:

https://cloud.google.com/storage/docs/request-endpoints#encoding

So GCP, expect, an encoded URL. So, with the assumption that AWS would accept both non-encoded (we know this
already) and encoded resource path. I added a line in the doRequest() method in HTTP.actor.cpp https://github.com/apple/foundationdb/blob/release-6.3/fdbclient/HTTP.actor.cpp#L356 which effectively encode the resource path:

    // line 356
    resource = HTTP:awsV4URIEncode(resource, false);

With this line, it works in AWS and GCP.

Here is the answer by GCP:

[7d0dcd1888650dd25a704c510ad7a07d] HTTP starting PUT /fdb-backup/data/fdb/kvranges/snapshot.000000544983018581/0/range%2C544983051027%2Ca09cad710edd522526be3dcfa7d39f0f%2C1048576 ContentLen:47
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/googleapis/storage/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=834a95661d0d3aa20441c2102ee9e1b78fa8e49327e3863d8ceffe7ea882de28
Request Header: Content-Length: 47
Request Header: Content-MD5: v286vjzzSoN6yVcznbCqDQ==
Request Header: Host: storage.googleapis.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T064818Z
[7d0dcd1888650dd25a704c510ad7a07d] HTTP code=200 early=0, time=0.131040s PUT /fdb-backup/data/fdb/kvranges/snapshot.000000544983018581/0/range%2C544983051027%2Ca09cad710edd522526be3dcfa7d39f0f%2C1048576 contentLen=47 [660 out, response content len 0]
[7d0dcd1888650dd25a704c510ad7a07d] HTTP RESPONSE:  PUT /fdb-backup/data/fdb/kvranges/snapshot.000000544983018581/0/range%2C544983051027%2Ca09cad710edd522526be3dcfa7d39f0f%2C1048576
Response Code: 200
Response ContentLen: 0
Reponse Header: Content-Length: 0
Reponse Header: Content-Type: text/html; charset=UTF-8
Reponse Header: Date: Thu, 14 Apr 2022 06:48:18 GMT
Reponse Header: ETag: "bf6f3abe3cf34a837ac957339db0aa0d"
Reponse Header: Server: UploadServer
Reponse Header: Vary: Origin
Reponse Header: x-goog-generation: 1649918898842380
Reponse Header: x-goog-hash: md5=v286vjzzSoN6yVcznbCqDQ==
Reponse Header: x-goog-metageneration: 1
Reponse Header: x-goog-stored-content-encoding: identity
Reponse Header: x-goog-stored-content-length: 47
Reponse Header: X-GUploader-UploadID: ADPycduGfm9eg_zj8eKs3sfg52LArbXzlbCH6vNqH0iH9DhMId0I9F3MBSIhLcJ5QtIsZw9GfH1e0I7SjykxqMg3CtmZfA
-- RESPONSE CONTENT--

--------
[7d0dcd1888650dd25a704c510ad7a07d] HTTP starting PUT /fdb-backup/data/fdb/snapshots/snapshot%2C544983051027%2C544983051027%2C47 ContentLen:338
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/googleapis/storage/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=827be71337c275385d3638df2c9987f5da35643e6ab055614e8586a2778382eb
Request Header: Content-Length: 338
Request Header: Content-MD5: JuFDrvkLy6pr2ko+MW49Cw==
Request Header: Host: storage.googleapis.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T064818Z
[7d0dcd1888650dd25a704c510ad7a07d] HTTP code=200 early=0, time=0.130198s PUT /fdb-backup/data/fdb/snapshots/snapshot%2C544983051027%2C544983051027%2C47 contentLen=338 [901 out, response content len 0]
[7d0dcd1888650dd25a704c510ad7a07d] HTTP RESPONSE:  PUT /fdb-backup/data/fdb/snapshots/snapshot%2C544983051027%2C544983051027%2C47
Response Code: 200
Response ContentLen: 0
Reponse Header: Content-Length: 0
Reponse Header: Content-Type: text/html; charset=UTF-8
Reponse Header: Date: Thu, 14 Apr 2022 06:48:18 GMT
Reponse Header: ETag: "26e143aef90bcbaa6bda4a3e316e3d0b"
Reponse Header: Server: UploadServer
Reponse Header: Vary: Origin
Reponse Header: x-goog-generation: 1649918898989197
Reponse Header: x-goog-hash: md5=JuFDrvkLy6pr2ko+MW49Cw==
Reponse Header: x-goog-metageneration: 1
Reponse Header: x-goog-stored-content-encoding: identity
Reponse Header: x-goog-stored-content-length: 338
Reponse Header: X-GUploader-UploadID: ADPycdvwBT7_tNPLIRugFypW_4mZ9jho9qKI5HlSXSB4XPzyRv3z1nt3SFdshH7uLJU3O1EvH4ihGiWg-qHBhYdOk9VrNQ
-- RESPONSE CONTENT--

--------
[7d0dcd1888650dd25a704c510ad7a07d] HTTP starting PUT /fdb-backup/data/fdb/logs/0000/0005/log%2C544983018581%2C545003018581%2Cae569c838a7cbaf666f55ba8a1566b15%2C1048576 ContentLen:0
Request Header: Accept: application/xml
Request Header: Authorization: AWS4-HMAC-SHA256 Credential=<access_key_id>/20220414/googleapis/storage/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date, Signature=8c40d50e382bd963607ee9a7585f957f9844a3237984a99873618790ef23ba4e
Request Header: Content-Length: 0
Request Header: Content-MD5: 1B2M2Y8AsgTpgAmY7PhCfg==
Request Header: Host: storage.googleapis.com
Request Header: x-amz-content-sha256: UNSIGNED-PAYLOAD
Request Header: x-amz-date: 20220414T064853Z
[7d0dcd1888650dd25a704c510ad7a07d] HTTP code=200 early=0, time=0.109839s PUT /fdb-backup/data/fdb/logs/0000/0005/log%2C544983018581%2C545003018581%2Cae569c838a7cbaf666f55ba8a1566b15%2C1048576 contentLen=0 [601 out, response content len 0]
[7d0dcd1888650dd25a704c510ad7a07d] HTTP RESPONSE:  PUT /fdb-backup/data/fdb/logs/0000/0005/log%2C544983018581%2C545003018581%2Cae569c838a7cbaf666f55ba8a1566b15%2C1048576
Response Code: 200
Response ContentLen: 0
Reponse Header: Content-Length: 0
Reponse Header: Content-Type: text/html; charset=UTF-8
Reponse Header: Date: Thu, 14 Apr 2022 06:48:53 GMT
Reponse Header: ETag: "d41d8cd98f00b204e9800998ecf8427e"
Reponse Header: Server: UploadServer
Reponse Header: Vary: Origin
Reponse Header: x-goog-generation: 1649918933696708
Reponse Header: x-goog-hash: md5=1B2M2Y8AsgTpgAmY7PhCfg==
Reponse Header: x-goog-metageneration: 1
Reponse Header: x-goog-stored-content-encoding: identity
Reponse Header: x-goog-stored-content-length: 0
Reponse Header: X-GUploader-UploadID: ADPycdsuAGYnL2s1b0TlYV35JeFquezqVfXnl2l1ste3TssCOCa73oOh1INkZq4NKT0pst35xFwokt7kRAUvsKzatrCMTA
-- RESPONSE CONTENT--

--------

Would it be possible to introduce this change in the code to make the fdbbackup S3 backup process work in both
AWS and GCP?

There are currently a few issues how the signature is created and how some parameters are interpreted (this list is by no means complete):

and there is an outstanding PR that fixes some of these issues: Fix compatibility issue of s3 backup by wangzw · Pull Request #6354 · apple/foundationdb · GitHub. I’m not aware of any timelines to fix all those issues and I would say that the FDB community is happy to see a contributions from the community.