How do I use `fdbbackup` with Google Cloud Storage?

It seems that the backup system is hard-coded to use S3 buckets right now. To be more specific, the Authorization header sends HTTP requests like Authorization: AWS access:secret.

Google Cloud Storage is technically compatible with S3-style requests, but it seems to require an HMAC signature that is currently not being sent.

Anyway, has anybody else had any success with this? Any recommendations? And is this a feature that would be worth making a future pull-request for?

1 Like

This is not correct, can you explain what is making it seem this way?

The Authorization header value sent is the access key plus an hmac_sha1 signature calculated from several other parts of the request as required by Amazon’s S3 auth scheme.

I have tested our client against Amazon S3 and Minio and both work.

I do not know for certain that anyone has used it with Google Cloud Storage, so you may be the first. If there is an incompatibility I’m sure it can be remedied easily. Can you paste the error messages you are seeing (sanitized of course)? Also, if you add --knob_http_verbose_level=3 to the command line of fdbbackup commands or the backup agent you will see a lot of HTTP/HTTPS detail printed to standard output including the full responses. GCS might be providing response content that gives more error detail.

Ah! --knob_http_verbose_level is so helpful! Thanks.

Here is the output of fdbbackup:

[c408a9b68635a049d0769fb6ea83aa9c] HTTP starting HEAD /backups ContentLen:0
Request Header: Accept: application/xml
Request Header: Authorization: AWS access:secret
Request Header: Content-Length: 0
Request Header: Date: 20190620T231316Z
Request Header: Host: storage.googleapis.com
[c408a9b68635a049d0769fb6ea83aa9c] HTTP code=400 early=0, time=0.007038s HEAD /backups contentLen=0 [204 out, response content len 179]
[c408a9b68635a049d0769fb6ea83aa9c] HTTP RESPONSE:  HEAD /backups
Response Code: 400
Response ContentLen: 179
Reponse Header: Cache-Control: private, max-age=0
Reponse Header: Content-Length: 179
Reponse Header: Content-Type: application/xml; charset=UTF-8
Reponse Header: Date: Thu, 20 Jun 2019 23:13:16 GMT
Reponse Header: Expires: Thu, 20 Jun 2019 23:13:16 GMT
Reponse Header: Server: UploadServer
Reponse Header: X-GUploader-UploadID: xxxxxxxxxxxxxxxxxxxxxx
-- RESPONSE CONTENT--

When I try an identical GET request with httpie:

GET /backups/test_route HTTP/1.1
Accept: application/xml
Accept-Encoding: gzip, deflate
Authorization: AWS access/secret
Connection: keep-alive
Content-Length: 0
Date: 20190620T232744Z
Host: storage.googleapis.com
User-Agent: HTTPie/0.9.8



HTTP/1.1 400 Bad Request
Cache-Control: private, max-age=0
Content-Length: 179
Content-Type: application/xml; charset=UTF-8
Date: Thu, 20 Jun 2019 23:43:03 GMT
Expires: Thu, 20 Jun 2019 23:43:03 GMT
Server: UploadServer
X-GUploader-UploadID: xxxxxxxxxx

<?xml version='1.0' encoding='UTF-8'?><Error><Code>MalformedSecurityHeader</Code><Message>Your request has a malformed header.</Message><ParameterName>Date</ParameterName></Error>

So then I changed the date format until it stopped complaining:

GET /backups/test_route HTTP/1.1
Accept: application/xml
Accept-Encoding: gzip, deflate
Authorization: AWS access/secret
Connection: keep-alive
Content-Length: 0
Date: Thu, 20 Jun 2019 23:35:59 +0000
Host: storage.googleapis.com
User-Agent: HTTPie/0.9.8



HTTP/1.1 400 Bad Request
Cache-Control: private, max-age=0
Content-Length: 179
Content-Type: application/xml; charset=UTF-8
Date: Thu, 20 Jun 2019 23:43:03 GMT
Expires: Thu, 20 Jun 2019 23:43:03 GMT
Server: UploadServer
X-GUploader-UploadID: xxxxxxxxxx

<?xml version='1.0' encoding='UTF-8'?><Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your Google secret key and signing method.</Message><StringToSign>GET


Thu, 20 Jun 2019 23:35:59 +0000
/test_route</StringToSign></Error>

But it looks like the date is signed, which breaks everything.

That’s unfortunate.

Can you build FDB locally from source but change "%Y%m%dT%H%M%SZ" to "%a, %d %b %Y %H:%M:%S GMT" at this line?

That will produce the date format that you found to work and use it in the signature. If that works everywhere then I’ll make a PR for it. If not, we’ll have to find something that does.

2 Likes

Do you find any solution? facing the same issue

I haven’t tested this, but if no one gets a patch in to fix this date formatting issue soon, you could try using MinIO in pass-through mode to GCS.

https://docs.min.io/docs/minio-gateway-for-gcs.html

To be clear, fdbbackup’s S3 client is using exactly the date format required by Amazon’s v4 signature scheme.

You can include the date as part of your request in several ways. You can use a date header, an x-amz-date header or include x-amz-date as a query parameter.

The time stamp must be in UTC and in the following ISO 8601 format: YYYYMMDD’T’HHMMSS’Z’. For example, 20150830T123600Z is a valid time stamp.

It seems strange that other services would use an S3-like interface but change the date format used in the signature, but maybe that is the case.

If someone can confirm that the patch I posted above works against GCS then we can add a blobstore:// URL parameter for using the alternate date format.

Old topic, but I can confirm that I got fdbbackup to work with minio in gcs gateway mode, with decent performance

Kubernetes manifest:

apiVersion: v1
kind: Service
metadata:
  name: minio-service
spec:
  ports:
    - port: 9000
      targetPort: 9000
  selector:
    app: foundationdb-minio-gcs-gateway
---
apiVersion: apps/v1
kind: Deployment
metadata:
  # This name uniquely identifies the Deployment
  name: foundationdb-minio-gcs-gateway
spec:
  selector:
    matchLabels:
      app: foundationdb-minio-gcs-gateway # has to match .spec.template.metadata.labels
  template:
    metadata:
      labels:
        # Label is used as selector in the service.
        app: foundationdb-minio-gcs-gateway
    spec:
      # Refer to the secret created earlier
      volumes:
        - name: google-cloud-key
          secret:
            secretName: fdb-backups
      containers:
        - name: minio
          # Pulls the default Minio image from Docker Hub
          image: minio/minio:RELEASE.2020-03-09T18-26-53Z
          args:
            - gateway
            - gcs
            - <GOOGLE_PROJECT_NAME>
          env:
            # MinIO access key and secret key
            - name: MINIO_ACCESS_KEY
              value: "minio"
            - name: MINIO_SECRET_KEY
              valueFrom:
                secretKeyRef:
                  name: minio-fdb-gateway-secret
                  key: secret
            # Google Cloud Service uses this variable
            # Should have: resourcemanager.projects.get, storage.buckets.get,
            # Also permissions for bucket to use.
            # If you want to browse files with minio, also storage.buckets.list
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: "/etc/credentials/key.json"
          ports:
            - containerPort: 9000
          volumeMounts:
            - name: google-cloud-key
              mountPath: "/etc/credentials"
              readOnly: true

With the service up and running, I run

fdbbackup start -d "blobstore://minio@minio-service:9000/fdb_backups/$(date +%FT%H-%M-%S)?sc=0&bucket=<BUCKET_NAME>" 

from a foundationdb pod in the same kubernetes cluster

You’d have to create google credentials with the proper permissions, and add it in a kubernetes secret (I used a secret called fdb-backups with a field called key.json. And another secret called minio-fdb-gateway-secret with the password used to access minio. That password is needed for the fdbbackup command.

It would be great if you could do the bit of testing mentioned in Steve’s post above, so that FDB could just natively support GCS instead of using minio to work around it.

OK so I think there are a few things here:

  1. Referencing the SigV4 docs on date format when using the SigV2 auth, AWS probably handles that one way, it appears GCS handles it another.
    To get around this the x-amz-date header could probably be set instead of the date header, see some examples here:
    https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html#RESTAuthenticationExamples
    https://cloud.google.com/storage/docs/migrating uses the same format in the x-amz-date header, so it looks like it can be handled.

  2. SigV2 will stop being supported on buckets created after June 24, 2020, SigV4 should probably be implemented as it has been supported since 2012.
    aws.amazon.com/blogs/aws/amazon-s3-update-sigv2-deprecation-period-extended-modified/
    docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html
    cloud.google.com/storage/docs/migrating#authentication

  3. There is probably (obviously?) a difference in how GCS and S3 handle the date header. I imagine since GCS was going for compatibility, this should be supported. Though given the upcoming deprecation, I’m not sure if they will want to make the effort to change it. I’ll file a bug on Monday to see if the change can be made.

Sorry about the broken links… new users have a limit of 2 per post.
EDIT: Used admin powers to linkify your links. Sorry about forum restrictions.

Thank you so much for this explanation!

I had no idea we were using an old authentication method. The first google result for “S3 authentication” is still https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html and I completely missed the message about the page covering v2 while v4 is the latest.

Clearly we should update, so I created an issue for it:

With V4 authenticate, fdbbackup still not works for Google Cloud Storage (GCS). Problem comes from mismatch between signature encode by fdbbackup and by GCS side. This is described in github issue along with a possible fix.