Monday, April 26, 2021

Rsync between volumes on two different OpenShift clusters

This is a short HOWTO about rsync-ing data between 2 distinct OpenShift clusters.

You always have the option to oc rsync the data from source OpenShift cluster to your local workstation and then oc rsync from your workstation to target cluster. But if you have halt a terabyte of data you may not have enough space or it may take several days because of network bandwidth limitation.

The method I describe below avoids any such inefficiencies as well the rsync process is restarted in case some network or system glitch kills it.

It basically works by having:

  • a kubeconfig file with access to the target OpenShift cluster inside a secret on the source OpenShift cluster
  • a pod on target OpenShift cluster with target volume mounted
  • a pod on source OpenShift cluster with source volume and kubeconfig secret mountedand an entrypoint running oc rsync

 So lets start with generating a proper kubeconfig secret.

$ touch /tmp/kubeconfig
$ chmod 600 /tmp/kubeconfig
$ oc login --config=/tmp/kubeconfig # make sure to use target cluster API endpoint
$ oc project my-target-cluster-namespace --config=/tmp/kubeconfig 
Note that command below will run against source OpenShift cluster.
$ oc login # use source cluster API endpoint
$ oc create secret generic kubeconfig --from-file=config=/tmp/kubeconfig

I will assume that you have your target pod already running inside target cluster. Otherwise you can create one similar to the pod in source cluster below, just use some entrypoint command to keep it permanently running. For example /bin/sleep 1000000000000000000.

Now all we need to do is run a proper pod in source cluster to do the rsync task. Here is an example pod YAML with comments to make clear what to use in your situation:

apiVersion: v1
kind: Pod
metadata:
  name: rsync-pod
  namespace: my-namespace-on-source-cluster
spec:
  containers:
    # use client version ±1 of target OpenShift cluster version
    - image: quay.io/openshift/origin-cli:4.6
      name: rsync
      command:
      - "oc"
      args:
      - "--namespace=my-target-cluster-namespace"
      - "--kubeconfig=/run/secrets/kube/config"
      # insecure TLS is not recommended but is a quick hack to get you going
      - "--insecure-skip-tls-verify=true"
      - "rsync"
      - "--compress=true"
      - "--progress=true"
      - "--strategy=rsync"
      - "/path/to/data/dir/"
      - "target-pod-name:/path/to/data/dir/"
      volumeMounts:
        - mountPath: /path/to/data/dir
          name: source-data-volume
        - mountPath: /run/secrets/kube
          name: kubeconfig
          readOnly: true
  # restart policy will keep restarting your pod until rsync completes successfully
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 30
  volumes:
    - name: source-data-volume
      persistentVolumeClaim:
        claimName: source-persistant-volume-claim-name
    - name: kubeconfig
      secret:
        defaultMode: 420
        secretName: kubeconfig
And last needed command is to run this pod inside the source cluster:
$ oc create -f rsync-pod.yaml
Now check what state is your pod in:
$ oc describe pod rsync-pod
If it start properly, then monitor your progress:
$ oc logs -f rsync-pod

No comments:

Post a Comment