K8S集群使用cronjob自动更新AWS ECR Token

AWS ECR提供私有镜像仓库,访问之前需要通过AWS ECR的认证,然后才能操作。原始的docker认证方式是这样的:

aws ecr get-login-password --region cn-northwest-1 | docker login --username AWS --password-stdin 123456.dkr.ecr.cn-northwest-1.amazonaws.com.cn

之后就可以使用docker操作pull或者push了。
对于K8S来说,我们并不直接操作docker,而是由kubelet管理容器。在处理认证这方面,我们可以用到K8S的secret资源,它可以存储docker的认证信息。
不过ECR的凭证有效期只有12小时,因此长远来看,需要一直保持可用的话需要周期性重新获取凭证。那么可以使用cronjob来做这个任务,因此我们需要一个安装了aws cli和kubectl的镜像。

构建一个安装aws cli和kubectl的镜像

使用的Dockerfile如下:

FROM alpine:latest
COPY server.crt server.key /
RUN apk --no-cache add aws-cli wget gcompat \
    && wget "https://dl.k8s.io/release/v1.23.9/bin/linux/amd64/kubectl" \
    && wget "https://rolesanywhere.amazonaws.com/releases/1.3.0/X86_64/Linux/aws_signing_helper" \
    && mv kubectl /usr/local/bin/kubectl \
    && chmod +x /usr/local/bin/kubectl aws_signing_helper \
    && apk del wget

需要注意的是,我这个构建过程是在海外做的,因为我要直接从dockerhub上拉alpine镜像,并且apk add的时候也快一点。
要想获取到ECR的凭证,需要有对应的AWS凭证,例如AKSK等,但我并不想在YAML里存放AKSK。我选择IAM Roleanywhere,用证书方式生成临时凭证。因此我还装了roleanywhere所需要的 aws_signing_helper和证书文件以及私钥。

创建cronjob

这里理清几个步骤,

  1. 要给某个具体的namespace创建secret,获取ns的名称。
  2. 启动容器之后,先用roleanywhere获取执行AWS CLI所需的凭证。
  3. 运行aws ecr命令获取ecr的凭证。
  4. 运行kubectl创建secret。

我这里为ingress-nginx这个namespace创建secret,用来使用ingress nginx controller,因为我把contoller相关的镜像放到了ECR里。使用的yaml如下:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: ingress-nginx
  name: ecr-auth
rules:
- apiGroups: [""]
  resources: ["secrets","serviceaccounts"]
  verbs: ["delete","create","get","patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: bind-default-secret-deleter
  namespace: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: ecr-auth
subjects:
- kind: ServiceAccount
  name: default
  namespace: default
---
apiVersion: batch/v1
kind: Job
metadata:
  name: job-ecr-auth
  namespace: default
spec:
  template:
    metadata:
      labels:
        app: job-ecr-auth
    spec:
      imagePullSecrets:
      - name: ecr
      containers:
      - command:
        - /bin/sh
        - -c
        - |-
          REGION=cn-northwest-1
          ECRENDPOINT=123456789.dkr.ecr.cn-northwest-1.amazonaws.com.cn
          SECRET_NAME=ecr
          TARGET_NS=ingress-nginx
          mkdir ~/.aws && touch ~/.aws/config && echo -e "[default]\n region = cn-northwest-1 \n credential_process = ./aws_signing_helper credential-process --certificate server.crt --private-key server.key --trust-anchor-arn arn:aws-cn:rolesanywhere:cn-northwest-1:123456789:trust-anchor/6a6966c2-912d-44fb-a468-d6faf8437334 --profile-arn arn:aws-cn:rolesanywhere:cn-northwest-1:123456789:profile/b174ad67-e035-4f86-ac34-c68bb313e785 --role-arn arn:aws-cn:iam::123456789:role/test" > ~/.aws/config
          TOKEN=`aws ecr get-login-password --region ${REGION}`
          echo "ENV variables setup done."
          /usr/local/bin/kubectl delete secret ${SECRET_NAME} -n ${TARGET_NS} --ignore-not-found
          /usr/local/bin/kubectl create secret docker-registry ${SECRET_NAME} \
          --docker-server=${ECRENDPOINT} \
          --docker-username=AWS \
          --docker-password="${TOKEN}" -n ${TARGET_NS}
          echo "Secret created by name. $SECRET_NAME"
          #/usr/local/bin/kubectl -n ${TARGET_NS} patch serviceaccount ingress-nginx,ingress-nginx-admission -p '{"imagePullSecrets": [{"name": "'$SECRET_NAME'"}]}'
          echo "All done."
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        image: 123456789.dkr.ecr.cn-northwest-1.amazonaws.com.cn/test:ecrauth
        imagePullPolicy: IfNotPresent
        name: ecr-auth
        resources: {}
        securityContext:
          capabilities: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      nodeSelector:
        node-role.kubernetes.io/master: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        value:
      dnsPolicy: Default
      hostNetwork: true
      restartPolicy: Never
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  backoffLimit: 4
---
apiVersion: batch/v1
kind: CronJob
metadata:
  annotations:
  name: ecr-auth
  namespace: default
spec:
  concurrencyPolicy: Allow
  schedule: "0 */11 * * *"
  failedJobsHistoryLimit: 1
  successfulJobsHistoryLimit: 3
  suspend: false
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      template:
        metadata:
          creationTimestamp: null
        spec:
          imagePullSecrets:
          - name: ecr
          containers:
          - command:
            - /bin/sh
            - -c
            - |-
              REGION=cn-northwest-1
              ECRENDPOINT=123456789.dkr.ecr.cn-northwest-1.amazonaws.com.cn
              SECRET_NAME=ecr
              TARGET_NS=ingress-nginx
              mkdir ~/.aws && touch ~/.aws/config && echo -e "[default]\n region = cn-northwest-1 \n credential_process = ./aws_signing_helper credential-process --certificate server.crt --private-key server.key --trust-anchor-arn arn:aws-cn:rolesanywhere:cn-northwest-1:123456789:trust-anchor/6a6966c2-912d-44fb-a468-d6faf8437334 --profile-arn arn:aws-cn:rolesanywhere:cn-northwest-1:123456789:profile/b174ad67-e035-4f86-ac34-c68bb313e785 --role-arn arn:aws-cn:iam::123456789:role/test" > ~/.aws/config
              TOKEN=`aws ecr get-login-password --region ${REGION}`
              echo "ENV variables setup done."
              /usr/local/bin/kubectl delete secret ${SECRET_NAME} -n ${TARGET_NS} --ignore-not-found
              /usr/local/bin/kubectl create secret docker-registry ${SECRET_NAME} \
              --docker-server=${ECRENDPOINT} \
              --docker-username=AWS \
              --docker-password="${TOKEN}" -n ${TARGET_NS}
              echo "Secret created by name. $SECRET_NAME"
              #/usr/local/bin/kubectl -n ${TARGET_NS} patch serviceaccount ingress-nginx,ingress-nginx-admission -p '{"imagePullSecrets": [{"name": "'$SECRET_NAME'"}]}'
              echo "All done."
            env:
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            image: 123456789.dkr.ecr.cn-northwest-1.amazonaws.com.cn/test:ecrauth
            imagePullPolicy: IfNotPresent
            name: ecr-auth
            resources: {}
            securityContext:
              capabilities: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          nodeSelector:
            node-role.kubernetes.io/master: ""
          tolerations:
          - key: "node-role.kubernetes.io/master"
            value:
          dnsPolicy: Default
          hostNetwork: true
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30

这里对照yaml说明一下:

  1. 我的环境比较特殊,我把刚才构建的镜像也放到ecr里面了,因此使用这个镜像也需要先创建一个secret。
  2. 鉴于#1,我把这个cronjob放到了default namespace里,在这里去为其他ns创建secret,以保证目标NS的纯净性。
  3. 为了做#2,不这又牵扯到default serviceaccount无法对其他namespace资源做操作,还需要创建角色,和serviceaccount绑定。
  4. cronjob根据定义的周期来运行,为了最快看到效果,我多加了一个job,使得刚开始secret就创建出来。

因此这个环境比较杂乱,但能够领会到思路,也能够学习到更多的资源用法,后面如果真的有类似的需求,是可以做调整的。

实验效果

我在ingress-nginx的yaml里已经提前写好了使用secret拉取镜像。

imagePullSecrets:
- name: ecr

但现在还没有ecr这个secret创建,因此出现下面的报错,和不带凭证直接去拉取镜像报错是一样的。

[root@master ec2-user]# kubectl -n ingress-nginx get pod
NAME                                        READY   STATUS             RESTARTS   AGE
ingress-nginx-admission-create-2hkj6        0/1     ImagePullBackOff   0          27s
ingress-nginx-admission-patch-c6d87         0/1     ImagePullBackOff   0          27s
ingress-nginx-controller-5fdf68b6bc-nnlph   0/1     ImagePullBackOff   0          27s
*****
  Warning  Failed     6s (x2 over 19s)   kubelet            Failed to pull image "123456789.dkr.ecr.cn-northwest-1.amazonaws.com.cn/test:controller": rpc error: code = Unknown desc = Error response from daemon: Head "https://123456789.dkr.ecr.cn-northwest-1.amazonaws.com.cn/v2/test/manifests/controller": no basic auth credentials

现在部署获取凭证的yaml.可以看到job已经完成了。pod的日志显示已经创建成功。查看secrets也已经出现了一个名叫ecr的secret。

[root@master ec2-user]# kubectl -n ingress-nginx get cj
No resources found in ingress-nginx namespace.
[root@master ec2-user]# kubectl get cronjob
NAME       SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
ecr-auth   0 */11 * * *   False     0        <none>          13s
[root@master ec2-user]# 
[root@master ec2-user]# kubectl get job
NAME           COMPLETIONS   DURATION   AGE
job-ecr-auth   1/1           5s         15s
[root@master ec2-user]# kubectl get pod
NAME                    READY   STATUS      RESTARTS   AGE
job-ecr-auth-kzz69      0/1     Completed   0          19s
[root@master ec2-user]# kubectl logs job-ecr-auth-kzz69
ENV variables setup done.
secret/ecr created
Secret created by name. ecr
All done.
[root@master ec2-user]# kubectl -n ingress-nginx get secrets 
NAME                      TYPE                                  DATA   AGE
default-token-8t7sg       kubernetes.io/service-account-token   3      44h
ecr                       kubernetes.io/dockerconfigjson        1      18m

之后再次部署ingress-nginx,这次所有镜像都拉取成功了。

[root@master ec2-user]# kubectl -n ingress-nginx get pod
NAME                                        READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create-75p46        0/1     Completed   0          36s
ingress-nginx-admission-patch-6hgvp         0/1     Completed   0          36s
ingress-nginx-controller-5fdf68b6bc-k9lwg   1/1     Running     0          36s


发表评论

  • OωO
  • |´・ω・)ノ
  • ヾ(≧∇≦*)ゝ
  • (☆ω☆)
  • (╯‵□′)╯︵┴─┴
  •  ̄﹃ ̄
  • (/ω\)
  • ∠(ᐛ」∠)_
  • (๑•̀ㅁ•́ฅ)
  • →_→
  • ୧(๑•̀⌄•́๑)૭
  • ٩(ˊᗜˋ*)و
  • (ノ°ο°)ノ
  • (´இ皿இ`)
  • ⌇●﹏●⌇
  • (ฅ´ω`ฅ)
  • (╯°A°)╯︵○○○
  • φ( ̄∇ ̄o)
  • (งᵒ̌皿ᵒ̌)ง⁼³₌₃
  • (ó﹏ò。)
  • Σ(っ°Д°;)っ
  • ╮(╯▽╰)╭
  • o(*
  • >﹏<
  • (。•ˇ‸ˇ•。)
  • 泡泡
  • 颜文字

*