In this article, I will show you how the ansible operator-sdk works to create a rabbitmq operator. The main purpose is to verify if an ops can use it or if it is a developer-oriented tool.
Targeted audience : k8s users and ansible users
By Jacques Roussel, Cloud Consultant @Objectif Libre
Definition
In kubernetes, an operator is a piece of code (controller) that watches specific resources via the k8s api. These specific resources are created by a Custom Resource Defenition (CRD) that adds new resources to the k8s api.
In this article, we will add a Custom Resource called rabbitmq. These CRD will allow us to create a rabbitmq cluster. These rabbitmq will be managed by our operator.
Configuration
Prepare your environment
To test our operator, we need a k8s cluster. We use minikube for its simplicity.
$ sudo apt update && sudo apt install virtualbox -y $ curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube $ sudo mv minikube /usr/local/bin/
We need to install the operator-sdk too:
$ sudo add-apt-repository ppa:longsleep/golang-backports $ sudo apt update && sudo apt install golang-go -y $ mkdir ~/go $ export GOPATH="$HOME/go" $ mkdir -p $GOPATH/src/github.com/operator-framework $ cd $GOPATH/src/github.com/operator-framework $ git clone https://github.com/operator-framework/operator-sdk $ cd operator-sdk $ git checkout master $ make install $ export PATH=$PATH:$(go env GOPATH)/bin
Now we check that everything is ok before we take the next step:
$ operator-sdk --version operator-sdk version v0.7.0+git $ minikube version minikube version: v1.0.0
Generate the file tree
We create a namespace-scoped operator. It means that the operator should be deployed in the namespace where we want to use it. Let’s start by creating the folder with all files:
$ operator-sdk new rabbitmq-operator --api-version=rabbitmq.olibre.io/v1alpha1 --kind=Rabbitmq --type=ansible INFO[0000] Creating new Ansible operator 'rabbitmq-operator'. INFO[0000] Created deploy/service_account.yaml INFO[0000] Created deploy/role.yaml INFO[0000] Created deploy/role_binding.yaml INFO[0000] Created deploy/crds/rabbitmq_v1alpha1_rabbitmq_crd.yaml INFO[0000] Created deploy/crds/rabbitmq_v1alpha1_rabbitmq_cr.yaml INFO[0000] Created build/Dockerfile INFO[0000] Created roles/rabbitmq/README.md INFO[0000] Created roles/rabbitmq/meta/main.yml INFO[0000] Created roles/rabbitmq/files/.placeholder INFO[0000] Created roles/rabbitmq/templates/.placeholder INFO[0000] Created roles/rabbitmq/vars/main.yml INFO[0000] Created molecule/test-local/playbook.yml INFO[0000] Created roles/rabbitmq/defaults/main.yml INFO[0000] Created roles/rabbitmq/tasks/main.yml INFO[0000] Created molecule/default/molecule.yml INFO[0000] Created build/test-framework/Dockerfile INFO[0000] Created molecule/test-cluster/molecule.yml INFO[0000] Created molecule/default/prepare.yml INFO[0000] Created molecule/default/playbook.yml INFO[0000] Created build/test-framework/ansible-test.sh INFO[0000] Created molecule/default/asserts.yml INFO[0000] Created molecule/test-cluster/playbook.yml INFO[0000] Created roles/rabbitmq/handlers/main.yml INFO[0000] Created watches.yaml INFO[0000] Created deploy/operator.yaml INFO[0000] Created .travis.yml INFO[0000] Created molecule/test-local/molecule.yml INFO[0000] Created molecule/test-local/prepare.yml INFO[0000] Run git init ... Initialized empty Git repository in /home/jroussel/Documents/Clients/Objectif-libre/blog/article-operator/rabbitmq-operator/.git/ INFO[0000] Run git init done INFO[0000] Project creation complete. $ cd rabbitmq-operator/
As you noticed, the sdk creates a lot of files. We will modify only a few of them. The main file will be roles/rabbitmq/tasks/main.yml. It’s in that file that you will list the k8s objects that you want your operator to deploy.
Customize your operator
The operator that we will create comes from this statefulset with few modifications:
--- # tasks file for rabbitmq - name: create rabbitmq k8s: definition: apiVersion: v1 kind: ServiceAccount metadata: name: '{{ meta.name }}-rabbitmq' namespace: '{{ meta.namespace }}' - name: Add rbac k8s: definition: kind: Role apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: endpoint-reader namespace: '{{ meta.namespace }}' rules: - apiGroups: [""] resources: ["endpoints"] verbs: ["get"] - name: Add rolebinding k8s: definition: kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: endpoint-reader namespace: '{{ meta.namespace }}' subjects: - kind: ServiceAccount name: '{{ meta.name }}-rabbitmq' roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: endpoint-reader - name: Add service k8s: definition: kind: Service apiVersion: v1 metadata: namespace: '{{ meta.namespace }}' name: rabbitmq labels: app: rabbitmq type: LoadBalancer spec: type: ClusterIP ports: - name: http protocol: TCP port: 15672 targetPort: 15672 - name: amqp protocol: TCP port: 5672 targetPort: 5672 selector: app: rabbitmq - name: Add configmap k8s: definition: apiVersion: v1 kind: ConfigMap metadata: name: rabbitmq-config namespace: '{{ meta.namespace }}' data: enabled_plugins: | [rabbitmq_management,rabbitmq_peer_discovery_k8s]. rabbitmq.conf: | ## Cluster formation. See http://www.rabbitmq.com/cluster-formation.html to learn more. cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s cluster_formation.k8s.host = kubernetes.default.svc.cluster.local ## Should RabbitMQ node name be computed from the pod's hostname or IP address? ## IP addresses are not stable, so using [stable] hostnames is recommended when possible. ## Set to "hostname" to use pod hostnames. ## When this value is changed, so should the variable used to set the RABBITMQ_NODENAME ## environment variable. cluster_formation.k8s.address_type = ip ## How often should node cleanup checks run? cluster_formation.node_cleanup.interval = 30 ## Set to false if automatic removal of unknown/absent nodes ## is desired. This can be dangerous, see ## * http://www.rabbitmq.com/cluster-formation.html#node-health-checks-and-cleanup ## * https://groups.google.com/forum/#!msg/rabbitmq-users/wuOfzEywHXo/k8z_HWIkBgAJ cluster_formation.node_cleanup.only_log_warning = true cluster_partition_handling = autoheal ## See http://www.rabbitmq.com/ha.html#master-migration-data-locality queue_master_locator=min-masters ## See http://www.rabbitmq.com/access-control.html#loopback-users loopback_users.guest = false - name: Add sts k8s: definition: apiVersion: apps/v1beta1 kind: StatefulSet metadata: name: rabbitmq namespace: '{{ meta.namespace }}' spec: serviceName: rabbitmq replicas: "{{size}}" template: metadata: labels: app: rabbitmq spec: serviceAccountName: '{{ meta.name }}-rabbitmq' terminationGracePeriodSeconds: 10 containers: - name: rabbitmq-k8s image: rabbitmq:3.7 volumeMounts: - name: config-volume mountPath: /etc/rabbitmq ports: - name: http protocol: TCP containerPort: 15672 - name: amqp protocol: TCP containerPort: 5672 livenessProbe: exec: command: ["rabbitmqctl", "status"] initialDelaySeconds: 60 # See https://www.rabbitmq.com/monitoring.html for monitoring frequency recommendations. periodSeconds: 60 timeoutSeconds: 15 readinessProbe: exec: command: ["rabbitmqctl", "status"] initialDelaySeconds: 20 periodSeconds: 60 timeoutSeconds: 10 imagePullPolicy: Always env: - name: MY_POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: RABBITMQ_USE_LONGNAME value: "true" # See a note on cluster_formation.k8s.address_type in the config file section - name: RABBITMQ_NODENAME value: "rabbit@$(MY_POD_IP)" - name: K8S_SERVICE_NAME value: "rabbitmq" - name: RABBITMQ_ERLANG_COOKIE value: "mycookie" volumes: - name: config-volume configMap: name: rabbitmq-config items: - key: rabbitmq.conf path: rabbitmq.conf - key: enabled_plugins path: enabled_plugins
If you are used to manipulate k8s objects, there is nothing special here. We use some variables like {{ meta.namespace }} or {{ size }}. These variables reference parameters that you will define when you create an instance of the CRD. meta.* comes from the metadata section and size comes from the spec section.
Then we need to modify deploy/role.yaml because our operator needs more permissions on the k8s api than the initial exemple:
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: creationTimestamp: null name: rabbitmq-operator rules: - apiGroups: - "" resources: - pods - services - serviceaccounts - endpoints - persistentvolumeclaims - events - configmaps - secrets verbs: - '*' - apiGroups: - "" resources: - namespaces verbs: - get - apiGroups: - apps resources: - deployments - daemonsets - replicasets - statefulsets verbs: - '*' - apiGroups: - monitoring.coreos.com resources: - servicemonitors verbs: - get - create - apiGroups: - apps resourceNames: - rabbitmq-operator resources: - deployments/finalizers verbs: - update - apiGroups: - rabbitmq.olibre.io resources: - '*' verbs: - '*' - apiGroups: - rbac.authorization.k8s.io resources: - roles - rolebindings verbs: - '*'
That’s it. Now we have to build the docker image that will embed our operator.
Build the docker image
In order to deploy our operator in the cluster, we have to create and publish a docker image that contains our logic operator.
$ operator-sdk build jarou/rabbitmq-operator:v0.0.3 #change the tag with your own registry or use this one $ docker push jarou/rabbitmq-operator:v0.0.3
Configure the docker image in the manifest
Modify deploy/operator.yaml to set the image to jarou/rabbitmq-operator:v0.0.3 and imagePullPolicy to Always :
apiVersion: apps/v1 kind: Deployment metadata: name: rabbitmq-operator spec: replicas: 1 selector: matchLabels: name: rabbitmq-operator template: metadata: labels: name: rabbitmq-operator spec: serviceAccountName: rabbitmq-operator containers: - name: ansible command: - /usr/local/bin/ao-logs - /tmp/ansible-operator/runner - stdout # Replace this with the built image name image: "jarou/rabbitmq-operator:v0.0.3" imagePullPolicy: "Always" volumeMounts: - mountPath: /tmp/ansible-operator/runner name: runner readOnly: true - name: operator # Replace this with the built image name image: "jarou/rabbitmq-operator:v0.0.3" imagePullPolicy: "Always" volumeMounts: - mountPath: /tmp/ansible-operator/runner name: runner env: - name: WATCH_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: OPERATOR_NAME value: "rabbitmq-operator" volumes: - name: runner emptyDir: {}
We are ready to deploy our operator and create an instance of our CRD.
Test of the operator
Deploy the operator
$ minikube start --memory 4096 $ kubectl create ns test $ kubectl create -f deploy/crds/rabbitmq_v1alpha1_rabbitmq_crd.yaml -n test $ kubectl create -f deploy/service_account.yaml -n test $ kubectl create -f deploy/role.yaml -n test $ kubectl create -f deploy/role_binding.yaml -n test $ kubectl create -f deploy/operator.yaml -n test
Now we can create a rabbitmq instance.
Use the operator
$ kubectl create -f deploy/crds/rabbitmq_v1alpha1_rabbitmq_cr.yaml -n test
Done. Our operator is working. To check that the reconciliation works, you can edit the rabbitmq object to change the size of the cluster and verify that the change is applied.
$ kubectl exec -it rabbitmq-1 bash -n test root@rabbitmq-1:/# rabbitmqctl cluster_status Cluster status of node rabbit@172.17.0.5 ... [{nodes,[{disc,['rabbit@172.17.0.3','rabbit@172.17.0.5', 'rabbit@172.17.0.8']}]}, {running_nodes,['rabbit@172.17.0.8','rabbit@172.17.0.3','rabbit@172.17.0.5']}, {cluster_name,<<"rabbit@rabbitmq-0.rabbitmq.test.svc.cluster.local">>}, {partitions,[]}, {alarms,[{'rabbit@172.17.0.8',[]}, {'rabbit@172.17.0.3',[]}, {'rabbit@172.17.0.5',[]}]}] root@rabbitmq-1:/# exit $ kubectl edit rabbitmq example-rabbitmq -n test # Change the size to 2 rabbitmq.rabbitmq.olibre.io/example-rabbitmq edited $ kubectl exec -it rabbitmq-1 bash -n test root@rabbitmq-1:/# rabbitmqctl cluster_status Cluster status of node rabbit@172.17.0.5 ... [{nodes,[{disc,['rabbit@172.17.0.3','rabbit@172.17.0.5', 'rabbit@172.17.0.8']}]}, {running_nodes,['rabbit@172.17.0.3','rabbit@172.17.0.5']}, {cluster_name,<<"rabbit@rabbitmq-0.rabbitmq.test.svc.cluster.local">>}, {partitions,[]}, {alarms,[{'rabbit@172.17.0.3',[]},{'rabbit@172.17.0.5',[]}]}] root@rabbitmq-1:/# exit
Conclusion
So it’s possibe for an ops to create an operator without writing a single ligne of code, and it stays simple.
In this example, I used the sdk (that I just discovered) and a rabbitmq statefulset (that I found for this article). The integration of both was simple and efficient. We can say this sdk seems to be a good tool.
Nevertheless, the Mesos team released a similar tool called kudo. It would be interesting to test it, to see if they are similar or if one of them is more efficient than the other regarding the cases.