Growing Your Kube Workforce

August 8, 2021 3 minute read

The Struggles of a Growing Business

Note: If this is your first visit, then you need to see the tutorial at: Kubernetes Home Lab to get started.

So, business is booming. The three of you just can’t seem to keep up with the load and have a life outside of work…

Time to hire some help…

Today we are going to add three more nodes to our Kubernetes cluster.

You will need another NUC like the one that you used to build the initial lab. Like before, it will need at least 4 cores, 1TB NVMe, and 64GB of RAM.

First, make sure that you have DNS A and PTR records for the new host. The DNS configuration that we set up previously included three KVM hosts, kvm-host01, kvm-host02, and kvm-host03. So, let’s assume that this host is going to be kvm-host02.

Update the helper scripts for this project:

cd ${OKD_LAB_PATH}/okd-home-lab
git fetch
git pull
cp ./bin/*.sh ${OKD_LAB_PATH}/bin
chmod 700 ${OKD_LAB_PATH}/bin/*.sh

Read the MAC address off of the bottom of the NUC. Then create the iPXE and kickstart files with the helper script:
```
${OKD_LAB_PATH}/bin/deployKvmHost.sh -c=1 -h=kvm-host02 -m=<MAC Address Here> -d=nvme0n1
```
Now, connect the NUC to the remaining LAN port on the internal router and power it on. After a few minutes, it should be up and running.

Verify that everything looks good on the new host:

ssh root@kvm-host02.dc1.${LAB_DOMAIN}
# Take a look around
exit

Now, back to the business of doubling our workforce.

Create the inventory file for the new worker nodes:
```
cat << EOF > ${OKD_LAB_PATH}/worker-inventory
kvm-host02,okd4-worker-0,20480,6,100,200,worker
kvm-host02,okd4-worker-1,20480,6,100,200,worker
kvm-host02,okd4-worker-2,20480,6,100,200,worker
EOF
```
Note: We added an extra disk to these nodes. Next week we’ll install Ceph cloud storage using those disks.

Initialize the ignition files and iPXE boot files for the new worker nodes:

${OKD_LAB_PATH}/bin/initWorker.sh -i=${OKD_LAB_PATH}/worker-inventory -c=1

Start the nodes:

${OKD_LAB_PATH}/bin/startNodes.sh -i=${OKD_LAB_PATH}/worker-inventory -c=1

Now, you need to monitor the cluster Certificate Signing Requests. You are looking for requests in a Pending state.
```
watch oc get csr
```
When you see Certificate Signing Requests in a Pending state, you need to approve them:
```
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
```
There will be a total of 6 CSRs that you need to approve.

Designate Master nodes as Infrastructure nodes

Since we now have three dedicated worker nodes for our applications, let’s move the infrastructure functions to the control plane.

Add a label to your master nodes:

for i in 0 1 2
do
oc label nodes okd4-master-${i}.dc1.${LAB_DOMAIN} node-role.kubernetes.io/infra=""
done

Remove the worker label from the master nodes:

oc patch scheduler cluster --patch '{"spec":{"mastersSchedulable":false}}' --type=merge

Add nodePlacement and taint tolerations to the Ingress Controller:

oc patch -n openshift-ingress-operator ingresscontroller default --patch '{"spec":{"nodePlacement":{"nodeSelector":{"matchLabels":{"node-role.kubernetes.io/infra":""}},"tolerations":[{"key":"node.kubernetes.io/unschedulable","effect":"NoSchedule"},{"key":"node-role.kubernetes.io/master","effect":"NoSchedule"}]}}}' --type=merge

Verify that your Ingress pods get provisioned onto the master nodes:
```
oc get pod -n openshift-ingress -o wide
```

Reset the Ingress Canary pods:

This will eliminate an annoying message about the ingress-canaries running on the wrong nodes.

for i in $(oc get pods -n openshift-ingress-canary | grep -v NAME | cut -d" " -f1)
do
  oc delete pod ${i} -n openshift-ingress-canary
done

Repeat for the ImageRegistry:

oc patch configs.imageregistry.operator.openshift.io cluster --patch '{"spec":{"nodeSelector":{"node-role.kubernetes.io/infra":""},"tolerations":[{"key":"node.kubernetes.io/unschedulable","effect":"NoSchedule"},{"key":"node-role.kubernetes.io/master","effect":"NoSchedule"}]}}' --type=merge

Finally for Cluster Monitoring:

Create a config map for cluster monitoring:

cat << EOF | oc apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusOperator:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    prometheusK8s:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    alertmanagerMain:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    kubeStateMetrics:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    grafana:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    telemeterClient:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    k8sPrometheusAdapter:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    openshiftStateMetrics:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
    thanosQuerier:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
EOF

That’s it! We now have three more nodes in our cluster.

Next week we’ll add Ceph storage.

cgruver

Growing Your Kube Workforce

The Struggles of a Growing Business

Designate Master nodes as Infrastructure nodes

That’s it! We now have three more nodes in our cluster.

You may also enjoy

Building a Single Node OpenShift Home Lab - Agent Based Install

Taking OpenShift Outdoors - Introducing a new format: VLOG

Eclipse Che / OpenShift Dev Spaces - Podman With Fuse Overlay

Try To Pull Yourself Up By Your Bootstraps - Without Falling Over…