CKAD

It’s taken me nearly a year, but I finally figured out one of the questions that stumped me in my CKAD (writeup: https://blenderfox.com/2019/12/01/ckad-writeup/)

In the exam, the question was to terminate a cronjob if it lasts longer than 17 seconds. There’s a startup deadline but not a duration deadline. It could be implemented within the command of the application itself, or by specifying to replace any previous running version of the jobs.

Well, I finally had that situation recently at work and wanted to terminate a cronjob if it was active more than 5 minutes, since the job shouldn’t take that long. Finally found out that the answer was not in the CronJob documentation, but in the Job documentation.

CronJobs spawn a Job resource, and within the specification, you can include spec.activeDeadlineSeconds. This will terminate the job pod at that time and will consider the job as failed.

CKA Exam Passed

5 questions I could not answer, and one I could, but arguably that question was ambiguous

  1. Fix a broken cluster — kubelet was started but couldn’t connect to itself.
  2. Add node to cluster. Nodes do not have kubeadm installed.
  3. Static pod. Couldn’t find where the path was to put the manifests for the yaml.

4 and 5 I can’t remember the questions but will update if I remember

Ambiguous Question:

  1. Create a pod with a persistent volume, that isn’t persistent, and doesn’t tell you how big to make the PV. I used emptyDir, but that’s not really a PV (didn’t create a PV or a PVC)

CKAD Writeup

So I did the CKAD exam and it was one of the latest exams I’ve done, starting at 22:45 and finishing at 00:45. The CKAD exam is 2 hours versus the CKA’s 3 hours

And I went into the exam feeling relatively confident. But, damn, the 2 hours goes by really quickly.

Had several questions I wasn’t able to complete or only partially complete.

Liveness and Readiness Probes

This question wanted a pod to be restarted if an endpoint returns 500. Simple enough, but there was a catch, if another endpoint returns 500, then the application is starting, and so disregard the check.

I used similar by implementing this check as a curl command in a real life scenario (I should write a blog entry on that some time).

So in the exam, I did both the liveness and readiness checks to chain two curl commands together, if the first endpoint (/starting) in this case, returned 200, then it would do the next endpoint (/healthz) and return a fail if that gave a 500.

Buuuuut, the image didn’t have curl installed so the probes failed. I could use the hack I’ve used in my image and install curl as part of the check, but time constraints wouldn’t let me.

Persistent Volumes

Similar to the CKA question, there was a quirkily worded question here which wanted me to add a file to a node, create a pod that used hostPath and reserve a 1Gi PV. The documentation does not provide an example of that, just a pod with a hostPath as an internal volume: https://kubernetes.io/docs/concepts/storage/volumes/#hostpath

Network Policies

A technology I haven’t used in Kubernetes yet. They gave several policies, one that allowed “app:proxy” and one that allowed “app:db” and wanted ius to edit a pod to only be allowed to talk to only those.

We were not allowed to modify the policies. I can’t remember whether we were allowed to create new policies for this question

But both those policies use the app label. And the pod can’t have the same label with two values (I did try)

Though thinking about it now, and after a few checks, the NetworkPolicy object describes how to restrict traffic to the pods in question — so those selectors may be related to the pods the policy is restricting. I think I should have looked inside the policies more carefully to see what it was saying on the ingress rule and see if it was saying something like “app:frontend”, and then making sure the pod was labelled accordingly.

Ambassador” Sidecar Pattern

A big chunk of the exam time was taken up by the sidecar questions — far more time than I would have liked, to be honest.

They had a question on adaptor, using fluentd, which was fine, I got that to work, but also had another where I had to use HAProxy to proxy requests do a different port (ambassador pattern). A useful use case, but I ran out of time to finish it. I wanted to come back and revisit it if I had time, but didn’t.

CronJobs

Terminate a cronjob if it lasts longer than 17 seconds. There’s a startup deadline but not a duration deadline. It could be implemented within the command of the application itself, or by specifying to replace any previous running version of the jobs.

Thoughts

I don’t think I passed this, having so many issues is probably going to take me into the 60s mark.

CKA Exam: Strike #2

I took my CKA exam for the second time — and failed again. This time. however got much closer to the pass mark than my first time.

Things I think I fluffed on:

Cluster DNS

pods, services and how they can show up using nslookup. I got caught up in trying to figure out why my DNS wasn’t working, and I think it’s because I was trying to nslookup from outside the cluster, which obviously would not resolve the “.cluster.local” domain correctly. I forgot that you can do an interactive, in-cluster shell using

kubectl run -i --tty busybox --image=busybox -- sh

Not to mention that doing nslookup {service}.svc.cluster.local won’t work, and you have to use -type=a to nslookup to get the ip address of the service to confirm it is resolving

etcd Snapshots

This got me both times. The first time I had no idea why doing a snapshot command was failing. The second time I figured out how to do the backup and how to invoke it from the pod, but still got it wrong. Now I figured out (and it was right in front of my face):

<br />WARNING:
Environment variable ETCDCTL_API is not set; defaults to etcdctl v2.
Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API.

USAGE:
etcdctl [global options] command [command options] [arguments...]

VERSION:
3.2.18

I wasn’t using the ETCDCTL_API variable beforehand so it was falling back to V2 api, which doesn’t have the snapshot command:

<br /># etcdctl
NAME:
etcdctl - A simple command line client for etcd.

WARNING:
Environment variable ETCDCTL_API is not set; defaults to etcdctl v2.
Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API.

USAGE:
etcdctl [global options] command [command options] [arguments...]

VERSION:
3.2.18

COMMANDS:
backup backup an etcd directory
cluster-health check the health of the etcd cluster
mk make a new key with a given value
mkdir make a new directory
rm remove a key or a directory
rmdir removes the key if it is an empty directory or a key-value pair
get retrieve the value of a key
ls retrieve a directory
set set the value of a key
setdir create a new directory or update an existing directory TTL
update update an existing key with a given value
updatedir update an existing directory
watch watch a key for changes
exec-watch watch a key for changes and exec an executable
member member add, remove and list subcommands
user user add, grant and revoke subcommands
role role add, grant and revoke subcommands
auth overall auth controls
help, h Shows a list of commands or help for one command

GLOBAL OPTIONS:
--debug output cURL commands which can be used to reproduce the request
--no-sync don't synchronize cluster information before sending request
--output simple, -o simple output response in the given format (simple, `extended` or `json`) (default: "simple")
--discovery-srv value, -D value domain name to query for SRV records describing cluster endpoints
--insecure-discovery accept insecure SRV records describing cluster endpoints
--peers value, -C value DEPRECATED - "--endpoints" should be used instead
--endpoint value DEPRECATED - "--endpoints" should be used instead
--endpoints value a comma-delimited list of machine addresses in the cluster (default: "http://127.0.0.1:2379,http://127.0.0.1:4001")
--cert-file value identify HTTPS client using this SSL certificate file
--key-file value identify HTTPS client using this SSL key file
--ca-file value verify certificates of HTTPS-enabled servers using this CA bundle
--username value, -u value provide username[:password] and prompt if password is not supplied.
--timeout value connection timeout per request (default: 2s)
--total-timeout value timeout for the command execution (except watch) (default: 5s)
--help, -h show help
--version, -v print the version

# ETCDCTL_API=3 etcdctl
NAME:
etcdctl - A simple command line client for etcd3.

USAGE:
etcdctl

VERSION:
3.2.18

API VERSION:
3.2

COMMANDS:
get Gets the key or a range of keys
put Puts the given key into the store
del Removes the specified key or range of keys [key, range_end)
txn Txn processes all the requests in one transaction
compaction Compacts the event history in etcd
alarm disarm Disarms all alarms
alarm list Lists all alarms
defrag Defragments the storage of the etcd members with given endpoints
endpoint health Checks the healthiness of endpoints specified in `--endpoints` flag
endpoint status Prints out the status of endpoints specified in `--endpoints` flag
watch Watches events stream on keys or prefixes
version Prints the version of etcdctl
lease grant Creates leases
lease revoke Revokes leases
lease timetolive Get lease information
lease keep-alive Keeps leases alive (renew)
member add Adds a member into the cluster
member remove Removes a member from the cluster
member update Updates a member in the cluster
member list Lists all members in the cluster
snapshot save Stores an etcd node backend snapshot to a given file
snapshot restore Restores an etcd member snapshot to an etcd directory
snapshot status Gets backend snapshot status of a given file
make-mirror Makes a mirror at the destination etcd cluster
migrate Migrates keys in a v2 store to a mvcc store
lock Acquires a named lock
elect Observes and participates in leader election
auth enable Enables authentication
auth disable Disables authentication
user add Adds a new user
user delete Deletes a user
user get Gets detailed information of a user
user list Lists all users
user passwd Changes password of user
user grant-role Grants a role to a user
user revoke-role Revokes a role from a user
role add Adds a new role
role delete Deletes a role
role get Gets detailed information of a role
role list Lists all roles
role grant-permission Grants a key to a role
role revoke-permission Revokes a key from a role
check perf Check the performance of the etcd cluster
help Help about any command

OPTIONS:
--cacert="" verify certificates of TLS-enabled secure servers using this CA bundle
--cert="" identify secure client using this TLS certificate file
--command-timeout=5s timeout for short running command (excluding dial timeout)
--debug[=false] enable client-side debug logging
--dial-timeout=2s dial timeout for client connections
--endpoints=[127.0.0.1:2379] gRPC endpoints
-h, --help[=false] help for etcdctl
--hex[=false] print byte strings as hex encoded strings
--insecure-skip-tls-verify[=false] skip server certificate verification
--insecure-transport[=true] disable transport security for client connections
--key="" identify secure client using this TLS key file
--user="" username[:password] for authentication (prompt if password is not supplied)
-w, --write-out="simple" set the output format (fields, json, protobuf, simple, table)

And then I can run

ETCDCTL_API=3 etcdctl snapshot save snapshot.db --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key

To create the snapshot.

Certificate Rotation

I need to look this one up — I had no idea how to rotate the certificates

Static Pods

I’d never directly dealt with static pods before this exam, and I don’t think I had this question in my first run, so it was one I didn’t know the answer do. A bit of hunting on the k8s side led me to figure out it was a static pod question, but I couldn’t find out where the exam cluster was looking for its static pod manifests. The question told me a directory, but my yaml didn’t seem to be picked up by the kubelet.

 

Final note

Generally, a lot of the questions from my first exam run showed up again in this run, which let me run through over half of the exam fairly quickly. I thought I was going to do better than my first run, and I did, but not by much.

LPIC-1

Linux Professional Institute

I’ve finished studying for the first of two exams for the LPIC-1 certification, and I have found some exam questions (about 600 of them), and have started to go through them.

The first thing that struck me about these questions is either I’ve not been studying all the topics, or some topics have been removed out of the exam. For example, some of the questions reference LILO, but according to the LPI page on the 101 exam, there’s no mention of LILO (but there is mention of Grub 2 and Grub Legacy). Then again LILO and Grub Legacy are quite limited by today’s standards, so it could be that they really are removed out of the exam. Guess I’ll have to take that chance.

%d bloggers like this: