How does Kubernetes work?

How kubernetes work and how does cloud deployment work? Today we have infrastructure as code meaning that we are in a declarative environment where we specify what resources we need and how those interact together. So how does that all work? How does Kubernetes then know up to put our application up and running? So basically works by API objects and resources that are defined in yaml/json in kubernetes files, and these can be, for example, definitions of the resources needed a container, the commands needed to execute the app in that container or the ports needed to be open in order for the users to connect to it. These files define for example an application namespace (where all the components pertaining to an application will live) that a set of resources belongs to. They can define behaviour or events. So for example, waves to do integration testing and these are things (in pre sync) that happen before a sync. So, let’s say before you deploy something set up a database and then deploy the application. So that the database is already there. You can have configmaps which are very important to override default settings in an application. For example, for a specific environment or for specific users. So, once we have this API object. Kubernetes has a control plane and inside this controlplane is just a machine. I could be a virtual or on premises server reachable via kubectl. Let’s say that can be queried and push to create these objects. How do we actually create these objects? We do it in a declarative approach. So kubernetes objects are declarative. We specify information on Dockerfile and config files (deployment files) like how much memory I can container needs or for example, which applications are running on which node, these files describe an intent. So K8s is a declarative language to express an intent in a way that creates resources and objects for an application to run. These files are defined in YAML or Json. It also specifies the resources on top of an application.

Other k8s/cloud paradigms are policies, for example restart after an update. So, if I get an out of memory exception, should I restart right away or it will sure leave it off? For example, the container has a status information to know how that application status is and needs to be and it can check that with a readiness probe or aliveness probe, like a health check: is the application on? All of these properties can be defined and overridden in config files which basically describe the container in conjunction with a Dockerfile, then the application can be instantiated. So in order to create your container, running your application, you need a set of specs and a Docker file to deploy it on a cluster. So once you have these parts together, you use kubectl and via apply deploy the application. Kubectl is an interface that’s given to you to operate the kubernetes cluster. The resource/application is defined in a customisation.yaml file, you can specify what container the application needs through a deployment file, which has this information and a service file, for example, exposing the port that’s needed by other pods in the same application to reach it. Once you have this set of files together, you have now your object tp deploy on a node. So you have a structure which is pyramidal in its growth. So you have a pod inside a node. Then inside the pod there is a container and inside the container there is your application, usually containers run a single process.

Kubernetes in a video streaming workflow

So once you do kubectl apply what happens is that jobs are created to deploy this application. Then the K8S has scheduler living in the plane, which assigns pods to nodes. The jobs to create the application runs, the pods are scheduled to be deployed on a node within a Kubernetes cluster. Finally, your application is run with the command specified in the Dockerfile, or it can also be specified in your deployment file. And then your application is running and then the user connects to this app through an Ingress. Ingress are another type of Kubernetes object that allows the application to be reachable by the outside world. How does the network work? So you basically have a container which includes the DNS server of Kubernetes, which then allows the pods to talk together with the nodes where there is a proxy on them to route requests. And that way, the Ingress controller knows that now to route through the k8s proxy to a specific application running in a container. We have the DNS server running on a pod, getting the request from the user passed through the kubeproxy on the node. Another example of kubectl: you can say, okay, delete this pod and you would delete the pod through kubectl running on the node, which then would restart if you have a restart policy applied to it, it would come back up by itself. K8s clusters can be virtual or physical, and if you want to put a node off, you say that your cordoning the node. Cordoning a node means that you make the node unschedulable. So K8s scheduler knows not to assign any pods to it. To reiterate, in K8s: we have API objects, which are submitted to a Kubernetes control plane. The API objects are defined in a declarative approach, and that contain a set of information to run the application specified in a kustomization file, which references a deployment file and a service file to define the application and the resources needed to run it. Then when you apply the configuration to the control plane. The controller runs a job that says okay: We need this kind of pods with this resources to run this application. Then you have the Kube scheduler to assign the pods to a node running so that the pod is assigned to the node and then the container is created within the pod and the application is started up within the container. Now your application is live on a k8s cluster!

In the realm of video k8s clusters are used to, for example, run a distributed encoder software where a manager controls pods running the encoding engine which can take for example a multicast input and encode or transcode to a HLS playlist always optimizing for quality and reducing disk space needed to host the video. These video encoding clusters can run either on prem or in the cloud. Multiple video channels can run on a node depending on the node’s resources and the type of channel. For example a 4K channel requires much more bandwidth and processing power to be encoded from a mezzanine or SDI live source than a SD stream.

How does Kubernetes work?

Reach out!