The Non-Technical Guide to Kubernetes (That’s Admittedly a Bit Technical)
Despite its precipitous rise in popularity, Kubernetes (also known as k8s) is a notoriously difficult concept to describe in non-technical terms.
Any discussion of it relies on some knowledge of containerization, which itself is an extension of virtualization.
That’s a lot of multi-syllabic words to introduce at once, so let’s break all of that down.
What is Virtualization?
According to VMware, a leading provider of virtualization technology, virtualization is “the process of creating a software-based, or virtual, representation of something, such as virtual applications, servers, storage, and networks.”
Virtualization, within the context of software engineering, refers to the use of virtual machines (VMs), which are software representations of computer hardware, firmware, operating systems, and software. VMs can be defined as files that require a hypervisor to run.
Once active, they act as computers within computers, allowing a user to install software, run a server, or do practically any function that a standard computer can fulfill.
Fully virtual machines have existed since 1966 (IBM CP40, CP-67), but were not widely used for production applications until the 2000s. Prior to this, if you were developing an application that required a database, API, web server, etc. you would likely be deploying all of these services together on one physical server, or each on a separate server (if scalability was a priority).
Since virtualization is not tied directly to hardware, it provides a lot of flexibility in application deployment. The underlying hypervisor can manage the actual hardware, while virtual machines can be spun up and down as necessary. This process can even be automated to scale with demand, and there are many tools that do this including hypervisor-specific tools as well as Chef, Ansible, Terraform, and Puppet.
There are many production systems relying on virtual machines, and many valid uses for them, but there are certain inefficiencies.
For these inefficiencies, you need containerization.
What is Containerization?
A container is an encapsulation of an application with the environment in which it is designed to operate. Much like virtualization, this promises the ability to deploy a fully-functional and self-contained application on any capable system. There are a few key differences, however.
Containers are abstracted at the application layer, where virtual machines are abstracted at the hardware layer.
Containers can share system and library resources that virtual machines would otherwise have to duplicate. This makes resource usage more efficient over standard virtualization, translating directly to faster management operations, more cost-effective uptime, and less total downtime.
Containerization is not a novel concept. As early as 1982, most Unix-like file systems could achieve partial file system isolation with chroot. LXC, which was a core piece of early implementations of Docker, has been in use since 2008. A smattering of other container systems has been developed between 2000 and the present day.
Thanks largely to the ubiquity of Docker, containerization stands among the most widely embraced abstractions in software engineering. Its impact on development, integration, deployment, and distribution is near impossible to overstate.
To summarize so far, virtualization reduces reliance on hardware. Containerization reduces reliance on applications. The next layer is where Kubernetes lives.
What is Kubernetes?
Kubernetes is a “container orchestrator” – i.e., a system for deploying and managing containerized applications.
The primary focus is on automation and scalability.
To enforce redundancy and promote high availability, Kubernetes is designed to operate across a cluster of at least three physical or virtual machines. Each cluster requires one master server and at least two node servers; the master centralizes the logic regarding how to manage the containerized application’s workload – node servers are responsible for running the workloads distributed to them by the master.
The master server has three primary components: the API Server, Scheduler, and Controller Manager.
- The Kube API server is the front end of the Kubernetes control plane – this is how your ops team/developers will interact with the cluster.
- The Scheduler watches created pods (Kubernetes terminology for containers) and will design them to run on a specific node.
- The Controller Manager runs background tasks, like maintaining the correct number of pods within a cluster.
Two more critical services within Kubernetes are called etcd and kube-proxy.
etcd is the configuration solution, implemented as a distributed key-value store.
kube-proxy is a network proxy and load balancer that exists on each of the worker nodes, forwarding incoming requests to the relevant pods (this is how the end-user would interact with Kubernetes).
Why Your Dev Team Likes Kubernetes
Load Balancing – The Kubernetes Cluster distributes user requests and workload. Traditionally, load balancing might necessitate the use of dedicated service. Kubernetes allows developers to keep their load balancing platform-agnostic and relatively low touch.
Namespaces – Each physical Kubernetes cluster is further abstracted by an arbitrary number of virtual clusters called namespaces. Namespaces have significant utility for large enterprises, where many separate applications may be installed on one cluster – roles and responsibilities can be separated by namespace, keeping development of different parts of an enterprise’s software suite appropriately isolated. Multi-tenant requirements can also be satisfied with clever namespacing configuration. Another common use case is namespaces per environment, allowing isolated deployments of development, staging, production, etc. environments on a single cluster. Namespaces can also be used to partition several different parts within a larger application, which affords some flexibility and reliability for microservice architectures.
Monitoring and Logging – Kubernetes, like Docker, provides a unified system and console tooling for capturing log output of each node. While this has practical development utility, this can really shine for production applications when rolling other services into the mix.
Some Drawbacks of Kubernetes
Developer Spin-Up Time – Kubernetes is a notoriously complex topic, and especially difficult to pick up for those not already deeply familiar with containerization. Larger enterprises can usually mitigate this by having specialized teams where many developers don’t need to have any understanding of Kubernetes at all. Smaller operations usually need to have more generalists, so they will feel the time sink more acutely. This is a significant reason why Kubernetes is more popular for projects with many moving parts.
Small Applications / Few Users – Simpler projects or lower-load applications will see less overall benefit from container orchestration. If your project is a brochure website with a handful of assets, for example, there’s little to gain from features like load balancing and namespacing. Even if your project is large in scope, but you only have a small team working on it, it may not be worth the effort to roll in Kubernetes. The main benefits of orchestration serve the purpose of scalability.
Effort / Cost – Kubernetes is infamous for taking a long time to install and configure on fresh hardware, even for the initiated. Admittedly, this may only be an issue if you’re running your own servers; it is common these days to use clusters pre-configured on AWS or other cloud hosts. Another consideration is cost – on DigitalOcean, the most cost-effective droplet (VPS) currently stands at $5/mo where a Kubernetes cluster has a $10/mo minimum. In terms of computing resources, Kubernetes will invoke overhead as well, requiring administrators to spin up extra servers or increase resources on the server for similar performance compared to a bare-metal installation.
Kubernetes Use Cases
Kubernetes provides many different case studies for well-known organizations on their website (https://kubernetes.io/case-studies/). A particularly interesting one is Pinterest, where they scaled to over 1,000 microservices before moving to Docker and Kubernetes. This move allowed them to reclaim up to 80% of their non-peak usage and 30% of their total instance hour usage per day.
That’s a massive reduction in operating costs.
The New York Times also has a case study where they highlight the efficiency of container deployments vs. VM deployments:
VM-based deployments took 45 minutes; with Kubernetes, that time was “just a few seconds to a couple of minutes.”
Why Should You Care?
If you’re not a developer, why should you know about Kubernetes?
Because at the end of the day, Kubernetes lives within the world of agile development, digital transformation, and cloud-based software, all of which are changing rapidly.
These concepts are accelerating digital products, creating environments where the winner takes all. If you’re managing a digital product, it’s essential to understand how technology is evolving so you can continue to plan for the future.
Understanding how new technology, like Kubernetes, affects software and business decisions gives you an edge when you’re forced to cut costs or prove ROI.