#99 April 14, 2020

kpt, with Morten Torkildsen

Hosts: Craig Box, Adam Glick

kpt (“kept”) is a new open-source tool for Kubernetes packaging built by Google Cloud. Morten Torkildsen is an engineer at Google, focusing on configuration management and the workloads APIs, and he worked on Kpt. He explains it to Adam, while Craig fills his mind with penguins.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

ADAM GLICK: Hi, and welcome to the "Kubernetes Podcast" from Google. I'm Adam Glick.

CRAIG BOX: And I'm Craig Box.


ADAM GLICK: So as we are still under a stay-at-home order here, there's a lot of discussion about who is an essential worker. But I'm glad to see that the nation of New Zealand has not sat on their hind legs on this one and has indeed made sure that Easter will go off as planned. The Easter Bunny has been identified as an essential worker.

CRAIG BOX: And the tooth fairy, too.

ADAM GLICK: I sometimes love the news that we get out of your homeland. It brings a bright spot to the day.

CRAIG BOX: Yes, it's always nice to hear what's going on. I trust that everyone had a good Easter and that deliveries were made on time. But the lockdown in New Zealand did go in place quite early, and as a result, they are doing quite well over there. So my best wishes go out to everyone back home in New Zealand. And hopefully, one day the planes will fly again and we'll be able to go back and visit.

ADAM GLICK: If not, perhaps we can get a hold of the dragons that they were working on funding.


ADAM GLICK: It does seem interesting to me that the times that we most cover New Zealand is either your family or imaginary animals.


CRAIG BOX: On the subject of real animals, there's been some delightful videos I've seen with zoo animals. Their keepers have found now that there's no people in the zoo, they need something to visit. And I saw a video today from the Oregon Zoo with there's one penguin who was taken for a walk to the seal enclosure. And they were just hanging out, looking at each other. I don't know if the seal and the penguin are naturally friends or one is a predator of the other, but delightful to see these animals being walked around the zoo by their keepers, also essential workers, and just allowed to have a little bit of a party.

ADAM GLICK: I've seen some of those with other zoos having the penguins walk around, and it's pretty adorable. Falls into the kind of cute animal category of put a smile on your face.

CRAIG BOX: Well, it's always good to have a smile on your face in these tough times. It's also good to get to the news.

ADAM GLICK: Let's get to the news.


CRAIG BOX: The CNCF has made three changes to its stable of projects this week, welcoming Volcano into the sandbox and moving Argo and Dragonfly to incubation. Volcano is a batch system for Kubernetes, originated at Huawei.

Dragonfly is a peer-to-peer image and file distribution system originally built by Alibaba Cloud. Argo is a suite of Kubernetes native workflow, events, and CI/CD tools, which originated at Intuit. In case you are already using Argo, Matt Hamilton, a security researcher at Soluble, recently found and reported five security issues, which have been disclosed this week. Credit to the Argo team for patching the most serious one in 30 minutes.

ADAM GLICK: Docker has announced that they are creating a specification for Docker Compose and a corresponding community with open governance. Previously only defined by its implementation, the new spec will provide an environment diagnostic way to define multi-container applications. Docker is working with vendors, including Microsoft and Amazon, to support Kubernetes and ECS environments. Drafts have been posted to GitHub and Docker is actively looking for contributors.

If you're using Docker Compose, you may also be interested in the Nautilus project, a new tool built by a startup of the same name to visualize Compose files in chart format. Nautilus is open source and available on Windows, Mac, and Linux.

CRAIG BOX: Microsoft has announced a new cloud native project under the Deis Labs brand. Krustlet, with a K, is a runtime for orchestrating web assembly modules in Kubernetes. The project is written in Rust and implemented as a kubelet, which should allow you to decode the name. It is written by core maintainers of Helm as both an exploration of using Rust as a development language on the Kubernetes API and a way to run a web assembly code in a cluster.

The container runtime interface is currently too container specific to support WebAssembly. But the Microsoft team supports that this work could lead to a V2 of the CRI. This announcement comes three years after Deis was acquired by Microsoft, and follows the recent announcement that WebAssembly support for Envoy contributed by the Istio team.

ADAM GLICK: The Continuous Delivery Foundation has announced that Tekton Pipelines, formerly knative build, has reached beta. other Tekton features remain in alpha status. Learn more about the CD Foundation in episode 44, and Tekton in episode 47.

CRAIG BOX: The MITRE ATT&CK framework, with a trademark friendly ampersand in the middle, is a knowledge base of known tactics and techniques that are involved in cyber attacks. Microsoft Security has released a no ampersand, no trademark Attack matrix for classifying and understanding security threats in Kubernetes. Lists things you should look for in terms of how someone might access your cluster, and what they might do when they have access to it. A separate blog post shows an example of the matrix in action with a case study of a cryptocurrency mining attack.

ADAM GLICK: Huawei has announced Mindspore, an open source, deep learning, training, and inference framework. Mindspore can be run on top of Kubeflow, and is optimized for software and hardware co-optimization. A CRD and operator are available alongside the code.

CRAIG BOX: Solo.io, our guest in episode 55, have released Service Mesh Hub, an open source, multi-cluster, multi-service mesh management plane. Not content with the API provided by each service mesh or even their own service mesh interface abstraction, the new hub API allows users to configure a set of different service meshes with a single API, as well as providing an operational management tool with graphical and command line UIs, observability features, and debugging tools.

ADAM GLICK: Another technology for doing multi cluster service mesh is Backyards, the Istio distribution from Banzai Cloud. This week, Zsolt Varga, with a Z, from the Backyards team has posted a deep dive into the Istio telemetry system, moving from the central Mixer component to metrics coming from each proxy has cut the latency substantially. But some work is required to map statistics into multi-cluster environments. Thankfully, Backyards sets this all up for you. But Varga explains how you would do it yourself, and how it's being addressed in future Istio releases.

CRAIG BOX: Amazon has announced version 1.4 of its Fargate container platform. This enables containers to talk to their elastic file system, adds network performance metrics, and allows the ptrace capability to aid debugging. This vision also replaces Docker Engine with containerd as Fargate's container execution engine. And a deep dive into the new release explains the implications of that change.

ADAM GLICK: The Rook Project, which we covered in episode 36, has released version 1.3. This version adds enhanced timezone support, improvements to SMB, EdgeFS and SEF, and increased test coverage. The team also moved to a CSI V2 driver, which added a number of features, including volume resizing. The announcement also stated that Rook is in the final stages of preparation for reaching the graduated status in the CNCF.

CRAIG BOX: Red Hat's OpenShift commons gathering, originally slated to happen before KubeCon EU, has now become a pre-day to the upcoming Red Hat Summit Virtual Experience, to be held on April 27. Speakers will include Jim Whitehurst, now chairman of IBM, as well as members of the OpenShift team and customers. This week, Red Hat also announced a new training course for building resilient microservices with OpenShift service mesh, their Istio product, and guidance on avoiding security problems when pulling container images by short name alone.

ADAM GLICK: Canonical, publishers of Ubuntu and guests of episode 60, announced they are now providing managed support for several open source databases, as well as logging, monitoring, and alerting tools. The apps will be supported on Kubernetes as well as VMs and bare metal across on prem and cloud deployments. Canonical offers an unspecified SLA with 24/7 break fix response and monitoring provided by Grafana, Prometheus, and Graylog. Pricing has yet to be announced.

CRAIG BOX: Helm creator and occasional children's book author, Matt Butcher from Microsoft, is continually asked, "what will win, Helm or operators?" He has written a blog post to explain that they are different technologies meeting different needs, but with some overlap.

Butcher says Helm is a package manager meant to help with deployment, akin to apt or yum. Operators, on the other hand, offer a cloud native way to manage software. The former lets you avoid YAML, the latter lets you use more of it to do anything. He suggests we should be careful about using the same word to mean different things, but that there is plenty of space for us all to get along.

ADAM GLICK: Monzo has written up their journey to build an egress firewall to protect from security exploits. Authors Jack Klieman and Chongyang Shi explained that even if a service is compromised, its attack potential is limited if it can't communicate with an external service. To that end, they evaluated a number of products, and finally decided to build their own Egress firewall. The post covers the gaps they found with existing technologies, and how they identified which services, both internal and external, needed access to which other services. The solution is a clever way to look at minimizing the blast radius of a potential exploit that gets into your clusters.

CRAIG BOX: Deep dive blog posts on Kubernetes 1.18 continue this week, covering a new alpha feature called API Priority and Fairness. This feature prevents particularly noisy containers from flooding the API server with requests and causing control plane unavailability. You can define priority levels and the amount of concurrency shares that each level receives. This gives you control to ensure that each service gets its specified share.

ADAM GLICK: Ever wonder what it would take to move your zookeeper instances into Kubernetes? If so, you'll want to check out the blog post this week by HubSpot, where they discuss how they did exactly that. In particular, they talk about how they made the conversion without downtime. The post also provides links to some open source projects to help with the networking pieces that HubSpot relies on an internal tool for.

CRAIG BOX: When a pod is terminated, it's given some time. Usually 30 seconds to shut down and process any requests that were in flight. Not all applications are as graceful as each other. Ilya Andreev from Flant looks at two popular app servers, Nginx and PHP-FPM in cases where the pre-stop signal sent to those applications may not do what you expect. This information, which comes with some suggestions on load testing, could stop your serving 500 errors when you scale.

ADAM GLICK: For those of us that build slides as well as code, the open container initiative has released an icon artwork library under the Apache license. It's a nice list of standard icons that anyone building an architecture diagram will need. And it's great to have a standard library of these available.

CRAIG BOX: Finally, if you're running a hands on Kubernetes training in a trusted environment, you might want to check out Pascal Widdershoven's work to create environments for students within a single cluster. He explains how he built this with Kind, as discussed on episode 69, and provides scripts for set up as well as cost analysis.

ADAM GLICK: And that's the news.


ADAM GLICK: Morten Torkildsen is a software engineer at Google, where he works on Kubernetes focusing on configuration management and the workloads API. Welcome to the show, Morten.

MORTEN TORKILDSEN: Thanks for having me.

ADAM GLICK: Congratulations on the launch of the kpt project.


ADAM GLICK: Can you explain what kpt is?

MORTEN TORKILDSEN: kpt is a tool for building the corrective workflows on top of Kubernetes resource configuration. The functionality of kpt can be split into a roughly four buckets. Publishing and consuming kpt packages, each of which is really just a set of Kubernetes resources and gets. That's a CLI to display and modify multiple corporation files at once. It has an extensible framework for writing and running validators into formers on Kubernetes manifests in hopes of chaining them together into pipelines. Finally, kpt has an apply command that is similar to the one we know from kubectl, but it adds support for pruning and the waiting for resources to reconcile after an apply operation.

ADAM GLICK: What does kpt stand for?

MORTEN TORKILDSEN: kpt is not an acronym, so it doesn't actually stand for anything. It's pronounced "kept". We wanted a name that starts with a K in the tradition of tools in that community's ecosystem. And I wanted in a name that is short enough that you don't have to elicit.

ADAM GLICK: Is it related to APT, as in apt-get?

MORTEN TORKILDSEN: It provides somewhat of a similar functionality. So APT has been a inspiration. But that's not necessarily where the name came from.

ADAM GLICK: Speaking of where it came from, why did you choose to build kpt?

MORTEN TORKILDSEN: One of the key pain points that kpt aims to address is non-compostable tools in the Kubernetes ecosystem. So, for example, if you want to build an application platform on top of Kubernetes, it'll require you to put together a lot of different tools. And you need to write all the glue codes to make it all work together. And that can be quite a bit of work.

kpt adopts this Unix philosophy of small, focused model components that allow composition. And to do this, we need a shared data format that all these tools can read, understand, and emit. And we have this data format today. It's the Kubernetes resource model or manifests. And it is not a new idea.

So let's say you are using Helm today, and you want to run something like kubeval against your Helm chart. And a way to do that is just run the "helm template" command, which will generate an output the Kubernetes manifests in the chart. And you can just pipe that into kubeval, and it will just work.

So kpt is build around configuration as data, which expressed as Kubernetes resource manifest. So for kpt, all tools and function has its Kubernetes manifest going in, and its Kubernetes manifest coming out. This is similar actually to how Kubernetes works internally, where you have all these resources in the API server. And you have these controllers that watch their state and know how to update them, read them, and actually deletes, and just constantly try to reconcile the side state expressed in resources with the absorbed state in a cluster.

ADAM GLICK: A couple of times you've mentioned workflows and pipelines. You mentioned chaining tools together. You were just talking about kind of taking outputs from Helm on that. How do you think about putting these together? Where does that fit in terms of a tool chain that someone is putting together to stand up their Kubernetes resources?

MORTEN TORKILDSEN: You can start out with a set of resources in some kind of group together. And kpt writes kpt packages for a way to take a set of resources that has some meaning as an aggregate and put them together. And often, as part of pipeline, you want to take these resources. You may want to do some kind of manipulation of them. That might be adding specific labels. You can maybe generate variance if you have one base resource that you don't want to create slight differences for different environments and different clusters.

And then you might want to do validation to make sure that these are valid resources. You might have policies you need to follow. And all these things if you accept the idea of configuration as data, you have this flow of resources that just goes through all these functions or tools that all can provide one piece of the functionality here.

So you might have one tool that validates that all your resources, for example, follow the best practices for RBAC rules, for example. And in such a pipeline, maybe the last step is that you pipe the resources into either kubectl apply or the kpt apply function that is part of kpt.

ADAM GLICK: You mentioned an interesting phrase there. You said configuration is data. And that sounds curiously similar, but slightly different to a phrase that I used to hear a lot, configuration as code. What's the difference between configuration as code, as people have kind of thought about it, and what you're talking about with configuration as data.

MORTEN TORKILDSEN: So I think it's similar. Configuration as data is a slight difference, I would say, from configuration as code. When we say configuration as code, people often think about tools like Terraform, which has a DSL that allows you to express how your configuration. And that DSL allows for things like loops, conditionals. When we talk about configuration as data, especially in the context of kpt, we think of it as data in the form of Kubernetes resources or resource manifests.

So there's no code involved. It's only data. So there's no conditionals. There's no loops. Nothing like that. So the resources you see on your client side is what will end up on the server side. There's no additional step, which will turn your configuration into something else once it gets into your cluster.

ADAM GLICK: Does that mean that it's a lot easier for someone who needs to read through the file to look for things?

MORTEN TORKILDSEN: It at least, I think, lowers the barrier. If you're an experienced developer with, let's say, Terraform, then you can read Terraform configuration pretty well. But for most people, it'll be easier to map from what is the configuration that I have in my source control to what is actually going to happen once I apply this to my cluster.

ADAM GLICK: Would it be fair to create the analogy that says, it's kind of the difference between, say, looking at a bash script versus looking at a CSV file?

MORTEN TORKILDSEN: Yeah, I think so. That sounds about right.

ADAM GLICK: Speaking of which, what is the format that this is all stored in? You mentioned that it's native Kubernetes resources. Is it a familiar file format?

MORTEN TORKILDSEN: We're using the Kubernetes resource model. And this is the model that-- even people that are not that experience with Kubernetes would recognize this. These are the YAML files. Most people use YAML at least, which will have the API version and kind. You'll have a metadata section where you'll probably provide a name of the resource. And you'll have a spec object, which will declare the decide state of this resource in the cluster. And then it'll have a status object, which reflects the observed state in the cluster. And these manifests are the fundamental unit of data we pass through the pipeline with kpt.

ADAM GLICK: So if it's in YAML, which people working with Kubernetes should be fairly familiar with, how is this tool different than just having your YAML files and checking those in to Git?

MORTEN TORKILDSEN: kpt definitely use both YAML and Git as sort of the foundation. But kpt that's a lot of value on top of that. So kpt provides a nicer interface for Git. So if you have many kpt packages, you don't have to put all of them in different repos. You can put all of them in one repo in different subdirectories, and then you can use the kpt tools to download a single kpt package from that repo.

And maybe you download a package. You make some local changes. Eventually, you want to pull in the changes from the upstream. And kpt realizes-- since kpt knows that what were you doing here is merging Kubernetes resources, it can make some good decisions when it tries to merge local changes with the upstream you can rely on three way merge, like apply, which would make it more likely to have a successful merge than if you relied on normal merging of files, like with Git.

kpt also provides a way of versioning each kpt package individually. And this is a strategy that is similar to Go modules, which people might be familiar with.


MORTEN TORKILDSEN: kpt also provides tools for working with YAML. If you have a kpt package with a lot of files, maybe a lot of resources in each file, there's tools for listing out to all the resources. There's tools for changing all resources in a package with one command. For example, you might want to add an annotation to all the resources in your package.

kpt also has something that's pretty powerful which is called kpt setters. And this is a way for a producer of a package to define what is the recommended knobs for customizing a package. And if a producer defines setters and then publish their package, then consumers are able to discover these setters with the kpt tools, and also make changes to the fields covered by the setters, simply by using the kpt tools.

But these are not the only knobs. You have the full fidelity of the underlying APIs. So if you consume a package, you can choose to use the existing setters that are already defined, or you can choose to define your own.

And if I download a package, and I think, the setters defined on this one, it's not the right ones, I think they should be different, I can remove the existing setters. I can define new ones. And I can publish it for others. And publishing a package with kpt is just committing to Git.

ADAM GLICK: You mentioned that this is kind of a Kubernetes-focused set of resources and tooling. Why does Kubernetes need its own configuration and deployment tool?

MORTEN TORKILDSEN: Managing configuration is a challenge for use of Kubernetes. You might have multiple environments, like a couple of prepro environments, a production environment. You might have multiple clusters in each environment, maybe in different regions and zones. And even for a single application, the number of different configurations might be overwhelming. And then you might have hundreds or thousands of applications.

In addition, if you want to do like progressive rollouts, the configuration for a single app might differ between different clusters at any point in time. And when we make changes to a system with this kind of complexity, we need to be able to verify that all changes has passed validation. They don't conflict with any policies. And we just think that declarative confirmation as data is the best way to handle this complexity. And kpt is just a tool to help build pipelines and platforms based on this philosophy.

ADAM GLICK: We've talked a bit about configuration and deployment. And you've, indeed, mentioned a number of other tools. I'm curious how kpt relates to those other tools. I'm thinking specifically of things like Terraform that people might be using to set up their clusters, or Helm, that people might be using to deploy applications within their clusters.

MORTEN TORKILDSEN: Those are both great tools. And there's certainly some overlap between the functionality of kpt with those. I think the difference, again, comes back to the basic philosophy of kpt, that configuration should be data expressed as Kubernetes resources.

Like I mentioned, Terraform uses a DSL. Helm relies on templating. So kpt is, in some ways, a different approach. And we think this has some advantages.

Like we discussed, it clearly represents intended states since there's no conditionals or loops. It supports static analysis and validation, which can allow us to shift many operations to the left in CI/CD pipeline. So we can do validation and policy enforcement, any other checks we want to do earlier in the pipeline. So maybe we can detect any issues before we even need to talk to the cluster.

ADAM GLICK: When you say DSL, you're referring to a domain specific language?

MORTEN TORKILDSEN: Yeah. That is correct.

ADAM GLICK: So you're saying that this allows you to work in a language, like we said before, YAML, that's a little more familiar to people who may not know those particular languages, but certainly understand YAML if you're working in Kubernetes?

MORTEN TORKILDSEN: Yeah. I would say YAML is a data format. And while DSL tends to be more of a language that you execute.


MORTEN TORKILDSEN: But kpt also works pretty well with some of these other tools. Like I mentioned, you can use the "helm template" command to generate a resource manifest for a Helm chart. And the output there is essentially the same as a kpt package.

So if you do that, you can take this output and feed that into a kpt pipeline. Or you can use kustomize to generate variations. There's a lot of options.

ADAM GLICK: As with any tool, there is often the right ways to use it and the wrong ways to use it, sometimes called patterns and anti patterns. What would you say are the right patterns? And what are the anti patterns for using kpt?

MORTEN TORKILDSEN: Like I mentioned, kpt works reasonably well with existing tools. Also, the different pieces of functionality in kpt are pretty independent of each other. So it's not really an all or nothing decision here.

If you need a packaging format, you can use kpt packages. And you can apply them to your cluster using kubectl. Or, on the other hand, if you have another way of generating your resource manifests, but maybe you want to take advantage of the pruning support or resource status support that comes with kpt, you can use the kpt apply command by itself.

But in general, I think kpt was designed based on a certain idea about how things should be done. And it probably works best in a situation where the whole system is aligned around that idea.

ADAM GLICK: At what layer is kpt meant to help set things up? Do you use it to set up your nodes themselves? Do you use it to set up the applications that are running within your cluster or on top of the nodes? Where is the right place to use it in terms of what you're deploying? And what are the places where it wouldn't make sense?

MORTEN TORKILDSEN: kpt works with Kubernetes resources. So anything that you define as a Kubernetes resource could be a good fit for kpt. The obvious is obviously things running inside Kubernetes. But it's becoming more popular to define other types of infrastructure as well. We're using the Kubernetes resource model. And if that's a solution you're using, then kpt could be a good tool for not only configuring your applications running in Kubernetes, but a bigger part of your infrastructure.

ADAM GLICK: A number of tools use centralized repositories to have default templates for things. I'm thinking like Helm chart repositories, for instance. Is there a default repository for kpt? Or is it something that people do for their configurations of their own clusters and their own applications, and those are separate?

MORTEN TORKILDSEN: There isn't currently any centralized repository of kpt packages. Like I mentioned, you could leverage Helm charts with kpt if you want to.

And also, since every kpt package is really just Kubernetes resources in Git, there's actually a lot of valid kpt packages available already. As an example, the examples that are available in the Kubernetes repo on GitHub can be used with kpt. You can use kpt to fetch those packages. And that will just work.

ADAM GLICK: So those are already out there. That's great. How does kpt work with operators?

MORTEN TORKILDSEN: kpt is a client-side tool, and doesn't really care which controller is listening for changes to the resources, which are part of the package. But we've seen some examples where operators are used kind of like a deployer. They tend to be a controller that, instead of relying on the more traditional reconciliation loop, the controller will only work for a single resource type that is typically only updated by users or through a CRD tool. And when the controller sees changes to a custom resource of that type, it'll generate built-in resources based on that content and just apply them to the cluster.

With kpt and kpt functions, you can still define the same CRDs. But instead of having a controller running in the cluster that basically turns the CRD into other resources, you can move that out into your CD pipeline. And that gives you the fully hydrated configuration in your CD pipeline so you can run validations before you apply them. And it also means one less controller running in your cluster.

ADAM GLICK: You also mentioned there that kpt is a client-side tool so that it's not something you have to install into your cluster. It's something that just sits on the machine that you're working on. How do you install kpt on that machine? Is it a package you go install? Or how do you get it there?

MORTEN TORKILDSEN: kpt is provided as part of gcloud. So you can get it through that path. It's also available as a homebrew package. Or you can just download it from GitHub.

ADAM GLICK: Put it in a container, and then just run it locally on the container?

MORTEN TORKILDSEN: Yeah. You can do that too. And obviously, you can always just build it from source.

ADAM GLICK: Speaking of source, where would people go to get involved in it? It's an open-source project. Right?

MORTEN TORKILDSEN: Yes, it is. So the kpt project is on GitHub.

ADAM GLICK: GoogleContainerTools?


ADAM GLICK: And are there any other organizations that are involved in this project at this point? I realized you just hit version 0.1 so you're fairly early on. Are there others involved in this?

MORTEN TORKILDSEN: The kpt project is a Google project. But it relies on several other upstream projects, many of which are part of the CNCF. It relies on some code from SIG-CLI, both for the apply functionality, status, and pruning.

ADAM GLICK: So now that you've got version 0.1 out there, what's next for kpt?

MORTEN TORKILDSEN: Our big goal now is working on maturity. We know there are some rough edges in kpt. And we're working to address those. There's also features we know are missing. We have a long list of issues and features that we want to get to, some obviously more urgent than others.

The roadmap and all the issues we're working on are all available in the GitHub repo. So it's open for everyone. So yeah. We want people to use it, give us feedback, and help us make kpt as useful as possible.

ADAM GLICK: Wonderful. Well, thanks for joining us today, Morten.


ADAM GLICK: You can find Morten on Twitter at @mortenjt, and you can find the kpt project at googlecontainertools.github.io/kpt.


Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @KubernetesPod, or reach us by email at kubernetespodcast@google.com.

CRAIG BOX: You can also check out our website at kubernetespodcast.com, where you will find show notes and transcripts. Until next time, take care.

ADAM GLICK: Catch you next week.