#181 June 1, 2022

Configuration as Data, with Justin Santa Barbara

Hosts: Craig Box

What is configuration as data, how is different from infrastructure as code, and why can’t anything just be itself anymore? We posed these questions and more to long-time Kubernetes contributor Justin Santa Barbara at KubeCon EU, and this episode is the result. Justin created the kOps project and now leads the team at Google that makes Kubernetes easier to consume.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm your host, Craig Box.


CRAIG BOX: It was a safe assumption there would not be a lot of Kubernetes news announced the week after a KubeCon. As I was on holiday, I polled the faithful, and you very kindly let me take the week off the show. The news that did follow KubeCon EU was mostly that much of the catering staff, and some of the more huggy community members, got COVID. The numbers may well end up in line with the general Spanish infection rate for the week, but it's a stark reminder that the pandemic isn't over.

Many years ago, a friend gave me a board game called Alhambra. I must admit that I had no idea what an Alhambra was or, as it turns out, how to pronounce it. It's a Spanish word so the H is silent. Alhambra. But it's la Alhambra, which translates to "the the red one."

Anyway, the Alhambra is a palace complex in Granada in Spain, founded in the 13th century and home to some of the most famous and best preserved Islamic architecture. One of the Airbnbs we stayed at in New Zealand, the one with the irritating fridge if anyone is keeping track, had pictures and plans of palaces of the Alhambra on the walls, which pretty much cemented it as our post-KubeCon destination.

We're now back in London for a while. God Save the Queen and all that. Keep an eye out for a meetup appearance in June and perhaps some gig videos, the best reason to follow me on Twitter. Let's get to the news.


CRAIG BOX: Microsoft held their annual Build conference one week after KubeCon, saving their best container-related news. Announcements included the general availability of Azure Container Apps, a Google Cloud Run-like service for running serverless containers. A number of new cluster extensions for AKS are available, including support for Dapr, Azure ML, and GitOps with Flux v2. Extensions in preview include support for KEDA and web application routing. Open source developer tool Draft saw a V2 release with new AKS and VS Code integrations.

Docker has acquired developer tool startup Tilt. Tilt was founded in 2017 by two Googlers who saw how tools like Google Sheets changed collaboration and wanted to do the same for developers. Their open source project, also called Tilt, automates the steps for a code change to a new process, watching files, building container images, and bringing your environment up to date. The teams are now taking some time to figure out how best to integrate their technologies.

Semiconductor company Broadcom announced their intention to acquire VMware for $69 billion. With further acquisitions in the semiconductor space having been blocked by regulators, Broadcom has turned to software vendors, having acquired companies like CA Technologies and Symantec. Broadcom's reputation for trimming costs and focusing only on a small number of regulated customers has raised eyebrows among VMware staff and users, according to multiple press reports, with a number of other Kubernetes companies coincidentally commenting on Twitter that they are currently hiring.

Version 1.14 of the Istio service mesh has been released. The headline feature is support for the SPIRE runtime, contributed by its maintainers at HB Enterprise. This adds another option alongside Istio's own SPIFFE implementation, offering pluggable authentication and identity federation. Other features include support for automatically using SNI on outbound TLS requests and configuring minimum TLS versions.

Google Cloud has launched a new GKE cost estimator tool. The new capability enables users to discover the estimated cost as they create a cluster from the GCP console. It helps users quickly understand how their choices of cluster parameters affect the monthly compute bill, and gives them controls for separate insights into each node pool, as well as offering upper and lower bounds based on cluster autoscaling and node auto provisioning. The feature is available in preview.

We pour one out today for Katacoda, the online training and playground tool acquired by O'Reilly in 2019. Once popular with cloud-native vendors, Katacoda powers the interactive training on Kubernetes.io, but only until June 15, when it will close to the public. The company claims that abuse vectors like crypto mining are behind the decision. And the technology lives on behind the O'Reilly paywall.

Finally, a chance to make your voice heard. For the past nine years, engineers around the world have told the researchers at DevOps Research and Assessment, commonly known as DORA, how they make software. These insights help the industry make software better, smarter, and safer.

DORA has now launched the survey for their 2022 report. The Accelerate State of DevOps survey includes focus areas like deployment tool chains, use of the cloud, disaster recovery, how you work, and more. You can find the link and a link to the 2021 report in our show notes. We're excited to help our friends at DORA with this report and represent the voice of the Kubernetes community as we trust our listeners are much more likely to be on the cutting edge of DevOps.

I've mentioned on the show before that Johnny Carson used to send in jokes with David Letterman to read in his monologue. I'd like it on the record that Adam wrote this news item for episode 44 and I just changed the dates.

And that's the news.


CRAIG BOX: Justin Santa Barbara is a longtime contributor to Kubernetes and now leads the team at Google that aims to make Kubernetes easier for users to consume. Welcome to the show, Justin.

JUSTIN SANTA BARBARA: Hi, Craig. It's great to be here. Thank you.

CRAIG BOX: I'm trying to pick where your accent is from?

JUSTIN SANTA BARBARA: I grew up in London. My parents are American. But I moved to California in about 2008 so have been stateside for, I guess, 14 years now.

CRAIG BOX: Now, you went to some nice schools. Got yourself a job in finance, as one might expect. How did you end up in Kubernetes?

JUSTIN SANTA BARBARA: Well, I'm not entirely sure I agree with the premise of the question, which sort of feels like that if you want to work on Kubernetes, you should have a computer science PhD and have worked at Google since grad school. I think one of the things that makes the Kubernetes community so wonderful is that everyone is super talented but they didn't all pursue that same path. We have people that are working on physics problems but they end up having to solve problems in big data and getting involved.

We have people that started out in IT support and followed the ops path. And they really understand how systems fail and how to make them observable and fixable. We have people that could have been artists but ended up deciding that they love to code. And we have people that understand how to bring order to the social consequences of trying to get 70,000 people to agree on anything.

CRAIG BOX: There are a lot of people with a musical background. Is that something that you have any history in?

JUSTIN SANTA BARBARA: I was forced to learn a musical instrument, but I would not say I have a musical background. My brother is definitely the musician.

CRAIG BOX: You don't even want to name which instrument it was?

JUSTIN SANTA BARBARA: I would not be doing justice to those instruments.

CRAIG BOX: Now, like some people in the Kubernetes community, you have had a little bit of startup experience. Tell me about FathomDB.

JUSTIN SANTA BARBARA: Absolutely. FathomDB was a company I started back in about 2007. How I got there was I was working in London, doing the startup bug. But I made the mistake of trying to pursue developer tools. That worked probably about as well as you might expect it to work. And so I was doing some contracting work on the side to make ends meet and got to see really what the problems were in terms of running websites and how to make them work. And certainly, at the time-- maybe it still is the case, but at the time, the big problem was databases and how you run your databases.

CRAIG BOX: That, I guess, works in with the timeline of you moving to California in 2008.

JUSTIN SANTA BARBARA: Yeah. So this was 2007. It was before cloud and before PAS. There were some shared MySQL services, but they were really about low cost and not about being great databases. And so I started a company to try to build those great databases. We called it databases as a service. The company was FathomDB. I got into Y Combinator and moved to San Francisco to pursue that.

CRAIG BOX: 2008 was around about when I got to the UK. But I think the startup scene probably picked up a little bit later than that. Was there a startup scene in London at the time?

JUSTIN SANTA BARBARA: It's definitely picked up since then. There was a bit of startup activity at the time. It obviously took a bit of a beating in 2008 afterwards. But it certainly was not as it is today.

CRAIG BOX: Did you have the prototypical sleeping-on-a-couch Y Combinator experience?

JUSTIN SANTA BARBARA: We definitely had elements of that. I shared a house with a couple of other Y Combinator companies. And there were probably too many of us in that house, or at least more of us than there should have been.

CRAIG BOX: What was the ultimate outcome of that company?

JUSTIN SANTA BARBARA: What we were doing was we were offering MySQL as a service on various clouds. And we started on AWS. We did a few interesting things, which I think were good lessons. First of all, we didn't try to offer more than databases. We didn't offer PHP hosting. We didn't offer memcached, any of those services that were sort of popular at the time.

One day, we wanted to, but the vision was that customers wouldn't get all their services from one company. Rather, they'd pick from various service providers or, if you wanted to do something special with one of those, you'd run your own. So it would be more like food trucks than going to a restaurant and picking off a menu from one company. That was very different from the style of the day, which was platforms as a service where you would get a fixed menu and you really wouldn't be able to make any changes to that.

CRAIG BOX: That said, I remember when I got to London, one of the first things I was doing was working on a migration of some web hosting to Amazon. It was just on the cusp of when they released IBS. We were going to have the problem of how are we going to host MySQL. Amazon came along and solved that for us. What did that do to a startup whose job was also trying to solve that problem on that same platform?

JUSTIN SANTA BARBARA: I believe the traditional thing is to say that it validated the market space, but it definitely didn't help.

CRAIG BOX: Yeah. I think they call it Sherlocking in the Apple space.

JUSTIN SANTA BARBARA: There we go. I think we made the right bet on the idea of individual services but that we got wrong there — or I got very wrong — was the idea that customers prefer to get all their services from their cloud provider. And so when Amazon came along with RDS, which was offering a very similar service to what we were offering, it made it really hard to compete directly on Amazon.

And where we went from that was we thought about, well, OK, are there other clouds that don't have database as a service products, and can we get involved in those? And that's how I got involved in OpenStack because, at the time, Rackspace was a main alternative to AWS. And Rackspace was moving to OpenStack. So you know, the thought was, well, maybe the market will look different if there is this open source cloud offering at the center of it all. Maybe Fathom DB can be part of that.

CRAIG BOX: This is an interesting journey back through history because I remember when OpenStack did its big launch back in the UK as well. Was the market any different there? Were you able to bring the product to users in a different way or was the same problem just repeating itself a few years later with OpenStack?

JUSTIN SANTA BARBARA: I think there were different problems with OpenStack there. OpenStack did eventually start a product to do databases as a service. I actually can't remember what it's called, but they did start one. And there would have been a similar channel conflict. I think we probably would have figured out where we differentiated better in that case.

I think the real problem with OpenStack was that it simply didn't get enough momentum. And there wasn't enough adoption of it for it to actually be the big driver of sales that we and everyone else had hoped for.

CRAIG BOX: Kubernetes came along in 2014. And that, obviously, should we say hastened the decline of OpenStack as a technology. When did you first come across Kubernetes?

JUSTIN SANTA BARBARA: I think I was at one of the Google events where they announced it. I wasn't at the DockerCon, which I think was a couple of weeks before. But a couple of weeks after that in San Francisco, there was an event where Google announced it. And I think Brendan Burns was on stage, I think, and talked about it there. It was immediately interesting to me.

I think FathomDB was slowing down. OpenStack, in my opinion, had already slowed down. And Kubernetes was spinning up. And a lot of it felt very familiar. Kubernetes felt very similar to the sort of operational system that we'd ended up building at FathomDB run these databases. Perhaps I had a bit of a head start there in that I understood the potential and what we could build with these things.

Today, when I look at some of the MySQL operators that are available on Kubernetes, they have been able to achieve a lot more than we were able to achieve at FathomDB. And I think that's a combination of both building in the open with a community and leveraging Kubernetes as the operational system.

CRAIG BOX: Many listeners will have first come across you as the author of the Kops or kOps platform. Do you have a preference on the pronunciation, first of all?

JUSTIN SANTA BARBARA: We didn't. And we are now recommending kOps. It better communicates what we think our value proposition is.

CRAIG BOX: You talk there about the change from platform companies and vendors doing things to open source. kOps was an open source platform from the start. Was that a conscious choice in the work that you were doing?

JUSTIN SANTA BARBARA: Yes, kOps has always been open source. And I think when I decided to get involved in Kubernetes, as did a lot of other people from the OpenStack community, I at least wanted to be very intentional about thinking about how would this be different? How would Kubernetes succeed or at least not fail?

And what were those areas? And could I get involved in them and help push it in the right direction in my little way? The two areas were, first off, I thought it had to work well with AWS. OpenStack and AWS always had a sort of complicated relationship. The people that were early adopters of Cloud had ended up on AWS. And OpenStack was sort of telling them, OK, you have to throw all that away and start again.

And that was a difficult proposition for people that were the early adopters that we needed. With Kubernetes, I wanted to be sure that it worked on AWS, worked well on AWS, and that those users could adopt Kubernetes without having to effectively just start again. That's why I got involved in the AWS cloud provider.

At the time, it was the bash script Kube-up. It still exists. But we basically hit the limits of what you could achieve with bash and salt. And I started a project which was initially called kube-up v2 and rapidly became kOps, which, hopefully, users know and love.

CRAIG BOX: There may be a generational gap in the listeners of the show. And these are, obviously, very, very short generations. Think fruit flies, perhaps. But nowadays, people getting involved in Kubernetes simply go to their cloud provider and click a button, and one is delivered to them. There will be people who remember the introduction of Kubeadm, which was tooling designed to make it easier for tools to build this.

kOps predates all of this. We're going all the way back, possibly around the time of the launch of GKE. But like you say, the state of the art for installing Kubernetes for a period of time was a combination of bash scripts and salt in the directory called heck, if memory serves.


CRAIG BOX: Which speaks well to the technology behind it, shall we say? How did the kOps tool change over time? You mentioned there was the v2 of this thing. Like has it evolved to be something that's still relevant today? Or is it something that was an artifact of its time and we don't really need to consider today?

JUSTIN SANTA BARBARA: Getting back to the thinking about how can Kubernetes not fail, I think one of the lessons I really learned from OpenStack was that Kubernetes has to work out of the box in open source. I felt that some of the OpenStack vendors had decided to make it their business model to make OpenStack easier to run or to work at all. And that set up a sort of negative dynamic where they were not incentivized to improve the core OpenStack experience.

I wanted to be very sure that Kubernetes itself would never become SAAS-only or consulting ware because then, in my opinion, it would have been more of source available rather than a true open source community. The work I did in Kube-up leading to kOps, I think, was around making sure that Kubernetes worked out of the box. I think Kubeadm is a great tool. It is a building block tool.

kOps remains very relevant because it is one of the best open source tools, where you can go and get a fully working Kubernetes installation with the cloud provider, with all the pieces that work, and is tested by the community. And it does that on AWS, on GCE. Ciprian Hacman in the community is adding support for Hetzner Cloud as we speak.

You know, it works across multiple clouds. Even if you never use kOps, I think you benefit from the idea that the baseline that is set by open source is stronger. I think the community is also stronger for the existence of kOps. And there are people that want to use it, and that's great. And there are people that want to use the managed services, and that's great as well.

CRAIG BOX: Yeah, you can almost think of it as an insurance policy to some degree. You want to make sure that the ability to do this is there, even if nine times out of 10 you don't have that need.

JUSTIN SANTA BARBARA: Exactly. And there are people who have evaluated kOps and other open source tooling before deciding to adopt the managed services because they want to be sure that should something go wrong, that they have that insurance policy and can go back to self-hosting if they have to and it's not a huge disaster for them.

CRAIG BOX: Is this something that you considered commercializing yourself at the time?

JUSTIN SANTA BARBARA: The idea was always there. I think I'm happy with where we ended up, with it being a pure community product. There was always the worry about conflicts with OpenCore and all of those sorts of things. I'm grateful to Google for employing me. And the other contributors have also found employment. And I think throughout the entire Kubernetes community, we've given a lot of work away for free. But we have also found adequate compensation for that.

CRAIG BOX: Now you are working in the configuration management space. We have a lot of different terms which, I think, people will have different uses for and, perhaps, different understandings of. So first of all, what does configuration management mean to you?

JUSTIN SANTA BARBARA: You might say that Kubernetes is the tool for configuration management and that you write your configuration, you give it to Kubernetes, and Kubernetes makes it happen. But I think there's a lot of work being done there by "you write your configuration." The distinction I would draw is that Kubernetes is great for actuation, for actually running your configured pods, or disks, or load balancers, or toasters, or washing machines. Whatever it is you're controlling, you declare what you want, and Kubernetes makes it happen.

But now we have this incredibly powerful declarative language. We need to be a little bit more careful about what it is that we declare. To me, configuration management is about managing that process. We help you write your configuration. We want you to share those best practices, share your changes with the community. We want you to write rules to enforce policies and be sure that those rules are always applied.

Generally, we're trying to bring automation to tame all this YAML. This is an area that sort of previously been very artisanal. Like everyone has handcrafted their YAML. There has not been a lot of automation or re-use of the work. We want to bring the productivity and safety benefits of automation to the configuration space.

CRAIG BOX: If we think back to our example of Amazon deployment, and OpenStack in 2008, and so on, at that time, configuration management to me was largely Puppet – and there were another couple of tools that worked very similar. That's the one I happened to use.

There was a domain-specific language in which you could describe the things that you wanted to deploy. And a lot of people then took the ideas from that and came up with the term infrastructure as code. We're going to talk a lot about X as Y throughout this.

Why can't anything just be itself? Why isn't infrastructure infrastructure, and why isn't code code? Why does everything have to be something different?

JUSTIN SANTA BARBARA: I think that is exactly what we are talking about here. You need some representation for what things are. You have to have some language for it.

I think Kubernetes KRM, the Kubernetes Resource Model, is a great language that 70,000 people have convened to define together. So we have actually agreed and defined that language. I think our objection, as it were, is that people are building tooling on top that hides the underlying configuration that we've all agreed on and that that isn't necessarily helpful.

CRAIG BOX: Is it helpful to think of these things as being configuration when there are a lot of people out there who say, I need to be able to express in a more Turing-complete fashion, and I would really rather it was a programming language, and I was able to declare things through code perhaps?

JUSTIN SANTA BARBARA: I would say to some extent it's a false choice. You, obviously, should be able to use code to manipulate configuration if you want to. But we have this universal language.

It doesn't mean that you can't use programs to write that universal language. What it means is that when you and I communicate, we should communicate in that universal language. So when you write your code, you should be manipulating that universal language that we've all agreed. That universal language is the Kubernetes Resource Model, KRM, and those APIs we've defined.

CRAIG BOX: We talk a lot about YAML, which is really just a shorthand for a code format. I used to express configuration for Windows 3.1 in the INI file, for example. There is a format there. Is there a huge distinction at a meta level between the INI file and YAML in that they are just structured ways of referring to things, not talking about what the structure is?

JUSTIN SANTA BARBARA: There is no such distinction. Kubernetes itself supports JSON and YAML formats. We could imagine other representations of the data. Oh, it also supports proto, of course. The format itself is less important from that, other than that we should all agree on it. But it's certainly acceptable to convert between JSON and YAML.

YAML has some advantages, particularly, as compared to proto, it's human-readable. As compared to JSON, it tends to diff better and tends to be more editable. Those are sort of the reasons why I think YAML has stuck particularly with GitOps community.

There could be another language. I don't think that it would be the INI format. But you can map YAML to keys and values if that's what you want to do.

CRAIG BOX: You can, I guess, relate the idea of a language and an alphabet, in the spoken sense. It's like, there are languages, but you could perhaps write in the Latin alphabet, write in Cyrillic and so on. So the format, whether it be YAML, or JSON, or INI, or whatever, doesn't really matter so much as the structure we've agreed upon as to how we will relate all these concepts together.

JUSTIN SANTA BARBARA: I think that's fair. And it's certainly easier if we all write with the same alphabet. We're in Spain this week. And it's certainly easier to speak in English as the universal language. But we can get by with translation tools as well.

CRAIG BOX: A lot of people are writing their Kubernetes manifests using some sort of templating language. And Helm is a packaging tool which includes templating capabilities. The idea is that you want to say, here is a different piece of configuration for my configuration.

I've defined what it means to be an app. But it might be that there are certain values that I want if I'm in this environment and certain things from somewhere else. That feels like the way a lot of people treat this problem. But it also feels like the amount of tooling that's come on from there suggests that that is not in itself enough to meet the needs of the community.

JUSTIN SANTA BARBARA: I mean, Helm is very popular. Like Helm is meeting needs of the community. I don't want to knock Helm in any way. And we're very grateful for what Helm has done. I think when we are talking about how we can take it to the next level, that's when we have to look at the downsides of Helm as well. We look at the tooling that's out there. People use a huge variety of tooling, mostly according to their personal tastes. You know, they are artisans. And they will use their own tooling.

There are the templating languages like Terraform, or Go templating, or Helm templating. There are infrastructure as code in real programming languages. I do think they're all code in the end, just with different levels of abstraction.

But there are these different types of tooling. And I think the problem with them all is that you have to switch to specifying everything in this different language. When a new parameter comes up, you have to add it to the values.yaml and plumb it through that machinery.

Another problem is you can't just take a change in the output syntax, and sort of reverse the machinery, and know what the input values should be. So when you're talking about making a change across packages, like, for example, to apply a security fix, you have to figure out how to do that reverse mapping for every single package individually one by one. And if you're using multiple tools, the problem just gets worse. And so this is a particular instance where I think the introduction of another layer of tooling, another language, if you will, has made it harder for us to proceed and move to the next level of configuration management.

CRAIG BOX: So if something like Helm generates some YAML for me, and then I look at that and say, this is the thing that gets sent to my cluster, but as you say, there is a security vulnerability with some value that needs to change, I'm taking from what you say that there's no way for me to be able to find the value in the YAML output and then figure out which of the various Helm input files I need to put that into.

JUSTIN SANTA BARBARA: It's certainly a very difficult CS problem. It would be a fun CS problem to solve. But it's certainly not solved by the Helm tool today. And it can't be solved in general.

We have ways to make it better. If you're just validating the output of a Helm chart, that's fairly straightforward. You do have to write an adapter to be able to render the Helm chart or to be able to run the JSON and produce that YAML. But you can run your validation consistently.

At the end of the day, though, when you write that validation, you're going to be validating Kubernetes objects. You're not going to be validating the Helm template. And so effectively, you've just introduced this layer of complexity on top of what you're doing by introducing Helm here.

What Helm doesn't allow is you can't write back. So we're giving you sort of problems and not solutions. So we say, oh, look, this service has a public IP. You have this rule that says you can't have public IPs.

You can't then say, oh, let's feed that back and find the right values.yaml entry. Either you go and you change that Helm chart to add that entry to values.yaml and plumb it all the way through, or you forever maintain a sort of patch on the end, where you do fix it up.

And at that point, are you really using Helm? Probably not. But I mean, I think actually that begins to go to where we expect the actual future to be. It's not a bad outcome if you take some YAML objects generated by some code or by some templating language, you validate them, you manipulate them if you want to, and then you apply them.

CRAIG BOX: We had this concept of infrastructure as code. And the thing isn't actually code, really. It's configuration. It's a main specific language that some piece of code takes and actuates.

Now we're looking at that future being what you are thinking of as configuration as data. How do we relate these product categories to each other? Are they the same thing? Is X as Y the same as Y as Z?

JUSTIN SANTA BARBARA: I think configuration as data is sort of the insight of removing unnecessary layers of complexity. The fallacy, I think, in infrastructure as code is that you do end up applying data to the Kubernetes API. You can put it through a lot of layers of abstraction. But at the end of the day, there is always some data there.


JUSTIN SANTA BARBARA: When you're using infrastructure as code, you're still ending up with your configuration as data. And our argument is let's acknowledge that, and let's be mindful of that, and let's start trying to get some benefits from that insight.

Let's start manipulating that data and treating it as a first-class citizen that we can analyze, that we can apply changes to, that we don't have to necessarily push on the piece of string to reverse the machinery. Let's manipulate the data directly.

CRAIG BOX: So you've had to build a code as configuration layer so we can have infrastructure as code, as code as configuration, as configuration as data. Am I getting that right?

JUSTIN SANTA BARBARA: I think there are definitely too many layers going on there, which, I think, demonstrates the point very well.

CRAIG BOX: All right, so let's strip all those layers back and say, for someone, perhaps, who is familiar with the idea of dealing with the templating language, perhaps someone who's using Jenkins to put the values in, what exactly are we talking about when we talk about configuration as data?

JUSTIN SANTA BARBARA: Let's get very concrete about what this actually means. We don't need to throw away all your work. You know, what you've done is great. It's working, probably. And it's a good start. But we want to basically keep going from there.

When we think about what doesn't work today with that scenario is the problem we described where if you want to make a change, you have to go back and insert a parameter into values.yaml. Hopefully, it exists already. But if it doesn't, you have to add it and get that upstream or into your private fork of the Helm chart.

And this is a problem that Ryan Grant at Google calls overparameterization, which is the sort of observation that as a Helm chart matures, you end up with every value in your manifest being a parameter that is exposed in values.yaml. We have this community of 70,000 people debating every nuance of API design in Kubernetes. And then we just flatten all of those into a key value map.

That makes me sad as a member of the Kubernetes community that worked hard to build those. But also, as a consumer of the Helm chart, you're not particularly well served either because you have to learn that chart API in addition to the Kubernetes API.

And you're not building your intuition about the Kubernetes API. When people are looking at the complexity of their Kubernetes YAML, I feel like part of that is an objection to this problem. It's that we now have these two languages that we have to know, have to understand. And the one that I work with isn't the one that I work with when something goes wrong.

CRAIG BOX: One of the earlier attempts to break those languages down and only to be working in one place was the kustomize project, where we have the idea of some subsets of YAML that was merged together with other ones and possibly, by way of labels, to generate the YAML objects at the end. You are now dealing with the same configuration language, if you will, all the way through. What are the downsides to that approach?

JUSTIN SANTA BARBARA: I don't think it's a bad approach at all. The configuration-as-data team is actually the same team within Google that is working on the kustomize project-- is working on this project also. And we're building on everything that we've learned from kustomize. And kustomize is a great tool, and we love it.

And we're trying to carry on the utility of kustomize. So yes, you can do a lot of this with kustomize. And KPT and kustomize actually share a lot of code. KPT is one of the tools as part of this configuration-as-data toolkit.

We also want to support the same sort of functionality in the kustomize pipeline. What we do want to do, which is going further, is to build a more integrated experience that encompasses more of the lifecycle of packages. And that isn't something that is in scope of the kustomize project itself.

Specifically, what we're doing is building in this notion of the change workflow. You edit packages, you validate them, you approve them. And we're building a knowledge of roll-outs.

So today, you can absolutely build your own process for this. You can build a PR process in your GitHub, or GitLab environment, or whatever it might be. You can build CI pipelines to run that validation. You can use Jenkins to roll out the configuration once it's approved.

And if you've built that, I'm sure you can attest that it is a lot of work. And I have been talking to people this week at KubeCon. And I have yet to meet anyone that loves the process that they've built. It ends up sort of feeling bolted together.

And what we're trying to do is put together the framework and give you that process that works out of the box. You'll be able to propose a change. Or your validation and policy logic is guaranteed to run against that change. Reviewers can approve and review or reject the change. And then approved changes roll out to your environments however you have that configured.

I'd mention that this is stuff we are still building out. So it isn't the case that you can download this from our GitHub repo and just start going. But my hope is that if you're building this, if we build it together in open source, then everyone is better off. We get a better product in less time.

CRAIG BOX: Let's step back a bit and perhaps talk about the concept of a package. I know that people have two distinct use cases for software like Helm. There are a lot of people who are building something internally. And they want to do templating and roll out the thing that they're building themselves.

And then there are people who want something off the shelf. And let's pretend it's MySQL in this case as something you have some experience with. They will find a Helm chart that defines that thing. And they will take that in the same way that they might previously have used apt-get to download a packaged version at someone else.

So the idea is someone has bundled that up for them and made a thing of it. Are people who are doing their own internal thing, are they building packages as well? Or is there a distinction between what people do with their own software and what they do with something that they get from a different team or a different vendor?

JUSTIN SANTA BARBARA: I think what we'd like to be the case is that there is no distinction, that packages are packages, that the big difference is whether it's sort of a blueprint package that is intended to be instantiated, or whether it's an actual instance of a package ready to be deployed. I think one of the problems with the tooling ecosystem as we've seen it today is that that isn't necessarily the case, that there is a barrier to building these packages internally, that there is a barrier to contribution when you make changes.

It's sort of hard to get them back upstream. And what tends to happen is that people tend to fork the packages, and not make their own changes locally, and deviate from the upstream package. And the problem there is, of course, if there's a security fix or some urgent thing like that, you're not getting the benefits.

One of the principles of the Kubernetes Resource Model is that the format that you work with declaratively on disk, the format that you write, is the same as the live-state format. In other words, when I kubectl get, it looks a lot like what I kubectl applied. And that is not the case with, actually, most APIs. It's sort of the linchpin of declarative. But it is a deliberate and difficult design decision.

And what that means is when we're talking about building those internal packages, I can now take the live state with a kubectl get. And that is ready to go in the format that I need to create a package. And if we remove those layers of templating and remove all that machinery, we're much more able to take the state that we see.

Maybe it's development environment that you built up by hand. Maybe it's a prod environment that you want to duplicate and run a sort of QA environment. But we're able to take that environment and immediately convert it into the package format in a way that hasn't been the case elsewhere, where we often see customers and users complaining, saying that what takes them a couple of minutes to do through ClickUps can take days or weeks to productize or package up into something that they can then deploy across their fleet.

CRAIG BOX: How much of this already existed in the KPT toolkit? And how much of it is new, just released this week?

JUSTIN SANTA BARBARA: This week, we announced a few things. We announced Config Sync being open sourced. We announced Config Connector being open sourced.

KPT has always been open source and developed in the open. We have actually been developing a lot of this work in the open already. But we are sort of announcing it now.

What we're adding this week is the server-side version of a lot of the KPT functionality, calling that Porch. It is in the repo. And what it does is it effectively gives you an API where previously you had to call into the CLI.

So if you go back into Kubernetes history, a long time ago, we moved deployments from being a client-side thing implemented in kubectl to moving and running on the server instead. kubectl definitely worked for deployments. But it is a lot easier to work with the server-side version. It's a lot more extensible. It's a lot more composable.

CRAIG BOX: You say Porch there. That's porch as in where we sit with our rocking chairs?

JUSTIN SANTA BARBARA: Yes, it is spelled that way. But perhaps Porch more as a great place for community, particularly in these COVID-aware times. It is short for package orchestration.

CRAIG BOX: Orchestration is, obviously, a theme of the Kubernetes environment. Is it feasible to think of packages as becoming a first-class citizen? And there's a lot of work being done by various different groups, none of whom all seem to agree on the concept of an application being a thing inside Kubernetes.

You mentioned deployments there. The deployments are very much what the software teams might think about. But when you think about a package, it might relate to multiple binaries on the disk-- in the DB, an example again. But it's a high-level abstraction again. Is there a place on the Kubernetes Resource Model for packages as a thing, a concept with a name?

JUSTIN SANTA BARBARA: We believe there is. Package is one of the API resources in Porch. One of the things that you get if you use Porch is a view, an API onto all your packages in their various states. Be it deployed, be it proposals that are in progress, you get that single pane of glass across all your packages.

CRAIG BOX: You've talked there about approval flows. That's something that people do, like you mentioned, with GitLab, GitHub, et cetera, GitOps as a concept. Does this work in that model? Or are you proposing something completely different?

JUSTIN SANTA BARBARA: Porch absolutely has to work with the existing stores of YAML and packages as they are found today. The idea here is that we'll still have backing storage where that YAML configuration is stored. And we support two backing stores today.

We have Git, and we have OCI. OCI is the 2022 name for Docker container images, now that nobody thinks that shipping containers are quite so amazing anymore as we've watched that system break down. Anyway, if you're using Git, we want to interoperate with the PR workflow that you're probably already using today. And I'll be honest, there are some tricky cases here.

Like what if the user force pushes something to the PR that isn't valid YAML, for example? We'll figure those out together. One of the challenges both for us and for users is that there's no consistent API today for viewing the state of those proposed packages in the various Git forges or for automatically approving them. And so that's one of the things that you'll get by going through the Porch API. But I'd emphasize that we want the existing PR flows to keep working.

CRAIG BOX: When I'm downloading and running one of these packages, how am I making the configuration changes for my own environment? Is that by way of extra pieces of YAML or by way of pieces of code?

JUSTIN SANTA BARBARA: One of the things we're trying to do is we're trying to simplify that base package. We're trying to say, don't have a complex machine that produces your YAML. Ideally, start with just static YAML. The problem is, of course, you often don't want to stop with static YAML. You want to do other things to it.

The first mechanism we have is called functions. So we acknowledge the need for code to run to make changes to that YAML. So the canonical example here is setting labels or setting annotations on your YAML. That's something you want to apply to a whole bunch of objects. And you want that to be pretty consistent.

So we have KRM functions. Those exist today in KPT. We are proposing to distribute them as Docker container images. What that means is that a bunch of problems are sort of solved by the community. We have the problem of versioning that's solved. Distribution, signing, and attestation, those are all problems we know how to deal with at this point.

You can basically put any code you want to into one of those functions. But we put a sort of name on it. And we pass some parameters to it. And those are a nice way to manipulate your YAML in bulk.

CRAIG BOX: How much do we really want to have people developing software touching YAML at that level? We think about it as being the assembly language of the cluster. Surely, the platform teams want to just define some way of having developers say what they want and then not touch the YAML, have some system generate it for them. But the benefits of this apply when you have one team going all the way through. When you've got two different teams, what's the right mix there?

JUSTIN SANTA BARBARA: I think you've hit on an interesting point about the philosophy of the tools that we use today. We may have made our lives a lot harder by abstracting away the Kubernetes resources. So you take a complex template. There's going to be a ton of parameters in there. And some of those, you want to allow, and some of those, you don't.

And today, you basically go through that Helm chart or whatever it might be, and you say, well, I want to allow these, and I don't want to allow these. So you block off the ones you don't want to allow in some way. And then the Helm chart updates. And you have to go through and look for all the new ones and pray you don't make a mistake and let something through. It's a lot of work. And often, what happens is you're not going to update that Helm chart as often.

And instead, in this approach, we say, you should just specify the actual constraints that you care about directly on the Kubernetes resources. Those constraints, once you specify them, they're going to work across all the packages. So you don't have to go through Helm chart by Helm chart. You just say, I don't want any public IPs. Write that once, and it works everywhere.

The constraints work across tools. We can share those constraints across companies and across teams. So we have some work to do to write those constraints. But we can do it together. And we can share it as a community.

CRAIG BOX: Is this a case where it would be beneficial to be able to change the Kubernetes API to have a field on each object now which lists constraints and so on so that the data is able to be there across the tools without having to have languages that are different for each set of tooling?

JUSTIN SANTA BARBARA: Kubernetes already has a schema definition, the OpenAPI. So the objects will have a structure. There are some schema-level validations on those.

So you can say the number of replicas has to be a positive integer. I actually don't know if that's specified. But that could be specified there.

I think one of the challenges with that API is that it is shared across all the users of the Kubernetes API. And so you might say, I don't want any public IPs. But someone else might say, actually, I really need public IPs. Or I want public IPs in my development environment but not in my production environment.

Here's where I think we end up. We end up with each company sort of consuming some common constraints from a shared library of them, maybe writing a few of their own but not too many. And that augments the Kubernetes OpenAPI. And you end up with those constraints. And you're able to enforce all of those constraints across your organization.

CRAIG BOX: You mentioned ClickUps before. And one of the things that I saw in the announcement was a plugin for the Backstage developer portal API to be able to do a lot of this work with the mouse, if you will. Back to our Windows example, perhaps. You can configure your INI files, or you can go click a bunch of check boxes in the user interface. What are you able to do with the Backstage API. And when is the right time for people to click through things versus write them in configuration or code?

JUSTIN SANTA BARBARA: UIs are super convenient. You'll find me using UIs sometimes, even though I could probably muddle through the YAML. The distinction is Kubernetes is great. And it doesn't feel complicated if you use something every day.

But if you're dipping into it occasionally, that's when it can be a bit frustrating. YAML isn't particularly discoverable. And that's the problem that we typically solve with a nice UI. So if you're using something occasionally, that's where a nice UI is wonderful.

The problem we have with the templating approach is it's hard to build a reusable UI. You're building a UI on top of the Helm values.yaml. And those are different for each package. If you fork that package, they're different for your company as well.

Instead, what we end up doing if we make the Kubernetes API the center of attention here is we end up building UI components on each object, on each object kind. We end up building UI components for each function. What we end up with is a UI where you start off with your base package. And then UI components light up based on what is in that package.

So we see a deployment in there. And we light up a nice UI which enables you to tweak the number of replicas, for example. We can do things like progressive disclosure to make it a bit more of an easy UI. But that's something that still makes the simple things easy but the hard things possible.

And importantly, you get sort of seamless validation across all your in-house rules, even in the UI. And when you do click that button, we don't immediately apply it. But rather, it goes through the approval process that you've carefully defined.

CRAIG BOX: I was going to ask about that because in the standard GitOps approach, it then goes through a system where there is a history, which you can go back and validate each change if you need to. Is the approval process that Porch uses similar to that? Is it the same in effect?

JUSTIN SANTA BARBARA: I mean, we want to make it as similar as possible. And ideally, if you're using GitHub, it will be an actual GitHub PR under the covers. One of the things we're bringing to the table here is we're making it sort of both technically and economically feasible to build these UIs.

If you're having to build UIs in your own company across each one of your packages, that's a lot of work you have to do. And maybe if you have a lot of users, it makes economic sense. But when we, as a community, can write a handful of reusable UI components that work across the relatively small number of Kubernetes API objects, then we sort of change the equation there.

When we've built this API that exposes the approval and review process, you can plug that into your own approval and review process in a way that-- of course, you could do that with a UI that you hand-crafted yourself. But it's a lot more work. And it probably is so much work that you're not going to get to it.

CRAIG BOX: How do we go from where we are today, which, in many people's cases, is very simple as just generating the barest minimum-effort template to get something running to a world where we have the configuration as data, which allows us to have our UIs run on them without any thought, for example? It does feel like there is going to be a big leap. And we are also dealing with the 5% to 10% of the IT world who's already made that big leap. How do we then think about the other 90% to 95% of people who have never started the Kubernetes journey? Is it worth them doing things the wrong way and moving to the right way? Or is there a way to jump straight into what we're now saying is state of the art?

JUSTIN SANTA BARBARA: One of the things about my personal journey is the importance of meeting people where they are and growing with them. If you're using Helm charts today, we're going to meet you where you are. And if you want to, you can use all this tooling as a fancy single pane of glass across all your Helm chart instances.

You don't have to do much more than that. You can use it and get approval processes. You can do things like query for all the Helm charts that might be out of date, whatever it might be.

You can incrementally from there, though, add functions. You can incrementally add direct editing. You can add validation. You can add change approvals and workflows to that.

If you're already using some of that in GitHub, we'll try to make sure that it works exactly the same. It might not work exactly, exactly the same. But we'll get as close as we can such that it won't be disruptive.

If you're in one of the 95% that hasn't even started, hopefully we'll have a library of packages that are easy to consume, that make a lot of sense, that aren't expensive for you to consume in terms of being able to apply all these things. You don't have to build all these things that other people have spent a lot of time building. You can just consume our effectively referenced implementation.

It's going to be open source. So if you don't like something, you don't have to throw it all out and start again. You can just tweak it. We'd love for you to tweak it and submit those changes upstream. But you don't have to. You can keep it private.

CRAIG BOX: One of the things I used to do was build Debian packages for internal things. And there is a question again about the effort required to build something to a standard where it can be distributed outside the company or possibly to arbitrarily anyone on the internet versus just SSH-ing to each machine and copying the stuff over. Do you think that the work you've done here will make it easy enough for people to do the correct thing so that they don't want to do the easy thing?

JUSTIN SANTA BARBARA: I hope it will make it easier for people to do the correct thing, yes. The reason being this principle of KRM that you can take the live state, and you get a ready-to-go package of YAML. I will say the art here will be that we want those redistributable packages to be minimal.

One of the problems with templates is that they are very much not minimal. When it's the only tool you've got, you've got to put everything in that tool. And so they end up expressing every option. It's sort of the opposite of minimal.

And so the question is, when you have a package that contains some YAML, how much of that should be the base, and how much of it should you extract into, say, a set labels function? How much of it should you extract into a different function? How much should you allow users to change directly?

CRAIG BOX: You mentioned Config Sync and Config Connector there? We've talked to some of the authors of those tools in the past. But do you want to just give us a brief rundown on what those two tools do and why they've become open source?

JUSTIN SANTA BARBARA: Absolutely. So they are part of the end-to-end story here. Config Sync is a tool which is able to pull YAML currently from Git and apply it to your Kubernetes cluster and is able to report the state of that apply back.

In other words, if we have a package of YAML that's sitting in a Git repo, we're able to apply it to the cluster and reflect back some state about whether it applied correctly. So you know, when you're applying that package to 1,000 clusters, you want to know that it applied correctly to all 1,000 clusters. And that's something that Config Sync enables.

The other piece, Config Connector, is about expanding the reach of the Kubernetes Resource Model. Config Connector is a set of operators that enables you to provision GCP resources, Google Cloud Platform resources. So for example, you can create a GCE instance or a Cloud SQL instance using Kubernetes. And that enables us to configure more resources than just sort of the application use case that Kubernetes has traditionally been used for. I hope we end up in a world with these sorts of things where KRM is the universal language for computing, Kubernetes is the tool that is used to apply it, and configuration management is how we manage all those care and resources.

CRAIG BOX: Let's bring that back around now to the idea of the Cluster API and that being a way of defining clusters using the KRM. How does kOps work in the world where we now want to use the KRM as the source of truth for everything?

JUSTIN SANTA BARBARA: I've been involved in kOps for a long time. We've tried to do a lot of things along the way. We have not yet got Cluster API into kOps, though I continue to try.

kOps has actually always been KRM based. The language which you use to define clusters and what we call instance groups, which are effectively groups of nodes in kOps, is KRM. They are using the Kubernetes machinery.

They don't live today on a Kubernetes API server, but they could. The tooling that we are building with configuration as data will actually likely work very well with kOps as it is today. I think one of the things that I'd like to see for kOps, though, is adopting more of these things going forward.

So I'd like to see us have Cluster API support. I continue to work. I continue to try. We probably will one day, touch wood.

The other thing that I think we can learn a lot from is the whole configuration-as-data approach and this whole idea of more structure configuration management. So for example, today in the kOps cluster, we embed all the cluster component configuration. It's all embedded in this cluster object. And that object becomes sort of big and monolithic.

And it ends up looking a lot like a very complicated Helm chart, for example. It's a generator from an input value that has some very complicated machinery and produces some objects at the other end. They aren't currently KRM objects, but they could well be. And maybe they will be soon.

But can we break up that kOps cluster object into a more minimal package? And we're working on that. Upstream Kubernetes has decided now that component configuration, which is the KRM way of specifying flags, is now the way to go. And so we're trying to use that to declare the state of the components and hopefully move out some of the complexity out of that base package, out of our cluster object.

CRAIG BOX: Your first PR to the Kubernetes project in 2014 was taking some early steps to add support for IPv6. And I hear that the kOps team have finally mostly got it working!

JUSTIN SANTA BARBARA: That is mostly true, yes. It's actually wonderful to see, after eight years, my very early steps on IPv6, which I would emphasize are very early in my starting contribution to the Kubernetes project. I don't want to overstate my contribution there.

CRAIG BOX: I did say only mostly working.

JUSTIN SANTA BARBARA: The Kubernetes community has done a lot of work on IPv6. Both AWS and GCP have added IPv6 support to their clouds. And so it is now possible to run pure IPv6 pods on Kubernetes.

And it actually works very well. There are some snafus around load balancers. But you should go watch the talk about it and learn where we are today. I hope in eight years we're sitting here, and configuration management is just the natural way everyone does things, and we're talking about billion-pod clusters that, of course, all have IPv6 addresses.

CRAIG BOX: All right, well, thank you very much for joining us today, Justin.

JUSTIN SANTA BARBARA: Thanks, Craig. It's been a pleasure.

CRAIG BOX: You can find Justin on Twitter at justinsantab, and you can find his work on the web at kpt.dev.


CRAIG BOX: That brings us to the end of another episode. If you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter at KubernetesPod, or reach us by email at kubernetespodcast@google.com.

You can also check out the website at kubernetespodcast.com, where you will find transcripts, and show notes, as well as links to subscribe. Thanks for listening. And we'll see you next week.