#250 April 1, 2025

Kubernetes Resource Orchestrator (KRO), with Jesse Butler and Nic Slattery

Hosts: Abdel Sghiouar, Kaslin Fields

Today we welcome Jesse Butler and Nic Slattery to talk about the Kubernetes Resource Orchestrator, or KRO. Jesse works as a principal product manager at AWS and Nic is a Product Manager at Google. The Kubernetes Resource Orchestrator is a new cloud agnostic tool meant to simplify Kubernetes resources for devs and platform admins.

Do you have something cool to share? Some questions? Let us know:

News of the week

  • Kubernetes JobSets: An open-source API for managing distributed jobs as a single unit. Integrates with Kueue for better resource utilization.
  • Google Cloud Next ‘24: Happening in Las Vegas, April 9-11. The Kubernetes Podcast team will be there!
  • Kagent: A new open-source AI agent framework built on Microsoft’s Autogen, designed for automating operations and troubleshooting in Kubernetes.

ABDEL SGHIOUAR: Hi, and welcome to the "Kubernetes Podcast From Google." I'm your host, Abdel Sghiouar.

KASLIN FIELDS: And I'm Kaslin Fields.

[MUSIC PLAYING]

ABDEL SGHIOUAR: In this episode, we talk to Jesse Butler and Nic Slattery. Jesse is a principal product manager at AWS, and Nick is a product manager at Google. We talked about the Kubernetes Resource Orchestrator, or KRO for short. KRO is a new cloud-agnostic tool meant to simplify Kubernetes resources grouping for developers and platforms' admin.

KASLIN FIELDS: But first, let's get to the news.

[MUSIC PLAYING]

Kubernetes introduced JobSets, an open-source API for representing distributed jobs. JobSets allows users to manage child jobs as a single unit using a higher-level API. It also integrates with Kueue and allows users to submit job sets via Kueue to oversubscribe their clusters by queuing workloads to run as capacity becomes available.

ABDEL SGHIOUAR: Google Cloud Next will be held in Las Vegas, Nevada on April 9 to 11. The "Kubernetes Podcast" team will be on site, so come by and say hello if you are going to be there. Keep an eye on our social media for more info about where we'll be at the event.

KASLIN FIELDS: Kagent is a new open-source agent framework introduced by Solo. The framework, which is built on AutoGen from Microsoft, allows DevOps and platform engineers to run AI agents in Kubernetes, automating complex operations and troubleshooting tasks. And that's the news.

ABDEL SGHIOUAR: Hi, everyone. Today we welcome Jesse Butler and Nic Slattery. Jesse works as a principal product manager at AWS, and Nic is a product manager at Google. We will be talking about a recent joint announcement called K-R-O, or KRO. The Kubernetes Resource Orchestrator is a new cloud-agnostic tool meant to simplify Kubernetes resource management for developers and platform admins. Welcome to the show, Jesse and Nic.

JESSE BUTLER: Thanks so much for having us. Really, really happy to be here.

NIC SLATTERY: Yeah, thank you.

ABDEL SGHIOUAR: I'm so excited for this episode. We've been trying hard to make it happen, so I'm excited that it's finally happening.

NIC SLATTERY: Me, too.

JESSE BUTLER: Yeah, me, as well. I think that was my fault, right? I had a bit of a plumbing issue in my house, so, you know.

ABDEL SGHIOUAR: Yeah, I'm well aware of that. [LAUGHS] No problems, no problems.

NIC SLATTERY: I'm not sure people want to hear about your plumbing issues, Jesse.

JESSE BUTLER: Not mine, my house. So yeah, back to the program.

ABDEL SGHIOUAR: Yes.

[LAUGHTER]

All right, so I'm going to start with-- I mean, I've been talking about KRO for a while. I've been posting about it. We wrote some articles. We're still writing more articles. I just did literally a talk today at 9:00 AM in the morning. I did a one-hour talk in an internal conference, and KRO was the one that got the most amount of questions, which was super interesting. So let's start with an introduction. Can any of you-- one of you or both of you-- tell us what is KRO?

NIC SLATTERY: Yeah, sure. Well, I think the explanation requires a little bit of a backstory. But basically, what we found in the Google team is that most of our customers are building internal developer platforms on Kubernetes, and they were struggling with the orchestration layer, so how to build an orchestration layer that is native to Kubernetes and works across clouds or across any of their third-party tools. And what we found when we investigated this is that there was no way to group resources in Kubernetes in a way that it runs server-side, and it's Kubernetes-native.

And it just makes it easier for platform teams to build services that can be easily consumed by their end users, who are typically developers or data scientists. So KRO provides that mechanism to group resources, and so platform teams can use KRO to essentially define repeatable patterns for how services should be consumed across their organizations.

JESSE BUTLER: Yeah, that's exactly right. And I guess my backstory isn't-- I maybe preloaded that. So we at AWS were having the same discussions, the same problems. There's no Kubernetes-native feature that does this. We also were finding that customers were kind of writing their own custom controllers with custom resource definitions and going through the Kube builder journey, which is-- it's fun, and it's fine, but if you manage 50 or 100 of these things, and it's not really your day job, it's not as fun.

So that was another problem we were trying to solve. When you're building Kubernetes operators simply to use other Kubernetes operators and not putting business logic into it, we're like, well, we could try to solve that problem. So yeah, sort of last year, earlier last year, we were working on this problem. Some open-source friends at Azure pinged us, said, we're trying to look at this problem, and we know that you're looking at this problem. Maybe we could collaborate.

Through a mutual customer, both Nic and I heard that we were both collaborating, so we said, hey, let's just get in a room. And so we got in a room. And it basically was a great set of conversations where we realized, hey, we all have the same problems. And from the position of cloud service providers, we want to solve these problems for our customers.

And one of the other problems in Kubernetes, as we all know, is that there's 85 different ways to do everything. So what we wanted to make sure is that we weren't going to build three new ideas that then customers had to choose and context switch between depending on where they're running our application. So we sort of said, well, let's just align.

And so we took some of the ideas that Google was working on and some of the ideas that we had and some of the things that Azure was working on, and we combined them all around this one, simple primitive to make it, ideally, something that we could see as a Kubernetes feature, rather than another end-to-end solution, something that you have to build and manage and learn all of the ins and outs of. Let's just say, let's keep it simple, and let's just make it what it is. Limit the scope. Don't try to boil the ocean. And that's what we did.

And so, yeah, it's been going really well. It's not viral, which makes me happy because we didn't want it to just explode because it's still effectively pre-alpha as a project. So right now what we're doing is iterating toward an alpha, iterating toward code complete as we're hearing from customers that we're on the right track. Customers, also ecosystem members, community members, college kids who are hobbyists in Kubernetes, everybody really likes this idea. So that's kind of, in a nutshell, where we are and why we're on this journey.

ABDEL SGHIOUAR: I mean, you're saying you didn't want it to be viral, and for a project that didn't want to be viral-- I was pinging Nic yesterday. We were mentioned on the InfoQ. There is an article about it on The New Stack. There is a long discussion on Reddit that you both have been responding to. So it's created some noise. So there are some discussions, which is good.

JESSE BUTLER: Yeah, it is good.

NIC SLATTERY: Jesse's just being humble.

JESSE BUTLER: Well, I mean, I think we're all pretty humble because we-- like I said, we just got in a room and said, hey, what if we solve this problem this way? And it just has resonated. And I think one of the really exciting things is that it's not yet another direction to go in. It's more returning to basics and saying, how does Kubernetes work? What can we do?

And so that was sort of at the root of it. We said, well, let's not go build another huge ecosystem. Let's not build a new DSL. Let's not have a whole new tool chain. Let's just look at Kubernetes. And so the heart of this project is just that. It's, how do we make the KRM work for you, especially when you're just combining resources?

So at the heart of it, we have this idea that we'll dynamically create, register-- install and register CRDs for you on the fly based on your specification of this resource graph definition. And then we have a single controller that adapts, and it's an adaptive controller. So in place, it learns how to support those new resources. So we think that it really simplifies that whole model. And certainly, if you just want to use resource A and resource B together, you shouldn't have to write a whole new controller. So that's part of it.

And then the other part is, as we do see a lot of interest in it, we're really trying to keep it focused. We just want this project to be this project. We don't necessarily need it to be the next new thing. It's really just Kubernetes. It's just looking at it a little differently.

ABDEL SGHIOUAR: Yeah. So I want to step back a little bit and try to answer a very specific question because like KRO has some stuff that makes it different from tools that already exist that try to solve the same problem. But I think that resource grouping, as you discussed, or resource composition, however you want to call it, why do you think that that's important? Why, in your mind, being able to group resources under an umbrella resource or a meta resource is important?

NIC SLATTERY: I can give the high-level answer, and I think, Jesse, you can get more into the technical details, if you like. But Jesse already spoke to, two of our goals were simplicity. So how do we solve this problem in the simplest possible way using whatever tools are already available to us? And then focus-- so focus the scope of the project so it doesn't snowball into this behemoth or some end-to-end solution that we're pitching. It's just focused on a particular problem, and that problem is platform teams who run applications on Kubernetes.

And they're tasked with providing a Kubernetes-based platform to their end users and a bunch of other services on top of that. And then they have to go and sell that platform to their end users, as well, which means they need to provide a simple end user experience. So that's the use case for the customers that we're talking to and people we're addressing with KRO.

And the challenge they were having is that they wanted to provide that simple end user experience, but they also wanted the monitoring capabilities of Kubernetes. So Kubernetes provides this thing called CRDs that lets you extend the Kubernetes API. And so we thought, OK, let's just use CRDs. Let's not invent a new construct.

Let's make everything native to Kubernetes so that the professionals working on Kubernetes can use all the benefits of it but also then provide a developer who may not know what Kubernetes is with an experience that they can wrap their heads around. So I know that's a vague answer to your question, Abdel, but it is very focused on the actual problem that we're trying to solve.

So what are our customers saying to us? What do they actually want us to do? I think our team at Google and Jesse's team at Amazon and our counterparts at Azure wouldn't have all come together around this unless all of us had customers saying, we want you to do exactly this, please. So that's really the impetus for the whole project.

JESSE BUTLER: I actually like your answer because it's not vague. It's pretty specific. This is a problem we have. And honestly, it's across the ecosystem. If you are a hobbyist, you'll encounter this. If I want to get really deep into Kubernetes, and the next thing you know, it's like the meme of the woman standing with the 55 Ethernet cables going, "Help. I just want email." It's kind of like that.

So I like that answer. I think the complexity or even the vagueness comes from how you get there. There's just so many different ways to get here, and a lot of them come with other whole ecosystems and this other toolchains and other solutions. And in Kubernetes, we get back to that question that has been driving a lot of us for several years is, if Kubernetes is a platform for platforms, then where is the platform?

NIC SLATTERY: Yes.

JESSE BUTLER: And it's like, we keep having these incredibly powerful abstractions, and Kubernetes just got so much stuff right. How do we leverage that with the next layer of abstraction? And so we have an enormous amount of awesome projects and tools and capabilities in the CNCF ecosystem, and those work with Kubernetes often as it is.

And really, that's where we said, hey, if we try to get down to first principles and really look at the ground truths of the problems that we're trying to solve, we could build these little primitives that extend on other things that already existed, like CEL. Let's use CEL. Let's make better use of it. So maybe we can avoid more DSLs or having yet another compiler-compiler sort of stuff.

So I think that's why that answer really works is that it sounds vague, but it's just a big, broad problem space. And so that's really the space that we're playing in and trying to just solve one little problem that's hopefully useful for a lot of use cases and other projects to integrate with.

NIC SLATTERY: Yeah, I agree with that.

ABDEL SGHIOUAR: And I think, in a way, the core of the problem that KRO is trying to solve, creating this abstraction layers, is not new. There have been tools trying to do that. I mean--

JESSE BUTLER: Absolutely.

ABDEL SGHIOUAR: --arguably Knative was trying to do that, specifically for serving applications, but that was kind of the purpose in a very dumb way, if you want. I'm not trying to compare the two. But I think that's one question that comes-- I mean, this is basically on the Reddit thread that I've been reading. If you would have to explain to somebody how KRO is different than what we have now, and what we have now are things like Terraform, which I think is easy to understand, but like Helm and kpt-- why would I use KRO versus what I already use? Arguably Helm, because Helm is server-side, technically.

JESSE BUTLER: Yeah. I think there's two questions there. I'd love to start with the Terraform question and then maybe-- because Nic and I have both talked about all of this before, so maybe then we'll switch over to how it differs from other Kubernetes space. So I think Terraform is amazing. Terraform allows you to build modules that abstract abstractions, and you can do all of these incredibly powerful things.

There's a couple of promises with Terraform that, once you start using it, you realize, well, OK, that was a good idea-- so for example, portability. And we see this with Kubernetes, as well. Yes, the tool is the same no matter where you run the-- wherever your environment is. But everybody knows that if you're on AWS, and you're on Google with the same application, with the same types of resources, 80% of your code is different.

And that's where modules come in. You can just ingest these things, and now you simplify your life. So that model really works. And it worked with Puppet and Chef. It worked with Salt and Ansible. It worked way back in the day with bash scripts and duct tape. It really is a pattern that we have just reproduced repeatedly. And this is just sort of like, how do we bring that into Kubernetes?

And I think that's where we differ with this project because it's not really this project is different from Terraform. Kubernetes is different from using Terraform on a MacBook to drive infrastructure as code. So leaning into saying, people use Kubernetes because it is an open standard, because it's an evolving, de facto open standard, so we use it. And we don't then want to context switch out of it to do things like provisioning Kubernetes clusters or managing add-ons.

So I think that's the big difference of prior, other tools and coming into Kubernetes. That does open you up to your kpt, and Helm, and Crossplane, and all these other things that are also Kubernetes. So segue-- Nic, you can talk about that a little bit.

ABDEL SGHIOUAR: Yeah. So then how is that-- I mean, the Terraform distinguishing is pretty easy to make. So then Helm, kpt, Crossplane-- where do we draw the line, essentially?

JESSE BUTLER: Yeah.

NIC SLATTERY: Well, I'll address the Helm question. I guess basically, we wouldn't be interested in investing in something new if there weren't a gap. And like Jesse said, I think KRO is probably better defined by the things that we're not doing. We are not building a new templating language. We are not designing a new way to extend Kubernetes. We're just using CRDs, and we're using existing templating languages like CEL. And we had a proof of concept in Google that used Ginger, as well, which usually makes people cringe when I say that.

ABDEL SGHIOUAR: I am one of the weirdos.

NIC SLATTERY: But then if you look at an existing tool like Helm and then look at the use case that we are targeting, which, again, is platform teams building, essentially, custom APIs on Kubernetes for consumption by some end user, like a developer, Helm wasn't really designed for that use case. And again, this is getting into my personal opinion. I'm sure some people agree and some disagree.

Helm is really good for distributing third-party software. If I was going to start a company today, and I needed to distribute my software on Kubernetes, I would definitely package it as a Helm chart. There's not much thought needs to go into that. It's very good for that use case.

It is brittle, I suppose, and complicated. And what we hear from customers is that it's hard to manage the lifecycle of Helm charts, and it's hard to track dependencies when you apply Helm to the use case of building a platform for your internal developer consumption. It's a complicated use case. It often requires customization, depending on the region that you're deploying a service in or the team that's consuming it, things like that.

So you end up with these layers and layers of Helm charts, often stored in repos, which then have different layers of permissions within a repo. And so then, when something goes wrong, trying to troubleshoot and figure out which layer that problem is coming from is a huge challenge. And this applies to many other tools out there. I'm using Helm as an example.

But the feedback was, it would be really nice if I could just come to my cluster to see the intended state and the actual state of all my individual resources, but then also how they are meant to be grouped together. And if there's things like dependency ordering, I want to be able to see that in the cluster because I'm a platform admin. I'm a Kubernetes professional. That's where I want to see my source of truth.

ABDEL SGHIOUAR: Yeah, and if I may add to that one maybe simple addition, I think that-- like what you were talking about, if I am in a platform admin trying to sell an IDP to developers, telling them to use something that already exists in the cluster, like a resource graph definition, versus telling them, "Hey, there is this Helm chart here. Go install this CLI to be able to use this Helm chart," there is a huge difference between something that already exists, and it's easy to use with the tools you have or something that you have to adopt, kind of.

So you mentioned something very interesting about the resource ordering or the dependency graph or DAGs. I did some tests, as in I deployed the sample example, which is a deployment service with an egress, and it just worked out of the box, perfect. Can you talk a little bit about, where do you see that dependency graph becoming interesting as people use and adopt KRO going forward? Why do we even need it?

JESSE BUTLER: Yeah. I think with that simple sample application, you see why you need it because what you want to do is configure some resources against the values that exist in other resources. And so that's one of the basic things that KRO does is that when it looks at a resource graph definition that you give it, it automatically infers that dependency by config injection. If you need a foo, and you need a bar, and bar needs to be configured with the UUID of the foo, you need the foo first. And that's pretty easy to do.

What's not super easy to do, which is some ongoing conversations that we're having now, because one of the primary use cases here is, beyond simple web applications, we want platform engineers and small DevOps teams alike to be able to author their own cloud building blocks. We want them to be able to say, oh, give me a Kubernetes cluster and have all of the complexity of VPCs, and subnets, and security profiles and all that stuff just abstracted away.

If we're going to do that, you need to understand the dependency management there. And yes, again, if you have a, oh, a subnet, and you need a VPC, so you have this cloud thing, and you have-- OK, cool, I can configure injection. Configuration injection will tell me the order there.

What if that isn't part of that? What if you're just saying, give me a foo, and give me a bar, and inherently, I need the foo before the bar, but I haven't done any configuration mapping. That's kind of an external data source that doesn't yet exist but we've left a door open for. And I think what's interesting there-- it gives you the ability to plug this into other state-aware configuration management solutions.

One of the really important things here is if you're building a primitive that should help everybody move the ball forward, you've got to make sure you have the things that they need to actually integrate. So that's a really good question. Big picture, I think, typically, the use cases that we're seeing-- and we already do have some users of the project, customers of ours and other folks that we've seen in the community, building fairly complex things.

And that is, at its root, the ability to say, well, OK, I need some infrastructure, and then I'm going to have a couple of services, and they're going to run on that infrastructure, and then I'm going to have my application which leverages those services. So there's some configuration elements that I need out of them.

You just put it in the RGD specification using that simple schema, which is another piece of the project that's really interesting to us is that, as Nic said, make it simple-- nice, human-readable, feels very Kubernetes-ish. You just define that RGD, define how these resources interact, and KRO sorts the rest of it out. Use as a DAG. We have static analysis built into it so that if you've done something that won't work, we'll tell you right up front, well, that doesn't work.

Looking further down the line at other verification methodologies that we can integrate so you can have system verification capabilities that leverage these primitives-- that's sort of the big picture. But yeah, I think it's super important and vital, that whole resource-- we abstractly call resource orchestration, having all this bundle of stuff. How does it work? You don't need to worry about it.

ABDEL SGHIOUAR: Yeah.

NIC SLATTERY: Can I add? Because Jesse just made me think of something because your answer was great. But it gets to how flexible KRO is because it is built as native to Kubernetes. So Jesse just described a few use cases that require some ordering of resources. And there's simple use cases, like I need a service account created before my deployment, which is just using Kubernetes-native objects. But then that same workflow applies to much more complicated use cases.

And Jesse started talking about subnets and other resources that are not native to Kubernetes, which then gets into how KRO works with other Kubernetes controllers. And I think Jesse was alluding to ACK at AWS, and there's KCC at Google. And these tools allow you to use KRO to orchestrate not just Kubernetes objects, but any of your cloud resources. So in theory, because KRO is native to Kubernetes, any third-party vendor can build a Kubernetes controller that orchestrates their API. And then it would be compatible with KRO, as well.

ABDEL SGHIOUAR: Yeah. And so that's a really good segue to my next question because it's very important to stress the fact that KRO works with Kubernetes objects. So they could be native, which are the stuff that you have inside Kubernetes already, or any, technically, object that you install through a CRD-- KCC, ACK, ASO, whatever, right?

JESSE BUTLER: Yeah, correct.

ABDEL SGHIOUAR: Before we get there, I think I'm going to make a very, very stupid comparison, and please feel free to tell me, you are dumb here. Looking at how you define a resource graph definition in KRO, which for the people who are going to listen to us, that's how you define a new CRD, technically, putting it very simply. It feels to me like you are defining an API interface because in the schema, you're exposing the inputs or the parameters that the user will be able to define.

And then in the resources section, you are hiding away the complexity of the underlying resources. Some of the values in those underlying resources will be coming from what the user defines, and some will be set to default values. Is that a fair way of describing it?

JESSE BUTLER: Absolutely.

NIC SLATTERY: Yeah, I think that's fair, yeah.

JESSE BUTLER: Yes.

NIC SLATTERY: It's essentially a custom API that is native to Kubernetes. Yeah.

ABDEL SGHIOUAR: Cool. Good. I'm happy about it. So in that case, where do you see that becoming important for cloud-specific resources like defining an EKS or a GKE cluster? I am quite sure that EKS, similar to GKE, has tons of parameters, maybe hundreds.

NIC SLATTERY: Oh, yeah. Yeah.

ABDEL SGHIOUAR: Or maybe less. I don't know. But do you see the point? How do you see that as an advantage that KRO has?

JESSE BUTLER: Yeah, so I'll just loop back moment. I did blow right by how you would create those VPCs and subnets. So thank you, Nic, for keeping me grounded. And for any listener who said, how did you do that? KRO provisions VPCs? No. You use it with other controllers, so like ACK, ASO, KCC, these solutions to integrate to SDKs in the cloud to get cloud resources.

So, yeah, you basically are saying, I'm going to create this-- so for example, we'll use cluster. It could be GKE cluster. It could be EKS, AKS, whatever. So if you have a cluster, maybe-- I don't think EKS clusters have thousands of settings. We have a fair amount, but there's a lot of resources under the covers. And those resources all have settings. Some of them are substantial.

So at the end of the day, if you have this single RGD that is cluster, it's probably a dozen or so resources, each with dozens of configuration elements. It's wildly complex. So one of the most important things in defining this as sort of-- to use your framing, to use this as an API to say, I want to expose an API to my-- for example, I'm a platform engineer. I want to expose an API to my developers for self-service cluster provisioning.

It should be something that we all have. That's a great goal to have. So a developer comes along and says, give me a cluster. There's a dozen resources under there that I don't want them actually even touching, necessarily, out of the context of a cluster. And then there's a bunch of resource configuration for each one of them that they have code to write. They have business logic to implement. They don't want to worry about that.

So one of the cool things here with the RGD, in defining that API, you get to define, as we had said, the resources-- their ordering, their dependencies, how configuration is shared among them, what new resources have values that get injected as configuration to other elements in this one thing called a cluster. So I give that to you. And you now have to set, what, 140, 150 configuration elements. No, thank you.

So what we do with the RGD is we also allow you to expose only the things that you want your resource consumer to have to worry about. And so to your point, the implementation of this API, almost like the back end, is a collection of resources. And you can set default configurations, and you can inject configuration from other resources.

So you basically now are in this position to say, OK, with this cluster example, give me a VPC, some subnets, some security groups, some IAM resources, whatever. Put this all together. Create the cluster with them. So now you're creating resources with something like KCC or ACK. You're coming through, creating these resources and injecting configuration where you need to, setting default configuration where you want to.

And at the end of the day, you're exposing it through a new CRD that KRO will dynamically create and install and register in the cluster for you where your downstream developer just sets name and region, and that's it. They just set those two things because, in the back end of that API-- I did air quotes there for podcast listeners-- the back end of the API is actually like an API, so you're implementing a lot of complexity and relationship management and stuff in the back end, and you just expose the simplest view that you see fit.

And of course, for power users, maybe you have two flavors. Maybe they can set a handful of configuration variables that give them a little bit more control. So this is entirely in your hands as far as authoring a resource graph definition. So yeah, that's my take. I think that's a really good question.

NIC SLATTERY: Can I throw in another example?

ABDEL SGHIOUAR: Yeah.

NIC SLATTERY: Because your question is really good, which is, why is this important, or why does this make life better for managing cloud resources outside of Kubernetes? And it, again, gets to, how do we make life easy for a platform engineer who has to sell something to their end users? So we need to make the end user's life easy, as well.

And from a platform engineer's point of view, it's, OK, I have this Kubernetes platform because my developers are running their applications on Kubernetes, and I've built all these GitOps pipelines that are native to Kubernetes. Oh, and hey, now I have the tools to use those same GitOps pipelines and these same monitoring tools with any cloud resources, whether it's related to Kubernetes or not. So if my end users, a developer, needs a storage bucket, I can offer them that service as something that's native to Kubernetes.

The developer doesn't need to know about Kubernetes, by the way. They just, like Jesse said, input the name of a storage bucket in the region. They don't need to know anything about all these GitOps pipelines and all of the platform engineering stuff. But from a platform engineering point of view, it's, OK, now my developers can provision a storage bucket, and the way that that gets provisioned is compatible with all the other tools that are used to run their Kubernetes-based applications.

ABDEL SGHIOUAR: Yeah. Yeah, and if I may add there, as you folks were talking, it just made me think of something. If you have written even the simplest thing, like the deployment service ingress in the past, one of the most annoying things is that sometimes there is the same value that you have to repeat everywhere. Like in a deployment, it's the same value in the name, in the template, in the label. It's the same thing in the service, same thing in the ingress. If you can just set it once on the RGD definition--

JESSE BUTLER: Exactly.

ABDEL SGHIOUAR: --then you don't have to worry about it. It just gets trickled down through a variable. And I'm putting air quotes here as a variable. And then it just works.

JESSE BUTLER: Yeah. There are nice, little syntactic things like that in the sticky behavior of using KRO every day, which I do. I create wildly scalable Hello, World applications all the time. And it's sort of like having a favorite programming language. You don't think about UX or human design when you think, oh, I love Go, oh, I really like JavaScript. But that plays a big role.

So just these little things that we've found through approaching the problem from first principles and then building up what we think would work. We have little things like that. So I think there's a third use case that is one of my favorites, which is like, maybe I'm just a one-person shop selling WordPress. I can simplify my life by using Kubernetes because we know that. It's much easier to do that than running big monoliths on compute VMs.

So I'm using Kubernetes. Even if I don't have a downstream consumer, even if it's just for myself, I can take all of the complexity and all of the configuration things, and I'm always typing the same defaults and these-- but sometimes they're a little different, so I have to write sort of the same string 85 times a day-- and just bake it into an RGD, and now I just create them on the fly. And I have a little bash script that runs, and I just make money. Yay. So there's so much of this where it's just, if you look at it just like this little, primitive building block to build other building blocks, it's very powerful and very exciting.

ABDEL SGHIOUAR: Yeah. I love how it always comes back to bash at some point, eventually.

JESSE BUTLER: For me, it always does. It used to be Korn shell, but I finally transitioned.

ABDEL SGHIOUAR: OK. Welcome to the right side.

[LAUGHTER]

All right, awesome. Well, folks, this was a really insightful discussion. What's next? What's coming for KRO? Where do you want people to check it out? Where do you want people to go to do with it?

JESSE BUTLER: Yeah, yeah. Come join us. It's one of the reasons we made a lot of noise about it is that we want everybody to know this is a super, super open project. We think there's a really exciting future for it. And it is fully vendor-agnostic. And we're sort of adopting CNCF governance just right out the gates, just because that's what people are familiar with in this ecosystem.

So super open, meritocracy. Everybody's welcome. We have a community meeting every other week on Wednesday mornings, 9:00 Eastern Time. It is tomorrow. So if folks are interested-- or actually, wait, I don't know when this podcast goes out. So it is--

ABDEL SGHIOUAR: No, this is not going to come live tomorrow. Yeah. OK.

JESSE BUTLER: It's March 5. So in the past, there was one, but every other week from there. Also, yeah, we have the KRO channel in the Kubernetes Slack. You can come join us. Obviously, github.com/kro/run/kro is the project. Please come take a look, open issues, file PRs. Really happy to have you contribute and join and just join the conversation.

NIC SLATTERY: Yeah. I'll just say that there's a lot of reporting that typically opens with, oh, Google and Amazon and Microsoft are collaborating on this thing, which has been true. But we don't want to own it. It's a community project. Anyone is welcome to contribute. We do have a handful of independent contributors already, which is great. But yeah, by no means is this just a Google, Amazon, and Microsoft thing.

ABDEL SGHIOUAR: Yeah, it's very important. The project is completely independent. Yeah, as long as you guys continue working and making good work, I'll continue advocating for it. So you have my backing there.

JESSE BUTLER: We appreciate that.

ABDEL SGHIOUAR: Thank you very much, Jesse and Nic. Thanks for your time.

NIC SLATTERY: Thanks, Abdel.

JESSE BUTLER: Thanks so much. Thanks for having us.

ABDEL SGHIOUAR: Yeah, thank you. Have a good one.

NIC SLATTERY: You, too.

KASLIN FIELDS: Thank you for that interview, Abdel. I've been looking forward to learning about KRO or Kubernetes Resource Orchestrator. That's what it stands for, right?

ABDEL SGHIOUAR: Correct.

KASLIN FIELDS: But I missed it because I was out, so you're going to have to tell me all about it.

ABDEL SGHIOUAR: Oh, yeah. So it's quite interesting how this came to exist because it was literally a quick discussion with Nic last year at Kubecon North America in Salt Lake City. And up to that time, I did not know about KRO. I had no idea that this project existed. And then we started chatting, and then I realized, oh, snap, this is quite big. And then I started digging into it.

Yeah, so essentially, in a nutshell, as you might have heard in the interview, it is not a Kubernetes-native for now because you still need to install a CRD, but as natively as it can be in Kubernetes for you to do resource grouping. So you can make higher-level abstractions of complex Kubernetes objects and make them available to developers as a simple object with a "simple schema," quote unquote, as in simple fields.

So you can do complex deployments. You can hide-- if you have a super complex object with a lot of options, you can set some options to default values and only expose what the developers are allowed to change, things like that. So you can think about it as resource grouping, essentially.

KASLIN FIELDS: Meta.

ABDEL SGHIOUAR: Pretty much, yeah.

KASLIN FIELDS: So it's taking existing native Kubernetes resources, and it's a CRD on top of them that allows you to group them into deployable units?

ABDEL SGHIOUAR: That's the beauty of KRO, native and not native, right?

KASLIN FIELDS: Yeah. interesting.

ABDEL SGHIOUAR: As far as Kubernetes is concerned, the object could be native to Kubernetes-- so the underlying object could be native to Kubernetes, like a deployment service, whatever. But they could also be external cloud resources.

KASLIN FIELDS: And the goal is to give you a deployable unit. So this is something that you define and then deploy?

ABDEL SGHIOUAR: The goal is to give the platform administrator a way to group resources and make them easy for developers to deploy. So there are two personas.

KASLIN FIELDS: What about the management side of it? I assume that it makes coordinating management of the components easier.

ABDEL SGHIOUAR: Correct. So there are two personas. As a platform admin, you would define those abstractions, using KRO, using this thing called Resource Graph Definition, RGD. And then in your resource graph definition, you have two sections, the section that says, this is how the object will appear to the end users, the name of the object, the schema, the fields that the user is supposed to define. And then once the user creates an object of this type, these are the objects that will be created under the hood.

KASLIN FIELDS: Interesting. I mean, custom resource definitions could be kind of like that too, right?

ABDEL SGHIOUAR: And that's the beauty of KRO because it will register this as a CRD. So you don't need to write a CRD from scratch.

KASLIN FIELDS: Interesting.

ABDEL SGHIOUAR: And then the operator-- so the component that will act on those objects and create them, or delete them, or manage them-- that's essentially depends. If the objects are native to Kubernetes, so it's the existing operators in Kubernetes. If it's a third-party cloud objects, then you will have to install external operators to be able to manage these objects, right?

KASLIN FIELDS: Right.

ABDEL SGHIOUAR: So KRO is quite agnostic in that sense.

KASLIN FIELDS: And I do want to mention also that this is a pattern that I'm seeing more and more is extensibility in Kubernetes is becoming more and more something outside of Kubernetes itself. Kubernetes itself is really big. If we let it get too much bigger, it's just going to be unwieldy, un-extensible. The process for getting something into mainline Kubernetes at this point takes so long.

ABDEL SGHIOUAR: Yes.

KASLIN FIELDS: And so the community is leaning more and more toward these types of custom resource definition extensibility types of features, rather than built into Kubernetes features.

ABDEL SGHIOUAR: And this is also kind of the philosophy of KRO is keep things simple. So it's borrowing from a book of Kubernetes of keeping things simple. So it kind of sits between, I want to build a CRD and an operator, but I don't want to jump right away to do it, so let me try if KRO can solve this problem with me before I start writing, like crafting-- so it's not going to solve all the use cases of people needing to write their own custom resource definition and operators, but it might solve part of it.

So you can think about it as solving the lower common denominator between the use cases for CRDs and operators. And then if you need to do something more complex, you still can go the route of building your own thing.

KASLIN FIELDS: I feel like we usually talk about CRDs and operators kind of in the same breath because people don't think that much about the implementation of the thing within Kubernetes, the operator, versus the implementation of what you want to run, the definition of it, the custom resource definition.

ABDEL SGHIOUAR: So in my head-- I might be wrong. In my head, a CRD is sort of useless. A CRD is, by definition, a definition, right?

KASLIN FIELDS: Yes.

ABDEL SGHIOUAR: So a definition without something to act on it is pretty much useless.

KASLIN FIELDS: Exactly. But people use the terms interchangeably--

ABDEL SGHIOUAR: Correct, yeah.

KASLIN FIELDS: --most of the time.

ABDEL SGHIOUAR: And I think that's because, most of the time, a CRD has an accompanying operator that goes along with it to make that CRD useful.

KASLIN FIELDS: It should.

ABDEL SGHIOUAR: Or it should. Yeah, that's a very good point. Yes, it should.

KASLIN FIELDS: If it doesn't, then it's really useless.

ABDEL SGHIOUAR: Yes, pretty much.

KASLIN FIELDS: Then it's a definition of something that doesn't exist in Kubernetes.

ABDEL SGHIOUAR: Exactly. And so, yeah, it's pretty cool. The discussions have been pretty illuminating. I helped write one of the blog posts. I did some talks about it. I did a webinar last week about it. But I think talking to the people, the people who are driving the projects on AWS and Google-- and by the way, keep in mind that this is between Google, AWS, and Azure. So it's not a one-cloud-provider effort. It's a common effort between the major cloud providers.

KASLIN FIELDS: Yeah, that's very interesting. I noticed when the blog post came out that it was about it being a collaborative effort between all of the major cloud providers. And I was like, is it really? And then as I've seen more and more stuff about it, it's like, yeah, it really is.

ABDEL SGHIOUAR: Because all of them have been thinking about doing the same thing. And I think at some point, they just realized they're trying to solve the same problem, so why not just solve it together?

KASLIN FIELDS: And this allowed you to talk to someone awesome, Jesse Butler. I can't believe we've gotten 10 minutes into the chatter without me mentioning I used to be on the same team as Jesse Butler at a previous job.

ABDEL SGHIOUAR: Oh, I didn't know that.

KASLIN FIELDS: And so I'm extremely jealous that you got to talk with him.

ABDEL SGHIOUAR: Yeah, he seems cool. We didn't meet in person. We just talked online. But I like the way he has this very easy way of describing and explaining things. It's pretty cool.

KASLIN FIELDS: He is very good at that and very deep, too. He's had a long technical career in the Linux space and really knows what he's talking about. I always enjoy learning from Jesse.

ABDEL SGHIOUAR: Nice. Awesome. Yeah, it was pretty cool. It was a very good conversation.

KASLIN FIELDS: Nice. Interesting. So one last question-- were there any particular use cases that you all discussed or that you want to imagine for this? I'd love an example.

ABDEL SGHIOUAR: So yes. So in my talks, one of the very basic examples I give is the example of a standard deployment. Imagine you have a web app you need to deploy. The minimum you have to create in a Kubernetes is a deployment to service and ingress to make that thing access-- or maybe a Kubernetes and the service of type load balancer, depending on what kind of load balancer you want.

KASLIN FIELDS: This is true.

ABDEL SGHIOUAR: And so if you go actually to kro.run, there is an example where they use KRO to hide the complexity of all of this. So instead of having this long YAML file which contains your deployment, your service, and your ingress, you create an RGD, a Resource Graph Definition, that hides that complexity, and then your application becomes a nine lines kind of thing instead of, whatever, 20, or 30, or 40 lines.

And this is actually-- so there was somebody publishing on The New Stack a couple of articles about KRO. And the example that they gave is WordPress. So imagine you are an admin that just needs to deploy WordPress websites for your customers. And essentially, all deployments are the same except a few things, changes between deployments. So using KRO, you can just templatize in a way that deployment and only change the things that are different between your customers, between the different deployments you need to do.

KASLIN FIELDS: Interesting. Yeah, I have done that, and I would like to have that. Excellent. I have something to do.

ABDEL SGHIOUAR: To be very fair, I mean, there is other tools that can do that. Helm is notoriously known for that.

KASLIN FIELDS: I was wondering about that.

ABDEL SGHIOUAR: Yeah, Helm solves a completely different problem. There are multiple things, and one of them being that Helm doesn't have reconciliation. You have to trigger Helm in case the deployment gets screwed up, and you need to redeploy it. You have to trigger Helm manually.

KASLIN FIELDS: It's more on the definition side than the operator side, Helm.

ABDEL SGHIOUAR: Correct, correct.

KASLIN FIELDS: It's like a library of things that you can deploy to your Kubernetes cluster, but it doesn't do anything to the cluster itself.

ABDEL SGHIOUAR: It's a packaging tool. So KRO is more on the server side. So you have that reconciliation loop, as well. So if somebody goes and accidentally deletes something that needs to be there, then you have the benefit of the reconciliation loop of Kubernetes, and Kubernetes will just recreate the object.

And I'm not talking here about scale down a deployment or delete pods, like if somebody deletes the deployment altogether. Kubernetes, out of the box, is not going to reconcile it. But because KRO is built as an operator, it will be reconciled.

KASLIN FIELDS: Nice. All right.

ABDEL SGHIOUAR: And so there is-- if you go on kro.run, there are a bunch of examples. This is just one simple one, and then there is more complex ones. There is one actually on the AWS side, which is using AKS to deploy an LLM. So it's super complex with a lot of objects and stuff. So if you are looking to contribute, looking to lend a hand, I think that the team will be very happy to have somebody helping out.

KASLIN FIELDS: Yeah. Join me in checking this out next time.

ABDEL SGHIOUAR: Yeah.

KASLIN FIELDS: All right, thank you, Abdel. I'm glad that I got to learn about that and glad that you got to meet someone awesome, in addition to the awesome people that you already know.

ABDEL SGHIOUAR: In addition to you.

KASLIN FIELDS: It's what we do.

ABDEL SGHIOUAR: Exactly.

KASLIN FIELDS: [LAUGHS]

ABDEL SGHIOUAR: Thank you.

KASLIN FIELDS: Thank you.

[MUSIC PLAYING]

That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media at "Kubernetes Pod," or reach us by email at <kubernetespodcast@google.com>.

You can also check out the website at kubernetespodcast.com, where you'll find transcripts, show notes, and links to subscribe. Please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening, and we'll see you next time.

[MUSIC PLAYING]