#262 October 29, 2025

GKE 10 Year Anniversary, with Gari Singh

Hosts: Abdel Sghiouar, Kaslin Fields

GKE turned 10 in 2025! In this episode, we talk with GKE PM Gari Singh about GKE’s journey from early container orchestration to AI-driven ops. Discover Autopilot, IPPR, and a bold vision for the future of Kubernetes.

Do you have something cool to share? Some questions? Let us know:

News of the week

KASLIN FIELDS: Hello. And welcome to the "Kubernetes Podcast" from Google. I'm your host, Kaslin Fields.

ABDEL SGHIOUAR: And I am Abdel Sghiouar.

[MUSIC PLAYING]

KASLIN FIELDS: Google Kubernetes engine turned 10 this year. In this episode, we talk with outbound product manager, Gari Singh, about how the product has changed over the years, and some of his favorite things that are happening now. But first, let's get to the news.

[MUSIC PLAYING]

ABDEL SGHIOUAR: The K-native project reached the CNCF graduated maturity level. The project was created at Google in 2018 with contributions from IBM, Red Hat, VMware, and SAP, and was introduced as an incubated project to the CNCF in 2022. The project aims at removing much of the complexity of running modern workloads on Kubernetes by handling infrastructure tasks like autoscaling, routing, and event delivery. Congratulations to the community for reaching this milestone.

KASLIN FIELDS: LLMD introduced release 0.3. The new release provides a fast path to deploying high-performance, hardware-agnostic, easy-to-operationalize, at-scale inference for large language models. LLMD is an open-source collaborative project between major cloud providers aimed at laying the foundation for LLM inference at scale on Kubernetes.

ABDEL SGHIOUAR: VLLM introduced new updates to its semantic router. VLLM is an open-source library for running large language models inference efficiently. It's available as a Python package or a container that can be used with Kubernetes. The semantic router is a mixture of models, router, that intelligently directs OpenAI API requests to the most suitable models from a defined pool based on semantic understanding of the request's intent.

The new updates include a dashboard for visualization, a paper on why to use the semantic router, and the YouTube channel. The roadmap for the router is also publicly available.

KASLIN FIELDS: The CNCF introduced the certified Meshery contributor. This certification validates technical proficiency in contributing to the Meshery open-source project through written assessments. The certification consists of five distinct exams, each dedicated to one of Meshery's major architectural domains. Meshery is recognized as the CNCF's sixth-highest velocity project, and aims with this new certification to offer a thoughtfully-designed contributor onboarding experience.

ABDEL SGHIOUAR: Headlamp is an open-source plugin created by Kubernetes contributors, and designed to complement Carpenter. Headlamp's UI gives users real-time visibility into Carpenter's activity. It shows how Carpenter resources relate to Kubernetes objects, displays live metrics, and surfaces scaling events as they happen. You can inspect pending pods, review scaling decisions, and edit Carpenter-managed resources with built-in validation.

KASLIN FIELDS: And that's the news.

[MUSIC PLAYING]

Hello, and welcome to the show, Gari Singh. We're going to be talking about the GKE 10-year anniversary today. But, Gari, first, why don't you tell us a little bit about yourself?

GARI SINGH: Hey, Kaslin. Yeah, great to be here. Yeah, so Gari Singh. I'm one of the product managers for GKE. Been here at Google for, I don't know, four years or so. Yeah, my role is kind of an out-bound role. So I do lots of events, talk to lots of customers. And I guess, on the other side, I'm probably one of the better testers of the product. I like to use the product a lot. So great to be here to talk about all the cool stuff we've been doing.

KASLIN FIELDS: I was wondering how to work it in that you are someone that we can trust to give your true opinion. [LAUGHS]

GARI SINGH: Yes. Yes. I guess I do not tell a lie.

KASLIN FIELDS: And the product has benefited because of it. I always love meetings that we get to be on together internally, where we're talking about the product, and features that we're thinking of creating, and all kinds of things like that because I feel like you always have great insights into what the community really needs and what it's like to really work with GKE.

GARI SINGH: Yeah, definitely. I mean, it's great to work with customers, hear stuff at various communities, go to meetups. If you're out there, you hear the stuff. And then, yeah, I like to try everything. If you're in a talking role, it's always better to be able to know what you're really saying behind the scenes. And I think doing is the best way to learn.

KASLIN FIELDS: Yeah, I feel like any new thing, I can trust that you have probably already tried it out.

GARI SINGH: Yeah. Yeah, yeah. There's visual learners, oral learners, and then there's doing learners. I'm a doing learner.

KASLIN FIELDS: So you've been here at Google for about four years. But where did your journey with GKE begin?

GARI SINGH: I mean, I guess there's a funny story. The first time I obviously heard of GKE was way back when in, well, Kubernetes in 2014, whatever, 10 years ago. I guess I can say where. We're on a podcast. So I was at IBM at the time. And we were building a bunch of stuff on containers. IBM had its own container service and everything. And then all of a sudden, Google, and Red Hat, and a few others, were like, well, we're going to do Kubernetes. And I was like, well, this is great. At least, there's this a standard out there.

And GKE was, I think, one of the first managed services out there. So I used it back in the old days, when I think we still had-- I think the API server was a single VM. We didn't have all the cool features that we have today. And then most recently, fast forwarding before I got here, I was running clusters all the world in my previous job. And no cloud is in every region, so we used GKE in a number of places.

And then it was pretty cool coming here. I was looking for my background, but I was like, what do you want to do next? I was like, I like Kubernetes, thought it'd be cool to come to Google and work on Kubernetes. So that was kind of my mission. And I got here.

KASLIN FIELDS: Can relate. And those early days of Kubernetes during the container orchestrator wars, as we call them, was an interesting time because it was kind of all about containers, really. In 2013, 2014, it was like there's this new virtualization isolation technology that was getting all of this hype and excitement for very good reason. But it was like, how do you run that at scale? And everybody was trying to figure that out at the same time.

GARI SINGH: Yeah, it was like, how do you run it at scale? How do you get things to talk to each other? There was a lot of single-machine solutions. There was some original versions of-- there was Docker Compose, I think, was out there. Docker had the original Swarm, not Swarm mode that came in. Everybody was building it. Amazon had a container service or something like that.

And it was like, all right, I got it. How do I scale this thing? How do I get this thing across multiple machines and all that? And I think that's where Kubernetes really-- the Docker company helped push that. And then I think Kubernetes really helped push it to the mainstream for production and beyond just using it for development on your laptop.

KASLIN FIELDS: Yeah, with things like minikube. That's a whole other topic-- minikube and K3s and how people actually develop on their machines.

GARI SINGH: As a side thing, I always keep reminding-- and I keep forgetting to submit it. Maybe I'll do it for KubeCon EU, is what if we had-- maybe it'll be a good segue into the cool stuff we do. But what if we had had minikube on day one? I think it could've been really different, right? Because there was a lot of struggles in setting up Kubernetes in the early days. Because, as we know, most people really don't want to deal with networking and don't know networking.

And so getting it set up really put the focus on that rather than necessarily the deploying and the apps and all the cool other types of resources we had. If we had minikube on day one or Kind on day one, it would have been a much more pleasurable experience, I feel.

KASLIN FIELDS: Yeah, it's always a challenge for folks trying to get started with Kubernetes, and trying to learn it for the first time is, it's really a system that's meant for large enterprise-scale use cases. So testing it out on your own, it's a little difficult to figure out how to set it up in a way that mimics the production environments that are built with it. And those technologies have really grown. Not really what we're here to talk about, but very important.

GARI SINGH: Maybe it will play into what we've been doing for the last 10 years with GKE, trying to make it much easier to run this thing at scale, right?

KASLIN FIELDS: That's true. So how do you think GKE has changed over the years, from the early days of containers being the hotness, and then everybody trying to figure out how to orchestrate containers on them, to Kubernetes being released by Google as open source, and working with the Linux Foundation to create the CNCF? And then GKE came out about a year later. 10-year anniversary of Kubernetes was last year. So we've been around for about 10 years now. And I think things have changed a lot since then. What do you see?

GARI SINGH: I mean, I think a lot of the-- like we said, a nice segue to the-- I think the early days we're still all about just getting the thing up and running, mapping it into the cloud. The initial thing with GKE was it was nice, that if you did want to run on multiple machines, with Kubernetes, it was a quite easy way to get started.

But again, I think as you looked at it, you still had to understand things like networking and setting all this stuff up. And I think as GKE has evolved, we're now at the point where if you click with Autopilot, which came out in 2021, I mean, you literally can't have-- it's a marketing term. I try not to use too many marketing. But you really can't have one-click production clusters.

You push a button, or you run a glcoud container. gcloud container clusters create auto. And now we've got that, which I think then really starts to move the focus beyond running the infrastructure itself and how all that connectivity runs, to more focus on workloads.

And Kubernetes, in general, I think, has started to focus more on workloads. And I think a lot of what we've been doing on the GKE side has been focusing on the workloads themselves, whether it be AI workloads, stateful workloads, and then trying to make all the day-two operations go away, like automated upgrades, automatic upgrades. And you've even seen the change of that since the first time GKE had it, which was still OK, better than doing it yourself.

But now, you can really trust the thing. And I leave clusters running and they upgrade and things seem to work, that hands-off experience, which really, then, I just start caring about trying new workloads, trying new parts of it, rather than like managing that infra.

KASLIN FIELDS: That is a good callback to really the hardest thing about Kubernetes in the early days was getting it going, I feel like, at least for me it was. Kelsey Hightower's Kubernetes the Hard Way is very famous for a reason, because it came around just at the right time when people were trying to learn about Kubernetes and how it could help them orchestrate containers at scale.

And nobody knew how to run the thing because it's this kind of obscure technology really rooted in the Linux kernel. So in order to deploy it in the early, early days, you had to understand the underlying Linux operating system components that actually made it possible, because you had to set up each one individually using the Linux system.

GARI SINGH: Exactly. Exactly. The kernels didn't support things and whatever. I think when you get to the point where you just know what the thing should do, it is an orchestrator of compute. The underlying stuff, can we just get that out of the way? And I think we've done a really nice job on GKE of getting to that point, where you don't worry about nodes, node pools, and whatever. You literally describe it all when you deploy. When you deploy your workload, we have those things called compute classes now or even on Autopilot, right?

You're like, I want to run on a arm machine, annotate your workload, get the right arm machine for us or whatever it may be-- or an AMD or Intel machine. You don't have to necessarily worry about pre-provisioning all that stuff and how do you size all that stuff. That's all the minutia.

So I think we've really done a great job of bringing the power of container orchestration, but giving it the ideal cloud experience, which is supposed to be that the cloud is quote unquote "magic," and just scales for you. But how do you tie those two things together? And I think we've put a lot of stuff into open source to make that happen, but then really done a great job of making sure that that ties out really well to Google Cloud infrastructure.

KASLIN FIELDS: In the early days of Kubernetes, I don't know, maybe the mid days--

GARI SINGH: Mid days.

KASLIN FIELDS: --I feel like Kelsey Hightower's favorite thing to say was that, one day, we're not going to care about Kubernetes. It's going to be all about serverless. And I think it's really interesting how the ecosystem has just grown this natural spectrum between how deeply you want to be involved with that underlying Linux system. How much do you want to be a sys admin who's really hands on with your machines and how they're configured, versus how much do you just want to say I'd rather pretend that there aren't servers and just run my workloads? And there's just this whole spectrum now of what level of control you want over that underlying hardware.

GARI SINGH: Yeah, exactly. And I think, as we've changed over time, we've gone from-- I use the term prescription without restriction. Things should just work. Here's the best practices. Here's all the stuff that we've hardened. Here's everything we've learned over all the years. Just give me that.

And then if you so desire, because you just want to or because you have a specific need, how do you go down and tweak a specific parameter if you need to? And I think that's probably where, obviously, Kubernetes shines through. It sort of does everything. You can run anything on it. And I think that's where we've found that fine line on GKE. How do you make it work for most use cases out of the box but then allow people to customize and tweak where necessary?

KASLIN FIELDS: I was talking to someone at an event last week who said that they ran Kubernetes clusters in the past, weren't at that particular moment. But they were saying that their favorite thing about Kubernetes was its extensibility, is that sometimes things would-- I think he said, sometimes things would feel kind of unfinished or really hard to use. But it's kind of by design because you want to implement that flexibility into the system. It's a platform for building platforms. So it's all about what you want to do with it.

And when we were talking about the 10-year anniversary of open-source Kubernetes last year, we also talked a lot about how the future of Kubernetes is all about that extensibility. We need a level of extensibility where it's flexible enough to do the things that people need it to do without being overly complicated, because it has gotten really complicated. And it's really hard to keep the user experience simple. And GKE faces that too.

GARI SINGH: Yeah, most definitely. And I think that's where you start to-- where's that fine line of where do you have the right levels of extensibility, especially in the AI workloads? So I think, obviously, pushing things like DRA was great in the open source side. But then that also allows us to better optimize our experience for using things like TPUs and make those have similar experiences so that they don't look like foreign add-ons.

Or having a standard set of APIs and make auto scaling work-- the cluster autoscaler has its capabilities, but then we added things like node auto provisioning, now custom compute classes. So there's all these levels of extensibility, but you're still getting the core. So we give you an optimal experience, but we're still based on the fundamental core things that are in the open source ecosystem.

KASLIN FIELDS: That's been the most interesting thing for me as Kubernetes has evolved during this new AI world that we live in. It's going back to that high focus on the underlying infrastructure because those AI workloads are just so resource intensive that it really matters what kinds of underlying hardware accelerators you're using for those workloads and if they're optimized for those kinds of hardware accelerators. So I think Kubernetes itself has had to make some interesting shifts to meet that flexibility of, you want it to be as easy to use as possible, but also, you want to give people as much control as possible.

And I think that's an interesting thing where that's also, I think, an interesting area where sometimes what do you do on a managed platform like GKE, versus what do you have to put in the core? Because the other interesting thing about AI is that a majority of people on the AI side, most people, the end users will end up being data scientists, or training, or whatever. Kubernetes is a means to an end for running AI workloads.

The main thing they want to do-- DeepMind wants to train Gemini. And they want to leverage Kubernetes for its power of scaling, orchestration, et cetera. But they don't want to have to go in and configure every single kernel parameter, the overlay networking that makes things like the NVIDIA's cross-GPU networking work or TPUs in there.

So we put that level of extensibility in there. And then we try to expose, on the managed side or the GKE side, that experience that's just, deploy your workload; describe it like this; and then we'll use the raw power of Kubernetes to push out that infrastructure. You know what I mean? You didn't have to come in and figure out how to create your 10,000-GPU node, how to network all that together, how to put everything together.

You just described, here's my thing. I need to shard the model, so we take advantage of some of the higher-level software for doing that-- things like Jetstream or JAX and whatever libraries and frameworks. And how do you map those and make that easy to use? And I think that's been the interesting focus, I think, on GKE moving beyond making the infrastructure stuff more automatic with Autopilot to now making it easy for people to just deploy these workloads.

And then of course, the most advanced people can go in and tweak it if they need to. But in a majority of cases, just [? kubectlapply-fsom-yaml ?] and it should work.

KASLIN FIELDS: That's the thing with Kubernetes is that it's not just one hole to fall into.

GARI SINGH: Yeah.

KASLIN FIELDS: It's so many rabbit holes that you can fall into along the way. So which rabbit hole do you want to fall into? And which ones do you want to ignore the existence of?

GARI SINGH: Sure. And I think the thing that we've learned from our customers over time is-- it's interesting because you start to see the rise of platform engineering, where Kubernetes was obviously a great thing for platform engineers because it did a lot of that, as we said, underlying cloud orchestration. And then they built their layers on it. But now you're like, I want to build more of my stuff that's specific to my business. I don't want to have to build add-ons to Kubernetes to do that.

So I think that's where, from the GKE side also, we've looked at how do we add multi-cluster stuff? How do we add additional management of this stuff, multi-cluster routing, config stuff? How do we build more core things into our managed offering, to offload a lot of that, so that people can, again, just focus on the things that matter for running their applications? Because after years of doing that, you're just like, can I? Yeah, I think I just want to offload as much as possible to you guys.

KASLIN FIELDS: Some people do and some people don't.

GARI SINGH: Some people, yeah. It just depends, right? Yeah.

KASLIN FIELDS: Yeah. It's about which rabbit holes you want to fall into. So all of this really is about how GKE has changed over the years and what kind of environment it's existing in now, and how it's changing for the AI sphere. But what do you see for GKE in the future?

GARI SINGH: I think we have some really interesting stuff coming up. And we've touched upon a few of the things. I'm really excited about a lot of the work that we've-- so we've had to push the boundaries of scale due to AI. And then at the same time, we talked about people don't really want to-- they just want to tell us and make this stuff happen.

KASLIN FIELDS: Like AI.

GARI SINGH: Yeah, like AI. I mean, so it's interesting. We talk a lot about running AI on Kubernetes and running AI on GKE. Obviously, we've had a lot of focus on that. I think now we look at, how do we leverage AI more in terms of running GKE and operating in your workloads?

KASLIN FIELDS: Kubectlai.

GARI SINGH: Yeah, I mean, the canonical example is I want to deploy an application. And people will say, hey, I'm going to deploy an HPA, and I'm going to say the threshold is 70% CPU. Or maybe they're lucky and they put in some custom metric. But at the end of the day, probably what they wanted to say for their web app or whatever it may be, is I want to have a response time of like 150 milliseconds. Figure out what that means. Does that mean I should scale vertically? Does it mean I should scale horizontally? What's the right type of compute?

And I think, as we start moving more in that direction, whether you call that AI ops, whether you use autonomic computing, whatever, there's all these partially serverless. Some serverless environments have that. So I think that's really where we go. We continue to work on scale. Because obviously, the bigger you can scale, it's better. You just take that out of the equation. So don't worry about scale. We'll get you the compute that you need, when you need it, how you need it, et cetera. That's stuff like we've been doing with compute classes and things like that.

But then, how do we just move to deploy your workload out there and we'll handle a lot of it for you, the next evolution of what we did with Autopilot? And then I think the next thing on top of that is, then, how do you manage that or how do you observe that? You've had the Kubernetes console, the GKE console. You've got dashboards, Grafana, Prometheus, whatever you have.

But it's really all about the data. And then can we have some agents just looking at what's actually running and starting to automate some of those things? Here's what we've noticed. We've noticed these differences in there. And then also, give you more of a-- and I hate using buzzwords, to be honest with you, but I do believe in this.

So you can a lot of agents running for people. But can we also just give you your own more personalized-- I think we have some of our PMs talk about giving you a more personalized experience. Why couldn't you just dynamically generate the UI or the experience in the model that you wanted? We can host it somewhere. But here's how you want to look at your Kubernetes distribution, not how we, GKE, thought you should look at it.

So I think it really comes down to AI for running the workloads, agents for monitoring and managing that stuff, but then giving tailored experiences around just what's going on with your workloads, et cetera, which can definitely change the game, I think, from a platform engineering perspective.

KASLIN FIELDS: That is such a important point, I think, in the world of AI. Anytime we're having a conversation about AI in the world of Kubernetes, I always want to pause it for a second and be like, are we talking about running AI on Kubernetes or are we talking about running Kubernetes with AI?

GARI SINGH: Yes.

KASLIN FIELDS: Because there's so much potential to make Kubernetes administrators' lives so much easier by using AI to help you understand how the system is running, because that's just so hard to understand right now. There's so much data for an administrator to parse through in terms of the logs of the workloads, the logs of the clusters, all sorts of pieces, the networking. And getting all of that into a view is very, very difficult.

So I think there's a lot of interesting potential there. I think observability is a really important aspect to think about in the world of AI and Kubernetes, and what's next in the future for GKE and Kubernetes in general.

GARI SINGH: I mean, there's obviously the stuff that goes out for app development folks who are like, oh, generate my YAMLs and all that stuff. And obviously, AI is fantastic for code generation and YAML generation. I admit, I did it today. I don't have to write it. I'm like, I need an app that does this. Just deploy it. And I have that workflow always working. But like you said, there's lots of data.

The thing about Kubernetes is, I guess the technical term, it's an edge-event-driven asynchronous system, which makes it hard sometimes when you-- so you run kubectlapply, you deploy a pod. And then it just comes back and says, OK. But really, all it did was validate that the API server accepted your pod spec.

Then, another task is deploying that. And you got the logs there, and then you've got the nodes tie. If you're on GKE, those tie to-- did autoscaler kick in, and then did that mean I had to deploy whatever underlying compute, and maybe there was a log, blah, blah, blah? And how do you connect all those things together?

Well, we have all those sources. And you could hard code a lot of this. But I think where, to me, a lot of the AI stuff becomes fantastic is if we have the right data sources, whether that's model context servers or just APIs, AI gives you a lot of that connectivity. And you can also streamline the logic. Most people know event-driven systems and response systems are fundamentally a bunch of events come in, data is pulled, and you write a ton of if-then-else logic.

Fundamentally, even rules engines, in the end, they're just a higher-level language that still is generating if-then-else statements. But what if we can avoid doing that with putting more natural language in there, where it's actually more figuring this stuff out on the fly, tailored to you? To me, that's the real kind of power that we have with running Kubernetes with AI or using AI to run Kubernetes.

And we know a lot of the best practices, and we've tried to do that. We have a lot of those. But people don't want to always get an alert here, or looking at console, or whatever. And they have something that can automatically process that. And then how do we avoid having to hard code every rule? I think that's where AI makes this stuff much more streamlined and possible to try.

KASLIN FIELDS: And flexible.

GARI SINGH: And flexible. Right.

KASLIN FIELDS: Yeah. Just like the Google motto, making the world's information universally accessible and useful, making your Kubernetes clusters maybe not accessible in terms-- the words have some issues there.

GARI SINGH: Accessible to platform administrators or those who have the proper role.

KASLIN FIELDS: Yes.

GARI SINGH: But I mean, there's some good stuff. I mean, there's lots of tools out there. I think we've put out kubectl-ai that's out there. We've got an MCP server you can try out for GKE now. I think we'll get beyond trying to just-- I think a lot of the work right now people are doing is, like I said, generating stuff. And that's great. That's generative AI.

But the best part is when you can have the generative part happening in response to the data that's processing and the events come in. And that's where you get into the agent mode, or agentic mode. And I think the more of that we can do, we can codify without writing every single rule, like a lot of the best practices that we have, start to provide those recommendations.

And then when people look at those recommendations, well, here's my whatever agent that's going to-- yeah, we're going to turn that on. And when it sees these recommendations, it's just going to start acting upon those. And it becomes this pretty powerful, self-contained system, where we're observing it; we have agents observing it; and then you may also have your own experience that you've dynamically created, tailored to what he may specifically be doing.

KASLIN FIELDS: I'm sure that a number of SREs and platform engineers in the audience are, however, at the moment clutching pearls. Like, I don't want agents running my production systems.

GARI SINGH: Yeah, I think, there's always this notion that-- I mean, that's the trust over time that you start to build on.

KASLIN FIELDS: And it's got to be rooted in data. There's got to be checks.

GARI SINGH: Yeah. And the thing is, there's still so much more to do. It's just offloading some of that work that you've had to code a number of things up to do before. Now, you're focusing more on the outer layers that matter. But there's lots of mundane tasks, that people have to do. I mean, even stuff like agents that generate reports on snapshots and point in time. Those are probably your starting points, more static.

But then you really could get to the point where-- I'm excited about combining AI and stuff like that with the ability to resize pods. That's a great feature, the VPA coming out with that we can now resize pods; we can have the VPA running.

But imagine, we could just have some agents running also specific to your kind of characteristics. People don't know what's the right requests and limits to set. Maybe turn it on in your development environment in the first place, and it'll find out the right kind of things to set, and push those to production.

And then someday in the future, we'll leverage that to really help with the-- because we can scale the infrastructure, but it's how do we figure out what the right way to scale the apps is and bind that to the infrastructure? And I think that's what's always problematic for people. And I think that's where we can really help with the agents.

KASLIN FIELDS: This reminds me of trying to explain non-determinism to engineers. I feel like a lot of engineers really struggle with the concept of non-determinism because it's just kind of like, let the computer figure it out. Offload some of that mental load and let the computer figure out some of the parts of it. I was doing a project with PDF parsing, where it's like, humans wrote a bunch of documents, and they share them as PDFs. And we need to derive rules from those. But they're written in natural human language. And that's just really hard to parse.

Writing that out deterministically is going to be so many if statements, so many kinds of things to consider, whereas you can just be like, AI, figure it out. And it kind of reminds me, too, of the serverless-to-not-so-serverless spectrum in the world of infrastructure, and the many rabbit holes that we could all go down in the world of Kubernetes. It's which ones do you want to go down? Where is the value for you to focus with this infrastructure, with your systems, or with your software? And what can you kind of offload elsewhere?

GARI SINGH: Yeah, exactly. Exactly. That's a good analogy, I think, on the PDFs.

KASLIN FIELDS: Yeah, that one's really good.

GARI SINGH: I had the pleasure of running a OCR product. But imagine that you tried to do some forms and they're like, if they don't match, how do you handle the exceptions? And I think that's where that training came in on those old models. It would do this fuzzy matching. And I think that's where this stuff really starts to help us bring these things together that may not be obvious.

I mean, AI is very good at figuring out how things correlate, look similar next gen. And as long as we can plug in those right data sources, it's using the tools to do it. You still have to have the right data. So Kubernetes has to have the right data. GKE, we have to have the right data. And then you as the platform team or whatever have to have the right data specific to your applications or your infrastructure in tying that all together.

But you probably want to focus more on your stuff, and then hope that we can now tie that to have that full trace all the way through to the low-level stuff when you need it.

KASLIN FIELDS: Yeah. And so I have two last questions for you. Number one, we've talked about a bunch of features of GKE and open-source Kubernetes in this. What is your favorite new feature in GKE and or open-source Kubernetes?

GARI SINGH: They're sort of tied together. My favorite feature in open source is definitely in-place pod resizing. I think that's what we call it here. What's it called?

KASLIN FIELDS: IPPR.

GARI SINGH: IPPR, but then it has some other name, whatever the name is in Open Source

KASLIN FIELDS: Does it?

GARI SINGH: It's Something in Open Source

KASLIN FIELDS: In open source?

GARI SINGH: I forget. Yeah, it's whatever that feature is.

KASLIN FIELDS: Oh yeah. It does.

GARI SINGH: Yeah, yeah. We should know. We can look it up and edit it for the thing. But we call it IPPR, but yeah, the ability to resize pods. And so I think, with that, enables great things like boost and whatever. And then from a GKE perspective, we introduced this thing that we call container-optimized compute, which allows us to dynamically resize the actual underlying compute.

I think, when you combine those things together, it's pretty nice. We can scale horizontally super fast, but we can also boost applications super fast. So that becomes a core primitive that's needed to do all the AI stuff that I talked about that we should hopefully be getting to.

KASLIN FIELDS: Yeah, it makes me so sad when I hear Kubernetes folks, platform engineers, talk about clunkiness in auto-scaling processes with Kubernetes clusters. I'm like, we have so many tools for this now. And especially right now, the platform is getting so much better at being able to auto scale super smoothly, which also terrifies a lot of the SREs and platform engineers out there because you don't want to have an incident where it's just scaled out of control and now you're paying crazy costs.

GARI SINGH: Right. Or it didn't scale and you have Issues

KASLIN FIELDS: Oh yes, that one too. So it's all about controlling the scaling. But even with controlled scaling, it should be super, super smooth. So I'm excited to see those features too. And a bit of a spicy question for you because I enjoy asking you spicy questions-- what feature do you most want to see added to GKE?

GARI SINGH: That's an interesting one. Well, I guess the one that I would really love-- so we talked about auto scaling or whatever. But I would love for us to be able to just call a Kubernetes API-- so you deploy your thing. You just have this endpoint. And then we just create clusters on the fly.

KASLIN FIELDS: Oh, interesting.

GARI SINGH: Because, right now, you still have to physically create a cluster. And yes, we've made it super simple. But it can still take a couple minutes, whatever, to start that cluster. But imagine a world where you're like, hey, here's my workload. And you've decided I want to scale up in a new region all of a sudden because you needed a new AI accelerator. What if I just don't have to even create a cluster? I just deploy a workload, tell you what region I want it in. If we have a cluster there, we deploy it on the cluster. If we don't, we create a new cluster.

We could do that today, but it takes like six minutes. But what if we could do that in less than like 20 seconds? That, to me, is like the ultimate in-- it's still Kubernetes, but it's basically serverless. It's like a K-native cluster. You know what I mean? When you deploy a K-native workload on a Kubernetes cluster, it sort of scales. Now I'm just pushing it as a Kube API, and a cluster just magically spins up and down, and you don't worry about clusters anymore.

KASLIN FIELDS: Yeah, I thought, at first, you were going to say just a Kubernetes API that I can deploy my workloads to. And I was like, serverless? But yeah, it kind of all comes back together to breaking down the walls more between Kubernetes and the concept of serverless, making it smoother for users to interact with the pieces that they want to interact with, and just not worry about the pieces that they don't want to. It's an interesting idea.

GARI SINGH: I think, because it's like there's overloaded terms where they call it a serverless Kubernetes or how you call it. But I think, just at the end, the engine being Kubernetes is still, I think, an important piece to me, or I feel that it's an important piece because that gives you access to all of the compute infrastructure, whether you need it directly or not.

And so you can run any workload. But now if I could just pull it out a little bit, that I don't have to worry about-- that I can just create these things on the fly in a timely fashion, that just makes everything go away. Upgrades become easier, everything. You get ephemeral clusters. Anyway, we're probably out of time, but you can get to this pretty cool dream. We wanted faster node startup. I want basically faster cluster startup anywhere.

KASLIN FIELDS: I think a lot of people want that. I hear that a lot at events, especially in multi-cluster conversations. I want it to be easier to manage all of my clusters.

GARI SINGH: Exactly. And selfishly-- I do demos of apps all the time and cool features all the time. I don't want to have to have my clusters. I just want to sit around and say, somebody's like, hey, can you show me how to do blah, blah, blah? Yeah, sure can. Boom. Here it was. Just put this in your YAML. Done.

KASLIN FIELDS: And you're good to go. Awesome. Thank you so much, Gari, for hanging out with me today and talking about GKE.

GARI SINGH: Hey, thanks for having us. And looking forward to seeing everybody at KubeCon and some more GKE events, hopefully.

KASLIN FIELDS: Yeah. Join us at KubeCon.

[MUSIC PLAYING]

That brings us to the end of another episode. If you enjoyed the show, please help us spread the word, and tell a friend. If you have any feedback for us, you can find us on social media @kubernetespod or reach us by email at <kubernetespodcast@google.com>.

You can also check out the website at kubernetespodcast.com, where you'll find transcripts, show notes, and links to subscribe. Please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening, and we'll see you next time.