Kubernetes Podcast from Google: Episode 252

#252 May 15, 2025

KubeCon EU 2025

Hosts: Abdel Sghiouar, Kaslin Fields, Mofi Rahman

This episode features curated selection of conversations from the KubeCon EU 2025 showfloor. We dive into the rise of platform engineering, explore what’s up with ai workloads on Kubernetes, get updates on core Kubernetes components, and hear some truly unique user stories- like using Kubernetes on a dairy farm!

Do you have something cool to share? Some questions? Let us know:

News of the week

Links from the interview

NAIS at NAV, with Hans Kristian Flaatten and Audun Fauchald Strand

Platform Engineering, with Max Körbächer and Andreas (Andi) Grabner

Kubernetes at LinkedIn, with Ahmet Alp Balkan and Ronak Nathani

LLMs on Kubernetes, with Mofi and Abdel

SIG etcd with Ivan Valdes

**Open Source Kubernetes, with **

Jago Macleod
Google Open Source: Kubernetes
Schedmd
Slurm
Ray
Run:ai from Nvidia
Medium blog: “Deploy Slurm on GKE” by Abdel Sghiouar
AI-Hypercomputer, xpk
- XPK (Accelerated Processing Kit, pronounced x-p-k) is a command line interface that simplifies cluster creation and workload execution on Google Kubernetes Engine (GKE). XPK generates preconfigured, training-optimized clusters and allows easy workload scheduling without any Kubernetes expertise.
Cursor AI Editor

Dairy Farm Automation & Banking with Kubernetes, with Clément Nussbaumer

Clément Nussbaumer
Talos Linux
Cluster-api
- Cluster API is a Kubernetes subproject focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.
KubeCon EU 2025 Talk: “Day-2’000 - Migration From Kubeadm+Ansible To ClusterAPI+Talos: A Swiss Bank’s Journey” - Clément Nussbaumer, PostFinance
Kubeadm
- Kubeadm is a tool built to provide kubeadm init and kubeadm join as best-practice “fast paths” for creating Kubernetes clusters.

Being a First-Time KubeCon Attendee, with Nick Taylor

Kubernetes The Hard Way
K3s - “The certified Kubernetes distribution built for IoT & Edge computing”
Kubernetes Ingress Controllers
Kubernetes Up and Running
Kubernetes Docs
KubeCon EU 2025 Sponsored Keynote: The Science of Winning: Oracle Red Bull Racing’s Formula with Open Source, Kubernetes and AI - Sudha Raghavan, SVP of OCI Developer Platform, Oracle

Transcript

Show full transcript

KASLIN FIELDS: Hello and welcome to the Kubernetes Podcast from Google. I'm your host, Kaslin Fields.

MOFI RAHMAN: And I'm Mofi Rahman. This is our KubeCon EU 2025 episode. We usually like to get the event episodes published closer to the actual event date, but we also were participating in Google Cloud Next, which ended up being right after KubeCon. But as we like to say, better late than never.

KASLIN FIELDS: Yes, you and Abdel recently had an amazing time at KubeCon EU 2025 in London. I wish I could have been there. And you ran a series of live streamed interviews straight from the conference floor. I know that the energy was high and you chatted with some really interesting folks across the community. So, in this episode, we're bringing a curated selection of those conversations, diving into the rise of platform engineering, exploring some cutting edge technologies, getting updates on core Kubernetes components, and hearing some truly unique user stories like running Kubernetes on a dairy farm.

MOFI RAHMAN: But first, let's get to the news.

KASLIN FIELDS: The Cloud Native Computing Foundation has announced the release of the automated governance maturity model. Developed by CNCF's technical advisory group or TAG security, this model aims to help organizations evaluate and enhance their governance policies, particularly focusing on automation in an era of rapid development and increasing AI system usage. The goal is to ensure systems operate according to organizational expectations, comply with regulations, and meet strategic objectives by embedding automation into traditional governance tasks.

MOFI RAHMAN: Kubernetes 1.33 release feature blogs are continuing to come out. So, if you are interested in learning more about the features in 1.33, make sure you check out the blogs on Kubernetes.io. Some cool features with new blogs include dynamic resource allocation or DRA, image volumes, and horizontal pod auto scaling.

KASLIN FIELDS: The CNCF recently featured a new blog called Understanding Kubernetes Gateway API, a modern approach to traffic management. As we've talked about before on this show, the Kubernetes Gateway API is a really cool update that enables several improvements to the way you can manage ingress for your workloads on Kubernetes. So check out the blog.

MOFI RAHMAN: Open observability con has been renamed to open observability Summit to avoid potential confusion with other similarly named events. While the name is changing, the purpose remains the same: to bring together developers, operators, and business leaders to explore and enhance open source observability projects and practices. The event takes place in Denver, Colorado on June 26, 2025.

KASLIN FIELDS: And that's the news. One of the most talked about topics at KubeCon this year was undoubtedly platform engineering. It's all about enabling developers and streamlining operations. Abdel kicked off his KubeCon live streams by speaking with several experts who are deeply involved in building and defining these platforms, and they had some fantastic insights to share. Note that these were recorded live on the KubeCon show floor, so there is a little bit of noise and some audio issues. Uh, so we hope we'll you'll be able to bear with us through those. First up, Abdel was joined by Hans and Audun from Nav, the Norwegian Labor and Welfare Administration. They've developed an impressive internal platform called NAIS, or NAIS, which we did an episode on in January. There's a link to that in the show notes. And they've been instrumental in fostering a significant platform engineering community in Norway. Let's hear about their journey, how they're using open telemetry, and their plans for the platform.

ABDEL SGHIOUAR: You have a talk tomorrow about platform engineering. Can you tell us a little bit more about that?

AUDUN FAUCHALD STRAND: Yes, well, uh, it's a talk about two things basically. It's a talk about the platform we have at Nav called NAIS, and also a community we built around platforms in Norway where we bring people together from the whole public sector in Norway where I think we have like a few thousand members and 60, 70 companies. So that's where they can a slack basically and to meet so they can meet and share experiences.

ABDEL SGHIOUAR: Awesome. Cool.

AUDUN FAUCHALD STRAND: Yeah.

ABDEL SGHIOUAR: Cool. So we we actually covered nice once on the podcast. So there is a episode, we'll make sure to link it somewhere, you know, we had, uh, I don't remember who did we have? We had you?

AUDUN FAUCHALD STRAND: No, it was Froden and Yanni.

ABDEL SGHIOUAR: Yes, yes, I remember now. That was a while ago. Um, and so, can you, Hans, can you explain to people quickly what nice is?

HANS KRISTIAN FLAATTEN: Yeah, so NAIS is a Kubernetes based application platform. Um, started on premise, running our on the Kubernetes the hard way and then transitioning over to Google Cloud and running it on top of GKE. So that's the high gist of it.

ABDEL SGHIOUAR: And nice is an abstraction layer on top?

HANS KRISTIAN FLAATTEN: Yes, so we have built our own Kubernetes operator, our own manifest. This was way back uh where this was brand very brand new.

ABDEL SGHIOUAR: Before K native.

HANS KRISTIAN FLAATTEN: Yes.

ABDEL SGHIOUAR: Yeah.

HANS KRISTIAN FLAATTEN: Yes, and it's been serving us so well because then back then it was only on premise, which allows us to just transition the applications over to the cloud-based environment, add on more features and really not not change the way that our developers were working and so configuring their applications.

ABDEL SGHIOUAR: Awesome.

HANS KRISTIAN FLAATTEN: Yeah.

ABDEL SGHIOUAR: And so last time, uh last year I was in a meetup in Bergen because these guys are well, Hans is based in Bergen.

HANS KRISTIAN FLAATTEN: Yeah, you are based in.

ABDEL SGHIOUAR: You are based in Oslo. Yeah, now I remember. Um, you had a talk about integrating open telemetry into the nice platform such a way that it becomes abstracted in a way from the developers, right? Can you talk more about that?

HANS KRISTIAN FLAATTEN: Yeah, so the manifest allows us to have uh additional add-ons as well. So if you can have databases on cloud sequel and and storage. And one of the things that we added recently was that to allow auto instrumentation of open telemetry. So that's been sort of one of my projects and coming in and taking taking a little bit more charge of the observability part of the platform. And then very quickly figuring out that we need to standardize an open telemetry. I really believe that this was this was a standard worth implementing for now. We have had uh different observability tools in the past and sort of constantly had to sort of change out the libraries, change out the agents. So this allows us to sort of instrument once and then uh run anywhere or get the display the data anywhere. So we are using the open telemetry operator for Kubernetes and then this is just adding on the metadata and the configuration which injects the agent depending on which run runtime you have. So that would be Node, Java, .net, Python. Yeah.

ABDEL SGHIOUAR: Yeah, so that's actually pretty cool because last time I looked into open telemetry and I talked to a lot of people, one of the main complaints, it's very complex.

HANS KRISTIAN FLAATTEN: Yeah.

ABDEL SGHIOUAR: So auto instrumentation essentially means people can just add some annotation labels or whatever and then your runtime will figure out what kind of programming language or?

HANS KRISTIAN FLAATTEN: Well, the developer specify that. So they have a very, very simple, they just say that, oh, observability, enable auto instrumentation and by the way, my runtime is Java. And then we set up rest. So they don't need to sort of know how the agent functions and all the different parameters, setting the resource name and resource namespace and all the extra attributes, etc. That's all set up for them and it just starts producing open telemetry metrics. And then the developers can then add additional metrics or instrument their the application with the business logic. So this only gets sort of like, oh, you have this number of requests and making traces for that. It doesn't really understand the business. That's what, that's where the developers needs to sort of add on an extra span, add some extra attributes to say that, oh, by the way, this is the the core functionality that we're doing. This is the customer or or this is the process business process that's happening right here and just adding adding that on to the data that's being produced.

ABDEL SGHIOUAR: Awesome, cool. And so, can you tell us then what's next for for for nice? Like, okay, pushing it through the government, pushing it through other agencies. Like how are you how are you approaching that?

AUDUN FAUCHALD STRAND: Well, well, that that's what's next now trying to make it a platform for more than just now.

ABDEL SGHIOUAR: Yeah.

AUDUN FAUCHALD STRAND: Uh and that means, first of all, we have to solve some technical problems. We have to scale out and make it possible to make a proper multi tenant platform where they have different projects in GCPs so we have no, so there's proper isolation for the data. And then we we need to figure out how to get through all the red tape. There's loads of red tape for sharing stuff in the government it turns out. And the most most difficult thing that you probably know better than us, we have to learn how to sell things.

ABDEL SGHIOUAR: Oh yeah.

AUDUN FAUCHALD STRAND: I don't know how to do that, but other people know. We have no idea. We just come there and say, well, this is good and this is bad and please use our product and I don't know.

ABDEL SGHIOUAR: Um, Kubernetes 1.27 introduced this thing called in place pod updates, where you can update the resources on a pod, the resource request specifically, on the fly without having to restart the pod. And that's very useful for Java applications.

HANS KRISTIAN FLAATTEN: Yes, yes.

ABDEL SGHIOUAR: Have you been looking at that?

HANS KRISTIAN FLAATTEN: We have definitely been looking at that. Uh, is it which version of GKE is it?

ABDEL SGHIOUAR: So, so the capability is in Kubernetes 127.

HANS KRISTIAN FLAATTEN: Yeah.

ABDEL SGHIOUAR: So it should be available in GKE. I think now right now it's better.

HANS KRISTIAN FLAATTEN: Yeah.

ABDEL SGHIOUAR: But then, so that's so what's available in Kubernetes is being able to update the resources.

HANS KRISTIAN FLAATTEN: Yeah, yeah, yeah.

ABDEL SGHIOUAR: But you need an operator to do that.

HANS KRISTIAN FLAATTEN: Yes, yes.

ABDEL SGHIOUAR: And so what we're doing is we're actually proposing that upstream to the VPA, the vertical auto scaler.

HANS KRISTIAN FLAATTEN: Oh yeah. That's really interesting.

ABDEL SGHIOUAR: So there is a proposal and that's that's coming, but it's not it's it's still a proposal. It has to be merged.

HANS KRISTIAN FLAATTEN: Yeah, yeah. No, so currently, so we have built this uh developer portal where the developers can see their usage uh and sort of like adjust it, but that's not sort of like the the same functionality.

ABDEL SGHIOUAR: Yeah, yeah, that's completely different.

HANS KRISTIAN FLAATTEN: Yeah. Yes, because what we have, we have the same challenge there that we need a lot of resources starting up. And then we need to it would be really beneficial to to scale that down or reduce it uh once once the application is up and running. So yeah, we are we haven't been looking at it, but we haven't sort of been able to to adopt it yet.

ABDEL SGHIOUAR: Awesome. All right. Awesome, cool. Well, thank you very much for your time guys.

HANS KRISTIAN FLAATTEN: Awesome.

KASLIN FIELDS: Continuing the exploration of platform engineering, Abdel had a great chat with Andy and Max. These are the authors behind the book Platform Engineering for Architects. And they offered Abdel some insights into the philosophy of platform engineering, why it's considered an evolution of Dev Ops, and how to approach building these platforms as a product by focusing on developer pain points.

ABDEL SGHIOUAR: You guys wrote a book about platform engineering, right?

ANDREAS (ANDI) GRABNER: That's right.

ABDEL SGHIOUAR: So, can you tell us a little bit about it?

ANDREAS (ANDI) GRABNER: Yeah, it was called, it's called platform engineering for architects and the tagline is uh building or crafting modern platforms as a product. So I think the key point is that uh you want to build a product and also treat it as a product. That means before we build any type of product, we first want to understand who is our end user, what is their pain point, what problem do we solve for them, then build something that solves their problem, and then ideally, if we really solve it, they want your platform and not you're forcing it on them.

ABDEL SGHIOUAR: Awesome. And so, um, so maybe Max, what's what's I've been hearing about this thing called platform engineering for a while and it seems like if you if you if you believe the internet, Dev Ops is dead and platform engineering is alive and kicking. Uh, but what's platform engineering? Can you give us a definition?

MAX KÖRBÄCHER: Well, I mean it's like the evolution of it, right? So it uses the same methods, the same procedures and approaches, but it tries to get a little bit away from this, you build it, you run it. That's maybe true for the platform itself, but it's more like being in a open environment, uh an integration layer for any kind of specialty. You often have like the target users for developer.

ABDEL SGHIOUAR: Yeah.

MAX KÖRBÄCHER: But I mean, we also wrote in the book, it doesn't need to be the developer itself. Right? With all the drive on AI and so on, it can be also data scientists, it's people around security, um but it's also a question of like how do we can optimize the environments for operations.

ABDEL SGHIOUAR: Yeah.

MAX KÖRBÄCHER: And that all comes together on platform engineering, break up the silos, put the silo into some horizontal uh approach, but open it up. This is the key key success factor of it.

ABDEL SGHIOUAR: Yeah, because it seems to me like and again, excuse my ignorance probably because I'm not a very expert on the topic. It seems to me like platform engineering is just putting back the barrier, the way it used to be back in the days of system and administration and application development. Because the whole point is that as a developer, you have a platform that you deploy on, well, a self-service platform, all that stuff, right? But it's kind of like going back into the kind of upper layer and bottom layer. Is that is that is that true? Is that like a fair way of describing it?

MAX KÖRBÄCHER: I think I cannot say something differently about it, but I would say the perspective is more like you open up for for any kind of people who wants to adopt it, right? It's like not like an extreme technical area. You have in some of the development platforms, you have a lot of uh documentation. You have product owners going into.

ABDEL SGHIOUAR: Yeah.

MAX KÖRBÄCHER: How many product owners in the past went to the server? Zero. Nowadays, they can at least have a direct interaction and see what's going on between development, between the operation, maybe what Dev Ops or SRE doing, but they also can see business perspective if they want to have it.

ABDEL SGHIOUAR: All right. And so, Andy, you've been talking about the book is about building or considering or treating platforms as a product, right? So I assume this means product managers, developers, ops people, and sort of creating a company within a company sort of or like a group within a group.

ANDREAS (ANDI) GRABNER: Yeah, not a company within a company, but building a product that is has has internal users, right?

ABDEL SGHIOUAR: Okay.

ANDREAS (ANDI) GRABNER: And I think if you look at here, I think they said 13,000 people are at KubeCon.

ABDEL SGHIOUAR: Yeah.

ANDREAS (ANDI) GRABNER: I think the majority is here for the first time.

ABDEL SGHIOUAR: Yes.

ANDREAS (ANDI) GRABNER: Which means these organizations that they represent, they probably started some of them in pockets have started with Kubernetes years ago. Some of them are experts. Now, more and more of the same organizations try to also adopt Kubernetes. Now the question is, do they all need to learn about service meshes, about network, about Argo?

ABDEL SGHIOUAR: Yeah.

ANDREAS (ANDI) GRABNER: Definitely not. So the question is, do they all need to reinvent the wheel or can the experts that have already built up the knowledge, build an internal platform to make it easy for everyone in the organization to do build, deploy, operate, observe and secure.

ABDEL SGHIOUAR: So abstract away the complexity of Kubernetes behind a platform.

ANDREAS (ANDI) GRABNER: That's kind of the idea.

ABDEL SGHIOUAR: Yeah. So, so my question to you is, I think I have another kind of stupid question probably. Does does a platform does a platform mean you have to have an IDP?

ANDREAS (ANDI) GRABNER: Not necessarily. I think what a platform needs to have is first of all, a pain that you solve. Right? If the pain in your organization is, and I think we have it in the in the book as well, we came up, we described a fictional company with uh teams and like problems and the first problem we explained is engineers have a hard time getting access to their logs.

ABDEL SGHIOUAR: Yeah.

ANDREAS (ANDI) GRABNER: In a production environment.

ABDEL SGHIOUAR: Yeah.

ANDREAS (ANDI) GRABNER: It's a very tedious and long process with a lot of tickets involved in manual work. So, how do I get to my logs? Do I need an IDP for that? Maybe, maybe not. Maybe I just build something where I can use Slack, MS teams, whatever, and I can just talk with a chatbot that says, I am Andy, give me the logs that I need for this particular problem. So how you solve it, it depends on which problems you really want to solve, which tools and processes you currently have in place, and then you try to find a smooth way of making or solving this problem in a in a self-service way. And whether it's an IDP or not, that's depends.

ABDEL SGHIOUAR: Got it. I like that. Start simple, start with the pain points and then go from there. Cool. So I'm going to ask you one last question. Uh and this is question to Max. Cloud Native Summit, Munich.

MAX KÖRBÄCHER: Yes.

ABDEL SGHIOUAR: When is that going to happen? And am I am I invited? I'm just putting I'm just kidding. I'm just kidding.

MAX KÖRBÄCHER: You're always welcome.

ABDEL SGHIOUAR: Thank you.

MAX KÖRBÄCHER: It's happening on the 21st, 22nd of July. So perfect time to come to Munich. Uh enjoy two days of open source conference. Very I would say cozy environment to network with everyone there.

ABDEL SGHIOUAR: Yes.

MAX KÖRBÄCHER: Uh and at the same time it's a perfect uh time of the year for enjoying also the beer gardens.

ABDEL SGHIOUAR: Well, you said that last year and it's rain. So I I wouldn't.

MAX KÖRBÄCHER: Yes.

ABDEL SGHIOUAR: Yeah, the beer was definitely good, but.

MAX KÖRBÄCHER: It's German, there's no bad weather, it's just the wrong clothes.

ABDEL SGHIOUAR: I live in Sweden, so I know that. All right, awesome. We'll make sure you have the link to the to the to the conference in the show notes. Thank you very much guys for being with us.

KASLIN FIELDS: Building platforms at scale brings its own unique set of challenges. Ahmet and Ronak from LinkedIn, who listen listeners might remember from their recent deep dive episode with us, joined Abdel at KubeCon following their talk on LinkedIn scalable compute platform. They discussed their experiences with operators and CRDs at massive scale and the critical importance of node lifecycle management for demanding AI and machine learning workloads.

ABDEL SGHIOUAR: Hi guys.

RONAK NATHANI: Hello.

ABDEL SGHIOUAR: How have been KubeCon for you?

RONAK NATHANI: Very energizing. I always come thinking I'll go back exhausted, but I always go back very energized.

ABDEL SGHIOUAR: Awesome. So you folks were in the show. We talked about LinkedIn. You were our first end user guest. And we got a lot of very good feedback. The episode was published like two weeks ago, so it's still too early to kind of figure out the numbers, but usually we look at the numbers a month later to see like how the stream or like how the downloads have been uh progressing. And uh we talked about a bunch of things in the in the in the in the in the episode so people have to go check it out. But you had a talk at KubeCon.

AHMET ALP BLAKAN: We did, yes, earlier today.

ABDEL SGHIOUAR: So what was that about?

AHMET ALP BLAKAN: Uh, we talked about how we are building a scalable compute platform at LinkedIn, uh managing bare metal servers all the way how we deploy apps on top of it.

ABDEL SGHIOUAR: Awesome.

AHMET ALP BLAKAN: All using Kubernetes of course.

ABDEL SGHIOUAR: Awesome. Did you share anything in the talk you didn't say at the podcast?

AHMET ALP BLAKAN: Um, I think we went into a little more depth in the talk itself and we encourage people to check it out, but we talked about how we are trying to improve user experience special specifically around failure categorization so that we have less support load. So that's what we touch about.

ABDEL SGHIOUAR: Oh, so less work for you.

AHMET ALP BLAKAN: Less work for us. Yes, that's the goal.

ABDEL SGHIOUAR: Awesome. And so Ahmed, you you have a reputation for having a think for operators and CRDs.

AHMET ALP BLAKAN: That's right.

ABDEL SGHIOUAR: Yeah. I mean your blog is all about that.

AHMET ALP BLAKAN: Yeah.

ABDEL SGHIOUAR: All right, so I I'm not putting you on the spot. So, can you tell us a little bit about like the learnings? What did you learn from building operations and CRDs from your experience?

AHMET ALP BLAKAN: Yeah, I mean, I I would say a lot of things that we do at LinkedIn, they have to be like fully automated because our scale is so big, right? So we have to basically rely on operators to do the job for us. So a lot of us are end up building a lot of operators and at the end of the day. So as a result, scalability of the operators, correctness of the operators becomes top of mind. And KubeCon is a great place to figure that stuff out because all the maintainers of the controller runtime, the API machinery, they're all here, so we get to kind of exchange ideas and you know, learn from them.

ABDEL SGHIOUAR: Awesome. Did you did you folks went to the maintainer summit by any chance?

AHMET ALP BLAKAN: I did.

ABDEL SGHIOUAR: Okay, so how was that? It's the first time in that format, right?

AHMET ALP BLAKAN: Yeah, uh it was pretty cool. Everyone was there as usual. Um, a lot of interesting breakout sessions, uh especially around like node life cycle, which is also very close and very hard.

ABDEL SGHIOUAR: Node life cycle, oh, interesting.

AHMET ALP BLAKAN: Okay. we deeply care about how the nodes get, you know, life cycle managed and for evictions, drains and stuff like that. So, uh, yeah, we had a lot of fun there.

ABDEL SGHIOUAR: Yeah. It's actually interesting for me how like if you just come to KubeCon and you look at, you know, the show floor and the exhibition hall and the lightning talks and the booth talks, everything is AI, AI, AI and LLMs. And you guys are still doing like nodes life cycle. Like the actual barebone stuff, the actual thing that matters, right?

AHMET ALP BLAKAN: Yeah.

ABDEL SGHIOUAR: So do do you have to do you handle any LLM workloads? Um, like in your day-to-day?

AHMET ALP BLAKAN: We do. We handle LLMs and what we call LPMs, large personalization models.

ABDEL SGHIOUAR: Yeah.

AHMET ALP BLAKAN: Yeah, both of them. So again, the node life cycle becomes pretty crucial even when it comes to GPUs for example.

ABDEL SGHIOUAR: Of course, yeah.

AHMET ALP BLAKAN: People want to launch large workloads with lots of training and when their GPU goes down, they're pretty sad about it. So node life cycle kind of makes it better, so that's where the interest comes from.

ABDEL SGHIOUAR: Of course, of course. And so I I have I have one more question. Um I don't know if you saw the announcement we made about MCO, multi cluster orchestrator.

RONAK NATHANI: We heard about the announcement, yes.

ABDEL SGHIOUAR: All right, did you did you have the chance to look into it or you just?

RONAK NATHANI: Not yet.

ABDEL SGHIOUAR: All right, so I'm I'm excited about that one. So it's essentially multi cluster orchestrator as the name indicates, right? And uh it's uh like an open source tool for managing multi cluster that we open sourced. Um, so I'm excited to see how that progress. It's pretty cool. It looks pretty cool on the surface.

RONAK NATHANI: Oh, we would love to check it out and definitely we'll be in touch.

ABDEL SGHIOUAR: Yeah, so maybe in a year from now when you actually get time to use it, we'll bring you back on the show.

RONAK NATHANI: Uh, definitely, we can share more feedback for sure.

ABDEL SGHIOUAR: Awesome. Thank you very much, guys. Thank you for being on the show.

RONAK NATHANI: Thank you for having us.

ABDEL SGHIOUAR: And uh yeah, thank you for coming back.

KASLIN FIELDS: Amidst the KubeCon whirlwind, Abdel and Mofi took a moment to discuss the significant interest in running large language models, or LLMs, on Kubernetes, including the why and the how for auto scaling these workloads. They also touched upon some exciting new developments with the Gateway API, particularly an inference extension that could reshape our interactions with AI models.

MOFI RAHMAN: So many things coming together at very last minute. I feel like uh like again, yesterday was a lot of bunch of colo events, so a lot of people are A, jet lagged, B, tired from the colo event and then running into this big operation. If you have been to Cubecons before, by the way, this is the biggest Cubecon so far.

ABDEL SGHIOUAR: 12,500 people.

MOFI RAHMAN: So about 15,000 people, 12,500 people all in roughly about 13, 14,000 people maybe. Um, so yeah, it's like first day. I don't know if you can hear this, but right behind me is a lot of like trumpet playing. So I don't know how that is translating over there.

ABDEL SGHIOUAR: Yeah, I don't I don't think we can There are people dressed up like British soldiers with like those very funny long hats and they are playing trumpets, which is definitely very loud. So, we did our talk today.

MOFI RAHMAN: We did.

ABDEL SGHIOUAR: We talked about running LLMs on Kubernetes.

MOFI RAHMAN: Yes.

ABDEL SGHIOUAR: We posted a post about it and we got a lot of questions about why would you want to do that.

MOFI RAHMAN: Do which one?

ABDEL SGHIOUAR: Run LLMs on Kubernetes.

MOFI RAHMAN: Okay.

ABDEL SGHIOUAR: Right? So, can you tell people why would you want to do that?

MOFI RAHMAN: Yeah, so I think it's a matter of so okay. So for most people, right? Like if you're trying to build an application, taking an off the shelf uh LLM like Gemini, GPT or Claude works fairly well. You could just like pretty much run your application using those API and scale up, scale down, pay per token. But for certain use cases, for example, if you're running in uh situation like you are uh financial company or government or healthcare, you can't really send your API request over to some third party company like Google, Anthropic or Open AI. In those situations you want to like keep the model located in your own data center, have all the control over running those models. Some other use cases could be is that like when when you pay per token as your usage goes up, your cost also goes up. But as you slowly like get to the point where you can um basically pre-provision all this resource, run it yourself, your cost kind of limited. Then you can like usage, increase your usage without increasing your cost. So these are the probably the few of the biggest reasons you would want to run your models yourself.

ABDEL SGHIOUAR: Awesome. So, um, so there was quite a lot of questions coming during the session. We talked about, I mean, we talked about running LLMs on Kubernetes. We showed this awesome demo that you showed, which is running like 11 LLMs.

MOFI RAHMAN: 10.

ABDEL SGHIOUAR: 10 with like a single um a single like UI and asking it uh knock knock joke which hopefully did not say anything bad, so I don't think it's going to be looking bad on the recording. Um and there was questions about resource optimization. There was questions about auto scaling. We didn't really get to cover that in details. Um, can you briefly touch on auto scaling specifically? What optimizations Kubernetes has for auto scaling LLM workloads?

MOFI RAHMAN: Yeah, so usually when you are like auto scaling Kubernetes workload, you are looking at CPU and RAM usage. Uh you could also look at custom metrics like number of requests. But when you are doing LLM workload, number of requests does not really necessarily tell you how much resource usage you have currently because uh some of the LLMs you can a single request can ask for up to 32,000 token, a million token. So you could have that wide range of how many tokens being generated. That is kind of a hint how you can use custom metric like tokens per second as a metric how you should uh scale up your like workloads so that you are kind of getting the most token generated for your users. So instead of looking at CPU usage or number of requests per second, you kind of flip the script and start scaling based on tokens or like GPU usage because that is the bottleneck and in this again, GPU is one of the examples, TPU same idea. You want to scale on GPU and TPU usage because that is your bottleneck in terms of scaling.

ABDEL SGHIOUAR: All right, so um good. There is one more topic that I'm going to pretend to ask you a question but then you ask me question so I can talk about it. The inference extension, the gateway API inference extension.

MOFI RAHMAN: I mean, passing it back to you, uh you tell me what is inference extension is supposed to be.

ABDEL SGHIOUAR: Thank you for asking me that question. Yeah, so we talked I did a talk about the inference the gateway API inference extension, which is an extension of the inference the the gateway API to be able to do inference and specifically it has capabilities around um multimodality. For example, you can set up a single load balancer that routes traffic to models based on what the user is trying to do. Are they trying to do text summarization or trying to do like video or picture or stable diffusion or whatever. Then there is the second thing is like model based routing. So you send requests and you say I want to talk to this specific model and then the gateway knows exactly which backend to send your request to. And then there is all the stuff we talked about which is getting custom metrics out of those um uh inference servers and use those to make like intelligent routing decisions, right? So this is an early access not early access but it's like an early work. Uh there is like a spec, uh there is a website, you can go check it out. Uh the talk will be recorded so you can check it out later on YouTube. And then um yeah, I'm excited actually. There will be a lot of interesting things coming out of that.

KASLIN FIELDS: KubeCon is always a fantastic opportunity to get updates on the foundational technologies that underpin everything in the Cloud Native world and to hear about the future direction of the Kubernetes project. Abdel had the chance to connect with some key figures who are shaping exactly that. First in this block, Abdel spoke with Ivan Valdes, who shared the news that he has just become a co-chair of SIG etcd. Ivan gave Abdel an update on the much anticipated etcd 3.6 release and the brand new etcd operator, which is set to simplify running standalone etcd clusters within Kubernetes.

ABDEL SGHIOUAR: Can you introduce yourself really quick?

IVAN VALDES: So, hi, I'm Ivan Valdes. Uh, co-chair of SIG etcd as like last week.

ABDEL SGHIOUAR: I think we everybody should know this by now, but if you're not aware, etcd became a SIG about a year and a half ago.

IVAN VALDES: Yeah, like year and a half, two years ago.

ABDEL SGHIOUAR: A year and a half, two years. It used to be a standalone project at the CNCF and now it's a SIG, a special interest group. So, what's new in etcd? What's going on?

IVAN VALDES: So the big news and hopefully everybody can uh join our talk on Thursday in the uh maintainer track, but the idea is that we want to finally release uh uh etcd 3.6.

ABDEL SGHIOUAR: Yeah.

IVAN VALDES: Uh, last version of etcd was 3.5 that was released uh four years ago. Uh, so it's not even a major version, it's a minor version, but it has been uh so the problem with the or like one of the issues with the team is that it's a small team and so there has been historically a lot of rotation with the maintainers. Uh so it's very difficult to do a minor release. Uh so that's basically what we have been working on. Uh and also we are releasing, we just released on Monday etcd operator version .1. So finally etcd is going to have its own official operator. And at this point it's like very alpha. Uh it's just kind of like a toy project.

ABDEL SGHIOUAR: Okay.

IVAN VALDES: Uh but we are actively looking for contributors because we are we definitely need more contributors in the etcd operator. Um so that's basically what has been like those are the big news from from our.

ABDEL SGHIOUAR: So, etcd operator, is that an operator to run etcd inside of Kubernetes?

IVAN VALDES: Yes, so it's an operator to run etcd inside Kubernetes, but of course, it's not the data store for Kubernetes. It's for the use case if you want to use etcd standalone inside your your own uh Kubernetes cluster.

ABDEL SGHIOUAR: All right, because uh like last time when we did the interview about the the SIG etcd, one of the use cases I remember is that Cillium I think for some of its requirements or for some of the the functionalities of Cillium, it requires etcd. So you run etcd standalone inside of Kubernetes if you need it for something else, right? So it's not the storage layer for Kubernetes itself, right?

IVAN VALDES: Yes.

ABDEL SGHIOUAR: Okay, cool. So, all right, so that's that's pretty cool.

IVAN VALDES: Yes.

ABDEL SGHIOUAR: And how how was KubeCon for you so far?

IVAN VALDES: Uh, so far so good. On Monday we had the maintainer uh maintainer summit, I think that's how it's called. Uh it's cool to meet with all like get like put a face on names uh that you don't know from the Kubernetes team. And also now it's open to all to other CNCF teams. Um CNCF projects. Uh so it's it's a nice experience to to just meet and like say hi and understand and like like talk about your your own issues because we are all like solving our so we think we're solving our own issues but in the end uh everybody is like doing uh something something similar at least in my field that I specialize in CICD, uh tooling and and and that. Uh and then yesterday were the collocated events. Uh some interesting uh talks and then today, of course, is the main event. And so far, it's been great.

ABDEL SGHIOUAR: All right, well, there is two more days to go and we're going to be hanging out here. So, first of all, thank you very much even for jumping last minute.

IVAN VALDES: Of course.

ABDEL SGHIOUAR: Thank you for being on the show.

KASLIN FIELDS: For a broader perspective on the Kubernetes project, Abdel sat down with Jago MacLeod, who leads open source Kubernetes engineering at Google. Let's hear Jago's insights on the overall health of the project, especially how Kubernetes is evolving to support the new wave of AI ML workloads, and his thoughts on how AI agents might simplify Kubernetes interactions in the future.

ABDEL SGHIOUAR: Uh can you introduce yourself?

JAGO MACLEOD: Let's start there. Hi, I'm Jago MacLeod. I'm an engineering director and I work on uh open source Kubernetes and GKE. I I lead open source Kubernetes at Google.

ABDEL SGHIOUAR: All right. Yeah, he makes Kubernetes happen, so he's very humble.

JAGO MACLEOD: My my team makes Kubernetes happen.

ABDEL SGHIOUAR: You just tell them what to do.

JAGO MACLEOD: I work with the team.

ABDEL SGHIOUAR: Awesome. So, um, I think we we like it has been a while since we had somebody on the show kind of talking about the overall Kubernetes open source side of things, but like kind of like an overview of the project, not just because we we tend to talk to people who are like very talking like working on very specific parts. So how is Kubernetes? I know this is a vague question, but how is Kubernetes going?

JAGO MACLEOD: Uh, it's going great. A year ago I was a little nervous. I wasn't sure in the you know, in the face of this new tide of AI and ML workloads if Kubernetes was going to retain its dominance. Uh and we've been working really closely with the SkedMD folks who support Srm, uh the Ray community uh and uh the run AI community. So there are all these like other schedulers that run these higher level frameworks. And we're really they've a year ago they were succeeding in spite of Kubernetes and now we've sort of evolved Kubernetes to be more aware downward facing of the hardware, new uh evolution in hardware, accelerators are really different than CPUs.

ABDEL SGHIOUAR: Yeah.

JAGO MACLEOD: Uh and upward facing on these frameworks that are pretty special purpose. So, um it's a super exciting time. I'm uh I think the momentum is really showing that Kubernetes runs all of these big AI workloads uh and it seems to be gaining even more momentum. So it's exciting, it's super fun.

ABDEL SGHIOUAR: Awesome. So, my follow up question would be like the way you describe it I like this is amazing way of describing it like the downward facing and the upward facing. Um, I think Kelsey High Tower is uh quoted for this um uh for this phrase he said at some point, Kubernetes is going to become the platform. Do you think that that's what's happening? Like where Kubernetes is just an API that people assume it exists and we build on top of it? Is that where?

JAGO MACLEOD: Maybe maybe the the way I like to think about it is like there was this hour glass model of the internet idea that emerged about 40 years ago.

ABDEL SGHIOUAR: Yeah.

JAGO MACLEOD: Where IP was kind of the center of the internet and you have different protocols run on top and different uh physical layers below. And I think Kubernetes has emerged as the narrow waist of the hour glass model of infrastructure management. So there are all these frameworks that run on top. So I think there are maybe multiple platforms within that platform that we used to describe.

ABDEL SGHIOUAR: Got it.

JAGO MACLEOD: And we kind of embraced the idea that Srm is really great at HPC and there are users who have used and will continue to use Srm no matter how good Kubernetes gets at batch and HPC workload. So, I think we need to embrace that idea that there are you know APIs and user experiences that people are attached to and should continue to use.

ABDEL SGHIOUAR: Right, awesome.

JAGO MACLEOD: But we want it to go through and use the underlying hardware in the same way as others do it.

ABDEL SGHIOUAR: Got it. So you you spoke of Srm. I I spent a bunch of time with Daniel Martini from Google uh writing a Slurm guide for GKE. And that was definitely an eye opening experience because I've never had to deal with HPC type orchestrators, right? Which is what Srm is, right? And so I think my question to you is, I know that we're doing a lot of work with tools like XPs XPK to make to abstract away Kubernetes for people who doesn't have to deal with that, right? Do you see like more kind of tools like this where either we're making a tool that uses Kubernetes under the hood without exposing the complexity of Kubernetes to data engineers specifically, because we tried that with K for apps, right? Kind of the same idea. Or where we are integrating tools that people are familiar with like Srm into Kubernetes. Srm is just an example, Ray is another one. Like do you see we will be doing more of those going forward?

JAGO MACLEOD: I think we will and I think that the an intermediate layer is more often going to be AI agents. Like if you use cursor at this point, you never have to go to the GCP the the UI.

ABDEL SGHIOUAR: Yeah.

JAGO MACLEOD: Or necessarily even use cube control to manage the Kubernetes cluster itself. Um it's kind of a byproduct of what you're trying to accomplish. And so some of those sharp edges start to fade away if it's in that if it's abstracted at that layer as well. So, I think we'll see a few different models that converge over time.

ABDEL SGHIOUAR: Yeah.

JAGO MACLEOD: Um but it's really exciting.

ABDEL SGHIOUAR: Awesome. Thank you very much for coming on the show.

JAGO MACLEOD: Thanks for having me.

ABDEL SGHIOUAR: Thank you.

KASLIN FIELDS: One of the real highlights of any KubeCon is discovering the incredibly diverse and sometimes surprising ways people are using Kubernetes. And of course, it's the vibrant community and the individuals within it that make these events so special. Abdel's final set of conversations from the show floor really capture this spirit. In this interview, Abdel chatted with Clement, who, by day, works on the Kubernetes platform for Post Finance, a Swiss bank, but in his other life, Clement lives on a farm and has ingeniously used Kubernetes and Prometheus to automate his family's milk dispensery and monitor their cows. He also shared some insights from his banking role about migrating from Kube ADM to cluster API.

ABDEL SGHIOUAR: We have Clem on the show. Can you introduce yourself to the audience, Clem?

CLÉMENT NUSSBAUMER: Yeah, sure. So, my name is Clem. I'm a Swiss Swiss software engineer working at Post finance on the community platform and I live on a farm with my wife, she's a farmer. I live amongst the cows, which is quite fun.

ABDEL SGHIOUAR: And yeah, so last time I met uh Clem was in Switzerland for KCD Zurich. And you've been telling me, I mean this is my first question is off topic at all. So you you you live on a farm and you produce milk and you had to automate your milk dispensary.

CLÉMENT NUSSBAUMER: Well, true, well.

ABDEL SGHIOUAR: Using Prometheus, right?

CLÉMENT NUSSBAUMER: Can you talk about that? Well. Yes. So the the basic idea is that we we have a self- service shop and um we want customers to be able to get some milk from the farm, raw milk. And so we built a machine from from from scratch with some some well mechanical devices, electrical devices and everything. And then the question is when there is no milk anymore, you want to get an alert for that. So I installed a small lidar, so a small laser uh meter. So I know exactly how many centimeters of milk there is remaining in the in the boil.

ABDEL SGHIOUAR: Yeah.

CLÉMENT NUSSBAUMER: And then there received my wife and was the whole family, they received a Telegram alert through Grafana, through Victoria Matrix.

ABDEL SGHIOUAR: Yeah.

CLÉMENT NUSSBAUMER: about the the level of status on the on the on the machine. So people always have milk in in the farm.

ABDEL SGHIOUAR: Awesome. Is there any Kubernetes behind that?

CLÉMENT NUSSBAUMER: Of course, it has to be.

ABDEL SGHIOUAR: Where is where is the Kubernetes cluster running?

CLÉMENT NUSSBAUMER: Uh, at home. I run a four four node cluster at home, which is super reliable actually. Uh, I run Talos Linux on it and works quite well. And so, yeah, that's how I do it. So, Prometheus operator, Prometheus, then a few agents to scrape the metrics and then I I can gather some data and produce alerts, useful alerts.

ABDEL SGHIOUAR: All right, so it's it's almost like an edge case actually. Like it's a Kubernetes on the edge kind of situation.

CLÉMENT NUSSBAUMER: Yeah, true. Yeah.

ABDEL SGHIOUAR: Yeah, so it's it's that's pretty much what you're running. Like a Kubernetes cluster on the edge connected to.

CLÉMENT NUSSBAUMER: Just locally in the local network of the farm and I gather data, well, this time it was from the milk dispensing machine, but I I'm also gathering data from the the cows production.

ABDEL SGHIOUAR: Oh.

CLÉMENT NUSSBAUMER: I I gather all the cows, well, we have 65 milking cows and they produce milk. I want to know how much milk the cows produce and you can get alert if the cow is going too low or.

ABDEL SGHIOUAR: Oh wow. Okay, I didn't know. Okay.

CLÉMENT NUSSBAUMER: So I also exported there in the course for milking data.

ABDEL SGHIOUAR: Connected cows.

CLÉMENT NUSSBAUMER: So in the community and then also with Grafana dashboard in the in the so we can check the production live.

ABDEL SGHIOUAR: That's awesome. And so so in your daily job, which is not the farm job, uh you are running the Kubernetes platform for Post finance, which is uh the bank, right? Swiss bank.

CLÉMENT NUSSBAUMER: Yeah.

ABDEL SGHIOUAR: Yeah. So, can you tell us a little bit about what kind of work you are doing there, like?

CLÉMENT NUSSBAUMER: Yeah, sure. So we we basically provide um an open source vanilla Kubernetes platform for all banking applications because finance that are running deployed on the on the cluster. So, the work is really about provisioning it from from from the getgo. We we start just with nothing. We have provision configure load balance so we provision VMs, we install uh we install Kube ADM and then we bootstrap and configure the cluster. So that's that's what I do during the day.

ABDEL SGHIOUAR: Yeah, and you told me you're moving towards cluster API, right?

CLÉMENT NUSSBAUMER: Right, yeah. That was actually the topic of my talk yesterday on migration from Kube ADM to cluster API and a live migration because otherwise it wouldn't be fun and I did also live demo yesterday of doing which didn't break, which is also good. So the idea is that really we we we have this old Kubernetes cluster that we want to get rid of or we want to extend those clusters with new Kubernetes API nodes so that then after some months of of uh like stable testing, we can remove the old Kubernetes nodes and then we have high graded to a much simpler streamline solution on our clusters. That's the idea.

ABDEL SGHIOUAR: All right. Are you planning to like make this open source or like open source this work?

CLÉMENT NUSSBAUMER: Yeah. Like the guide for how to do a migration from KDM. There was the talk yesterday and I I think I will write something on my on my on my blog like blog website as well because there was quite some demand and many questions after the talk yesterday. So I think it would be quite helpful to.

ABDEL SGHIOUAR: Yeah, I would assume there would be a lot of interest for people moving from a CLI based tool to like an API based tool, right? So like showing people how they can do that would be probably useful.

CLÉMENT NUSSBAUMER: And without downtime because that was also the key driver. We have a lot of applications running on our clusters. We don't want to migrate 500 applications, we want to update in place one cluster. That's the idea.

ABDEL SGHIOUAR: Awesome, awesome. Well, thank you very much Clement for coming. Thanks for talking to us.

CLÉMENT NUSSBAUMER: Thank you.

KASLIN FIELDS: Finally, KubeCon is a gateway for so many into the Cloud Native world. Abdel caught up with Nick Taylor, who was attending his very first KubeCon. Nick shared his journey of pivoting into infrastructure and Kubernetes, his learning process, and his initial impressions of the conference and the community. A perspective that I'm sure will resonate with many of you.

ABDEL SGHIOUAR: I am here with Nick Taylor. Hello Nick.

NICK TAYLOR: Hey Abdel. Thanks for having me on. How are you doing today?

ABDEL SGHIOUAR: I'm good. How are you?

NICK TAYLOR: Pretty good. I'm I'm starting to get a little tired as well. It's uh like we talked about briefly, it's my first KubeCon and we're at a sponsor booth as well, so I've been working the booth a lot this weekend. It's been busy and it's fun, but it it does take it out of you, so.

ABDEL SGHIOUAR: Okay. Um, so, yeah, you told me yesterday that you were it's your it's your first it's your first KubeCon. Um, every year when there is a KubeCon, especially the big ones, like the European one, the US one, there is a report that comes out and every year it's always the same thing. It's more than 50% of the attendees it's their first time. Okay. So you are this year in the probably 50%.

NICK TAYLOR: Yeah, yeah.

ABDEL SGHIOUAR: So, can you tell us a little bit about how the experience, how like how was it for you?

NICK TAYLOR: It's been great. Uh it's like for some context, uh I've typically been an app developer and uh I've pivoted into infra and security. So it's literally my first KubeCon. I am brand new to Kubernetes, like like green. My my past experience with Kubernetes at a startup was a front-end dev restarting pods sometimes. That's like the extent of my Kubernetes experience. So, I'm I'm just really excited about it and uh it's a lot to take in though.

ABDEL SGHIOUAR: So you switch to the dark side of of of the world. Um, so, can you tell us a little bit what have been your learning process? Like this is something we get all the time, people who are new, they just ask questions like how do I get started and where do I find things and how do I learn?

NICK TAYLOR: Yeah, well, I'm fortunate enough to work with a lot of people that have some really good Kubernetes experience, but I think the way I've been approaching it is just, you know, not doing it the I know there's Kelsey Towers uh Kubernetes the hard way.

ABDEL SGHIOUAR: The hard way, yeah.

NICK TAYLOR: I've started to look at that. Um on my local machine, I've I've started to install K3S just to get a cluster up and stuff. And but it's it's still literally early days for me. So like uh you know, I'm starting to look into the ingress controller and stuff. There's like uh stuff that I'm sure everybody here is pretty familiar with, but it's like really green for me. So it's I don't know, it's fun kind of being in the unknown. I'm always comfortable getting uncomfortable, so it's just been exciting to dig into these things, so.

ABDEL SGHIOUAR: Awesome. So yeah, so K3S, the Kubernetes the hard way, learning the English controller, these are all good resources. If I may add, there is Kubernetes up and running, which is a really cool book. And I think the documentation and also certifications are pretty good resources for learning.

NICK TAYLOR: Yeah, for sure.

ABDEL SGHIOUAR: All right. Did you had any talk or any sessions or?

NICK TAYLOR: Um, I've mainly been working our booth, but uh in my time off, uh I've been actually focusing a lot on meeting up with people because I'm new to this community.

ABDEL SGHIOUAR: Yeah.

NICK TAYLOR: Uh but I did I was able to catch some keynotes this morning and uh it's kind of funny. I never it never thought of places where Kubernetes would be. So I saw the Oracle talk this morning from their SVP and hadn't occurred to me that F1 racing would have Kubernetes in it. It makes sense, but uh it's just I think I always associate it with tech and and software or dev tooling. I I I haven't really associated it with real world stuff yet, you know, so it's it's kind of cool to hear that, you know, uh it's like Kubernetes is powering like a pit in uh F1 racing or something, so.

ABDEL SGHIOUAR: Yeah, there I mean there are a bunch of use cases we have seen in the past like F16, the fighter jets running running not on the jet but like the software behind it running Kubernetes. We've seen like on boats or on vessels, like some of these cruise companies are actually using Kubernetes on the vessels. So um yeah, I think Kubernetes gets gets sneaks into places you wouldn't you wouldn't expect, right?

NICK TAYLOR: No, totally. We actually had uh uh somebody come to our booth yesterday and they have a completely air gapped uh Kubernetes running in a submarine.

ABDEL SGHIOUAR: Oh yeah.

NICK TAYLOR: And I was like, it it just hadn't occurred to me, you know?

ABDEL SGHIOUAR: Yeah, yeah, yeah.

NICK TAYLOR: So to your point about boats and ships and like it's pretty wild where it can be, I guess.

ABDEL SGHIOUAR: Yeah, yeah. Oh yeah, exactly. Awesome. Well, thank you very much, Nick for coming to the show.

NICK TAYLOR: Thank you, Abdel. Thanks for having me.

KASLIN FIELDS: Thank you too Mofi and Abdel, who's not here with us, but for for doing those interviews at KubeCon EU. I didn't get to go this year, so I'd love to hear more about how the event went. Tell me about it, Mofi.

MOFI RAHMAN: Yeah, I mean this was uh the biggest KubeCon to date, KubeCon EU 2025. I think had roughly about 13,000 attendees, um or more than 13,000 attendees. You might have the actual number which you can put on the notes later.

KASLIN FIELDS: Yeah, probably, I could probably ask someone for it, but the the transparency report will come out later, which will tell us exactly how many came out. So, I haven't seen that yet.

MOFI RAHMAN: One of the things they had done this year is that the show floor was actually split in half across like either side. So the venue itself had the show floor divided in two sections. Um, so even though we had a lot of people, um, some people actually talked about how it felt a little emptier because again, it was spread across two whole section. But again, that that that was a pretty well, I think it was a good idea because too many people in the same place would have been crowded. Um, but it was a lot of walking for the that few days. Every KubeCon is a lot of walking, you're just like walking everywhere and um, but other than that, I think the big thing for me and not necessarily would probably not come as a surprise to anyone is that AI was a big had a big presence even this year. Um, a lot of the keynotes talked about AI, a lot of the keynotes um talked about how AI can be done better on Kubernetes and a lot of other CNC projects. Also in the show floor, in the project Pavilion, there was a lot of projects that were either showcasing uh running AML or uh using AI for some sort of a benefit, either in the CI platform or code generation or monitoring observability. Um, so a lot of these things were tooling that are being built around um Kubernetes and the sub projects in CNCF that are using IML. So if you walked around the show floor, you would see a lot of the the word AI was everywhere this year as well. So,

KASLIN FIELDS: As is to be expected.

MOFI RAHMAN: Yeah, and uh so we also, I mean last KubeCon, KubeCon North America, we announced the the 65,000 nodes on a single cluster. Uh I actually got to do a demo on a live cluster uh this KubeCon to showcase like running batch workload. And this kind of features are also being built uh with AML and batch in mind. So, uh a lot of the new features that are coming out of the Kubernetes the project as well as a lot of the vendors are also kind of in service of running AML and batch workloads. So, um, it it is some of these features are also obviously making the project itself better because this kind of stress testing a cluster at that scale was is not something we do usually, like the Kubernetes project what we test I think up to 5,000 nodes in the open source, but to allow for massive training like this, now the entire project is having to be um we're finding out some edge cases that otherwise probably wouldn't have been found. So, even though the the purpose of this is to enable AML use cases, but I think the project itself is getting better and more robust because we're trying to test for this hyperscales.

KASLIN FIELDS: It's the Pokemon Go story all over again.

MOFI RAHMAN: Yeah, yeah. So anytime that kind of like a moment like that happens and everybody's trying to do something that the tool was um I'm not going to say not ready because Kubernetes is one of the biggest scalable thing in the world, but the there's still you can find the upper limit of certain things then we push that boundary and make it better.

KASLIN FIELDS: This is why we test in production, right?

MOFI RAHMAN: Yes, absolutely. But also this year uh is gonna going to be the first or second year, I think the first year with five official Kubecons. So,

KASLIN FIELDS: Yes, yeah.

MOFI RAHMAN: So basically while we're actually.

KASLIN FIELDS: EU, China, Japan, India and NA.

MOFI RAHMAN: And NA, yeah. So basically in a given year of 12 months, you are roughly having one KubeCon every two months and one week, which is fantastic. Uh,

KASLIN FIELDS: It's a lot. It's a lot for sure.

MOFI RAHMAN: Yeah, every every nine weeks there would be a KubeCon basically, which is um,

KASLIN FIELDS: More chances for announcements spread out throughout the year. And involving more different regions too. So I'm curious to see if uh we hear about different things from the different regions as India and Japan especially get spun up.

MOFI RAHMAN: Yeah, I think so basically Kubernetes has three releases a year and now we're going to going to have five of these Kubecons. So we're not going to be able to coincide a release to a conference as much anymore. Um, also like a lot of the maintainers and the organizers are kind of involved in constantly organizing the next one. I hope it doesn't cause any sort of like burnout because organizing a conference is a huge undertaking. So, I'm hoping.

KASLIN FIELDS: From a nonprofit? How could that happen?

MOFI RAHMAN: Yeah, yeah, nonprofit and a team of volunteers that kind of like work uh in making this happen maintainers have to like create up. So, it's almost like a pressure to have some sort of an update to share on every KubeCon release, but uh, yeah, so, hopefully that doesn't happen. But overall, I think KubeCon EU, uh, in my experience at least was a great event. There was a lot of great talks, um, that I got to attend and speak to people. So, uh, that's kind of like the main goal for me to like just speak to the attendees and hear how people are doing and dealing with Kubernetes. Um, so yeah, it was uh, it was a strong few days at KubeCon and for us the the uh GKE team, we were also traveling before KubeCon for some container events that we have in the area, not just like London, we actually had a couple of events in Norway and as well as uh Sweden. So we had basically a two- week road show of bunch of things. So it was um a lot of talking, a lot of learning and uh excited for the next installment of KubeCon wherever I get to go. So,

KASLIN FIELDS: Yeah, a lot of folks uh forget that there's so many co-located events that happen on site at KubeCon and when companies can, they'll also uh create other events kind of in the local area around the KubeCon. So KubeCon bring in a lot of stuff. A lot of stuff goes on around them. And it's so great to hear from the community as we did in this episode. We hope you enjoyed hearing from folks at KubeCon EU. Thank you, Mofi.

MOFI RAHMAN: Thanks, Kaslin. That brings us to the end of another episode. If you enjoyed this show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media @KubernetesPod. Or reach us by email at kubernetespodcast@google.com. You can also check out the website at kubernetespodcast.com where you will find transcripts and show notes and links to subscribe. Please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening and we'll see you next time.

View More Episodes