#139 February 25, 2021

GKE Autopilot, with Yochay Kiriaty

Hosts: Craig Box, Lin Sun

Today Google Cloud introduced GKE Autopilot, a new mode of operation where you no longer manage or configure nodes, and you pay per-pod, per-second. Craig talks Autopilot with GKE product manager Yochay Kiriaty.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box with my very special guest host Lin Sun.


CRAIG BOX: Lin, thank you for joining me today. Welcome back to the show. Last time I saw you in person was probably last time I saw anyone in person back in KubeCon in San Diego for your appearance on episode 86, which went out in January 2020. Such a long time ago.

LIN SUN: Yeah. Thanks so much for having me back. Oh, my gosh, I remember seeing you in person. You were so busy recording podcasts. And I emember, there was Istio boat event the day before I saw you, and it was such a memorable event. It feels like that was the last community event in person for the longest time.

CRAIG BOX: Well, I didn't get to do that obviously because I was preparing our giant KubeCon podcast extravaganza. What was that event, and why was it so memorable?

LIN SUN: Yeah, it was in San Diego at one of the ports, and we went out for a cruise for, I think, two hours? And there were a couple of talks people were giving out. I remember I was talking about my new book, "Istio Explained." I remember Christian Posta was giving a demo on a tiny, tiny TV that nobody could really see, and he was struggling to give a demo. So that was so cool.

CRAIG BOX: I hope people were pretending to pay attention at least, rather than looking at the lovely cruise they were on.

LIN SUN: I think people wanted to see it though, because it was interesting. But then they were having to struggle to see it. Luckily, I was in the first row, so I got to see everything. Yeah, and then at the end of the talk, people just walked around the [boat], and we got to enjoy the beautiful San Diego Bay. So it was so memorable.

CRAIG BOX: Those were the days. Events are all virtual now, and I wanted to have you on this week in particular because you and I are both co-chairs of the IstioCon event, which is on this week.

LIN SUN: Yeah, I'm super excited. I think the event has been tremendously successful. On the first day, you and I got to be relaxed because everything was prerecorded, and I was busy tweeting about ourself onstage with different people. So it was so fun, and I was so surprised. It was 3,000 people the moment when I logged in, and it was even more in the afternoon.

CRAIG BOX: Well, you don't have to worry about it when everything's prerecorded.

LIN SUN: Right, exactly.

CRAIG BOX: One thing I will say is I haven't been doing very many of these virtual events over the last year, but I recorded a piece for this one, and there were two sets of feedback that I got as it played. The first one was saying, "oh, that's what the guy from the podcast looks like". So thank you to everyone who listens to the show and isn't used to seeing me on stage anymore. It was nice to get a few callouts there. And the second one is, "my, what a lot of hair that guy has".

LIN SUN: That was amazing, yeah. I think it was Neeraj [Poddar]said you looked like Jeff from Coupling. And I went to the China session during the evening of my time because-- my Chinese is actually very good in daily language, but I'm not very good at technology words. So I ended up presenting maybe 60%, 70% in Chinese, and then the rest of the percentage just using English. And I was able to look at the chat history, and people were saying you have a really cool hair style. So good job, Craig.

CRAIG BOX: Well, thank you. I don't normally do much with my hair, but if I'm going to be on screen, I figured I'd better make it tidy. It's interesting now, with the new plan the UK has to come out of lockdown, I can tell you the exact date of my next haircut. It will be on April the 12th.


CRAIG BOX: Well, no earlier than April the 12th. That's when the hairdressers are all allowed to reopen.


CRAIG BOX: They're actually calling people, salons and so on, are calling people and saying, "hey, we get to reopen on this date. Would you like to book in for a haircut?" Not my barber, though. I just sort of walk up. I'm nowhere near that fancy.

LIN SUN: That's so interesting. I mean, same as you-- if I'm on stage, that's why I've been wearing pearls and using the pearls as my filter in Zoom. So it made it a little bit more fancy.

CRAIG BOX: When I saw you were wearing pearls for your half of the video, I thought, hm, she's going to show me up here. Should I find some pearls? I thought about it. I thought, well, probably doesn't suit me.

LIN SUN: Right, so you went over through your hair style, right?

CRAIG BOX: Perhaps. Like I say, it's not really a style. It's just what happens when you've been in lockdown for a long time, and you can't get a haircut.

If you want to check out our videos, you can find them on the IstioCon website. You will also find them on YouTube momentarily. I'm the guy with the big hair. Lin's the lady with the pearls. But until then, let's get to the news.


CRAIG BOX: Google Cloud has launched GKE Autopilot, a new operation mode for Google's Kubernetes Engine. Autopilot brings fully managed infrastructure, which is charged by the pod as opposed to by the node. Learn more about GKE Autopilot in today's interview.

Microsoft's Dapr, the Distributed Application Runtime has hit 1.0. Dapr is a library that offloads common application tasks, but instead of being compiled into your application, is addressed by HTTP or GRPC. Building blocks include service invocation, state management, pub/sub messaging, event driven resource bindings, and distributed tracing. Microsoft set up an independent steering and technical committee in September, and is evaluating donating Dapr to a foundation.

Tigera has announced Calico Cloud, a SaaS platform for Kubernetes security and observability. Calico Cloud builds on the Calico network plugin with a dashboard, which lets you observe, and set policy for, multiple clusters. The product is pitched as an industry first pay-as-you-go option. We are charged only for services consumed.

At IstioCon this week, Solo.io announced that Gloo Mesh Enterprise is now generally available. Building on top of the open source Gloo Mesh control plane, Enterprise adds production support for Istio with SLAs, support for older versions, and FIPS-compliant builds. A blog post from Solo describes the technology behind the FIPS builds, which are also published for general community use.

Red Hat has closed the acquisition of StackRox, which was announced in early January. They will now proceed to open source StackRox's technology, and will still support multiple Kubernetes platforms as well as Red Hat's OpenShift.

PayPal has contributed a new schedule implementation to Kubernetes and written up how you can use it in an engineering blog post this week. Trimaran, a three hulled boat, makes the Kubernetes schedule aware of the gap between resource allocation and actual resource utilization. It watches the load in your cluster and can make scheduling decisions based on live node utilization values as opposed to requested resources. Trimaran has been proposed to the Kubernetes project as a scheduler plugin.

Finally, Kubernetes generally uses an overlay network to allow multiple IP addresses per node in a cluster. A lot of this is done with encapsulation, often VXLAN, but you can get better performance using IPv6. John Milliken has written up how you can use the 6to4 and Teredo features of the Linux kernel to create fast overlay networks between nodes. And that's the news.


CRAIG BOX: Yochay Kiriaty is a product manager at Google Cloud working on GKE cluster lifecycle and security. Welcome to the show, Yochay.

YOCHAY KIRIATY: Thanks for having me, Craig.

CRAIG BOX: I imagine you're probably disappointed you missed the "Star Wars" show by a week.

YOCHAY KIRIATY: Yeah, yeah, I heard Tim rambling on "Star Wars." We actually had a little bit of a game within the team couple of months ago, actually, about knowledge, who knows more about "Star Wars." And I have to say that, as much as I'm thinking myself an avid "Star Wars" fan, Tim smoked me on that part. So yes, kudos for him.

CRAIG BOX: You've done a lot of presentations over the years, and you do seem to be wearing "Star Wars" shirts in effectively all of them.

YOCHAY KIRIATY: Yeah, correct, I clearly consider that I have way too many "Star Wars" t-shirts than any grown up in my age should have or own. I very much love the series and everything around that, so this is great. My favorite character is Darth Vader, and at some point, I decided to start wearing a Darth Vader shirt for basically any public presentation I have.

And I have two flavors of my "Star Wars--" the red and the black one-- that I wear in that essence. There's a person that, in my previous work, used to start dressing up with red polo t-shirts for his public presentation, and this becomes a hallmark. And I said, hey, you know what, maybe I'll copy that little trait as well.

CRAIG BOX: How do you decide if it's a red presentation or a black presentation?

YOCHAY KIRIATY: It's pretty much on the mood. If I need to be slightly more aggressive, it will be the red. The black is where all day comfort in the Dark Side sits in. It's like, just be ready, so yeah.

CRAIG BOX: Now, you worked a long time at Microsoft, including on the launch of Windows 7. Was that the last good version of Windows?

YOCHAY KIRIATY: So Windows 7 was a great release after, obviously, the Vista. Built all the foundation that everything of Windows is built today. 8 was not a good version, and then they just skipped 9, went directly to 10.

And I think 10 is actually pretty good and solid, stable. My understanding, they're not going to release any more. It's going to stay on 10 forever and ever.

CRAIG BOX: Like Mac OS did for a long time.


CRAIG BOX: We get to 10, and we stick there for 10 plus years.

YOCHAY KIRIATY: 10 is a lucky number, I guess.

CRAIG BOX: You were on both ends of the journey as Microsoft made the transition from a Windows company to a cloud company. What was that like?

YOCHAY KIRIATY: I was lucky enough to join Azure when it was something called Red Dog early, early 2008. I'm saying it with a smile on my face and being very positive. It was a wild, wild West, and it was fun because you could do almost anything you wanted to. A lot easier and quicker to move and ship stuff, so it was really great on that aspect.

The journey was bumpy, obviously, but there's a lot has been talking about Satya and his change. And yes, he came in when there was already cloud established and so forth, but the push was amazing at that point. Open source and everything around that.

So the cloud really transformed the company way of thinking. Obviously, business and all that is super important, but you could really see the culture change within. You can see it on open source stuff like VS Code for example.

CRAIG BOX: Yeah, there's been almost universal praise of Satya as the CEO of Microsoft.

YOCHAY KIRIATY: Yeah, kudos for him. It was really amazing in that part. Yeah, and building up and seeing everything scale the way it did, in terms of the customer demand and the growth, was just amazing.

I mean, it was so fun being part of it. I got super lucky and built basically two services from zero, from nothing, from basically ideas into full blown production and with hundreds of thousands of customers. So it's really rewarding and quite fun, and actually, I'm super happy doing the same journey again here at Google.

CRAIG BOX: You joined Google December of 2019, and I imagine that was just long enough to get used to them cooking all your lovely meals for you, and then they kicked you out of the office.

YOCHAY KIRIATY: Yeah, this was an odd year. At least I have to say I started a new job, and I actually got to see the office and my co-workers. There was someone who just started my team a couple of weeks ago. He doesn't have this luxury. So I guess I'm lucky to a certain extent on that.

The office was very nice. We got brand new shiny buildings in South Lake Union in Seattle. New buildings really, just new. You can still smell the new in them. It was pretty good, and the team is amazing. It was very cool seeing everybody in meeting the office, and then they just took it all away.

CRAIG BOX: Could you see the sea planes taking off and landing out the window?

YOCHAY KIRIATY: We could definitely see the sea planes taking off the window. The cafeteria has basically a view to the lake, to South Union Lake, so you can see the planes coming in and out. Yeah, it was pretty nice. On sunny days, when there were somewhere in between in winter, it's being cold, but it's gorgeous scenery. So it's really a fun place to be.

CRAIG BOX: Over the last year, you've been working on GKE Autopilot, which has just launched this week. Hop in the magic elevator with me and tell me what that is.

YOCHAY KIRIATY: Yeah, yeah, thanks. It's been almost a year in the making for me, in Autopilot, and it's a great journey. So at the top, Autopilot is a fully managed Kubernetes cluster that brings together all the best of the GKE features that we have, all the advances that we have built, in scaling and security and data operation-- everything into an opinionated way of running Kubernetes clusters with the full support of Google SREs, managing end to end your entire cluster and optimizing for production.

CRAIG BOX: A lot of people think that services like GKE are managed. What isn't fully managed about GKE today?

YOCHAY KIRIATY: To set the context, we can talk a little bit about Kubernetes all up, do-it-yourself, and then we can go through the managed, and then the different state of managed, I guess. I think we're at the point-- and obviously people who listen to this podcast probably would subscribe to the notion that, at this point, Kubernetes is kind of the de-facto container orchestration tools out there. And it will just grow, and the ecosystem is amazing, and innovation continues at a pretty high pace or is even accelerating, which is just, again, amazing to see with the pace that things are changing.

And with that power, if you want to do Kubernetes yourself the hard way, as Kelsey put it, there is a lot of stuff that you need to do. I mean, even if you do it on a VM infrastructure, like in GCP, you still need to do a control plane and worker node and security and network and upgrading and patching and node pools-- the whole thing. You have to monitor everything. You have to scale it up. You have to manage the entire stack for you.

Here comes all the managed services, and again, GKE came out five years ago. Took away a lot of that complexity that customers have to deal if they want to do-it-yourself. It can manage control plane. GKE manages your node pools for you, and the workers and run them on your behalf and so forth, and take away all of the extra work that is associated with do-it-yourself.

You still need to manage your workload, yours workers, node pools, and provisioning, and security, and some of the configurations, and so forth. And over the years, GKE has done a lot of automation and auto upgrade all the reliability, effort, and so forth, release channels, yada, yada, yada. Despite all this, users are still looking for even more simpler way, even more way to build their own developer platform, which is more secure, more consistent for the developers, because there are still a lot of knobs that you need to control and manage, even with the managed infrastructure information, or the services that are provided by a service like GKE.

So basically, this is where Autopilot comes into play. GKE Autopilot is a new mode of operation. We call it GKE Autopilot mode of operation versus Standard, which is the existing or current GKE. Even taking it a step further, we are, again, putting the mark on how we believe GKE should be run and managed across the board, and we just removing completely any notion that you need to handle or manage nodes in any kind of shape or form.

CRAIG BOX: Now, your announcement says that Google Cloud provisions the cluster infrastructure based on your workload specifications.

YOCHAY KIRIATY: Yeah, exactly. So one of the things we found is that customers have quite some difficulties in figuring out the workload requirement to the shape and size of their VMs or the nodes and the node pools, and they either mostly provision too much, and you end up with extreme inefficiencies in terms of utilization of resources. So there are two aspects of this. One is that, if I'm a platform team, and I have a product that comes into play and says, hey, we have this new big service, shiny service, we need to launch it tomorrow, CEO wants it, and now, here, go, take it.

And then you can ask, how many resources do you need? How does it look? All of those interaction that needs to happen with a developer are friction points, and we want to reduce this friction, and by reducing and using against existing technologies we already have in GKE-- we're kind of just packaging everything here-- using something like Node Auto-provisioning will basically assess the requirements and provision dynamically for you different VMs shapes and sizes and node pools behind the scenes, and you can basically just deploy your workload. And will also, obviously, if you configure everything correctly in terms of the auto scalar, either HPA, VPA, or both of them, then those will just basically kick in as well and just scale. We're taking that friction point away from the platform team and the development team as well.

CRAIG BOX: Does that mean that every time I submit a new pod, I have a new node created to run it on?

YOCHAY KIRIATY: So it's not exactly every time that you submit a new pod you have a new node. We are doing bin packing. Unlike others, we are not guaranteeing, or we are not forcing one pod to one node. We believe there are reasons for multiple pods being together.

Support affinity, for example, is still supported, for obvious reasons. And with that in mind, we will just basically update and adjust the shape of the node pool to optimize for that workload. So you don't have to do that kind of work.

CRAIG BOX: And then the trade-off there is the Autopilot charges by pod spec. So the CPU, memory, and disk that you request, that's actually how you're choosing to pay rather than paying by node per hour.

YOCHAY KIRIATY: Yeah, so this is a good thing. So everything is around the pod. So the billing is around the pod, this SLA's are on the pod, as well.

The narrative around that or the thinking is as follows. So if you don't need to provision nodes and node pools anymore, then you don't need to basically kind of worry about them, manage them, or pay for them. So you're basically paying for the pod definition that you have. Or whatever it is in your spec, basically, is the one that will get defined and charged and billed for.

And therefore, the SLA is also at the pod level. We guarantee that your pods will be scheduled. Now if the pod schedule and for whatever reason, there is some user error and they failed to start because there's some code error in the user space, we can't guarantee anything about that. But we will ensure and guarantee that all your pods across all the zones will be available and be able to be scheduled.

CRAIG BOX: And that will involve scaling the cluster up if necessary to achieve that.

YOCHAY KIRIATY: And this will involve, obviously, scaling the cluster, as well. It's important to understand that the nodes are visible. They are accessible to the user. The entire cluster is deployed within the customer project and network.

This is not a multitenant service. We're basically taking GKE and slapping on top of, if you want to say, best practices, hardening, and guidelines, and everything around that. And just packaging into a managed environment that basically, Google owns and manages the nodes for you.

CRAIG BOX: Back in 2019, we did a live show with Eric Brewer. And we took a couple of audience questions. One of them was someone asking about why there's no nodeless Kubernetes product from Google. Eric and I both talked about how we believe that there are things in Kubernetes that rely on the node abstraction to be able to run DaemonSets, for example, and define failure domains. Why have you chosen to build Autopilot with nodes rather than a nodeless Kubernetes product?

YOCHAY KIRIATY: As I said, there are certain aspects of Kubernetes within the nodes that are important, like DaemonSets. So Autopilot supports DaemonSets, as long as those DaemonSets don't violate certain rules that we have withing Autopilot. And we can cover that in a bit.

Autopilot will not allow you to do node affinity. But as we said, you can do pod affinity and anti-affinity, all of those within Regard. So it's not exactly nodeless but it's not "not nodeless".

We truly believe in transparency. And we really want the customer to be comfortable where it is. Current GKE users and future ones, but definitely current ones are comfortable and familiar with the current notion of nodes and node pools.

This would basically just take it to the next level. You'll still be able to see those things but you just cannot access them. So we will show you really everything we are doing behind the scenes for you.

And there is another mode, which we call "break glass". Again, we can cover that later. The user can actually move from Autopilot back to Standard. The easiest way to explain it, is basically, it's an upgrade process.

At the end of that, user has full access and control to everything that he is doing today in GKE in just the Standard mode. And at that point, you'll see the node pools and nodes, and so forth.

So we hope that not a lot of customers will need that. But in that sense, this is why we are not completely eviscerating the node and saying it's nodeless because it's much more than just the node. Managing the entire infrastructure, taking responsibility for the uptime across the board, security, as well, and so forth. It goes way beyond just the aspect of being nodeless.

CRAIG BOX: How would you contrast Autopilot versus something like the Virtual Kubelet, where there is effectively one giant node which just has machinery behind it that you don't necessarily get to see?

YOCHAY KIRIATY: I think from the developer point of view, if it's a pure developer who just wants to deploy Kubelet, apply and deploy pods, that's fine. There's not a lot of differences. There's a lot of differences at the backend in terms of different componentization.

If you are in IT administration, you need to segregate different resources for security and boundaries, and stuff like that, then I think there is still good sense in which you still need the ability to have a little bit of control and visibility into how the underlying infrastructure looks specifically. And all up in Virtual Kubelet as is, it's still plenty of gaps there in the API in terms of the capabilities of what you can do with Kubernetes. So it's not exactly 100% Kubernetes API.

CRAIG BOX: With GKE, you get some choice over what operating system the nodes run. With build your own Kubernetes you, of course, can bring anything you like to it. But the node operating system for GKE's Autopilot mode is the Container Operating System running containerd.

Does the user care? Does it matter? Is there a contract that says that it will always be that?

YOCHAY KIRIATY: GKE offers COS-- the container optimized operating system-- as well as Ubuntu. And majority, like a large majority, don't really care about that. So I would say that the hope is that customers don't care.

Because if customers take any dependencies on any of those underlying infrastructure, then maybe that customer is really an advanced customer that needs that level of control and configurability. And for that customer, this customer can still use GKE Standard. We are not taking that away whatsoever. It's completely there.

Autopilot is another mode of operation for GKE. So you have Standard and Autopilot. And you can think of it as, if I need that level of control, if I need to control and configure, if I need that flexibility, go with Standard. But if you're looking for something more managed and more opinionated, then Autopilot is definitely a place for you.

CRAIG BOX: Do you think that Autopilot, therefore, is only for smaller customers or less advanced customers?

YOCHAY KIRIATY: No, this is a great question. I think the answer is not necessarily that. I would even say on the contrary, some of our more advanced customers are saying, hey, you need to lock down the nodes. I want to prevent people from SSHing to my node. I don't want any random developer to run a privileged pod on the node and arbitrarily deploy whatever. Even if nobody has any kind of malicious intent behind them, the capability, if you give it to someone, most likely they will use it because it will just make their life easier for whatever reason.

And at that point, those platform teams basically lose containment or lose control over the shape of their fleet. It's a very heterogeneous fleet at this point. What we hear is that they want that level of control. That cookie-cutter makes everyone's life a lot easier. And it will save a lot of operational cost, time, and money basically down the road.

So no, I believe that the course of action, we'll see a lot of companies starting with maybe dev/test because it's very easy and simple to start with that aspect. And there's smaller cluster there. But if you're building and deving and testing for Autopilot six months, a year from now, you'll be deploying it as is.

I really hope that we'll see a shift in people's mindset. I know everybody loves that level of control. And as a geek myself, seeing all the knobs excite me. And I'm really happy I can touch and see what happens.

But on the day-to-day and being able to run-- you need to automate everything and monitor everything, there is a huge investments behind this. Very big customers area actually doing that at a very big cost. And the outcome of that process for those customers is, basically, a cookie-cutter platform that they can then provision and prescribe for their internal users.

So we say, hey, why don't we take the same approach? Autopilot is that baseline of cookie-cutter that takes 80%, 90% of what those teams were already doing and provides it for them. But we can do it for everybody at this point.

CRAIG BOX: So what are the things that those teams no longer have to do when they create a cluster in Autopilot mode?

YOCHAY KIRIATY: The few things they don't have to do is they don't have to monitor the nodes. Today, when you deploy GKE Standard with node pools, you're still on the hook to make sure that nodes are there and alive and running right. And you're still on the hook to monitor and manage core Kubernetes components, like if for whatever reason, kube-proxy or kube-dns goes away on one of the nodes, there is no Google with the pager that will come back and help you on this point. This is a mixed mode of responsibility that basically is being used across the board with almost all of the managed services.

So with Autopilot, you no longer need to take care of that. So Google's SRE team will take this pager duty on your behalf, and will ensure that underlying node and infrastructure are there to, again, run your pod. So you need to deploy your pod, you need to scale, there is a place for the pod to run. So that will be number one. Very, very large investment over there.

The second one is an area we've mentioned previously with resource inefficiencies. So no more figuring out how to do bin packing. Now you basically pass the bin packing pain or cost, or whatever you want to call it, to someone else.

So Google will make sure that there is a pod for you to run, and infrastructure. And you're just paying for those resources that you're applying, not for all the overhead that it takes actually to manage and run a Kubernetes cluster at this point.

And then if you want to think about more, is that from internal data we know that about half of the support tickets opening with nodes are basically user-inflicted mistakes.

CRAIG BOX: Disk full.

YOCHAY KIRIATY: That the other 50%.


We've seen customers deploying some stuff like kernel modules that hose the nodes, making some changes to iptables that breaks something else, and so forth. And change some obscure configuration, whether they know or didn't know. And we're going to take that away. So you're going to save a lot of time in terms of not being able to mutate the node. So we take a lot of that pain away.

The last is the one that for onboarding, start with Kubernetes today. It's a pretty steep learning curve. And I believe that with Autopilot, we are slightly reducing that. Because we're taking away all the pain in terms of how does the Kubernetes apply to a certain environment.

You still need to do a lot of YAML for your Kubernetes objects. But we're taking almost all the peripheral of integration between your Kubernetes cluster and a GCP environment.

CRAIG BOX: We've talked about how the Autopilot mode charges based on the requests that you make for your pods. The Datadog container survey says that almost half of containers are using less than 30% of the CPU that they've actually requested. Will GKE Autopilot handle this optimization for me?

YOCHAY KIRIATY: Yeah, this is a great point. And I was listening to Michael the other week on the podcast, a couple episodes ago. And this brings a smile to my face when he talked about this in particular. So the answer is 100% yes. And this goes back to what we've talked about.

Customers have a really difficult time figuring out how to assess a workload. And the fact that the YAML spec asks you or forces you to define resources and limits, gets you to the point where you need to guess. And developers have certain ideas about it. But in production, it behaves naturally.

And then when it comes to the operator or to the person who runs the platform, they need to handle 10 of those developer teams. So they need to figure out how to do this puzzle. And this puzzle, as we all know, is some kind of an NP-complete task at this point.

So what happened is that people just provisioned way, way more than they needed to. And the interesting aspect is that with Kubernetes, you have an allocatable resource, CPU and memory. And then there is the utilization.

The allocatable basically becomes the 100% that you can actually get utilized. And what we have found is that on the allocatable, people pretty much, as the Datadog survey says-- we have exactly the same data behind our fleet that shows that customers are doing a really poor job in allocation. To the point that we have found that we have many customers-- an example. If I would take a certain customer with over 1,000 clusters, and I would automatically transfer all those into an Autopilot cluster, that particular customer would actually see a sizable-- sizable is 10% and above decrease in pure compute cost for that.

Just because of the fact that we could take the bin packing utilization from 40% to 30% that this customer was on the allocation side right, not even utilization. And take it all the way to 50%, 60%, 70%, and even more than that.

CRAIG BOX: What about then the difference between allocation and utilization? Should we be attaching vertical pod autoscalers to look at the usage of these containers?

YOCHAY KIRIATY: From the customer point of view, if you define your pod and resources, that's the only thing you need to take care of. We will own any inefficiencies that you have. So if, for whatever reason, we use a four core machine and you're running only 1.5, you're paying only for the 1.5.

The fact that you're using the four core machine behind the scenes, it's great you know that. But you don't really care. So that's one thing. So again, this is a shift in how people need to think about that.

CRAIG BOX: But it might also turn out that my workload only needs half a core. But I've requested 1.5.

YOCHAY KIRIATY: We will guarantee that you get billed on what you request, basically. And this is where stuff like VPA kicks into place and can definitely enable. So both HPA and VPA are enabled by default, and customer scan configure those objects under pod specs and utilize it and work it through. And then, obviously, the billing will be appropriate by what the pod is actually being requested.

On that node, GKE basically GA'd the multidimensional autoscaler this week, I believe this week or last week. And Autopilot will obviously pick that up by default, as well. So it's enabled, aAnd if you're subscribed into that, then all this magic will happen for you behind the scene, as well.

All the value of all the automation that we can provide, that's literally taking off that burden and load of thinking of how I should actually mix and match workloads to VM shapes and sizes. Which is, again, it's a pretty difficult task to begin with.

CRAIG BOX: Now you mentioned before that you can take a cluster in Autopilot mode and upgrade it to Standard mode, if you will. Can I add nodes to my Autopilot cluster? Can I have a mixed mode?

YOCHAY KIRIATY: If you look at our docs, you'll see exactly how we implement the Autopilot mode. And in that sense, Autopilot has a certain number of pods associated to a node, an IP, and sizes, and so forth. Those sizes are fixed right now. And auto cluster size or number of nodes basically change as you scale up and down.

If and when you hit the maximum, which we don't think people will hit anytime soon-- certainly a few of them will-- at that point, we have a few features down the road that will enable the cluster to dynamically increase the size, including increasing IP ranges, as well, dynamically. So adding more CIDRs to it, which will be quite cool. So I'm not too concerned about the size of the node.

And if, for whatever reason, Autopilot doesn't work for you, then, as we've said, you are calling it an upgrade. We can consider if it's an up or downgrade. But you can move from Autopilot mode into a Standard mode.

Remember, we're not hiding any of the nodes, or anything like that. But you will basically revert back to Standard. And you'll have the same cluster with all the best configuration that we have, and everything just as-is. We'll just break the seal and now you're back to the responsibility of monitoring and managing those nodes, and so forth.

But node auto-provisioning and all the setup is still there. So if you are deploying a new workload and you haven't changed any of the settings, you will still have that same experience that we just talked about in terms of node pools dynamically being generated, created for you with the right shape of the node to accommodate for your workloads.

CRAIG BOX: If I type "gcloud containers cluster create" and don't specify anything, which kind of cluster do I get today?

YOCHAY KIRIATY: You will get GKE Standard, the current GKE experience. We just launched Autopilot. It's GA, full SLA across rapid and regular and later stable as well. You'll have full control if you want to create Autopilot cluster. But the default currently is still Standard.

CRAIG BOX: Do you see that default changing over time?

YOCHAY KIRIATY: Answer, yes. I would very much want to see this changing. I think that changing CLI default underneath people without proper versioning of those CLIs is just going to create a lot of existing automation. So we need to take that into account. So we need to be conscious of that net effect that such a change would have. Because it is a dramatic change.

CRAIG BOX: But if I'm coming along as a cluster administrator who just wants to create a cluster, do you think that Autopilot will become the default that I opt into?

YOCHAY KIRIATY: My hope is that down the road, a year or two from now, the answer would be definitely yes. Again, if you need extra level of control. If some of the restriction that you have-- if for whatever reason, you need to SSH into a node. If you have a privileged pod that cannot run unprivileged, for whatever reasons, then you can use Standard.

Even if you have that kind of restriction or requirements and you need to run GKE Standard, or you are running in GKE Standard, you can use GKE Autopilot with other workloads. And you can mix and match. You have multicluster service discoverability that went GA. You talked with Jeremy last week about that.

You can connect multiple to a hub, all those different clusters. And you can then definitely mix and match. So you can take only a few of your highly flexible or highly demanding, or whatever you want to call this, that need those special configuration workloads, put them on a Standard, and then throw everything else on an Autopilot. And that definitely would work for you.

CRAIG BOX: What's on the feature backlog? What things would you like to implement between now and the future world where it might become the default option?

YOCHAY KIRIATY: We're just starting with Autopilot. I think it's important, as this is the first release and we definitely have a long road ahead of us. So that "break glass" experience we've talked about, about moving from Auto to Standard, will be rolling out literally in a matter of weeks, as we speak.

We want to do burstable workload. So this will be a huge, huge benefit for the customer in terms of improvement. And make it easier for them to define. It's amazing.

Migration from Standard to Auto, we talked about that default. We still have all of the fleet basically is currently on Standard. Maybe customer would want to automate. So we want to make that frictionless for them so they don't need to recreate clusters, just deploy.

Additional machine types. Some customer would need more memory or more CPUs. We want to allow this easily, without creating node pools, just defining those in the pod specs. Or even just inferring from the ratio that they have. GPU and TPU, Windows support, the list goes on. So it's very exciting where we are at.

CRAIG BOX: Well, it's very exciting to see. And congratulations on the launch. Good luck with the future. Thank you for joining us today.

YOCHAY KIRIATY: Thank you for having me. This has been great fun.

CRAIG BOX: You can find Yochay on Twitter @yochayk.


CRAIG BOX: Thank you very much, Lin, for helping out with the show today.

LIN SUN: Thanks so much for having me again, Craig.

CRAIG BOX: If you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter at Kubernetespod. Or reach us by email at kubernetespodcast@google.com.

LIN SUN: You can also check out the website at kubernetespodcast.com, where you will find transcripts and show notes, as well as links to subscribe.

CRAIG BOX: I'll be back with another guest host next week. So until then, thanks for listening.