#72 September 24, 2019
Kubernetes 1.16 is out, and our guest this week is its release manager, Lachlan Evenson. Lachie is a Principal Program Manager at Microsoft and an Australian living in the US; Craig and Adam are therefore method-interviewing, being this week in those two countries respectively.
Do you have something cool to share? Some questions? Let us know:
CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box.
ADAM GLICK: And I'm Adam Glick.
[THEME MUSIC]
CRAIG BOX: It's always great to see New Zealand in the news. What have you got for us this weekend?
ADAM GLICK: Oh, I saw this fantastic article. There is a thing called a redundancy meeting. It's basically, you're getting canned. And in New Zealand, they allow you to bring one emotional support person-- which is really actually a great idea because that can be a really hard time for people.
So this particular enterprising marketer-- I can only hope I have this kind of creativity some day-- went out and hired a clown to come in and be his emotional support person. And while they were discussing to him that they were letting him go from the company and the severance and all that, the clown is making balloon animals--
[LAUGHING]
--and clowning his way through what could only have been an incredibly awkward meeting for all involved. And just-- my hat's off to that person.
CRAIG BOX: It does sound like the guy will have been emotionally supported during that time by his clown friend.
ADAM GLICK: You know, he made the best of a situation, and it turns out the story went a little bit viral. He's gotten a new job since then.
CRAIG BOX: [LAUGHING]
ADAM GLICK: So all ends well, but that's just beautiful. It just warmed my heart to see that this week.
CRAIG BOX: Down here in the South Pacific, I've been enjoying a week in Australia. I did talk at the Cloud Summit, which was in a giant room that they partition off with curtains to make lots of little smaller rooms. And we have those little in-ear radios that people have to wear.
And I have to say, the ambience of that is very strange. When you're in that kind of place, and everyone's quiet, there's no interactivity and no one wants to shout out. And then, flying to Melbourne a few days later, and giving a talk at a room for an APIdays conference here, I have to say, that room was fantastic. It was interactive. There were people having a good time.
So I know that a lot of countries have two cities that like to compete with one another. You have your London and your Birmingham, and you have your New York and your LA. And obviously, Sydney and Melbourne, I'm not trying to say I have a favorite, but just in this particular case, I have to say the Melbourne audience won out.
And so thank you if you did come along to either of those talks. I gave away a lot of podcast stickers. I'm sure I gained two or three new listeners out of that. But it was a good time overall and goes to show that it's not just clowning around in the South Pacific.
ADAM GLICK: Let's get to the news.
[MUSIC PLAYING]
ADAM GLICK: Kubernetes 1.16 has been released. Headline features include the general availability of custom resource definitions and dual-stack IPv4/IPv6. Listen to this week's interview to learn more.
CRAIG BOX: API gateway Traefik has made a 2.0 release this week, with new features including TCP support with SNI routing, multi-protocol ports, new middleware, and the new dashboard and web UI. They've also moved to YAML as a supported markup language, because Kubernetes.
ADAM GLICK: .NET Core 3.0 has been released. The project now embraces gRPC and includes a worker service project for background workers, which runs great on Kubernetes. The base container size also has shrunk by 60 megabytes to 109 megabytes.
CRAIG BOX: Google Cloud has announced the general availability of container-native load balancing in GKE, with numerous improvements to scalability and performance. Scaling an app down to zero replicas and back up is now over 90% faster than in the beta, giving advantages to low-traffic services. Users can now manage their own load balancer using "standalone network endpoint groups", a beta feature which makes it easier to have a single load balancer manage GKE endpoints alongside other Google services.
Google has also announced that it will spend an additional 3 billion euros expanding its European data center footprint, and alongside the biggest corporate purchase of renewable energy in history.
ADAM GLICK: CloudARK were at the recent Helm summit in Amsterdam and shared five key takeaways in a recent blog post. On top of the long-promised security improvements, there are projects underway to consolidate artifacts, like Helm charts and customized patches, as management is acknowledged as an issue, with several companies working on solutions. A number of talks were on Operators, but all agreed that Helm and Operators are complementary and that Helm is for templating deployments and Operators are for DAV2 operations. CloudARK also noted that Helm is a welcoming community, and they did not find a high barrier to entry.
CRAIG BOX: Crossplane has announced vision 0.3 of their multi-cloud Kubernetes control plane. In this release, they have broken the providers out of the main code tree so that each integration can be released at its own pace. They've also added support for Pivotal's PKS and Yugabyte, the database service, on top of the existing support for the three major cloud providers. Provisioning and lifecycle management for networking and Kubectl IAM permissions are now included. They have also included a new cross-plane runtime to simplify building controllers and a WordPress app stack.
ADAM GLICK: Agones has gone 1.0. For any of you who love to get your game on, this is good news, or at least for those developing backends for those games. Agones, which we covered with Cyril Tovena and Mark Mandel in episode 26, is an open-source game server hosting framework that runs on Kubernetes. This release brings numerous improvements, including updated documentation and a list of stability fixes to ensure that 1.0 is production-ready.
CRAIG BOX: Bloomberg has built a TPM node attestation plugin for SPIRE, the SPIFFE runtime environment. Now, if you're running sensitive workloads on a machine with a trusted platform module installed, you can include TPM claims as part of your workload attestation policies. If you don't recognize any of those words, please check out episode 45 with Andrew Jessup.
ADAM GLICK: Azure has announced that their AKS service is now available for users of their Government Cloud and that egress lockdown is now generally available for AKS.
They also released AKS Periscope, an open-source tool for troubleshooting your AKS clusters. The tool deploys onto your cluster nodes and captures data about your pods and nodes, as well as a host of networking settings like iptables, DNS, and connectivity checks, amongst other data collected. The tool currently supports Linux nodes and containers, and the data is automatically stored in the Azure Blob service.
CRAIG BOX: Sumo Logic is expanding the monitoring tools with their new Istio app. As well as monitoring open-source Istio installations, the tool works with Sumo Logic's existing GKE monitoring and Kubernetes app to provide broad monitoring support for Google's Anthos. Sumo Logic now say that they provide comprehensive monitoring for both the infrastructure and application layers.
ADAM GLICK: The latest Forrester Continuous Integration Wave has been released, and Google Cloud is placed as a leader, achieving the highest score for their current offering and the highest score for their strategy. In their blog post, Google Cloud called out the serverless nature of their CI tools, as well as the flexibility, security, and compliance features that they provide.
CRAIG BOX: Updates to Banzai's cloud operators this week. A new version 2 operator for logging is now more Kubernetes-native, allowing users to filter logs based on Kubernetes labels, and support for the Istio 1.3 releases included in their Istio Operator, including version upgrades.
ADAM GLICK: Quentin Hardy from Google Cloud has an interesting post in "Forbes" this past week about the challenges around the term "cloud-native." In particular, he points out that it's a bit of an overloaded term with several different definitions being actively used. He points out that in some cases, people are referring to companies that have always provided services available over the internet, like Snap or Google, while other people prefer to think of it as a set of cloud-native technologies like Kubernetes. How this will be resolved is left as an exercise for the reader, but I'll propose one potentially biased perspective that as Kubernetes becomes the technology that underlies more and more applications, the distinction may evaporate.
CRAIG BOX: Was that a cloud joke?
ADAM GLICK: Might have been.
CRAIG BOX: Citrix announced that it has integrated Istio into its ADC-- or Application Delivery Controller-- portfolio, which you may remember from its time known as Netscaler. You can now use the Istio APIs to program ADC as an ingress gateway or for sidecar proxies for microservices.
ADAM GLICK: Finally, ContainerShip, a hosting platform founded in 2015 and later re-platformed onto Kubernetes, has announced that they are calling it quits this week. They've announced that they will cease operations on October 31 and have advised customers to move their applications before then.
CRAIG BOX: And that's the news.
[MUSIC PLAYING]
ADAM GLICK: Lachlan Evenson is a principal program manager at Microsoft and the release manager for Kubernetes 1.16. Welcome to the show, Lachlan.
LACHLAN EVENSON: It's great to be here. Thanks for having me, Adam and Craig.
CRAIG BOX: Lachie, I've been looking forward to chatting to you for some time. We first met at KubeCon Berlin in 2017 when you were with Deis. Let's start with a question on everyone's ears-- which part of England are you from?
LACHLAN EVENSON: The prison part. See, we didn't have a choice of going to Australia, but I'd like to say we got the upper hand in the long run. We got that beautiful country, so yes, from Australia, the southern part of England-- the southern tip.
CRAIG BOX: We did set that question up a little bit. I'm actually in Australia this week, and I'll let you know it's quite a nice place. I can't imagine why you would have left.
LACHLAN EVENSON: Yeah, it seems fitting that you're interviewing an Australian from Australia, and that Australian is in San Francisco.
CRAIG BOX: Oh, well, thank you very much for joining us and making it work. This is the third in our occasional series of release lead interviews. We talked to Josh and Tim from Red Hat and VMware, respectively, in episode 10, and we talked to Aaron from Google in episode 46. And we asked all three how their journey in cloud-native started. What was your start to cloud-native?
LACHLAN EVENSON: I remember back in early 2014, I was working for a company called Lithium Technologies. And we'd been using containers for quite some time, and my boss at the time had put a challenge out to me-- go and find a way to orchestrate these containers because they seem to be providing quite a bit of value to our developer velocity.
So he gave me a week, and he said, go and check out both Mesos and Kubernetes. And at the end of that week, I had Kubernetes up and running, and I had workloads scheduled. I was a little bit more challenged on the Mesos side, but Kubernetes was there, and I had it up and running. And from there, I actually went and was offered to speak at the Kubernetes 1.0 launch in OSCOM in Portland in 2014, I believe.
CRAIG BOX: So a real early adopter?
LACHLAN EVENSON: Really, really early. I remember, I think, I started in 0.8, before CrashLoopBackOff was a thing. I remember writing that thing myself.
[LAUGHING]
CRAIG BOX: You were contributing to the code at that point as well?
LACHLAN EVENSON: I was just a user. I was part of the community at that point, but from a user perspective. So I showed up to things like the community meeting. I remember meeting Sarah Novotny in the very early years of the community meeting, and I spent some time in SIG Apps, so really looking at how people were putting workloads onto Kubernetes-- so going through that whole process.
It turned out we built some tools like Helm, before Helm existed, to facilitate rollout and putting applications onto Kubernetes. And then, once Helm existed, that's when I met the folks from Deis, and I said, hey, I think you want to get rid of this code that we've built internally and then go and use the open-source code that Helm provided.
So we got into the Helm ecosystem there, and I subsequently went and worked for Deis, specifically on professional services-- so helping people out in the community with their Kubernetes journey. And that was when we actually met Craig back in Berlin. It seems, you know, I say container years are like dog years; it's 7:1.
CRAIG BOX: Right.
LACHLAN EVENSON: So seven years ago, we were about 50 years-- much younger. And if you go and pull up the photo for me--
CRAIG BOX: That sounds like the same ratio as kangaroos to people in Australia.
LACHLAN EVENSON: It's much the same arithmetic, yes.
ADAM GLICK: What was the most interesting implementation that you ran into at that time?
LACHLAN EVENSON: There wasn't a lot of the workload APIs. So back 1.0, there wasn't even Deployments. There wasn't Ingress. So back in the day, there were a lot of people in those points trying to build those workload APIs on top of Kubernetes, but they didn't actually have any way to extend Kubernetes itself. So there was no third-party resources. There was no operators, no custom resources.
So a lot of people are actually trying to figure out how to interact with the Kubernetes API and deliver things like deployments, because you just had-- in those days, you didn't have replica sets. You had a ReplicationController that we called the RC, back in the day. So you didn't have a lot of these things that we take for granted today. There wasn't RBAC. There wasn't a lot of the things that we have today.
So it's great to have seen and been a part of the Kubernetes community from 0.8 to 1.16, and actually leading that release. So I've seen a lot, and it's been a wonderful part of my adventures in open-source.
ADAM GLICK: You were also part of the Deis team that transitioned and became a part of the Microsoft team. What was that transition like, from small startup to joining a large player in the cloud and technology community?
LACHLAN EVENSON: It was fantastic. When we came on board with Microsoft, they didn't have a managed Kubernetes offering, and we were brought on to try and seed that. There was also a bigger part that we were actually building open-source tools to help people in the community integrate. We had the autonomy with-- Brendan Burns was on the team. We had Gabe [Monroy]. And we really had that top-down autonomy that was believing and placing a bet on open-source and helping us build tools and give us that autonomy to go and solve problems in open-source, along with contributing to things like Kubernetes.
So I'm part of the upstream team from a PM perspective, and we have a bunch of engineers, a bunch of PMs that are actually working on these things in the Cloud Native Compute Foundation to help folks integrate their workloads into things like Kubernetes and build and aid their cloud-native journeys.
CRAIG BOX: So there's a number of new tools, and specifications, and so on that are still coming out from Microsoft under the Deis brand. That must be exciting to you as one of the people who joined from Deis initially.
LACHLAN EVENSON: Yeah, absolutely. So we really took that Deis brand-- it's now Deis Labs-- but we really wanted this a home to signal to the community that we were building things in the hope to put them out into foundation. So you may see things like CNAB, Cloud Native Application Bundles. I know you've had both Ralph and Jeremy on the show before--
CRAIG BOX: We have.
LACHLAN EVENSON: --talking about CNAB, SMI - Service Mesh Interface, other tooling in the ecosystem where we want to signal to the community that we want to go give that to a foundation. So we really want a neutral place to begin that nascent work, but then things, for example, Virtual Kubelet started there as well, and it went out into the Cloud Native Compute Foundation.
ADAM GLICK: Any consternation about the fact that Phippy has become the logo people look to rather than the actual owl [Captain Kube], as part of the donated characters?
LACHLAN EVENSON: Yes, so it's interesting because I didn't actually work on that project back at Deis, but the Deis folks, Karen Chu and Matt Butcher actually created "The Children's Guide to Kubernetes," which I thought was fantastic.
ADAM GLICK: Totally.
LACHLAN EVENSON: Because I could sit down and read it to my parents, as well, and tell them-- it wasn't for children. It was more for the adults in my life, I like to say. And so when I give out a copy of that book, I'm like, take it home and read it to mum. She might actually understand what you do by the end of that book.
But it was really a creative way, because this was back in that nascent Kubernetes where people were trying to get their head around those concepts-- what is a pod? What is a secret? What is a namespace? So having that vehicle of a fun set of characters--
ADAM GLICK: Yep.
LACHLAN EVENSON: And Phippy is a PHP app. Remember them? So yeah, it's totally in line with the things that we're seeing people want to containerize and put onto Kubernetes at that. But Phippy is still cute. I was questioned last week about Captain Kube, as well, on the release logo, so we could talk about that a little bit more. But there's a swag of characters in there that are quite cute and illustrate the fun concept behind the Kubernetes community.
CRAIG BOX: 1.16 has just been released. You were the release manager for that-- congratulations.
LACHLAN EVENSON: Thank you very much. It was a pleasure to serve the community.
CRAIG BOX: What are the headline announcements in Kubernetes 1.16?
LACHLAN EVENSON: Well, I think there are a few. Custom Resources hit GA. Now, that is a big milestone for extensibility and Kubernetes. I know we've spoken for some time-- custom resources were introduced in 1.7, and we've been trying to work through that ecosystem to bring the API up to a GA standard. So it hit GA, and I think a lot of the features that went in as part of the GA release will help people in the community that are writing operators.
So there's a lot of lifecycle management, a lot of tooling that you can put into the APIs themselves. So doing strict dependency checks-- you can do typing, you can do validation, you can do pruning superfluous fields, and allowing for that ecosystem of operators and extensibility in the community to exist on top of Kubernetes.
So I think it's been a long road to get to GA for Custom Resources, but it's great now that they're here and people can really bank on that being an API they can use to extend Kubernetes. So I'd say that's a large headline feature. The metrics overhaul, as well-- so I know this was on the release blog.
The metrics team have actually tried to standardize the metrics in Kubernetes and put them through the same paces as all other enhancements that go into Kubernetes. So they're really trying to put through, what are the criteria? How do we make them standard? How do we test them? How to make sure that they're extensible? So it was great to see that team actually step up and create stable metrics that everybody can build and stack on.
Finally, there were some other additions to CSI, as well. So volume resizing was actually added. So this is a maturity story around the container storage interface, which was introduced several releases ago in GA. But really, you've seen volume providers actually build on that interface and that interface get a little bit more broader to adopt things like, I want to resize dynamically at runtime on my storage volume. So that's a great story as well for those providers out there.
I think they're the big headline features for 1.16, but there are a slew. There were 31 enhancements that went into Kubernetes 1.16. And I know there have been questions out there in the community saying, well, how do we decide what's stable? So eight of those were stable, eight of those were beta, and the rest of those features, the 15 remaining, were actually in alpha. So there was quite a few things that went from alpha into beta and beta into stable, so I think that's a good progression for the release, as well.
ADAM GLICK: As you've looked at all these, which of them is your personal favorite?
LACHLAN EVENSON: I probably have two. One is a little bit biased, but I personally worked on, with the team dual-stack in the community. So dual-stack the ability to give IPv4 and IPv6 addresses to both pods and services. And I think where this is interesting in the community is Kubernetes is becoming a runtime that is going to new spaces. So think IoT, think edge, think cloud edge.
So when you're pushing Kubernetes into these new operational environments, things like addressing may become a problem where you might want to run thousands and thousands of pods which all need IP addresses. So having that same crossover point where I can have v4 and v6 at the same time, get comfortable with v6, I think Kubernetes may be an accelerator to v6 adoption through things like IoT workloads on top of Kubernetes.
So that was one. The other one is Endpoint Slices. So endpoint slices is about scaling. So as you may know, services have endpoints attached to them, and endpoints are all the pod IPs that actually match that label selector on a service. Now, when you have large clusters, you can imagine the number of pod IPs being attached to that service growing to tens of thousands. And when you update that, everything that actually watches those service endpoints needs to get an update, which is the delta change over time, which gets rather large as things are being attached, added, and removed, as is the dynamic nature of Kubernetes.
But what endpoint slices makes available is you can actually slice those endpoints up into groups of 100 and then only update the ones that you really need to worry about, which means as a scaling factor, we don't need to update everybody listening into tens of thousands of updates. We only need to update a subsection. So I'd say they're my two highlights, yeah.
CRAIG BOX: Are there any early stage or alpha features that you're excited to see where they go personally?
LACHLAN EVENSON: Personally, ephemeral containers. The tooling that you have available at runtime in a pod is dependent on the constituents or the containers that are part of that pod. And what we've seen in containers being built by scratch and tools like distroless from the folks out of Google, where you can build scratch containers that don't actually have any tooling inside them but just the raw compiled binaries, if you want to go in and debug that at runtime, it's incredibly difficult to insert something in.
And this is where ephemeral containers come in. I can actually insert a container into a running pod-- and let's just call that a debug container-- that has all my slew of tools that I need to debug that running workload, and I can insert that into a pod at runtime. So I think ephemeral containers is a really interesting feature that's been included in 1.16 in alpha, which allows a greater debugging story for the Kubernetes community.
ADAM GLICK: What feature that slipped do you wish would have made it into the release?
LACHLAN EVENSON: So the feature that slipped that I was a little disappointed that slipped was Sidecar Containers.
ADAM GLICK: Right.
LACHLAN EVENSON: In the world of service meshes, you may want to order the start of some containers, and it's very specific to things like service meshes in the case of the data plane. I need the Envoy sidecar to start before everything else so that it can wire up the networking.
The inverse is true as well. I need it to stop last. So sidecar containers gave you that ordered start. And what we see a lot of people doing in the ecosystem is just laying down one sidecar per node as a daemon set, and they want that to start before all the other pods on the machine. Or if it's inside the pod, or the context of one pod, they want to say that sidecar needs to stop before all the other containers in a pod. So giving you that ordered guarantee, I think, is really interesting and is really hot, especially given the service mesh ecosystem heating up.
CRAIG BOX: This release deprecates a few beta API groups for things like ReplicaSets and Deployments. That will break deployment for the group of people who have just taken example code off the web and don't really understand it. The GA version of these APIs were released in 1.9, so it's obviously a long time ago. There's been a lot of preparation going into this. But what considerations and concerns have we had about the fact that these are now being deprecated in this particular release?
LACHLAN EVENSON: Let me start by saying that this is the first release that we've had a big API deprecation, so the proof is going to be in the pudding.
CRAIG BOX: Yes.
LACHLAN EVENSON: And we do have an API deprecation policy. So as you mentioned, Craig, the Apps v1 has been around since 1.9. If you go and read the API deprecation policy, you can see that we have a three-release announcement. So around the 1.12, 1.13 time frame, we actually went and announced this deprecation, and over the last few releases, we've been reiterating that.
But really, what we want to do is get the whole community on those stable APIs because it really starts to become a problem when we're supporting all these many now-deprecated APIs, and people are building tooling around them and trying to build reliable tooling. So this is the first test for us to move people, and I'm sure it will break a lot of tools that depend on things. But I think in the long run, once we get onto those stable APIs, people can actually guarantee that their tools work, and it's going to become easier in the long run.
So we've put quite a bit of work in announcing this. There was a blog sent out about six months ago by Valerie Lancey in the Kubernetes community which said, hey, go use 'kubectl convert', where you can actually say, I want to convert this resource from this API version to that API version, and it actually makes that really easy. But I think there'll be some problems in the ecosystem, but we need to do this going forward, pruning out the old APIs and making sure that people are on the stable ones.
ADAM GLICK: Congratulations on the release of 1.16. Obviously, that's a big thing. It must have been a lot of work for you. Can you talk a little bit about what went into leading this release?
LACHLAN EVENSON: The job of the release lead is to oversee throughout the process of the release and make sure that the release gets out the door on a specific schedule. So really, what that is is wrangling a lot of different resources and a lot of different people in the community, and making sure that they show up and do the things that they are committed to as part of their duties as either SIG chairs or other roles in the community, and making sure that enhancements are in the right state, and code shows up at the right time, and that things are looking green.
So a lot of it is just making sure you know who to contact and how to contact them, and ask them to actually show up. But when I was asked at the end of the 1.15 release cycle if I would lead, you have to consider how much time it's going to take and the scheduling, where hours a week are dedicated to making sure that this release actually hits the shelves on time and is of a certain quality. So there is lots of pieces to that.
ADAM GLICK: Had you been on the path through the shadow program for release management?
LACHLAN EVENSON: Yeah, I had. I actually joined the shadow program-- so the shadow program for the release team. So Kubernetes release team is tasked with staffing a specific release, and I came in the 1.14 release under the lead of Aaron Crickenberger. And I was an enhancement shadow at that point. I was really interested in how KEPs worked, so the Kubernetes Enhancement Proposal work. So I wanted to make sure that I understood that part of the release team, and I came in and helped in that release.
And then, in 1.15, I was asked if I could be a lead shadow. And the lead shadow is to stand alongside the lead and help the lead fill their duties. So if they're out, if they need people to wrangle different parts of the community, I would go out and do that. So I've served on three releases at this point-- 1.14, 1.15, and 1.16.
CRAIG BOX: Thank you for your service.
LACHLAN EVENSON: Absolutely, it's my pleasure.
ADAM GLICK: Release manager emeritus is the next role for you, I assume?
LACHLAN EVENSON: [LAUGHS] Yes. We also have a new role on the release lead team called Emeritus Advisors, which are actually to go back and help answer the questions of, why was this decision made? How can we do better? What was this like in the previous release? So we do have that continuity, and in 1.17, we have the old release lead from 1.15. Claire Lawrence is coming back to fill in as emeritus advisor. So that is something we do take.
And I think for the shadow program in general, the release team is a really good example of how you can actually build continuity across releases in an open-source fashion. So we actually have a session at KubeCon San Diego on how that shadowing program works, so if you want to add a little plug on coming along to that. But it's really to get people excited about how we can do mentoring in open-source communities and make sure that the project goes on after all of us have rolled on and off the team.
ADAM GLICK: Speaking of the team, there were 32 people involved, including yourself, in this release. What is it like to coordinate that group? That sounds like a full time job.
LACHLAN EVENSON: It is a full time job. And let me say that this release team in 1.16 represented five different continents. We can count Antarctica as not having anybody, but we didn't have anybody from South America for that release, which was unfortunate. But we had people from Australia, China, India, Tanzania. So we have a good spread-- Europe, North America. So it's great to have that spread and that continuity, which allowed for us to get things done throughout the day.
CRAIG BOX: Until you want to schedule a meeting.
LACHLAN EVENSON: Scheduling a meeting was extremely difficult. So typically, on the release team, we run one Europe, Western Europe, and North American-friendly meeting, and then we ask the team if they would like to hold another meeting. Now, in the case of 1.16, they didn't want to hold another meeting. We actually put it out to survey. But in previous releases, we held an EU in the morning so that people in India, as well, or maybe even late-night in China, could be involved.
ADAM GLICK: Any interesting facts about the team, besides the incredible geographic diversity that you had, to work around that?
LACHLAN EVENSON: I really appreciate about the release team that we're from all different backgrounds, from all different parts of the world and all different companies. So there are people who are doing this on their own time, There are people who are doing this on company time, but we all come together with that shared common goal of shipping that release.
So this release was we had the five continents. It was really exciting in 1.17 that we have in the lead roles, it was represented mainly by women. So 1.17, watch out-- most of the leads for 1.17 are women, which is a great result, and that's through that shadow program that we can foster different types of talent. So I'm excited to see future releases benefiting from different diverse groups of people from the Kubernetes community.
CRAIG BOX: What are you going to put in the proverbial envelope for the 1.17 team?
LACHLAN EVENSON: We've had this theme of a lot of roles in the release team being cut and dry, right? We have these release handbooks, so for each of the members of the team, they're cut into different roles. There's seven different roles on the team. There's the lead. There's the CI signal role. There's bug triage. There's comms. There's docs. And there's release notes. And there's also the release branch managers who actually cut the code and make sure that they have shipped and it ends up in all the repositories.
What we did in the previous 1.15, we actually had a role call the test-infra role. And thanks to the wonderful work of the folks of the test-infra team out of Google-- Katharine [Berry], and Ben Elder, and other folks-- they actually automated this role completely that we could get rid of it in the 1.16 release and still have our same-- and be able to get a release out the door.
So I think a lot of these things are ripe for automation, and therefore, we can have a lot less of a footprint going forward. So let's automate the bits of the process that we can and actually refine the process to make sure that the people that are involved are not doing the repetitive tasks over and over again. So in the era of enhancements, we could streamline that process. CI signal and bug triage, there are places we could actually go in and automate that as well. I think one place that's been done really well in 1.16 was in the release notes.
So I don't know if you've seen relnotes.k8s.io, but you can go and check out the release notes and now, basically, annotated PRs show up as release notes that are searchable and sortable, all through an automated means, whereas that was previously some YAML jockeying to make sure that that would actually happen and be digestible to the users.
CRAIG BOX: Come on, Lachie, all Kubernetes is just some YAML jockeying.
[LAUGHING]
LACHLAN EVENSON: Yeah, but it's great to have an outcome where we can actually make that searchable and get people out of the mundaneness of things like, let's make sure we're copying and pasting YAML from left to right.
ADAM GLICK: After the release last week, you had a retrospective meeting. What was the takeaway from that meeting?
LACHLAN EVENSON: At the end of each release, we do have a retrospective. It's during the community meeting. That retrospective, it was good. I was just really excited to see that there were so many positives. It's a typical retrospective where we go, what did we say we were going to do last release? Did we do that? What was great? What can we do better? And some actions out of that.
So it was great to see people giving other people on the team so many compliments. It was really, really deep and rich, saying, thank you for doing this, thank you for doing that. People showed up and pulled their weight in the release team, and other people were acknowledging that. So that was great.
I think one thing we want to do is-- so we have a code freeze as part of the release process, which is where we make sure that code basically stops going into master in Kubernetes. Only things destined for the release can actually be put in there. But we don't actually stop the test infrastructure from changing, so the test infrastructure has a lifecycle of its own.
So one of the things that was proposed was about we actually code freeze the test infrastructure as well to make sure that we're not actually looking at changes in the test-infra causing jobs to fail while we're trying to stabilize the code. So I think that's something we have some high level agreement about, but getting down into the low-level nitty-gritty would be great in 1.17 and beyond.
ADAM GLICK: We talked about sidecar containers slipping out of this release. Most of the features are on a release train and are put in when they're ready. What does it mean for the process of managing a release when those things happen?
LACHLAN EVENSON: Basically, we have an enhancements freeze, and that says that enhancements-- so the KEPs that are backing these enhancements-- so the sidecar containers would have had an enhancement proposal. And the SIG that owns that code would then need to sign off and say that this is in a state called "implementable." When we've agreed on the high-level details, you can go and proceed and implement that.
Now, that had actually happened in the case of sidecar containers. The challenge was you still need to write the code and get the code actually implemented, and there's a month gap between enhancement freeze and code freeze. So if the code doesn't show up, or the code shows up and needs to be reviewed a little bit more, you may miss that deadline.
I think that's what happened in the case of this specific feature. It went all the way through to code freeze, the code wasn't complete at that time, and we basically had to make a call-- do we want to grant it an exception? In this case, they didn't ask for an exception. They said, let's just move it to 1.17.
So there's still a lot of people and SIGs show up at the start of a new release and put forward the whole release of all the things they want to ship, and obviously, throughout the release, a lot of those things get plucked off. So I think we went from something like 60 enhancements, and then what we got out the door was 31. So they either fall off as part of the enhancement freeze or as part of the code freeze, and that is absolutely typical of any release.
ADAM GLICK: Do you think that a three-month wait is acceptable for something that might have had a one- or two-week slip, or would you like to see enhancements be able to be released in point releases between the three-month releases?
LACHLAN EVENSON: Yeah, there's back and forth about this in the community, about how can we actually roll things at different cadences, I think, is the high-level question. Tim Hockin actually put out, how about we do stability cycles as well? Because there are a lot of new features going in, and there are a lot of stability features going in. But if you look at it, half of the features were beta or stable, and the other half were alpha, which means we're still introducing a lot more complexity and largely untested code into alpha state-- which, as much as we wouldn't like to admit, it does affect the stability of the system.
So there's talk of LTS. There's talk of stability releases as well. I think they're all things that are interesting now that Kubernetes has that momentum, and you are seeing a lot of things go to GA. People are like, I don't need to be drinking from the fire hose as fast. I have CRDs in GA. I have all these other things in GA. Do I actually need to consume this at the rate? So I think stay tuned. If you're interested in those discussions, the upstream community is having those. And show up there and voice your opinion.
CRAIG BOX: Is this the first release with its own release mascot?
LACHLAN EVENSON: I think that release mascot goes back to-- I would like to say 1.11. If you go back to 1.11, you can actually see the different mascots. I remember 1.11 being "The Hobbit." So it's the Hobbiton front door of Bilbo Baggins with the Kubernetes Helm on the front of it, and that was called 11ty-one--
CRAIG BOX: Uh-huh.
LACHLAN EVENSON: A long-expected release. So they go through from each release, and you can actually go check them out on the SIG release repository upstream.
CRAIG BOX: I do think this is the first time that's managed to make it into a blog post, though.
LACHLAN EVENSON: I do think it is the case. I wanted to have a little bit of fun with the release team, so typically you will see the release teams have a t-shirt. I have, from 1.14, the Caternetes, which Aaron designed, which has a bunch of cats kind of trying to look at a Kubernetes logo.
CRAIG BOX: There was a fun conversation we had with Aaron about his love of cats.
LACHLAN EVENSON: [LAUGHS] And it becomes a token of, hey, remember this hard work that you put together? So it becomes a badge of honor for everybody that participated in the release. So I wanted to highlight it as a release mascot. I don't think a lot of people knew that we did have those across the last few releases. But it's just a bit of fun, and I wanted to put my own spin on things just so that the team could come together. So a lot of it was around the laughs that we had as a team throughout this release-- and my love of Olive Garden.
CRAIG BOX: Your love of Olive Garden feels like it may have become a meme to a community which might need a little explanation for our audience. For those who are not familiar with American fine dining, can we start with what exactly is Olive Garden?
LACHLAN EVENSON: Olive Garden is the finest Italian dining experience you will have in the continental United States. I see everybody's faces saying, is he sure about that? I'm sure.
CRAIG BOX: That might require a slight justification on behalf of some of our Italian-American listeners.
ADAM GLICK: Is it the unlimited breadsticks and salad that really does it for you, or is the plastic boat that it comes in?
LACHLAN EVENSON: I think it's a combination of all three things. You know, the tour of Italy, you can't go past. The free breadsticks are fantastic. But Olive Garden just represents the large chain restaurant and that kind of childhood I had growing up and thinking about these large-scale chain restaurants. You don't get to choose your meme. And the legacy-- I would have liked to have had a different mascot.
But I just had a run with the meme of Olive Garden. And this came about, I would like to say, about three or four months ago. Paris Pittman from Google, who is another member of the Kubernetes community, kind of put out there, what's your favorite sit-down large-scale restaurant? And of course, I pitched in very early and said, it's got to be the Olive Garden.
And then everybody kind of jumped onto that. And my inbox is full of free Olive Garden gift certificates now, and it's taken on a life of its own. And at this point, I'm just embracing it-- so much so that we might even have the 1.16 release party at an Olive Garden in San Diego, if it can accommodate 10,000 people.
ADAM GLICK: When you're there, are you family?
LACHLAN EVENSON: Yes. Absolutely, absolutely. And I would have loved to put that. I think the release name was "unlimited breadsticks for all." I would have liked to have done, "When you're here, you're family," but that is, sadly, trademarked.
ADAM GLICK: Ah. What's next for you in the community?
LACHLAN EVENSON: I've really been looking at Cluster API a lot-- so building Kubernetes clusters on top of a declarative approach. So I've been taking a look at what we can do in the cluster API ecosystem. I'm also a chair of SIG PM, so helping foster the KEP process as well. So making sure that that continues to happen and continues to be fruitful for the community.
And I'm looking at a position for the steering committee as well, so we'll see if that happens. Voting-- that's a reminder to go and vote. Kubernetes steering-- but by no means a plug for me, but the Kubernetes steering elections are now happening. So please, if you're a part of that, please go and vote.
CRAIG BOX: With that, it only remains for us to say, Lachie, thank you so much for joining us today.
LACHLAN EVENSON: Absolutely, the pleasure was all mine. And thank you for this great show, and I wish you very much success for continued episodes.
CRAIG BOX: You can find Lachie on Twitter at @LachlanEvenson.
[MUSIC PLAYING]
Thanks for listening. As always, if you've enjoyed the show, please help us spread the word-- tell a friend. If you have any feedback for us, you can find us on Twitter @kubernetespod, or you can reach us by email at kubernetespodcast@google.com.
ADAM GLICK: You can also check out our website at kubernetespodcast.com, where you'll find transcripts and show notes. Until next time, take care.
CRAIG BOX: See you next week.
[THEME MUSIC]