Kubernetes Podcast from Google: Episode 12 - Kubernetes Origins, with Joe Beda

#12 July 17, 2018

Kubernetes Origins, with Joe Beda

Hosts: Craig Box, Adam Glick

Joe Beda, Craig McLuckie and Brendan Burns are considered the “co-founders” of Kubernetes; working with the cluster management teams at Google, they made the case that their implementation of the Borg and Omega patterns should become a proper product. Joe and Craig now run Heptio, a company working to bring Kubernetes to the enterprise. Your hosts talk to Joe Beda about the history of Kubernetes, creating a diverse company, and what exactly is wrong with YAML.

Do you have something cool to share? Some questions? Let us know:

News of the week

Minimal Ubuntu
Sysdig security blog series
Why Red Hat think Kubernetes is the new application server
Deep dive blog posts for Kubernetes 1.11:
Interview transcript blog post for Episode 10 with Josh Berkus and Tim Pepper
Elastifile announce Kubernetes and Tensorflow integration
Heptio Ark v0.9.0

Links from the interview

Transcript

Show full transcript

ADAM GLICK: Hi, and welcome to the Kubernetes Podcast from Google. I'm Adam Glick.

CRAIG BOX: And I'm Craig Box.

[MUSIC PLAYING]

ADAM GLICK: Hey, Craig. How's it going?

CRAIG BOX: Well, if everything plays out as we hope it is, I'm on a plane as I record this episode.

ADAM GLICK: Excellent.

CRAIG BOX: Actually, I'm on a plane as you listen to this episode. As it's released through the magic of technology, I am making my way to California for Google Cloud Next next week.

ADAM GLICK: Awesome. You all prepped and ready to go?

CRAIG BOX: Always.

ADAM GLICK: Excellent.

CRAIG BOX: That's a way of saying, eh, there's a little couple of last minute things I'm working on, but that's the fun of life.

ADAM GLICK: Indeed. I'm looking forward to it as well. I'll be heading there next week for Next. Not to overuse the "next" term, but it should be an exciting time, a lot of great sessions, lot of good things going on in both the cloud native world that we'll be talking about as well as the cloud world. So it'll be an exciting time, and you who are listening, we'll be recording some shows, so feel free to swing by, find the two of us. We'd love to meet you.

CRAIG BOX: Absolutely. Let's look at this week's news.

ADAM GLICK: Canonical has released Minimal Ubuntu, a version of Ubuntu that's 50% smaller and boots 40% faster than standard Ubuntu server images. The new OS version is designed for both VM and containerized usage, and offers 16.04 LTS and 18.04 LTS images. Images are available in Google Cloud and AWS as well as in Docker Hub. This new version has a kernel optimized for cloud hypervisors, and is fully compatible with the full version of Ubuntu server, but has user interactive parts like documentation, editors, and locales removed.

CRAIG BOX: Sysdig this week published an open source security guide where their suggestions on securing Docker and Kubernetes. The first part focuses on runtime security, and not surprisingly, they recommend the use of their open source project Sysdig Falco, a behavior monitoring and [INAUDIBLE]. The post talks about how you can take the output of Falco and plug it into your observability systems, and is well worth a read.

ADAM GLICK: Red Hat, home of JBoss and OpenShift, last week posted an article arguing that Kubernetes is the new application server. They point out that although many common languages may already be portable, with examples being Java WAR files or interpreted language files in Ruby, Python or Node, the real portability can be elusive, with things like excising the file system, meaning forward or backward slashes, language version differences, driver issues, and such. They point out that Kubernetes and containers solve many of these issues, and solve common challenges like service discovery, basic invocation, elasticity logging, monitoring, tracing, CI/CD pipelines, resilience and authentication.

CRAIG BOX: The Kubernetes project this week published several blogs diving deep into parts of the new Kubernetes 1.11 release, including the IPVS in-cluster load balancing, core DNS, resizing persistent volumes, and dynamic kubelet configuration. Two weeks ago, we interviewed the release managers for Kubernetes 1.11 and the upcoming 1.12, and we've posted the transcript of that interview to the Kubernetes.io blog. If you haven't had a chance to listen to the episode, we encourage you to check it out.

ADAM GLICK: Elastifile this week showcased their new integrations with Kubernetes and TensorFlow. Elastifile is a startup in the hybrid storage solution area, and provides functionality like [INAUDIBLE] and NFS file server capabilities in the cloud, and works with all three major cloud vendors.

CRAIG BOX: Heptio announced version 0.9 of their Ark backup tool for Kubernetes' clusters. New features include integration with open source backup project, Restic, with a C, to back up Kubernetes' volumes, and additional support for exporting metrics in the Prometheus data format. Users of previous versions are encouraged to upgrade, as it also includes a critical bug fix to mitigate a potential data corruption issue.

ADAM GLICK: And that's the news. We now have the great pleasure to welcome Joe Beda to the podcast. Joe is an engineer who started his career at Microsoft, working on Internet Explorer. it was actually the first place that both of us met. While at Google, he launched products including Google Talk and Google Compute Engine before helping to create Kubernetes and launching Google Kubernetes engine. Joe's currently the founder and CTO of Heptio, a company working to make Kubernetes work well in the enterprise. Welcome, Joe.

JOE BEDA: Thank you so much for having me. This is a lot of fun.

CRAIG BOX: Thanks for coming. So this week, we're talking about three years since 1.0. And about a month ago, there was a blog post which you wrote about four years since the actual launch of Kubernetes. It's good that they're together in June or July, but which is the milestone you think we should be celebrating?

JOE BEDA: It's hard for me to celebrate any of this stuff, I think. There's so much about the community coming together, seeing things develop over time. It's really a very different project now than it was then.

And I think about the same thing when I go to KubeCon, where I'll have sometimes people say, hey, look what you started. I'm like, I didn't really do this. It's like, you get the ball rolling and then things happen. I think I'm more excited about the launch, and getting the ideas out there, and starting the community, and getting things off on the right foot. But I don't want to put too much weight on those initial moments. Because really, it's everybody who put everything in over those last four years that really makes Kubernetes what it is.

ADAM GLICK: As you look at that-- and you mentioned that Kubernetes has, obviously, changed a lot over time and continues to grow and change-- many side projects. What do you consider the canonical Kubernetes origin story? When people talk a lot about where it came from. You put up that blog post, which is really fascinating. What's the canonical story?

CRAIG BOX: Take us back to that moment.

JOE BEDA: I think a lot of it is the context of what was happening at Google at that time. Craig and I had started Google Compute Engine, got that launched. Google was taking cloud more and more seriously. More and more investments into cloud. Along the same way Brendan was working on some projects that would eventually be sort of configuration-type stuff with cloud. And there was this real feeling.

And I remember early on, we were presenting-- this was right when I had a new boss, who was from other places at Google. And we were showing GCE. And say, hey, look. You can call these APIs, go to this console. You get a virtual machine. You can SSH in. And look, you have a root prompt.

And to most of the world outside of Google, that's really cool that you can get a virtual machine up. We were faster in terms of launching machines than any of the competition. But the reaction from folks inside Google is like, now what?

And honestly, that "now what" question around I have a machine, now what, has fascinated and has driven so much around how people think about, how they consume, what they do with cloud. So the big push around cloud is not getting those machines. It's, how do you use those machines in a really useful way so that you can turn that into business impact? And so that realization that virtual machines were not where things are going, or at least not where the value is. And then, comparing and contrasting that to how things work inside of Google with things like Borg.

And so for Google to take cloud seriously, Google really needed to dog food their own platform. And so what that meant was you either convince all of the Googlers to start using raw virtual machines, which would have felt like a step backwards. Or you know, we bring Borg-- and I'm using air quotes here-- to the rest of the world.

So that's really where Kubernetes came from. We got to roll forward. We got to get these ideas out of Google to the rest of the world because it helps solve those problems in a way that had been proven at Google for 10 years. But also, it was a way to actually start converging the way that Googlers built software and the way that the rest of the world built software.

And we knew that that was going to be a long journey to be able to do that. And then it was just a matter of, how do we do that? How do we take the ideas in Borg that have been proven out over the last 10 years? And just to be clear, like none of me, Brendan, or Craig actually helped get Borg off the ground.

I joined Google and was an early user of Borg in some of the earlier projects I was on at Google. Compute Engine was built on top of Borg. Brendan worked on search that used Borg. So like, we had been consumers of Borg, but we weren't the ones that were actually driving Borg internally at Google. But those ideas were clear to us. We saw the usefulness.

ADAM GLICK: Who was driving Borg at that time?

JOE BEDA: There were a lot of folks working on it. And a lot of the investment there and the effort at the time was really split into two different areas. There was the folks working on Borg, which was, hey, we got a lot of customers. We got a lot of applications. We got a lot of stuff that we need to make better and better and continue to refine. Improve the economics. Improve the reporting. Improve the isolation. So there were a lot of folks working on that.

Meanwhile, there were a lot of other folks working on a system called Omega, which was sort of the successor to Borg. And I'm not the one to tell the Borg or the Omega story. I think Brian Grant was one of the folks driving Omega. And over time, the ideas with Omega leaked into Kubernetes. And some of the people started mixing together. And so those ideas from both Borg and Omega really found their way into Kubernetes.

But there was a question, and there were folks who were very much pushing for us to either build a service on top of Borg-- essentially, find a way to expose Borg as sort of a new service with GCP, or to open source Borg itself. And the intuition that Craig and Brendan and I had was that neither of those plans is great.

Open sourcing Borg-- I mean, I love Borg. It's a great system. But it's a real-world system that's been battle-tested within Google for 10 years. It's written in C++. It has its tendrils throughout the rest of Google. It's really not built in a way where it would be easy to extract it to be able to release it. And even if we did, the concepts and some of the assumptions that it makes may not apply to the outside world.

So at the same time, we see Docker starting to emerge, which is a new twist on how to package up views and think about containers. There's parallels between the way that Docker uses containers and the way that Google uses containers internally, but there are also big differences there. And so just open sourcing Borg was not really viable in our minds, both in terms of the amount of work and effort and churn there. But also, the end product we didn't think was quite right for an external audience.

And then similarly, if we were going to create a product and sort of create an external-facing API for Borg, Borg wasn't built for that type of scale in terms of number of users. Because the thing is when you build a system for order of number of teams at Google, that's different than order of number of companies that are going to use your cloud platform across the entire world, right? It's a very different thing. And so it wasn't built for that scale.

And we also knew that we needed to change that paradigm for how people thought about building and releasing applications. And it's going to be very, very hard to do that with a proprietary service. And I think we'd seen that with the sort of fragmentation around platform as a service and how difficult it is to get people to write applications that are really targeted at a very specific set of APIs. At this level at least.

You add all that together and the decision was really like, let's do something new. Let's do it open source. And let's try and change the way people think about deploying applications so that it's more compatible with the way that Google thinks about deploying applications.

ADAM GLICK: You talk about making it open source. Obviously, Google is a big creator of open source and also contributor to it, but not every project is necessarily turned into an open source project. Was there any discussion about whether it should be open source, or from the get-go were people all on board that that's the right way to move forward with Kubernetes?

JOE BEDA: It took a long time to convince folks that Kubernetes should be open source. Google is, I think, changing its relationship with open source, and has over the last four years since we introduced Kubernetes. I think if you think about the open source efforts that Google had at the time, you'd look at things like Android and Chrome. And Android and Chrome are open source projects. They're great. But it's very clear that Google is in the driver's seat for both of those projects. They are not really community-driven projects in the same vein as something like Kubernetes. And so we really wanted to do something new and different with Kubernetes that went beyond sort of the Chrome and the Android model of open source.

At the same time, there was this argument around-- Google had released things like the MapReduce paper, or the Bigtable paper, or the GFS paper. And these things fundamentally changed how people thought about doing distributed systems and launched whole new industries around things like Hadoop.

And what we saw-- and I led an effort for a couple of months to try and figure out, how can we actually sort of bring Hadoop, the Hadoop ecosystem, back in on top of GCE, and reuse some of the plumbing that we had built for MapReduce internal to Google to accelerate and make Hadoop better? And what we found is that the way that the abstractions, the models were just slightly different enough that it was very, very difficult to do that. And so the end result was that Google, through its research, ended up igniting these really interesting ecosystems. But because there was no code that was shared, because it was just the ideas and not the actual implementation, there was really a limited ability to be able to import this and benefit from it. And so that was one of the big arguments that we took when we were arguing to open source Kubernetes.

CRAIG BOX: One of the reasons that we celebrate the anniversary of the 1.0 announcement, obviously we had the foundation of the CNCF and the formalization of that external community. But also, because customers don't like using untested software and they're very keen for things to be labeled 1.0 or greater, or to be [INAUDIBLE] generally available. Are you surprised at the uptake we've had since that point? Tell me about your experience with seeing people use Kubernetes, and the things that have come in as contributions from outside people or from things that companies have asked for that you wouldn't have thought of when you were building it out before it got that 1.0 tag on it.

JOE BEDA: Well, I think when we look at sort of the early contributors and what they brought into Kubernetes, you definitely have to give a nod towards Red Hat. I remember there were early meetings at Google I/O before we announced Kubernetes publicly where it was Craig and me, and I think Brendan was there, talking to the Red Hat folks, working to convince them to base the next version of OpenShift on top of Kubernetes. And they took a bet on us. And it's worked out well for everybody in terms of that community. And I think that Kubernetes benefited enormously by having that outside perspective and that outside boost from the get-go.

And one of the interesting things that-- the perspectives that the OpenShift folks brought in early on was this idea of namespaces. Now, they call them projects in OpenShift. I mean, I remember we had arguments over, what should we call them inside of Kubernetes itself? But some of the ideas early on with Kubernetes when we thought about GKE and how we were going to host it was, well, maybe you have a Kubernetes per namespace and it becomes a single tenant-type of thing, or a Kubernetes cluster per project, per GCP project. And then you have a one-to-one relationship between Kubernetes clusters and GCP projects.

But the OpenShift folks are like, well, no, we're looking at the OpenShift online stuff. We look at how we deploy it. We really need this idea, if not multi-tenancy, multi-team to be baked in in a fundamental way.

And I think the project is better for that. I think it does create some confusion for GKE because there is this question of, when do you create multiple projects? When do you create multiple namespaces? And I think there's an awkwardness there. But I think when we look at sort of the applicability of Kubernetes more widely, that contribution and that perspective that the Red Hat folks brought was really, really critical.

Did labeling something 1.0 and calling it GA so that customers can use it? I mean, this is just classic Google. I mean, how long was Gmail labeled beta? I think there is a thinking inside of Google that calling something GA, calling it 1.0-- Google holds the bar very, very, very, very high for itself.

And I think part of becoming a cloud platform for Google, part of Google understanding how other people think about IT, is setting those expectations appropriately around what the quality bar is, what is GA, what the assurances are. I think the fact that Google holds the bar very high, and that feeling translates to the Kubernetes community is good. But I think it's also possible to overdo it.

If you ask me, I'd probably say Kubernetes isn't 1.0 yet, right? Because there are still things we want to do. There's still rough edges.

CRAIG BOX: A lot of people do talk about the idea of, obviously, 1.0 was an MVP. It was get something out there that meets the standard where the API is unlikely to change, but there were large things of what we think of Kubernetes today that weren't there. So robust access control didn't come until long after 1.0. And all the extension points, we went through third-party resources. We eventually renamed them, settled on custom resources.

A lot of these extensions and things that happen afterwards, obviously we've been able to make changes to that basic thing. But is there anything that's so baked into Kubernetes that you wish you'd been able to change it at the time knowing what you know now?

JOE BEDA: Oh, yeah. Like so many things. Mistakes were made. But like, I don't think they're fatal. And I think some of the stuff is that there were concepts in Kubernetes that were well-supported with our experiences at Google, where it's like, yes, this is the right way to do it. We know that this is going to work well. I think the relationship between pods and replica sets, it's not exactly the way things were done at Borg, but it's very much informed by sort of the core parts of what happens with Borg. Omega is well on its way to moving to this label-based way of relating things.

I think we've done OK with some other stuff to actually sort of replicate things like-- things like deployments are not exactly the way things are done at Google, but I think that they're a useful primitive. The one that I'm unsure about-- and I think if you got Tim Hockin on here, he'd probably tell you the same thing. I don't know if we got the service abstraction right, because we've really put a lot of stuff into service. It can really be three different things in different layers. The relationship between service and ingress is a little bit overwrought-- service versus endpoint, the layout of the endpoint object. So I think that whole thing, I think could have done with a little bit more refactoring and baking.

Again, I don't think it's fatal. But I also think that it's not as elegant as it probably could have been if we had more time to adjust it. So me, and Kelsey, and Brendan wrote this "Kubernetes-- Up and Running" book. And I did the service discovery chapter on that. And I rewrote that a couple of times to try and find the best way to present the service object because I think it really is one of the trickiest things to get across to folks.

ADAM GLICK: As you think about the stuff that you're building and where it's going over the next four years, what are you looking forward to? And where do you see Kubernetes being four years from now?

JOE BEDA: I think different people have different visions on where they want to take Kubernetes. I am in the sort of microkernel minimalist camp. I think that I want to see a thriving ecosystem. I want to see a lot of ideas tried out outside of the Kubernetes core. And so it worries me because I find myself-- like I think successful projects are defined by their ability to say no. But it's hard because people come at you and they're like, here's a scenario. Here's what we want to do. This is really important.

And then the question is, well, does that have to be built-in to Kubernetes? Is that something that can actually be done outside? So I'll say yes to extension points. Let's make sure we get those extension points right. But I'll also say, I'll take a really critical look, a really hard line at adding new features that maybe don't need to be built-in that can be layered out on top.

And so an example is recently that a bunch of folks from SIG Storage came to the architecture SIG and were asking about snapshots. And how do we build snapshots into Kubernetes, snapshots for volumes? And so my push-- and I think it's an ongoing discussion. I don't know where that's going to end up. But my push is, can we take these features and bake them down into primitives? Can we actually go through and find things that have surprising reusability in other places? Because I think we found that that has worked well in the past.

But it's a lot of work to do that. It's a lot of churn. People just want to ship their feature. They want to get the stuff out there. And so there's that tension that we're fighting.

So as I look over the next four years, what I'd love to see is Kubernetes calmed down, continued its march towards boring. But along the way, I'd love to see an explosion of stuff happening in the larger ecosystem, whether those projects are formally under the Kubernetes umbrella or whether it's a bunch of projects that aren't part of Kubernetes officially, but also play into that system. Because I think that ecosystem will pay dividends and will scale way more than any centrally-controlled project possibly could.

CRAIG BOX: So after you were at Google, you joined Accel as an entrepreneur in residence. And that afforded you some time to think about what you were going to do next. Was it always going to be something to do with Kubernetes?

JOE BEDA: No. I mean, I left Google. I was a little burned out. I took some sabbatical and some vacation. And then, pinged one of the partners at Accel who I'd been talking with and friendly with for a while. He's like, hey, do you want to become an entrepreneur in residence? And my first question is, what is that?

So for the listeners, I'd love-- every time I talk about this with folks, I want to demystify this stuff a little bit, and really provide a bit of a decoder ring.

CRAIG BOX: Please.

JOE BEDA: So early venture-- this is like seed and series A type of venture-- is really built on reputation and relationship more than anything else. The assumption is that you start a company and things are going to change. You're going to have unexpected situations. The market's going to change. And so they really want to bet on the people.

You also want to have a good, sort of stable of ideas. You want to be creative. You want to have a direction that you want to go. But you also want to bet on those people knowing that they're actually going to be able to roll with it and actually come out with something great at the end, right?

A great example is Slack, the communication application. You know, that started out as a gaming company. And they built Slack as an [INAUDIBLE] tool.

CRAIG BOX: It was the co-founders of Flickr, I think. People [INAUDIBLE] experience or history.

JOE BEDA: Yeah. An Accel portfolio company, also.

So an entrepreneur in residence, it's really, I think-- in some ways, you can think about this as a venture firm taking an option out on somebody. It's like, hey, let's build a relationship. Let's give you some support. And in return, you'll get to know us. We'll get to know you. If something interesting happens, if you start a company, then things will go a lot more smoothly because everybody is a known quantity at that point. So it's really a relationship building tool.

So for me, they helped me pay for flights down to the Bay Area to talk to people. We introduced each other to other people. They sometimes had me brainstorm ideas with other folks. So a lot of that sort of like trying to sort of ideate. And I went into the EIR not sure if I was going to do a startup at all because it's a big commitment.

I've always viewed my career as sort of doing tours of duty. You know, you sign up to do like four or five years on a project because that seems to be the amount of time it takes to do something interesting. And so early in my career, I did a bunch of stuff at Google Talk. That was three or four years. I did some Ad stuff, three or four years. Then I did the Cloud stuff, that was three or four years. And so you see these things.

Whereas, I think a startup, that's a longer commitment. You're probably talking-- if you're going to stick with it, it's 5 to 10 years. You don't know how long this is going to take. So I wanted to be super serious that I was ready for that marathon. I wanted to make sure I had a set of co-founders that I could trust because picking a co-founder is kind of like getting married. It's somebody that you're going to have to live with day to day for a long, long time.

And one of the main causes for startups to fail is founder drama, founder disagreement. So having somebody that you've worked with before that you trust, that you can mind meld with is critically important. So I wanted that. And then, I wanted to be able to see an area, an idea where I could see a business being built. Because I think-- I'm based in Seattle. Accel is down in the Bay Area.

I think Seattle startups bring a little bit more of a-- by necessity, because the funding. And you know, it's just not quite the same environment up here. You have to think about how you're going to build a real business. What does this stuff look like?

And I think in the Bay Area, you can get money and start throwing stuff at the wall and you'll find somebody to fund you. In the consumer space, if you can get 100 million people to do anything, you're going to make money, right?

ADAM GLICK: Yep.

CRAIG BOX: Did you try Kubernetes for pets?

JOE BEDA: Yeah, exactly, right? But that's like the Snapchat, Instagram model, right? It's like, if you can get a lot of people doing stuff-- scooters now, I guess-- then, you can make money.

B2B, you got to be more thoughtful about it. And so we wanted to see a path, or at least I wanted to see a path for like, how do we build a real business on this stuff? And I think you have to be doubly careful when you have that open source angle.

And so I was exploring all sorts of things, but the opportunity around Kubernetes has continued to grow. It was clear that it was really gaining steam. It was really a do or die moment of like, if I don't take this opportunity, nothing in my career is really going to line up like this again in terms of doing a startup.

ADAM GLICK: When you were looking at those things, Kubernetes obviously, a slightly bigger ecosystem today. But there's a whole swath of directions you could go with it. How do you narrow down the list of possibilities of what direction you wanted to take Heptio, and the pieces that Heptio was going to take the lead and build?

JOE BEDA: Yeah, thank you. I mean, the first thing is we looked at explosions of similar ecosystems, and what happened, and how things played out with the commercial entities that were formed around those. And I think it's very instructive to look at the Hadoop ecosystem. It's very instructive to look at the OpenStack ecosystem. And look to see what worked and what didn't work in those spaces. And there's a couple of themes that I took out of this. And those are things that we're bringing to Heptio.

So the first one is that you can't view a startup that's attached to an open source ecosystem as a value extraction exercise. You can't just go in there and say, hey, we're just going to build something on top of that. We're going to let the community do its thing and we're going to be the ones who monetize it. We didn't want to actually just pull stuff out because that will engender a lot of resentment within the community. You're not investing in the community. Things will eventually peter out.

If Google is the only one investing in Kubernetes, at some point that's not a healthy thing. We really do need to find a way to get lots more people investing in it. And I wanted Heptio-- commensurate with our size because we're still pretty small-- I wanted Heptio to be able to really go through and play its part.

And then, the second lesson out of that was that your business model can't be predicated on the open source project being broken. One of our engineers described, like this business model as like crawling over broken glass as a service. And so you take open source that's fundamentally unusable, fundamentally broken, and you're the ones who clean it up and make it consumable for other people. And you see this. This is just the whole idea of a distribution, is that, oh, that open source stuff is so messy. We're the ones who are going to actually clean it up, and actually stabilize it, and make it available to you.

I didn't want to see Kubernetes move in that direction. I want people to be able to take upstream Kubernetes as it is and be successful with it, which means that we can't build our business on the idea that we're just going to help you get Kubernetes installed and maintain it.

There is a business there right now doing that. But one of the things that we're investing in is things like SIG Cluster Lifecycle. We just had a blog post on the Heptio blog about everything that's new with kubeadmin in 1.11. As we march towards GA for kubeadmin, we're adding HA there. There's a lot of stabilization going on there. So a lot of good work that Tim St. Clair at Heptio has been doing working with a bunch of folks in the community around that. And that's really our effort to get to the point where we really want to make sure that upstream is as usable as possible.

And so our value-add then has to be around building things that build on top of Kubernetes that are complementary to Kubernetes. And so that's where as we start releasing more commercial product, things that actually grow HKS, our Kubernetes subscription beyond just support, you're going to be seeing some of that in the coming months as we start getting some of that released.

And some of the open source efforts that we're working on, things like Ark, which is a backup disaster recovery, Sonobuoy, which is a diagnostic tool and the basis for the Conformance suite for Kubernetes. Ksonnet, which is a configuration tool where we're trying out some new ideas there, trying to take things in a new direction. Or like Gimbal and Contour, which is an ingress controller based on Envoy to solve Northwest problems. We're trying on new ideas there. We're trying to move things forward, not just in core Kubernetes, but also introduce new things that really help to fill-in gaps in the larger ecosystem as we see it.

CRAIG BOX: I'd like to ask about a couple of those projects that Heptio have built to work around the core of Kubernetes like you mentioned. Ksonnet is a templating language for creating Kubernetes objects. And I know that both you and Brendan have worked on high-level abstractions. And I think it was you that said that you think of YAML as being a bit like a machine language. But YAML by design claims to be a human readable format, at least comparable to things like JSON. So I guess my question is, what's wrong with YAML?

JOE BEDA: There's nothing wrong with YAML, per se. I actually really like YAML. And a little while ago, I met one of the original authors of the YAML spec and I apologized to him.

CRAIG BOX: I saw a post recently from the original author of the YAML spec who apologized to everybody.

JOE BEDA: I don't think the problem was with YAML, per se. And I think it's interesting to look at a little bit of the history here. Because we built Kubernetes. There was an API server. And early on with GCP, there was a machine learning service, like one of the early things with machine learning. And I remember we went to a Google I/O demo. And the way they demoed it, is that they were writing JSON in an [INAUDIBLE] editor, and then using [INAUDIBLE] to upload it.

And it's like, that does demo it, but you need something that is a little bit more user friendly for people to interact, touch, and feel stuff, right? And so as we were building Kubernetes, we had the API server. And we needed something that would facilitate actually touching and actuating the API server. So that's sort of like where kubectl came from, right? I think it was called kubeconfig early on, and then it got renamed to kubectl, "kube cuttle," "kube ectl," whatever you want to call it.

Kris Nova calls it "kube ectl." It's like one of the worst way to pronounce these things. And so the early versions of that had like, oh, well, you write JSON and this is just a facilitator for being to upload this stuff to the API server. And it was really about the API being the thing.

Now, the API for Kubernetes is really built to be a machine readable API, things for different components to react. Because it's not only humans that are doing this. You have things like the [INAUDIBLE], and controllers, and everybody is using that same similar API. So it's for both in the way that machine communications that are unambiguous are verbose.

And so that API was really built to be as explicit-- as necessarily verbose as possible to enable the type of interactions between components in the system. But then over time, that leaked into the user experience as we built this tool that became kubectl. We did it with JSON.

Well, JSON's there's a total pain in the butt. There's no comments. The quotes and the commas. It's like, JSON is just horrible from a sort of like writing stuff point of view. Any time you edit your config in VS Code, you start cursing JSON.

And so YAML is just a more human way of actually writing the JSON. We don't use any of the more advanced features of YAML. I think those things kind of confuse things a little bit.

The big mistake here when we say that the YAML for Kubernetes is a machine language or it wasn't sort of created to be done by humans is that it's too verbose. There's too many granular concepts. You'll find that you repeat the same labels three times typically.

Now, there's reasons where you may want to vary these things, but those are maybe like the 1% reasons versus the 99% reasons. And so there was always this thinking that there would be higher-level abstractions that would help to sort of take the intent and translate that into those API objects. I'm kind of horrified that we're still authoring those API objects directly by hand because we never intended those things to be human consumable.

But it's a challenge, especially in the face of CRDs and the expansion of the number of objects to find the right way to capture that intent in a way that both makes it easy to use, but then also doesn't restrict you from doing more advanced things and grows with you as we start extending Kubernetes into new directions and new places. And so that's some of the challenge that tools like Ksonnet are working on. Brendan has his meta-particle stuff, which is a very different take on this stuff. Brian Grant has been advocating a patch-based thing which is being prototyped with Kustomize.

The big headline here is that this is a hard problem and there really is no one-size-fits-all solution. But it's time for us to move past just hand authoring YAML. We got to do better than that.

CRAIG BOX: Do you think the answer might be more that we need to extend the API so that whatever the format we're putting things in, we can actually get the things out in that format as well?

JOE BEDA: I don't know. It's difficult. Like every API field is there for a reason. And I think if you look at some of the more advanced Helm charts, like go look at the nginx Helm chart. And what you'll find is that when you try and create an abstraction that works for a wide set of people, you end up parameterizing it. You end up with so many different options that it becomes as complex as the original thing that you're trying to abstract around.

The number of parameters that you see in a complex Helm chart is almost-- and then you look at the YAML that it's generating and it's nothing but [INAUDIBLE] templates. And I think part of the problem that we have here is that there is no silver bullet. So I'm not quite sure what the solution is. I think we can find a solution that will work for a significant number of folks. And that's what we're trying to do with Ksonnet. But I don't think anybody's going to say this is the one true way to do this stuff.

CRAIG BOX: With projects like Ark and Contour, they fill in places that the platform is perhaps lacking. So the backup and restore of Kubernetes objects and general things that are deficient about the ingress implementation as we talked about before. How do you balance wanting to make a third-party product versus thinking maybe this is something we should extend on the upstream?

JOE BEDA: Well, like I said before, the choice really is, is this something that we try and move into upstream Kubernetes? Is this something that we do open source? Or is this something that Heptio does as a commercial product?

And like I was talking about earlier, I'm a minimalist when it comes to pushing stuff into upstream. I want to prove things out outside of Kubernetes as much as possible before we move things into upstream. And so I want us to be as surgical as possible, as thoughtful as possible about how we do that.

I think one of the interesting things here is I look at some of the work that we're doing around Contour. The stuff, as Dave Cheney, who works for us-- he's based in Australia-- is the main driver for Contour. We're exploring some new ideas about, how do you capture ingress intent? How do we actually look at the ingress object, take those ideas? But like, what are some new needs that we can deal with?

And basically, we're looking at multi-team. We're looking at, how do you actually sort of split traffic? Some of the things that folks will want out of an ingress solution, but don't actually work well with the current ingress object.

We've decided instead of actually adding yet more annotations here to explore using CRDs as a way to prototype ideas that I think if we can prove that they're successful, then we're going to have a good case to go work with SIG Network and say, well, maybe these will influence some of the ideas as we look at turning the crank on the next version of ingress. But I don't want to go through and design the next version of ingress without having those proof points outside of the core set of Kubernetes.

And then in terms of commercial value versus open source, the closer you are to developers, the closer you are to the stuff that folks are actually running, the more uplift you're going to get from having a community, from having the open source. And so that's generally the way that we think about that. And so things like Contour and Ark, just naturally those core projects are going to be open.

ADAM GLICK: Heptio is known for being a very socially-aware tech company, and has a particular care around diversity. How did you help build that culture into the company?

JOE BEDA: That's a great question. The first thing it started out with was me realizing that-- and I don't want to blame Google too much here because this is on me. But there were certain aspects around the way that big companies evolve that brought out the worst in me. I looked around and I'm like, I don't like some of how I act. I don't like the way that I interact with other people.

If you ask around Google, I had sharp elbows. I pissed off a lot of people. And I didn't want to be that person, right? So a lot of it started with when I took the time off from Google, it was really about let me start listening to more people. Let me re-evaluate the way that I interact with people. Let me really practice being able to be self-reflective and question my own assumptions about the world. So that's the first thing, is that, like, all right, I'm like, I got to do work on myself.

And then, the second thing is that as you build a company, you're like, I want to build a company that I want to work for. And so I think that it's very clear that we have problems in our industry. The best thing I can do is build a company where we can do our part. We can do as much as possible to help bring some new ideas and solutions to those problems.

It's not easy. We're not doing everything that I want us to be able to do. We have to balance things like growth versus diversity inclusion. Those are very, very difficult trade-offs that we have to make. And I'm trying to hold us true to that. But just like there's this idea of technical debt where there's this compounding interest where the problems get harder over time when you don't address them, I think especially with startups there's an idea of cultural debt.

If you wait until you're a hundred people to start addressing these issues, it's really too late. It's going to cost 10 times as much, as if you take care of this stuff early on. So we're trying to just be very conscious about our culture. Because you're going to have a culture at a startup one way or the other. You might as well pick what kind of culture you're going to have and really work to live those values and demonstrate that. And so the focus on diversity inclusion and in general, being thoughtful about our values, is something that Craig and I put a lot of thought in from the get-go.

ADAM GLICK: I've also heard you refer to Heptio as a geographically-distributed company centered around Seattle. Is the hiring policy also inspired by the social goal?

JOE BEDA: Yes and no. I mean, so pragmatically I look at some of the folks-- you know, the engineers and the field engineers and the leadership that we have-- and I'm just continually impressed by the team that we have. But I'm not going to get say, Tim St. Clair to move to Seattle. It's like, he likes his community. He likes where he lives, right? And I think you'll see that everywhere, especially with the open source angle, that people who have a proven track record in Kubernetes or in open source, those folks don't all live in the Bay Area. They don't all live in Seattle. The West Coast doesn't have a monopoly on good ideas and talent either. So that's one thing.

I think the social goals-- there's a nuance to this over time. Because I think there's two superpowers when it comes to hiring that startups have. So the first one here is that inside of big companies, job roles end up being straight-jackets. It's like, you are a product manager. You are a developer. And if you find somebody who doesn't fit neatly into those, they end up--

CRAIG BOX: Recording a podcast.

JOE BEDA: They end up recording podcasts, yeah. But it's easy to trigger this immune reaction within the larger organization. And you have people say things like stay in your lane, that's not your job, right? And those generalists can be enormously valuable in a startup context. And so one of the things that a startup can do is provide that flexibility where you can really make use and provide a great environment for people to excel outside of the formal definition of a particular job description.

The second superpower on hiring that small companies have is that they can look deeper into the resume. I went to Harvey Mudd down in Southern California, a very well-respected school. I know folks who went to Harvard, MIT. It's like, you pick any of those classes. There are idiots at all of them, right? So using that as your bar, using that as your filter, it may be efficient, but I don't think it's smart.

And then, the flip side of that is true, is that you can like-- let's take University of Wisconsin Madison, right? That's a great school. They have a great CS department. Maybe you have somebody local there who went to school there but maybe they have a sick parent and so that kept them close to home. And so they got a job locally there and their resume doesn't read like a rock-star trajectory on the West Coast type of thing. And those people will get missed. And there's a huge potential there.

Building a remote company and having a company where you can look deeper really opens up opportunities for folks who can't upend their life. They don't have the support. They can't move and find a place to live in the Bay Area, right? And I think that it's both a pragmatic angle around just being able to find good people, but I think there is also room for that social aspect of actually being able to find folks who otherwise wouldn't have those opportunities.

CRAIG BOX: Well, you're talking about the roles of a generalist in a startup. You've been producing a weekly video series exploring the Kubernetes ecosystem for almost a year now. What prompted you to start recording these? Were these sessions you were doing beforehand, or is this something you started just for the video?

JOE BEDA: So just for the listeners, this is called TGI Kubernetes. And so me-- and more and more lately, Kris Nova, who's a developer advocate of Heptio, has been taking over this stuff. I do it in this room here that I'm in right now. And I don't know why I started that, to be honest. I think a lot of it is that I wanted to put a human face on how Kubernetes works, how you work with it.

I started watching game-play videos. And at the same point, they're enormously boring, but they're also enormously interesting. And I think part of what makes them so interesting is that you see the little tips and tricks. You see the real deal of what people do, what they spend time on. And I wanted to give a little bit of that with respect to Kubernetes, where I went through stuff live without a lot of preparation, made mistakes, screwed things up, answered questions. And so I really wanted to get that rawness, that interactivity, around how people learn and operate with and deal with Kubernetes.

Over time, I think that's really resonated with folks. They're long. I spend like an hour and a half doing them. And so you can watch them at like 1.5. That's probably an appropriate speed to watch them if you're watching the back catalog. So that's one thing.

I think the next thing is really about, for me, personal satisfaction. So I'm CTO of the company. I'm currently managing all the engineering at the company.

But you know, you ask anybody, I'll tell you, I'm a mediocre manager. I'm really not a great manager. I'm trying hard. I want to learn. I have to expand my skills because I owe it to the employees to be able to do that.

But I think one of the things that happens-- and there's a lot of companies that do this where they take good engineers and they turn them into crappy managers. And I think for me at least, one of the reasons why I think I'm not a great manager is I don't take personal satisfaction in my team doing well. If my team does amazing things, the team did that. I didn't do that, right? I just sat in meetings all day doing one-on-ones and encouraging them. I was a cheerleader on the side. That's an effective manager.

CRAIG BOX: It's a useful skill.

JOE BEDA: It's a useful skill. But I think personally, the way that I'm wired is that I don't get that satisfaction of like, oh, my team did stuff, so therefore I can claim credit for it.

CRAIG BOX: Right.

JOE BEDA: And so I think TGIK was an outlet for me to actually do something directly that I can point at that I can actually say, hey, I'm having an impact. I'm building these relationships. And so it was Friday afternoon. Early on, I drink a beer and like, hang out and try and connect with a larger audience as an outlet for me to do something that was sort of hands-on keyboard direct without sort of promising to do stuff and not delivering, which is so easy to do when you're a manager who used to be an engineer.

ADAM GLICK: Totally. If people follow you on Twitter, they see a lot about your family, especially your son Theo's adventures and learning to code and work with Kubernetes. Is this interest from him or is this interest from you?

JOE BEDA: This is all him. And he trolls me because he doesn't listen to me at all. He will use the most esoteric technologies and go deep on it. I'll give you some examples.

So early on in my career I was at Microsoft. I was on Internet Explorer. And then I helped start what at the time was called Avalon and turned into Windows Presentation Foundation. I was doing the graphics APIs for Avalon at the time. And there's still people working on it, so I want to be respectful because I still know folks who are still working on that stuff. But Avalon's not lighting the world on fire these days, let's be honest.

He loves that stuff. So he wants to build real UI. So he's like played with Windows Presentation Foundation. He's done Swing Java apps. He started doing with the Java Genie, the Java sort of network object serialization thing. And I'm like, nobody uses this stuff anymore, but he'll go ahead and do that. So what I've been trying to do as much as possible is encourage him to do more web stuff, get interested in Kubernetes. And I think he's getting there, right? And it's fascinating to watch kids learn.

And you talk to teachers about the whole new math thing and the different sort of approaches to teaching kids math. And there's one approach, which is this very structured-- you start with concepts and you build them on top of each other and you master the things below it before you get to the next thing. And I think that's how we like to think people learn. And a lot of times, teaching works in that way.

Watching him naturally discover this is that he'll just circle around and touch technologies, play with them, move across to something else. And he'll use like four different tools across three different languages all over the course of a Saturday. And I'm just running to keep up with him, right?

And I think some of the ways that folks are trying to teach math now are some of those like touch back on topics and sort of reinforce things over time. So yeah, he's definitely in the driver's seat around that stuff.

ADAM GLICK: Awesome.

CRAIG BOX: All right. Well, Joe, thank you so much for taking the time to talk to us today.

JOE BEDA: Well, thank you for having me. Hopefully, it's an interesting conversation for folks.

ADAM GLICK: I certainly think so.

CRAIG BOX: Joe Beda can be found on Twitter as jbeda. And you can read his writing at blog.heptio.com. You can learn more about Heptio at Heptio.com.

JOE BEDA: And we're hiring just, by the way. Please come join us. Thank you.

CRAIG BOX: Perfect. Thank you.

ADAM GLICK: Take care.

Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @kubernetespod, or reach us by email, kubernetespodcast@google.com.

CRAIG BOX: You can find all the links from our episode today and more at our website at kubernetespodcast.com. Until next time, take care.

ADAM GLICK: Catch you next week.

[MUSIC PLAYING]

View More Episodes