#135 January 26, 2021

Siri, Storage and Solutions, with Josh Bernstein

Hosts: Craig Box, Jasmine Jaksic

Josh Bernstein has worked at a number of infrastructure roles before recently landing at Google. He talks about migrating Siri from AWS (pre-acqusition) to VMware to Mesos, and Dell EMC’s work building what would become the Container Storage Interface. Guest host Jasmine Jaksic talks with Craig about snowcreatures.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box with my very special guest host, Jasmine Jaksic.


CRAIG BOX: Jasmine, welcome back to the show.

JASMINE JAKSIC: Hi, Craig. Thanks for having me on.

CRAIG BOX: You were last with us-- for people who are not fully caught up with the back catalog-- a long, long time ago on episode number 15 in August 2018.

JASMINE JAKSIC: Yes, that was a very long time ago.

CRAIG BOX: A lifetime ago, perhaps?

JASMINE JAKSIC: It actually is. As a matter of fact, my son Oliver was born during September 2018. So it was before his time.

CRAIG BOX: And you joined us with Dan Ciruli to talk about the Istio project. How's that project gone since last we spoke?

JASMINE JAKSIC: Istio has had many releases since then, architectural as well as adding new features and improved content. So if you haven't checked it out recently, you absolutely should, istio.io.

CRAIG BOX: Now, for those who are not sure why Jasmine's on the show this week, of course, my co-host Adam has moved on to pastures greener. And so what we're going to do is a rotation of guest hosts over the next few weeks going back through some of the guests that we've spoken to in the past. Adam always used to give us a game or TV show recommendation, so what do you and Oliver like to spend your time competing against each other at?

JASMINE JAKSIC: He is at a point where he's inventing games, but he's not able to communicate them very fluidly. So we go through quite a bit of time trying to figure out what he wants me to do, and then we play his game.

CRAIG BOX: Does he win?

JASMINE JAKSIC: He always wins.

CRAIG BOX: But does he understand the concept of rules?


CRAIG BOX: Would he feel better sometimes if Mummy won?

JASMINE JAKSIC: I don't think he understands the concept of winning. He's just enjoying doing something, and he wants me to do that with him.

CRAIG BOX: So is that a game, or is that just a thing that you're doing together?


CRAIG BOX: That's a deep question.


CRAIG BOX: That's way too deep for us.

JASMINE JAKSIC: What makes doing something together into an actual game? Is it just winning and losing?

CRAIG BOX: I think the element of competition has to be some part of it.

JASMINE JAKSIC: I see you've spent some time in the snow?

CRAIG BOX: Yes. Very unseasonably here for the UK, we had a big snowfall this Sunday. And of course, we're all in lockdown, but we're allowed to go out for our government-mandated one walk per day. And so everyone went to the park, and everyone went to the top of the hill looking over the River Thames near where I live.

And there were a lot of people sledding on plastic sleds, which I'm sure they only get out once every two or three years, or plastic bags failing that, and a lot of snowmen. If you check me out on Twitter, you'll find a bunch of snowmen and snow thing pictures that I took all around the neighborhood. A lot of imaginative uses of snow, shall we say?

There's one snow thing you see a picture of there with a staff, if you will. It looks a little bit more like a rabbit than a person. But to be honest, I was a little bit disappointed there were no snow Yodas.

JASMINE JAKSIC: Is there such a thing as fluid species?

CRAIG BOX: Perhaps.

JASMINE JAKSIC: There we go. It's a person rabbit. Rabbit people?

CRAIG BOX: It seemed wise and sage. But again, the opportunity to have a snow Baby Yoda seemed like it would have been perfect, and no one took it up in Richmond Park at least.

JASMINE JAKSIC: You should, Craig.

CRAIG BOX: Unfortunately, the moment for that has passed. The snow is mostly all melted now. And the weather looks like it is going to just be rain for the rest of the week. I hear that the Bay Area has suffered from a little bit of inclement weather along those lines.

JASMINE JAKSIC: Oh, yes. We experienced extreme winter. It drizzled a little bit yesterday. I'm guessing about 10 minutes? That's winter for us. And the advantage in having a small kid is I had to take him out, and he jumped in a puddle and forced me to jump in the puddle as well.

CRAIG BOX: Sounds like a game.

JASMINE JAKSIC: It is a game. And that's as close as we're ever going to come to experiencing winter in California.

CRAIG BOX: Well, that's the weather for the week. Let's get to the news.


JASMINE JAKSIC: Kubernetes supports horizontal and vertical pod autoscaling, but why not both? Google Kubernetes Engine has introduced multidimensional pod autoscaling, which allows you to use horizontal scaling based on CPU and vertical scaling based on memory at the same time. Multidimensional pod autoscaling is now available in preview in GKE's Rapid Channel.

CRAIG BOX: Hitachi makes power stations, trains, and vacuum cleaners. But you're nothing to us until you have a managed Kubernetes offering. Hitachi Vantara this week launched Hitachi Kubernetes Service-- assess management plane offering multi-cloud and bare metal management, a service catalog, and quote, "the highest level of security." This comes 10 months after the acquisition of the assets of a closed startup called Containership. The service launches on AWS with Google Cloud and Microsoft Azure coming soon.

JASMINE JAKSIC: A new project called Garnet has been released to support running distributed and parallel Python on Kubernetes. Garnet plugs into your existing Notebook, IDE, or CI pipeline and enables users to scale popular ML libraries, including Pandas, NumPy, and scikit-learn, or custom code. It then runs these using the Dask framework, abstracting the Kubernetes away for developers and data scientists.

The preview includes support for Garnet's fully managed cluster, currently offering a free tier. Upcoming releases will support running Garnet against your own cluster. In case you're wondering, a garnet is a red gem that's totally not a ruby.

CRAIG BOX: Kind, Kubernetes' own Docker has released version 0.10. This release adds a new contributor's guide, which must have worked, as the project also welcomed two new maintainers. You can now run kind in Google Cloud Shell, which could well be Docker in Docker in Docker. Or maybe not. As the kind website points out, it's actually all containerd anyway. Learn all about kind in episode 69 with Ben Elder.

JASMINE JAKSIC: Hosted Knative service, Google Cloud Run, has a host of announcements this week. Adding support for end-to-end HTTP/2, WebSockets, and GRPC bidirectional streams. Make sure you don't cross those. These features are all available in preview.

CRAIG BOX: A team of Heptio-turned-VMware staff has been working on a book on production Kubernetes, looking at the key considerations as you move to a real world deployment. It has been released to print and will be available in July. But if you can't wait, the e-book is now available for the price of your email address, courtesy of VMware Tanzu.

JASMINE JAKSIC: If e-books are your thing, Alex Ellis from OpenFaaS has written a guide to running it. This one will cost you $25, but funds future development by someone doing open source independently. You can learn all about how Alex is funding his work in episode 116.

CRAIG BOX: Our interview with Chris Aniszczyk last week inspired him to write a blog post with his 2021 predictions. And his colleague Priyanka Sharma, GM of the CNCF, has done the same. She lays out her vision for 2021 as well as her experience six months into the job in a post to The New Stack. You can hear from Priyanka in episode 107.

JASMINE JAKSIC: Last week, Chris mentioned the LFX program for CNCF mentoring. And this week, the CNCF announced the graduation of 14 interns who worked on 11 projects, including Kubernetes, TiKV, and Chaos Mesh. Congratulations to all the graduates.

CRAIG BOX: How do you tell if an attacker has been in your Kubernetes cluster? Brad Geesaman from Darkbit proposes an interesting honeypot, a secret which looks enticing but shouldn't normally be read. If it shows up in your audit log, you know you've been hacked. The honey token, he proposes, uses the Tiller service from Helm 2, which you should have upgraded from if you're still running Helm.

JASMINE JAKSIC: If you would like to read more on security, Seth Art from Bishopfox Labs has a write-up on various exploits that are possible if a pod has too much access. He looks at privileged pods and permission to access various host attributes. We won't spoil it, but even if everything is still locked down, you are not out of the woods.

CRAIG BOX: Finally, by now you will have heard the US Air Force is flying Kubernetes on fighter jets, and on December 11, took its first supersonic flight. It passed its flight tests with flying colors, demonstrating live updates to software on the ground and in the air without any interruption to the system.

JASMINE JAKSIC: And that's the news.


CRAIG BOX: Josh Bernstein is the director of infrastructure modernization solutions for Google Cloud. He has worked on infrastructure and cloud at companies such as Facebook, Apple, Dell EMC, and ByteDance, and was previously on the governing board of the CNCF. Welcome to the show, Josh.

JOSH BERNSTEIN: Thank you, Craig. Great to be here.

CRAIG BOX: Let's pick your story up about 10 years ago when you started at Apple. First of all, what was the company like at that time? What was the infrastructure they were operating?

JOSH BERNSTEIN: It was a really interesting time. I started on the Siri team about six months or so after their acquisition. What was interesting about the infrastructure at that time is the company had claimed that they were one of the largest virtualized deployments in AWS at the time.

And this was long before Netflix got big, so maybe that was true, maybe that wasn't. It predates me. But it was a very, very large virtual machine environment.

The company was purchased in April. I started in December later that year. The goal was basically go build Siri for Apple.

CRAIG BOX: Presumably, they're going to run at a much larger scale than Siri before Apple was going to run it?

JOSH BERNSTEIN: [LAUGHS] Yeah, I mean, you have to imagine that if you were embedded in every iPhone that ships from this point forward, you don't have to grow your user base from 0 to 1,000 to 5,000. You basically go from 0 to tens of millions as soon as the keynote ends. As an engineer, you're given this task. Start now, go build Siri. And you go ask questions like, how many users are we going to support?

CRAIG BOX: All of them.

JOSH BERNSTEIN: How popular is the app going to be?


JOSH BERNSTEIN: All of the questions. And you get no answer. The answers are like--

CRAIG BOX: Think of a number and 10x it.

JOSH BERNSTEIN: Think of a number, right. And very famously, I had asked at one point-- I said, well, do we have a budget? At least give me a budget. Give me some bounding number. And the response I got back was, be fiscally responsible. What do you do with that?

And so we set to work. From December 2010 to the following October, we built what I think at that time was one of the world's largest VMware environments to support Siri. It was basically replicating to some degree the environment that existed in Amazon but with a completely new code base, right? The code base was also completely rewritten.

Everything was-- if I'm giving myself credit, I would call them wild educated guesses. That was the task at hand. It was crazy.

CRAIG BOX: Was there ever any consideration on just keeping it in the cloud?

JOSH BERNSTEIN: I don't think so. There was an opportunity to rewrite the app from the ground up. And in doing so, it would be deployed sort of in Apple's purview. Apple at that time was very strict.

And I think everybody 10 years ago was very paranoid about the security context of the cloud. And Apple, being one of those companies that values user privacy, said, look, we're going to build this all on-prem. I mean, that was just sort of what the plan was going to be. I don't know if there was any discussion about using the cloud at that point.

We had other very silly discussions about, well, if we need to buy all this infrastructure, we need to buy all these servers, are there other companies we should just buy to do that? And we joked around about a number of things. But ultimately, we bought our own servers, we built our own data centers, and launched a very large VMware environment to start, at least for the first, I think, two years or so.

CRAIG BOX: Were they still making Xserves back then?

JOSH BERNSTEIN: I think they had been discontinued. There were some lying around, but buying Xserves also wasn't really an option.

CRAIG BOX: Doesn't really matter what your data center looks like. It doesn't have to be pretty. It just has to be functional.

JOSH BERNSTEIN: Yeah. Well, I think that the Xserve was an incredible piece of hardware. Running it in a data center was still really challenging, right? I think the Xserve was meant for prosumer video recording, audio recording market that would be rack-mounted next to you.

Maybe you would have a cluster of a hundred of them at max, but probably just two or three. But certainly, tens of thousands was not something in scope. So managing them remotely was really hard. They were never an option for us. Perhaps thankfully, we went down traditional x86 server route.

CRAIG BOX: This was all going down around Steve Jobs's final year. Do you have a Steve Jobs story?

JOSH BERNSTEIN: [LAUGHS] Yeah, I do have a Steve Jobs story. Siri, when I started there, was in Infinite Loop 2. So that was sort of Scott [Forstall]'s building. I was running back inside from a meeting or from lunch or from something, and I came around the corner. And at the base of the staircase, I was looking at my phone, and I bumped into Steve. I physically bumped into him.

And I looked up. I was like, oh my gosh. It was Steve. And I think he could see the shock.

CRAIG BOX: The fear in your eyes?

JOSH BERNSTEIN: I don't know that it was fear. I think I was just in shock. And I said, oh my gosh. I'm so sorry.

And he said, oh, it's quite all right. Tell me your name. And I told him, I'm Josh. What do you do here? And I told him I was working on a project.

And I gave him the code name, because it was sort of Apple culture not to talk about what you're working on. And he said, ah, yes, of course. What are you doing for them? And tell me about what you're working on. And I'm really, really excited about it.

He was full of enthusiasm. He was very gracious and made my first meeting with him very endearing almost. We would go on to have plenty of meetings together over the course of the year until he got very ill, but that's my meeting Steve Jobs story for the first time.

CRAIG BOX: The iPhone is obviously taking off during this period. And as Siri is scaling out, does it remain fiscally prudent to keep running VMware?

JOSH BERNSTEIN: Well, I don't know so much that it was a fiscal issue for us. I think what we learned or what I learned very quickly was we made the VMware choice because we needed manageability. We knew we were going to need to run virtual machines, and we knew we were going to run a lot of them. What could we do?

Well, we had two options. We could buy VMware. I think we even entertained Xen at that time. But then how do you manage and operate VMs at any real scale? And VMware was the only option. And we didn't have the time to write our own tools to do it. VMware was expensive.

But what happened for us was the operational costs and the manageability costs associated with an environment like that was painful. You would issue an API command or set of API calls to, let's say, spin up 1,000 VMs or re-provision a few thousand VMs. And VMware at that time was backed by a database. vCenter was backed by an Oracle database as we were running it.

It was just slow. That API call would fail. And you'd be like, well, how many VMs did we start? How many did we stop? And so the complexity involved in managing an environment of that scale with that kind of model was untenable.

And so that's really what drove us to look at something else was not so much the price and the costs, because, again, we were being fiscally responsible. We weren't being cheap. We weren't really worried about spending money, but we wanted to spend money where it added value. And we were spending a ton of time operationally just managing that environment.

It worked great. I have no qualms about it. I think it's a phenomenal platform. It's just we outgrew it very quickly. And that's really what drove us to do something else.

CRAIG BOX: What did the process of looking for something else look like?

JOSH BERNSTEIN: [LAUGHS] Well, you have to remember this was 2012 or so. We had hired a few additional engineers. And one of our SRE engineers found this thing called Mesos.

Kubernetes didn't exist. And Mesos, at the time-- this was before the company was started, right? It was essentially some code from Ben's PhD thesis on GitHub that looked like a good option.


JOSH BERNSTEIN: I remember sitting in a meeting with our leadership team. They looked at a colleague of mine and said, so the plan is basically to rebuild the Siri environment on some code you found on GitHub? He said it kind of jokingly, but yeah, that's basically what we did is we tried to-- and we actually attempted over the next year or so to re-provision and rebuild the infrastructure on top of bare metal and Mesos.

CRAIG BOX: Did Mesos improve over that time? And was Apple contributing to it?

JOSH BERNSTEIN: Mesos improved quite a bit. I think that Mesos doesn't get the credit that it deserves for how robust that tool is and how reliable that code was. It was really, really good. It was written in C++.

And so it was difficult to contribute to. I think it was difficult to gain traction to. The code base was fantastic. We were lucky that we had a great experience with it.

I think that if you look at the way that Kubernetes has developed over the years, Kubernetes' abstraction model and resource model makes it very, very powerful. The fact that it's written in Go makes it more consumable and easier to contribute back to. And the whole ecosystem has exploded in this massive complexity of various tools that do various different things.

And I think, to some degree, we were lucky that Mesos just did container start and stop. We had to build other tools. We didn't have an Istio and an Envoy. And so we had to build our own service discovery layer. We had to hack together NFS scripts for storage, because the CSI wasn't available yet.

CRAIG BOX: Would you call it more of re-architecture, rewriting, or migration?

JOSH BERNSTEIN: I don't think it was rewriting. I mean, there were certain tools that we had to build and there were certain things that we had to rewrite. But it wasn't a rewrite of the core code. It was a change in the operational model.


JOSH BERNSTEIN: How you did deployments, how you did monitoring, how you reacted to events. Rewrite is way too strong of a word.


JOSH BERNSTEIN: Certainly, there were changes to the code that we made to make it more runnable in containers, if I can say that. To say it was a giant rewrite I think is a real stretch.

CRAIG BOX: Mesos was obviously great at running batch workloads at the time. Were you able to run stateful workloads on Mesos as well?

JOSH BERNSTEIN: Yeah. I would say it depends on your definition of stateful. We ran things like MySQL. We ran certain stateful workloads. One thing we did not run on Mesos, though, was the Hadoop stack and HDFS and things like that.

CRAIG BOX: But that's what it was really good at.

JOSH BERNSTEIN: Later on, it became very good at things like Spark and mapreduce and so on and so forth. But in the early days, it started and stopped containers and did control plane stuff really well.


JOSH BERNSTEIN: And all that capability and all that functionality was added much later. I think one of the beauties of Mesos and one of its Achilles heels was sort of this two-tier scheduling, right? Because it had the Mesos controller and then it had this scheduling framework on top of it.

You know, different than how Kubernetes has been architected, but that gave it a lot of power to run stateful workloads later on when those types of tools were written. But in the beginning, it was just Mesos scheduler and Mesos master. And that was it.

CRAIG BOX: Tell me about the experience of moving from Apple to EMC.

JOSH BERNSTEIN: What I learned and what I appreciated leaving Apple was I remember walking around the trade show floor my first week at EMC. And this was before they were acquired. And my colleagues there were talking to me and pointing out all the areas of the show floor, like, this is called hyper-converged infrastructure, where you run compute and storage on the same node.


JOSH BERNSTEIN: And this area of the floor is our DevOps hall. I had to ask them for explanation, like what is DevOps, and what does hyper-converged mean? And as they explained it to me, it was like, oh, these are the things that we organically fell into in our journey at Apple.


JOSH BERNSTEIN: We didn't know we should do DevOps. It was just sort of what we did. And I think the realization that I had was, wow, the whole world doesn't operate the way that we do. They operate completely differently. They don't have that experience that we had in a very, very short period of time.

CRAIG BOX: What was the new job at EMC? What were you brought on to do?

JOSH BERNSTEIN: I came to EMC to be the VP of Technology. And I had a variety of responsibilities there. But one of the things that I had the opportunity to do was lead this open source team that they had put together.

One of the senior executives, who later became the CMO, brilliant gentleman, had said, well, we should do something with DevOps. And we should do something in open source. But we don't really know what we're doing.

They had put together this team of developers and marketing and were trying to figure it out. And when I came over, my responsibility was to lead that team. That team was called the Code team. Your listeners might remember it as EMC {code}.

CRAIG BOX: That's code with squiggly braces around the outside.

JOSH BERNSTEIN: That's right. Maybe you've seen us at KubeCon, and you remember REX-Ray, and you remember the logos. We had a phenomenal team of people.

And the experience that I had in coming over was, well, we can't do a bunch of little projects and little tools. We'll never gain any momentum or any acceptance. So what's the big project we can hang our hat on? And I had this pet peeve coming out of Apple that we had built this amazing new environment with containers and built our own service discovery and done all that, but we were still running persistent applications with hackey NFS mount scripts or pinning containers to specific nodes of local disk.

There was a project one of the engineers had written called Dogged and REX. It was this collection of code that did storage orchestration for containers. And later, that project became known as REX-Ray. And those of you that were around probably remember the logo with the dog and the parrot. We had a really great marketing team that came up with all of this.

But REX-Ray was really designed to be like storage abstraction for container runtimes. And we supported for a long time running with Mesos, running with Docker, and then later running with Kubernetes. And in the latter years, that project became effectively the container storage interface in the CSI. And we co-developed that with the community.

And it was solving a pet peeve I have. How do you run persistent apps in containers? And folding that into the Kubernetes ecosystem was really important to me personally. I'm really proud of the work that we did there.

And now everybody says, oh, well, you know, you run it through a CSI abstraction. Or you read a blog about how to run a persistent app through Kubernetes, and it all goes through the CSI. And it's pretty neat to see that.

It was a huge collaborative effort. I think it was a great example of how the community as a whole banded together to solve a problem for the community. It was open source the way it should have been. And I'm really proud of being involved in all of that.

CRAIG BOX: If I have something today that I would like to make into a standard in the way that you took REX-Ray and then built a community around it to what eventually became CSI, how would you suggest you start on a project like that?

JOSH BERNSTEIN: The first thing is you have to be honest with yourself about why you think people are going to care about your project. And the way I guess I think about it is, what problem is this solving in the community? Is this solving an actual problem? Does the community know that it's a problem yet? Do people actually have this pain?

You have to think about it much like you're thinking of a product at a startup. What problem am I solving? And will people care? I see a lot of projects come through that are yet another, right? They're yet another database or yet another--

CRAIG BOX: Markup language.

JOSH BERNSTEIN: Markup language, or yet another version of Helm. I think these sorts of things struggle to gain traction, because they're not solving a real problem. They're just expressing a problem in a different way.


JOSH BERNSTEIN: It's a different opinion. And that's much harder.

CRAIG BOX: If you have that wide open space, though, if you have a new area, and you think, hey, I have perhaps the chance to help guide this, and maybe for commercial reasons a company might think to do this, what direction should they take to make sure that it becomes a community thing and not just yet another solution in a bigger space of solutions?

JOSH BERNSTEIN: It becomes basically a grassroots marketing campaign. You know, at {code}, we were fortunate enough to have people that were connected in the community. We had people that could run marketing. We had great engineers. And we had people that could connect with others in the community.

I mean, we were effectively writing a storage plug-in, if you will. You have to sort of-- not gain notoriety, but you have to gain buy-in. Because it was storage, it was easy to go to the other storage companies that were involved in the community and say, hey, listen, we want to work with you on this. Is this a problem for you?

We worked very, very closely with Mesos and Mesosphere at that time. Mesosphere, as I recall, wrote the original specification. And then the {code} team was doing the original implementation of the specification.

And so once you get buy-in from people, you sort of get this critical mass that then you can take to a SIG. Now we have SIGs or now we have TOC members. And you can get them to believe in your cause. And once you have that kind of momentum, you can go through the SIG process and the TOC process and you can get your project sort of accepted or acknowledged.

It sounds easier than it is. Solving a real problem for the community, it's really hard to do versus just having another opinion about something, right? Yet another markup language doesn't help anybody.

CRAIG BOX: You heard it here first!

JOSH BERNSTEIN: Yeah, [LAUGHS] that's right. You know, that's the other thing. Some companies believe that they can open source something for commercial purposes or financial purposes. Giving away something you invest in is not a good way to make money generally. And so you have to be mindful that your open sourcing something, you're giving this away for good reason, right?

There was a lot of discussion and a lot of talk at EMC as to whether or not we should productize REX-Ray and keep the code closed and make it a product versus giving it away for free. Because it was solving a real problem and because it was an enabler for us to sell storage to customers, it made sense to open source it. And too many companies also make the mistake of just we're going to be an open source thing and everybody's going to come and join us. And that's just not how it works.

CRAIG BOX: While you were at EMC, you were the governing board representative on the Cloud Native Computing Foundation. I'd like to get into a little bit about how people donate projects to the foundation and some of those things you just talked about and how they built a business around it. But first of all, what was your experience like on the CNCF governing board? For people who aren't familiar, what does that do?

JOSH BERNSTEIN: CNCF is a nonprofit organization that is a subsidy of the Linux Foundation. And it's been around for a very long time. And the purpose of the governing board and the purpose of CNCF is to provide some structure and really to act in the best interest of the community around popular projects. So of course, the LF supports the Linux kernel. There's so many organizations that the LF is involved with.

But the CNCF is basically the cloud native or the Kubernetes portion of LF. It was a collection of stakeholders-- really, people from companies that had contributed money at a certain level-- to talk about what is needed in the industry to help foster the community. It's an amazing experience.

One of the biggest things that I can remember that I was involved with there is putting together training and putting together certification around Kubernetes. You can become Kubernetes certified administrators through the LF now. It's an amazing program.

These are all things that benefit the community as a whole. It's also a funnel through which companies can support a community, because there's a legal entity there. For example, if you have a bunch of cloud computing, a bunch of resources in the cloud, and you want to donate it to a Kubernetes project so they can do regression testing or continuous integration and continuous delivery, it's much easier from a structure standpoint to make that donation through a nonprofit than it is to give that to an individual or give that to a project that has no entity.

So that structure provides tremendous value to the community. And that's really the role that CNCF serves. In my mind, anyway.

CRAIG BOX: The TOC are the body who are responsible for deciding which projects that are submitted are eventually accepted. Do the governing board provide any opinion to how that process should be, or whether or not the CNCF should aim to have multiple projects in the same space, or try and pick the one that is most successful, for example?

JOSH BERNSTEIN: When I was involved there, I would say it was still sort of earlier on. The CNCF had been around for some years, but it was still being sort of crystallized. And there was this notion that the CNCF, it does not exist to be a kingmaker. It's not there to decide which project wins and which project loses. That is explicitly wrong.

A lot of people think that, and it's just not true. The people that are a part of the organization take that mission, I think, to heart. The TOC is there to be a funnel to bring projects to vote and bring projects to the board and get them presented to the board so that the board can make decisions as to what resources we can route to the project, what needs the project actually has. And we can plan how to support these incoming projects.

But to think that the TOC member is the kingmaker is wrong. It really is a community discussion. And it really is a discussion.

And a lot of the discussion happens around, is this a community need? Is this a community problem? Or is this just a different opinion for something that already exists?

And if so, what's the value of supporting both of them or all of them? And sometimes there's reasons one way or the other. It's an incredible group of people. It was when I was involved anyway.

CRAIG BOX: So you've been on both sides of this in terms of donating something, which eventually, obviously, became the CSI, and being on the governing board of the CNCF. When do you think is the right time or the wrong time to donate a project to the CNCF?

JOSH BERNSTEIN: Well, there's a lot of wrong times. The wrong time is you've worked on an open source project for a while, it's fairly mature, it's fairly not mature, and you don't know what to do with it. And so a lot of people try to give it to the CNCF so that it lives on or so that somebody else can do it. That's the wrong time, because this principle of what need are you filling in the community has already told you that it's not filling a need, because you've not got any traction. On the business sense, you've taken the product to market, and the market has responded with no interest.

CRAIG BOX: A lot of commercial products might open source themselves when they feel that they haven't got that interest. That might be valid, but not trying to give it to a foundation.

JOSH BERNSTEIN: Yeah. And so they say, OK, well, we invested a bunch in this. We're going to give it to the CNCF. Well, the CNCF isn't going to do anything with it unless you're going to do anything with it. We're not a digital soup kitchen, where we're just handing out developers to work on projects. The CNCF aims to support and accelerate projects that already have traction and already have this mass of people or this massive community.

So that's the wrong time to do it. You've built a project, and it's not gone anywhere, and you want to be done. You don't want to just throw the code away, right?

CRAIG BOX: So what about the right time?

JOSH BERNSTEIN: Let's say, magically, you have a project like that. Then CNCF accepts it as a blessed project. And then nobody develops on it or nobody works on it?

CRAIG BOX: It kind of becomes rkt at that point.

JOSH BERNSTEIN: [LAUGHS] You said it, not me, Craig. Yeah, there's a lot of projects that went that route. What's the right time? The right time is when you recognize that you've built something, it's turned out pretty good, people are pretty excited about it, you've got some real good interest, and you've reached a point where you are no longer able to scale it yourself, where you need other resources to scale it.

You need test infrastructure. You need community guidelines. You need rules. When the project that you're working on has gotten bigger than yourself-- and generally, it's gotten to that point because you've solved a real problem in the community and people are excited about it-- that's the right time to go to the CNCF and say, hey, listen. I've got 1,000 downloads.

I've got 20 people that I know running this in production. I've got five other contributors from five other organizations outside my own. This has gotten to be at a point where I just need help.

I need help with the governance of it. I don't necessarily need help building a community. Too many times people are like, I'm going to give my project to the CNCF, and I'm going to get a community for free. And it doesn't work that way.

CRAIG BOX: I think a lot of people think they're going to get marketing for free.

JOSH BERNSTEIN: Yeah, they think they get marketing for free. And what good is marketing? Unless you're solving a real problem and unless you've built your own mass in the community, basically you're marketing snake oil. And too many people are under that. It's a shame.

I think that if people focus on this community need, they can make much more of an impact. I mean, we see this in the service discovery space. I think this happens because a lot of startups come out with, we're going to do service discovery better than the next company. And then they open source everything because they have to.

One of them inevitably fails or multiple of them fail. And they should have asked themselves in the beginning, well, instead of starting my own service discovery company, why don't I just contribute to one that already exists? And the answer is because you can't get VC capital for writing code for free. I think that narrative explains why so many projects come to CNCF or come to the community or even the Apache Foundation-- it doesn't even have to be the LF or CNCF-- any foundation and just go nowhere, because they're coming out of almost desperation or this misguided hope that marketing is going to help them.

These organizations provide governance. They provide structure. They provide a lot of value. But if you don't have a community that needs that governance or needs that structure, then it's a waste of time.

CRAIG BOX: The first round of Kubernetes startup acquisitions is long past companies like Heptio and CoreOS. There are a lot of security companies that have been acquired in the last year. I see a lot of companies that had started building managed Kubernetes infrastructure realizing that that's now table stakes from the hyperscalers pivoting to run something else. If I wanted to make money on Kubernetes today and start a business, what should I be looking at?

JOSH BERNSTEIN: [LAUGHS] That's a great question. If you figure that out, Craig, then we should go do a startup together.

CRAIG BOX: Well, there's no money in podcasting. I'll tell you that.

JOSH BERNSTEIN: [LAUGHS] There's no money in podcasting. The problem with containers in general-- and this is not a reflection of the Kubernetes community, but it's very, very hard to consume. So if you look at the companies that are making money on Kubernetes-- and I'm talking really about the Kubernetes ecosystem-- it's Red Hat with OpenShift. You could argue it's the cloud providers with their various Kubernetes offerings.

If somebody could make Kubernetes consumable in a way that's consumable for the traditional enterprise, I think that's a good path forward. I'm really excited about this new wave of startups that are providing resource abstractions on top of Kubernetes. Shipa is one that I think is really interesting.

Instead of writing Kubernetes spec, you write to this abstraction. And then that abstraction can be deployed on OpenShift or Amazon or Google or Azure in this abstracted sense. Maybe that's the magic bullet that is enough to get the traditional enterprise to consume it. That's where I would look.

I think, more specifically, the two areas that I'm very excited about are in the CI/CD space. Companies like Harness, I think, are doing a phenomenal job. Armory is doing a great job with Spinnaker. I think that's very exciting.

This previous generation of companies is around observability. We have this tendency now that we're writing patterns in microservices. And microservices are great for all the reasons they're great, but they add this tremendous layer of complexity. And now you have to manage it and debug it and really observe it. And that's why we've seen this observability space take off in a way that we haven't seen for a while.

So those are the two spaces that I'm bullish on is observability and CI/CD, because they make containers consumable. And that's really what I would look for, because for every container security company that got acquired, there were five that didn't. So for me, it's really about providing capabilities to operationalized containers, whether it's through an abstraction like Shipa, through a CI/CD tool, or on the observability side, which I think is a much more mature market at this point.

CRAIG BOX: Someone who's working on those ideas will almost certainly have to open source some or all of it from the beginning. Do you think that joining a foundation or the CNCF, as we mentioned, is becoming table stakes for new vendors as well?

JOSH BERNSTEIN: No, I disagree with that premise actually. What's happened, if you look at the observability space, things like OpenTelemetry have become table stakes. And so what you've gotten in that space is the collection of data has been commoditized.

This forces companies to differentiate and add value at a higher level on actionable alerts, on correlation between data sources, on dashboard generation. This actually is where there is value for customers. And it allows companies to build proprietary value at levels that customers care about and rely on the open source community to commoditize the things that are lower value, like collection, like open telemetry, like these sorts of standards.


JOSH BERNSTEIN: This is great for the community. So if you're a new startup in that space, yeah, I think it's table stakes that you adopt open telemetry, you adopt one of these open standards. But it's not table stakes that you open source your entire product. It just allows you to focus on differentiating yourself at a real level of value.

That benefits everybody. That benefits the companies. It benefits the investors. And I think it benefits the customers in the community most.

CRAIG BOX: Finally, you've been at Google for six months now. But as everyone will know, we've all been working from home that whole time. [WHISPERING] What do you think the offices are like?

JOSH BERNSTEIN: [LAUGHS] I lived in the Valley long enough. I've been to all the offices everywhere. I think what's amazing about what Google has done with their offices and what a lot of other companies have done is they've just made them a much more humanistic place to work. You go to the Apple campus, you go to the Facebook campus, it's fun. It's comfortable. It makes you happy to be there.

CRAIG BOX: They do your washing for you. You never need to go home. Do more programming.

JOSH BERNSTEIN: [LAUGHS] Those are perks, but I'm most interested in being in an environment where socialization among human beings is built into the fundamental architecture of the office space. And I think Google pioneered that. And so, frankly, I'm really looking forward to experiencing that.

Look, Google did it first, right? Before there was the Menlo Park Facebook campus, there was Google. Maybe it's a little retro of me, but that's what I'm looking for.

How is this envisioned at the Google level? And how do you connect with people at that level? I'm really looking forward to that.

CRAIG BOX: Well, that day can't come soon enough for most of us. And it just reminds me to say thank you very much for joining us today, Josh.

JOSH BERNSTEIN: Thanks, Craig, for having me. I appreciate it.

CRAIG BOX: You can find Josh Bernstein on Twitter, @quityourjoshing.


CRAIG BOX: Thanks again to Jasmine for helping out with the show today.

JASMINE JAKSIC: No problem, Craig. Thank you for having me on. This was a lot of fun and a lot less stressful than the last time I showed up.

CRAIG BOX: If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter, @kubernetespod, or reach us by email at kubernetespodcast@google.com.

JASMINE JAKSIC: You can also check out the website at kubernetespodcast.com, where you will find transcripts and show notes, as well as links to subscribe.

CRAIG BOX: I'll be back with another guest host next week. So until then, thanks for listening.