#11 July 10, 2018

Helm, with Vic Iglesias

Hosts: Craig Box, Adam Glick

Helm and its Charts help you manage Kubernetes applications. Vic Iglesias, a Solutions Architect at Google Cloud, is a maintainer of the Helm charts repository. He talks to Craig and Adam about how people are using Helm, and where the project is going.

Do you have something cool to share? Some questions? Let us know:

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box.

ADAM GLICK: And I'm Adam Glick.

[MUSIC PLAYING]

CRAIG BOX: Welcome back, my friend. How was your holiday?

ADAM GLICK: Oh, the holiday was great here in the States. Me and my wife got a chance to go away and feed a large number of mosquitoes out by the water.

CRAIG BOX: Brilliant.

ADAM GLICK: Yes, excellent. They had quite a feast, and we got to visit a wonderful space out-camping, which got me off the grid, which was really nice, as coming back--

CRAIG BOX: A wonderful American space, I hope.

ADAM GLICK: Oh, a fantastic space. I like to think of-- it's the Americas. We actually--

CRAIG BOX: Oh, yes.

ADAM GLICK: We enjoyed the holiday, and then we headed up to Vancouver Island in Canada, actually. It was lovely.

CRAIG BOX: I had lived in Canada for a couple of years. It's a lovely place.

ADAM GLICK: Awesome. How was your week, Craig?

CRAIG BOX: Well, it's been an interesting week. It's a full-time job keeping up with politics in the UK at the moment. There's a little bit of change in the environment, but I try and put all that aside.

It's peak festival season here in the hot summer sun, so I have gone and seen a couple of great bands in the last couple of weeks and a few more to go. If any of you follow me directly on Twitter, you'll see that I post occasional Kubernetes commentary, but more often than not, it's live concept videos. So I was about four people back from the stage for Roger Waters last week. It's a good place to be.

ADAM GLICK: Excellent. Let's get to the news.

[MUSIC PLAYING]

CRAIG BOX: Google announced Jib, a code to container tool for Java that is as easy as building a JAR file. Jib is a plugin for either Maven or Gradle, which publishes Docker images to a repository, without requiring you to write a Dockerfile or have any container tools installed. Jib reads your build config, organizes your application into distinct layers, such as dependencies and classes, and only rebuilds and pushes layers that have changed. It assembles containers declaratively from your build metadata and can be configured to produce reproducible images from the same inputs.

ADAM GLICK: Congratulations to Codefresh, who landed their series B funding round last week with $8 billion in new financing.

CRAIG BOX: Woo!

ADAM GLICK: Codefresh is a Kubernetes-centric CI/CD pipeline tool which promises users faster development times. Codefresh says the new funding will help accelerate their roadmap, adding greater support for advanced deployment strategies using their latest technologies like Helm and Istio.

CRAIG BOX: Do love Helm.

ADAM GLICK: We'll hear more about that in the interview section, I believe.

CRAIG BOX: Yes! Well, as Kubernetes becomes the preferred deployment platform for more and more users, enterprise vendors are starting to target Kubernetes directly rather than relying on end users or the open source community. This week, MongoDB announced the MongoDB Enterprise Operator for Kubernetes and OpenShift. The operator enables a user to deploy and manage MongoDB clusters with an installation from within Kubernetes, registering them with an external MongoDB Ops Manager, and configuring it by Kubernetes [INAUDIBLE] like Config Map or Secret. You can deploy the operator with a Helm chart and then create objects with types like the MongoDB sharded cluster, which will create the required Kubernetes objects on your behalf.

ADAM GLICK: Another project with a new version of their Kubernetes operator this week is the OpenFaaS, an open framework for providing functions as a service. OpenFaaS has previously integrated with Kubernetes through a back-end integration aptly named Faas-Netes, and has been working on their CRD-based integration since last October. Founder Alex Ellis posted a blog post explaining the operator, and you can find a link to that in our show notes.

CRAIG BOX: Pivotal announced version 1.1 of their Pivotal Kubernetes Service, or PKS. 1.1 upgrades Kubernetes to version 1.10, introduces multiple availability zones, and as a beta feature also adds support for high availability masters spread across those zones. If you'd just like to hyper-converge your budget with someone and have them deploy hardware directly into a data center preloaded with PKS, it is now supported by the Pivotal-ready architecture from Dell EMC, which includes networking, storage, and compute in a box that may or may not be signed by Michael Dell.

ADAM GLICK: The OpenSDS community, a group working on software-defined storage technologies, announced their first release, codename Aruba. OpenSDS Aruba is a software-defined storage controller that provides unified block, file, and object storage services. It offers support for both Kubernetes and Open Stack, allowing a single storage controller to manage storage for both containers and VMs. Other features in this release include a graphical user interface and support for remote data replication. Like the CNCF, the OpenSDS project is a collaborative project hosted by the Linux Foundation and is supported by a who's who of storage vendors and telcos, including Huawei, IBM, Hitachi, Dell EMC, Fujitsu, Western Digital, Vodafone, Yahoo Japan, and NTT Communications.

CRAIG BOX: And with that, that's the news.

[MUSIC PLAYING]

ADAM GLICK: We're pleased to welcome Vic Iglesias, solutions architect with Google Cloud and contributor to the Helm project. Welcome, Vic.

VIC IGLESIAS: Howdy. How's it going today?

ADAM GLICK: Great. How are you doing?

CRAIG BOX: Great to see you, Vic.

VIC IGLESIAS: Yes.

CRAIG BOX: What we see as Helm today is the merger of two previous projects. Can you talk us a little bit through those projects and the history of Helm?

VIC IGLESIAS: Yes. Originally, there was what is now known as Helm Classic, and that was a very Git-centric way to manage your manifests, store your manifests in Git repositories, and use templating in order to install them into your Kubernetes clusters. That was not so great for collaboration between folks, so there was a rewrite underway by the folks at Deis at the time. And in parallel with that effort, there was also an effort at Google to make a version of their Deployment Manager product that was more Kubernetes centric. So those two efforts found out about each other and actually turned into a single effort, which now we call Helm 2. And that is the most common Helm that you'll see out in the wild.

CRAIG BOX: And so what is the main problem that Helm solves?

VIC IGLESIAS: Well, a Kubernetes application, or a user of Kubernetes, they don't interact with just one type of Kubernetes resource. Usually, you have many things that make up a single application or package. So you have maybe some services, some volume claims, some deployments, and maybe some config maps. Helm allows you to treat all of these as one unit and to use templating in order to not have to reuse variables all over the place. So you can actually have, for example, the same name used in multiple places and only define it once. So it made managing your applications, which ended up becoming more complex over time, more easy.

CRAIG BOX: And you're listed as an owner or a maintainer of the Helm project. What exactly does that mean, and how did you get there?

VIC IGLESIAS: Well, I got there by joining very early on into the Helm dev call meetings and being noisy in there, which I was very appreciative that they took my feedback. The place where I engaged the most with the Helm project early on was in what we call now the Kubernetes Chart Repository. And basically, what we turned that into was the set of examples or the starter applications or kind of the de facto place to find an application that's installable via Helm.

So when we started out, we had maybe five or six different types of applications, some databases, some Jenkins-type CI systems. Now we have over 200 different types of applications that you can install easily into your Kubernetes clusters. And I was kind of the curator of those early on. It was easy to do it with a few people when we started out, and now we have multiple maintainers because we have quite a bit of contribution coming from the community.

ADAM GLICK: In that case, you're maintaining the charts themselves or the system that holds the charts and the tools that deploy the charts.

VIC IGLESIAS: Yeah. Actually, it's both of those things. So first, we had to set up an ecosystem around that repository that allowed for other people to contribute and for us to know that their changes were good, that they passed our testing, and that they had any testing at all to begin with. So that was kind of the first things that we did.

And then from there, we also wanted to have exemplary charts. So a lot of the chart maintainers will own a chart or two and try it keep that up with the best practices. Over time, you'll get many people contributing, and sometimes those charts will change ownership. But we wanted to make sure that we had some examples of what we consider to be best practices within the repo before others started to do their contributions.

ADAM GLICK: What did people do before Helm existed?

VIC IGLESIAS: It was tough to manage your manifests. You had basically just a bunch of manifests in a directory, and maybe you would copy those manifests and alter them in order to have various environments. Really, you were just kind of at a free-for-all to just manage a bunch of YAML. Helm allows you some optimization on that in that you can reuse pieces of your YAMLs through templating.

And also, when you're deploying, you can also deploy them as a unit and roll them back as a unit. So if you install an entire application, which is various Kubernetes manifests, you can upgrade them to a new version of those manifests, and if that doesn't work out, roll back the entire set, which is an important functionality of Helm.

ADAM GLICK: Gotcha. We talked to Phil Wittrock recently about Customize, and it uses overlays rather than templates. How would you describe the difference between the two approaches?

VIC IGLESIAS: Yeah. I love Phil's work. He does some awesome stuff. The Customize project just takes a different approach to managing these manifests. With Helm, a lot of people in previous lives were used the idea of templating, and so it might fit more naturally for them to do templating in the Kubernetes context if they came from, for example, the Ansible world, where their manifests are also templated.

So I think with Customize, it breaks from that and gives you a different approach, which is you have basically base manifests that then you can add changes to in basically an overlay format. So it's just a different model for achieving similar goals. Another piece is that Customize doesn't do one of the things that I just talked about, which is that application installation and rollback management. It only handles the one chunk of getting a end state of manifests in place, which is similar to what Helm does in its templating engine. So it's a little subset of what Helm is able to accomplish.

CRAIG BOX: Whenever I ask a roomful of Kubernetes users what they're using to roll out applications, I find about half of them normally say Helm. You also spend a lot of time with Google Cloud customers. Is there a consistent way that you see they use Helm?

VIC IGLESIAS: I think mostly, people are using Helm to just get a handle on naming schemes and to provide a sort of API to their applications or the components of their applications that they're installing. And so what do I mean by that API? With Helm, when you create a chart, it allows you to provide what's called a values file, and that values file, it defines the configurable knobs that you're making available to the end users of your chart.

So what I see a lot of is an operations team or the dev ops side of the house will expose only certain things to application developers through this values file. And so they have a little bit more control over what's changing over time, and as pull requests come in, they see those changes to those values that they think are worthwhile to change. And so I think that gives a layer of control to their process that they didn't have previously.

CRAIG BOX: You can see people use Helm as an app store or as an app repository in a way, being able to say, hey, here's the canonical way to deploy something like MySQL. And then, you can obviously have people build their own charts and templates for internal use. Do you see one more than the other?

VIC IGLESIAS: I think you'll see a blend of both at any particular place, but probably more within companies themselves is iterating on their own much more quickly, whereas for off-the-shelf software, they will maybe use that as something from, for example, the charts repo as a starting point and then mutate that over time, whereas with their own application, that's going to change at a much higher frequency. But in general, what I see is both, or dipping your toes at the very least into some off-the-shelf software from the Helm charts repo.

CRAIG BOX: Is it possible to define dependencies between them? So my internal Helm chart depends on something which is provided externally?

VIC IGLESIAS: Definitely. And so one of the things that's nice about Helm is that you can have this configuration, this parameterization of these charts. So something that you'll see commonly is to add a dependency on something that you want to install within the cluster in, for example, the case of development. So my application depends on Redis, and for developers, we depend on this Redis chart and install it when we're doing development. But in production, it actually points out some hosted service, cloud memory store, or some hosted Redis cache that you have available.

CRAIG BOX: Right.

ADAM GLICK: Do you end up with multiple copies of something? If I think about how people store containers, you can often have something like a public repo, and that will have multiple different versions of even the same application created by different creators, different companies that have rolled out pieces. Is that the same with Helm charts, that you'll have a repo that has multiples of those? Or is it something more like you have one version of it, and that's the canonical one that you provide? And that's, hey, if I want to just deploy my Redis, here's my MySQL. You've just got default versions, and that's the version everyone gets.

VIC IGLESIAS: Yeah. The original intent with the chart tree specifically was to try to have charts that were flexible in those ways. If you had Redis, you would have the ability to turn on Redis clustering and/or have replicas.

CRAIG BOX: It's a one-size-fits-all approach.

VIC IGLESIAS: Right. That's what we wanted to get to. In practice, it turned out that those things were different enough in most of those cases that you wanted to have basically a pointed chart that did the right thing and only cared about handling replication, and/or, for example, one of the early ones we had was the MongoDB replica set.

The ways that those were managed were different enough that it made sense to have different charts. But not only that, people's use cases were different. Usually when people install the basic MongoDB, it's something for development. They just need a MongoDB, no replication. When they're looking at the replica set, that's a completely different use case that they're thinking about, generally. So it ended up working out that way in the evolution of the charts repo.

ADAM GLICK: Where should people store their Helm charts? Is it that central repo you're talking about, where basically everything's stored like an app store model? Or is there a federated model where people can store them multiple places?

VIC IGLESIAS: So at the end of the day, a Helm chart is stored as a tarball that gets uploaded to some sort of object storage. So we see a lot of people using Google Cloud Storage or S3 or any of those kinds of object stores to place their charts. For internal use cases, that's mostly what you're going to see.

The reason people want to put things into the charts repo is mostly to share and collaborate with others and see that iterate over time and potentially have multiple maintainers contribute to the same chart. One of the longest standing charts is Jenkins, and we've seen a lot of functionality added to that that I personally would not have expected to have but has become very useful as the chart has evolved and as people's use cases of that chart have come about.

ADAM GLICK: So you can store your charts in the public repo. You could also store it in a Git repository or any object store. You could easily check them in with source code, for instance.

VIC IGLESIAS: Absolutely. And generally, what we'll see is people have a process of a source code repository that hosts their charts, and it can be many charts per repository, and then a build process, some CI, that takes those charts and dumps them into object storage for them to be installed by the Helm client.

CRAIG BOX: Istio has recently been updated to be deployed using Helm. Are you seeing a lot of other first parties-- you mentioned Jenkins there. Are the people who build and maintain open source software starting to write and maintain their own Helm charts for it?

VIC IGLESIAS: Yes, absolutely. And that's one of the, I think, best outcomes of starting the charts repo, was that we just went rogue on a lot of projects and just created something that kind of worked. And we saw a lot of basically-- take what we had started and say, you know, I think we can do this better; we know this software a little bit better than the community does-- and create their own versions of that software.

So we have things like GitLab, which are now-- kind of the de facto installation for GitLab on Kubernetes is via a Helm chart. And we're starting to see more and more of that. I think Datadog has their chart upstream, but they actively maintain that themselves. So they've taken back ownership from the community, which I think is a really good thing.

CRAIG BOX: How else have you seen people respond to changing use patterns, either as people packaging applications or the Helm project itself?

VIC IGLESIAS: One of the big things was we were, I guess, theoretically, early on in the Kubernetes product or in its adoption, when it really started to kind of take off. But the downside of that was that Kubernetes itself was changing significantly every few months. So keeping up with what is the best practice-- I've been doing Kubernetes stuff at Google for a few years now, and I would go to customers and say, here's the best practice today and then come back three months later and say there's a different best practice.

So just seeing the evolution of the different patterns come through, and then, the downside of that evolution is having to deal with backwards compatibility. Our charts have semver. So basically when we have to make a breaking change, we want to do a major version bump. And so keeping up with those best practices of just the software delivery cycle was challenging.

CRAIG BOX: I remember when role-based access control came out. That was a challenge for Helm.

VIC IGLESIAS: Exactly. Not only just RBAC, but now we had network policies, storage classes, all these things that were better versions of what we had previously. In Helm, we were able to hack through some of these use cases ourselves, but then we were able to use the Kubernetes project's best practice in order to implement it.

So I think the most interesting thing is now today, we've seen a sort of stabilization in those types of changes, where previously, one of those features would come out and we would get 50 PRs, one for each chart, to make that change, right? Which was relatively painful. And so now we're seeing less and less of that kind of thing.

ADAM GLICK: Do you maintain multiple versions of a particular chart? Say, for instance, if there's a breaking change, do you maintain both the new version and also realize that some people are using older versions of Kubernetes and may need an updated version of their chart if someone's making a change, but that doesn't have the changes that were part of the breaking change? How do you manage, basically, when you've got a breaking change and multiple versions of the chart? Do you have an LTS versus the most recent, or what's the strategy?

VIC IGLESIAS: Yeah. For better or worse, we don't have an LTS and/or a branching strategy per chart. It was never really baked into the model, so we're kind of always moving forward. And the best signal we can give you is that we changed the major version. You can always fork a chart and maintain it yourself, but frankly, we just did not have the capacity to keep up with all of the streams of various charts at the time that we started. And I think still now, we're a little bit understaffed on contributors there.

CRAIG BOX: Helm v2 introduced a component which is called the Tiller, which basically has to run as root in the cluster and actuate all the things that Helm wants to do. I understand the v3 is doing away with the Tiller. Can you tell us about the evolution of the system that deploys these charts and a little bit in the direction that that's moving?

VIC IGLESIAS: Helm v3 is really a break from the full code base that was previously there and taking stock. The first thing that was done to figure out where Helm 3 should go is to take stock with what was good, what was working, what did people appreciate about Helm, and then, what problems were they having? And one of the great benefits is that we'd had enough adoption with Helm 2 that we knew about real production problems, real problems in enterprise. The stickier problems had already come about. So now, with Helm 3, it's time to really fix those things.

One of the big ones has always been the Tiller, and I remember this since the very earliest dev calls of Helm. First, it's a GRPC-only server-side component, so it's just harder to integrate with, harder to think about, because it's not the common path of an HTTP REST service. So people always had trouble with that early on.

And second is the security model around that Tiller. Previously, we just had it wide open to be root on the cluster. Now there are ways to limit its ability, but that also limits your ability to use Helm as an enterprise. So those things are all going to be fixed in Helm 3.

One of the big changes is going to be using CRDs, which is a Custom Resource Definition which allows us to use the Kubernetes API server as the store of state and somewhere where we can actuate state without having to actually maintaining ourselves. And I think that'll be a really big change for Helm itself, but there's also other things changing around how the packages work, how you can share packages between various repositories. It's going to be the new look on the similar problem.

ADAM GLICK: Speaking of Helm v3, what do you see coming in the future for Helm?

VIC IGLESIAS: Well, I think we're going to have to look at how to migrate people and make sure that everything works as they expect as they're going through these upgrades, because we have a lot of people who are dependent now on Helm as part of their production workloads. So I think first and foremost, make sure that as little as possible has to change for them to get to the new software, which we think is going to be much better and much more user friendly, much more maintainable, all these kinds of things. So I think that's the most paramount thing. Once that upgrade path is well known and is simple and easy, I think the rest is just figuring out what other usability type things developers and operators need out of Helm.

ADAM GLICK: Is there a particular upcoming feature that you're excited for?

VIC IGLESIAS: I think one of the big things is looking at distributed search for Helm charts. So one of the things that, as the Helm chart repository, we take on is, basically, where the main central repository and the place where most people want their charts to land. As part of that, we get a lot of contributions, and we have a lot to maintain.

So one of the things we're going to try to do is figure out a way to have more of a distributed central charts repo, where people can on-board their own repositories where they manage it, and we can take some of the pressure off of the charts' maintainer. So that's a very selfish thing to want to have happen, but is definitely one of the things that I'm most excited for.

CRAIG BOX: I imagine that the way most people will get involved with Helm is looking to deploy an application and finding the chart for it. What do you see as the path to contribution for people who want to get involved?

VIC IGLESIAS: Definitely, the most common path is hey, I want to install software X. You type in Kubernetes software X, and the charts repo version of that shows up, forking that and making it your own. I think you can look today at the number of forks on the charts repo, and it's significant, and that's actually a good thing. We're not always going to be everything for everybody, but we want them to have a place to start and to have a working version that then they can iterate on so that they don't have to start from a blank page. So I think taking a known application, even if it's not necessarily something that you need to use-- just taking something that you know how to use outside of Kubernetes, seeing the charts' representation of that, and then forking and mutating that to maybe a use case that makes sense for you is a really good way to get your training wheels on Helm.

CRAIG BOX: And do you see a lot of people submitting patches back from their forks to the upstream repair?

VIC IGLESIAS: Absolutely. Absolutely. We are constantly receiving pull requests. I think we have on the order of 300 active pull requests at any given time. When we started out, it was much less than that. But we see just constant contribution and people who want to see the upstream chart be as good as possible and as well maintained as possible.

CRAIG BOX: All right. Great, Vic. Thank you very much for joining us today. It's been fantastic to hear a little bit about Helm. If you want to learn more about Helm, please find the websites in our show notes, and you can check out helm.sh for the main Helm website, or github.com/kubernetes/helm for the chat repository.

ADAM GLICK: Vic, if anyone who's listening wants to get in touch with you or follow what you're doing, how can they follow you?

VIC IGLESIAS: I am on Twitter, which is probably my most active place to be, at vicnastea-- V-I-C-N-A-S-T-E-A-- a very clever way to spell that.

ADAM GLICK: Aha.

CRAIG BOX: I always think it's like Vic ice tea.

VIC IGLESIAS: It's whatever you need it to be. I have no idea how it got there.

ADAM GLICK: Cool. And we'll put a link to that in the show notes as well.

VIC IGLESIAS: All right. Thanks.

ADAM GLICK: Awesome. Thank you so much for your time today, Vic.

CRAIG BOX: Thanks, Vic.

[MUSIC PLAYING]

ADAM GLICK: That's the end of our show for another week. Thank you for listening. If you've enjoyed the show, we have a favor to ask. If you're an Apple user, please take a moment to go to iTunes and rate our show. It really helps other people find the podcast, and we appreciate your help. This five-second process will help us get the show out to more people and continue to keep bringing you news and interviews from around the Kubernetes world.

CRAIG BOX: Aside from that, please just go to Twitter and tweet to us to let us know that you're listening. You can tag us @KubernetesPod, or just tell your friends that you're enjoying the Kubernetes Podcast from Google.

ADAM GLICK: You can find the links for this show and all others at kubernetespodcast.com. Until next week.

CRAIG BOX: See you later.

[MUSIC PLAYING]