#5 May 29, 2018

Kubernetes Documentation, with Zach Corleissen and Jared Bhatti

Hosts: Craig Box, Adam Glick

This week, Craig and Adam bring the news from Google Kubernetes Engine and elsewhere, and talk to SIG-Docs leads Zach Corleissen (from the CNCF) and Jared Bhatti (from Google).

Do you have something cool to share? Some questions? Let us know:

News of the week

Guests:

Links:

ADAM GLICK: Welcome to the Kubernetes Podcast from Google. I'm Adam Glick.

CRAIG BOX: And I'm Craig Box.

[MUSIC PLAYING]

ADAM GLICK: Hey, Craig, how are things down under?

CRAIG BOX: I've been at Container Camp. It's a fantastic indie conference. There's a great breadth of presentations this year from a lot of people, a lot of first-time speakers giving excellent talks as well.

It is fair to say it's become a little bit Kubernetes camp, which is to be expected, but a lot of in-depth talks about building container images, keeping them secure-- very good overall. Container Camp is run this year in Australia, and later on in the UK. So if you have a chance, I recommend checking it out. How's Milan?

ADAM GLICK: Milan is wonderful. I got a chance to be at the Cloud Summit here in Milan. That's the Google Cloud Conference locally here, a little over 2,000 people, was a great crowd, a ton of interest in Kubernetes and Istio. The session there on that was a packed room, as a lot of people are really looking at what's the next generation of the infrastructure they want to be building, and learning a lot more about Istio and Kubernetes as their way to do it.

CRAIG BOX: Let's take a look at the news.

ADAM GLICK: Google Cloud has updated Kubernetes engine to bring Kubernetes 1.10 to general availability. Aside from the new open source features, four new enterprise features were announced, including support for running in shared VPC networks, node auto repair, custom metrics for horizontal pod autoscaling, and regional persistent disks.

CRAIG BOX: On most cloud providers, and indeed, until recently on Google Cloud Platform, the disks you can attach to a VM were only available in the single zone. In the event of that zone becoming unavailable, you would not be able to reach that data. With regional persistent disks, Google synchronously replicates the data between two zones in a region, where it can be reattached in the event of a zone outage. With such a highly available storage layer, you can run high-availability systems without having to worry about application-level replication. In the event of that zonal failure, GKE will manage that transition for you.

ADAM GLICK: Another new GKE feature, the horizontal pod autoscaler, lets you set a number of replicas in your deployment based on CPU or memory usage. With GKE and Kubernetes 1.10, you can scale on a custom metric exported from one of your pods, and also now on external metrics, like the length of a Pub/Sub queue.

CRAIG BOX: Moving on, containerd's new 1.1 release is now generally available for use in Kubernetes. This release bakes in CRI support for the Kubernetes Container Runtime Interface. Benchmarks show that Kubernetes with containerd has lower pod startup latency, CPU, and memory usage than Docker, which, of course, comes from containerd being split out from the Docker daemon into its own project. Docker Engine will also use containerd 1.1 in an upcoming release.

ADAM GLICK: The CNCF this week invited two new projects into their sandox, the entry point for early stage projects in the CNCF. The first of these is Cloud Events, a specification that provides a consistent set of metadata to make events easier to work with for publishers, middleware, subscribers, and applications. It was submitted by the CNCF Serverless Working Group. And contributing companies include Google, Alibaba, Huawei, IBM, Iguazio, Microsoft, Oracle, Red Hat, SAP and VMware.

CRAIG BOX: Secondly, Telepresence is a tool that lets you run a single service locally, while connecting that service from a remote Kubernetes cluster. It was donated by its original creators at Datawire, namely an HRSS provider. And the Kubernetes user use Telepresence, along with Istio, to deploy services into their cluster, and have Istio route requests made by a developer working on a particular service to that service running on the developer's laptop. And we'll put a link in the show notes to the video explaining that.

ADAM GLICK: SAP launched Gardener, an open-source tool for managing and updating multiple Kubernetes clusters using Kubernetes itself. In keeping with the botanical metaphor, Gardener is a tool that uses a seed cluster, which is similar to a master note in Kubernetes, to manage multiple shoot clusters, metaphorically similar to Kubernetes nodes, all with custom resources on the Kubernetes API server. It works with Google Cloud, Azure, AWS, and OpenStack today and promises to work with on-prem deployments as well. You can find out more about the project at github.com/gardener. That's G-A-R-D-E-N-E-R.

CRAIG BOX: This week saw the first birthday of the Istio project, announced last year on May 24. Currently in the stability freeze, the proposed long-term support release, Istio 0.8, has only a few remaining issues to be fixed. And we expect a release candidate to be cut any day now.

ADAM GLICK: Canadian consulting company CloudOps has announced that they have joined the Kubernetes training partner program. CloudOps joins seven other companies in the training program, and have upcoming workshops in Montreal, Toronto, and Ottawa.

CRAIG BOX: That's all for the news this week. Our guests today are two [INAUDIBLE], Jared Bhatti from Google, and Zach Corleissen from the CNCF. Welcome.

JARED BHATTI: Hi.

ZACH CORLEISSEN: Hello.

ADAM GLICK: So can you tell us who you are, what do you do?

JARED BHATTI: I'm Jared Bhatti. And I work on documentation for Google. And I focus on Kubernetes content.

ZACH CORLEISSEN: I'm Zach Corleissen. I'm the lead technical writer for the Linux Foundation. And right now, I focus mostly on Kubernetes.

ADAM GLICK: Jared, were you involved with Kubernetes documentation before [INAUDIBLE]?

JARED BHATTI: Yes. So I've been part of the Kubernetes content really since inception, since day one. And I am one, very happy to have Zach on board, and two, very happy that we have a lot more process, and standards, and community contribution process in place. Part of what's so important for me when it comes to the content for Kubernetes is that we democratize the technology, and actually include this larger group of developers who want to contribute.

And content is how we do it. Content is how we share information. It's how we engage with that larger community. And it's a lot easier for them to connect with us if they know where to find their content, and know, actually, where to get started, and how to use it.

CRAIG BOX: My recollection of the early days is very much engineers just write documentation for each feature. And then if you're lucky, it's been edited by someone by the time it goes live. We're a lot further away from that today. How has that evolution been?

JARED BHATTI: Launches used to feel like about 1,000 people trying to cram through a doorway all at the same time. And I would usually be standing there trying to edit their content before it got out the door. And it was really nice with 1.4 to actually sit down, and hammer out a good, solid launch process, where we're actually involved, where people come to us, give us PRs. We reviewed them, and it's a much smoother process.

And a lot of what we're doing now is trying to make anybody in the community capable of doing that process. So somebody who knows our docs, has spent some time filing a few PRs, reads our launch content, our launch documentation-- we actually have content for our launches-- can actually get some out the door before a launch, and do it without a ton of hand-holding from us.

CRAIG BOX: Does most of the documentation still come from launches?

JARED BHATTI: Ooh, we're starting to shift. You want to talk a bit about that?

ZACH CORLEISSEN: So for content related specifically to launches, yes and no at the same time. We still follow the release process, where the documentation process is aligned with the rest of the Kubernetes release cycle. So a lot of feature documentation, either creation or updates, comes through the release cycle. But we also release continuously. So that's part of why we do things differently than any other Kubernetes repository, is the ability to keep documentation fresh and new, or updated, or removed constantly.

ADAM GLICK: How do you ensure the quality of the documentation that people are checking in?

ZACH CORLEISSEN: I would say that's one piece that we're doing a lot better job of now, during the release process, for documentation particularly. So working with the larger Kubernetes release team has been a godsend. As releases open up right now, there's a really well-tracked list of features that are going into a particular release. So it's very easy to track whether or not documentation has been submitted to that.

And we have, from the documentation community, from SIG Docs, every release cycle has a dedicated maintainer attached to it, so a Release-Meister. And that person is responsible for making sure that all of the pull requests associated with a particular feature has the checks, that they have the content needed, that they receive at least a minimum of editorial review, and are shepherded through the process.

CRAIG BOX: What about, then, the overview? If you are committing a new feature, you'll write documentation related to that feature. But then there'll be a bunch of other things that are now invalidated, like if you had a documentation page that said, "Use Feature X," and then Feature Y invalidates that, the Feature Y team aren't necessarily aware of that. Is there some process that's going through and updating each page, each release?

JARED BHATTI: Well, hopefully, they are aware of that during the release process. And part of what we do is we're user zero. We're people who will test the feature before we write content for it, before we launch content for it. So it's really nice for us to actually get our hands dirty, figure out what's the valid use cases for this, and then give feedback to the engineers building it.

So some of that overview content that we-- you asked about, like, are we working on launch content or are we migrating away from that? And yeah, we're doing a lot more work on onboarding people. So you're getting people in the front door, [INAUDIBLE] on the right path, picking the right solutions, whether it's running it themselves, hosting it on GKE, whatever it is they need.

CRAIG BOX: As we've reorganized the documentation to task some concepts and journeys, there are people, presumably, who just say, hey, I know what I want. I want to go find the information specifically about that. And I imagine that they have a very different way of finding information than others. How do you balance those two types of person?

JARED BHATTI: Yeah, it's funny. We did some user studies on Google's content, and basically discovered that some people want to know how everything works right off the bat, before they do anything. And other people, they just want to start doing stuff. And then if they need to know, they want to look into that, which is why we split out tasks and concepts.

We also split out the reference content. Because there's plenty of people who are, like, I just want to know what the API call is. I want to get to that.

ADAM GLICK: Do you want to talk more about that?

ZACH CORLEISSEN: Yeah. I would say that what user response has been like, I think it's been very polarized. I think that there are people who are very happy with the user journeys, and people whose overall response to the documentation is still that they're experiencing a lot of frustration. And I think those are all equally true and equally valid.

There are ways in which we really succeed with documentation with the updates to user journeys. And there are ways in which I think we really fall down still, in terms of our content, Kubernetes documentation really fully meeting users needs. I think that that's an ongoing project.

ADAM GLICK: How do you go through the evolution of that? One of the things that I always notice when I'm looking at a new piece of software is the first thing that starts is the API documentation, or just here is the raw stuff that if you already know how to use it, you'll understand this. And if not, then this may not be as helpful.

And then normally I see that evolve into something else. So they start with user guides, or getting started guides, or things actually take people through examples, where they say, here's not just the pieces that you could put in here, but here is how you actually put it all together. And, at least as weekend hacker, as I go through things, those are the things that I often find to be much more valuable. But it doesn't seem to be the way that, at least many pieces of software, write their pieces. How do you think about what the right way to evolve documentation is?

JARED BHATTI: So that's one piece that plays into what we've been doing over the last six to nine months, and where we're headed. So you mentioned API documentation in particular. That was a piece that was problematic for us, because to be really honest, we didn't know how it was generated.

We had to go to Phillip Wittrock, and say, hey, will you regenerate these docs for us? So it was an opaque piece of functionality on the site. We just didn't know the toolchain for how reference documentation was generated.

We do now. We have a full and complete understanding of how to generate documentation for all of the Kubernetes APIs. And getting to that point of just being able to understand it, and now make improvements to it, has been a huge step.

So in terms of quality, just getting our own mastery of the API documentation squared away has been a big piece. In terms of where we're focusing on in the next six months, we are taking a look at-- on the site, if you go to the Kubernetes website and click on Documentation, in the vertical nav pane, there is one of those top nav buttons is called Setup.

We'd like to change that to Getting Started, and go through all of that content, and make sure that it is up-to-date and accurate. Some pieces of it are woefully out of date. But we are working with the Cloud Provider working group to make sure that all of the providers and platforms who contribute specific setups have a very clear path to ownership and change for that documentation, to make sure that providers are maintaining their own documentation as accurately as possible for themselves.

ADAM GLICK: Nice. Docs really help out in terms of, as someone who is not a master at the technology, but continues to--

JARED BHATTI: I feel validated.

ADAM GLICK: [INAUDIBLE] scaling documentation started out as a project that Jared, you were working on. And then the CNCF now is a full-time member on that. And my guess is that's probably not the end. More people will be added and growing. As the project grows, so will documentation. How do you think about scaling that as a project? And how do you handle the growth of not only the amount of code that you need to document, but the number of people who are contributing to the documentation?

JARED BHATTI: I always review the content. But there are editors on my team who have jumped in, and contributed a tremendous amount of content as well.

ZACH CORLEISSEN: So have other CNCF partners, as well.

JARED BHATTI: Exactly. We host a Doc Sprint here. And pretty much every KubeCon, we have a Doc Sprint, where we bring in people who are interested in contributing, and basically run them through the process of how do you do your first PR? How do you file issues against the site? And really welcome them into filing content and getting involved.

ZACH CORLEISSEN: Amusingly and wonderfully, that's only-- I would say the actual dedicated technical writers tend to be outnumbered at our Doc Sprints. We get a pretty heavy set of interest from feature developers, engineers, people who are more directly involved in creating features, who come to the Doc Sprints. It's wonderful.

ADAM GLICK: That's awesome.

ZACH CORLEISSEN: More to the scaling piece, that's a piece that we struggle with sometimes, because it's the classic technical writer problem. If you could throw enough technical writers at something, it ceases to be cost efficient to hire technical writers.

ADAM GLICK: The mythical man month?

ZACH CORLEISSEN: Exactly. But no, the CNCF we're hiring for, another technical writer role, it's more likely to be a blended technical writer/developer advocate role. And we're talking with some really good candidates right now. But our hope is to focus not just on Kubernetes documentation, but to solve larger documentation concerns across the whole of the CNCF suite of projects.

JARED BHATTI: Yeah. There's a lot of lessons that we've learned in growing our doc set and building on some of our tooling and templates that we'd like to share with other open source sites, and have them adopted, maybe adopt our community and contribution processes as well, and at the very least, learn from our mistakes.

CRAIG BOX: Tell me about the recent launch of the new doc site.

JARED BHATTI: Oh, Zach, you've been driving that. So I'll let you. Do you want to chat about that?

ZACH CORLEISSEN: Sure.

JARED BHATTI: Even though I was like, I should do this.

ZACH CORLEISSEN: That's kind of how our partnership works, is I get a lot of my best inspiration from you, and then run off and do it.

JARED BHATTI: Aw, thank you.

ADAM GLICK: If you guys need a moment, we can--

ZACH CORLEISSEN: It's been an ongoing moment since we've met four or five years ago. So the changes that we're bringing to the Kubernetes website, we are changing the static site generator that we use. Up until now, Jekyll has been the static site generator that we've been using.

But we are switching to Hugo. And that's driven in part because of internationalization. So you talk about making documentation more available and scaling documentation. Internationalization is a huge challenge in scaling.

And we looked around at Jekyll implementations for different language translations, and they just were not easy. There's nothing that's well-maintained currently. And there's nothing that works easily.

And Hugo is kind of the polar opposite of that. Its internationalization multi-lingual support is built in, and it's very easy to implement. So we are changing the site to take advantage of just an easier pathway for contributors overall, and looking at solutions that are going to scale well for users across the globe.

CRAIG BOX: I wonder then, it seems like the barrier to internationalization is less about tooling, but the workflow required. If I come along a document which is initially in English, and I commit a change to that, what process then makes sure that change is updated in all the other languages the document is translated into?

ZACH CORLEISSEN: There's no easy way to do that to make sure that that kind of change is universal. What I've seen in other open source projects that deal with translation, their approach has been simply to flag when a change has been made, so that other maintainers of other repositories for language-specific content, that they are aware that a particular file has changed, so that the change is visible. But there's really no way to automate that translation.

CRAIG BOX: Do the communities that you have been in, have you seen-- is that normally a batch thing, where every release, a translator will go through and adopt all of the adjustments? Or are they--

ZACH CORLEISSEN: We are so new at this. Translation--

JARED BHATTI: Yeah. We're still figuring this out. We have a great Chinese community that's been doing community translations.

ZACH CORLEISSEN: They have been so patient with us.

JARED BHATTI: And yes. And if they happen to be listening, then thank you so much for your patience and time.

ZACH CORLEISSEN: Yes, thank you.

JARED BHATTI: We're working with them, and developing some of the tooling to engage them on that level. But right now, we're launching Hugo to support them. And we hope it works. But we're still figuring out what the exact mechanisms are.

ADAM GLICK: What's coming up that people should be excited about?

ZACH CORLEISSEN: Well, internationalization, I think, is the big piece.

ADAM GLICK: What languages do you hope to be able to support with that?

ZACH CORLEISSEN: So we've got a Chinese team that's been working really dedicatedly towards completely translated content for the past year. But we were just contacted by a Korean team that wants to begin as well. So I think that the Korean team is going to benefit from all of the pain that the first translation team went through.

Other changes that are coming up-- one of the contribution barriers that we've seen has been the site performance. When contributors come and want to make changes to documentation, if they're adding documentation or removing it, the requirements for site architecture have not been transparent in Jekyll at all. There are some YAML files that you have to update, and that process is non-intuitive.

And in Hugo, you pretty much just change the file, or add it, or remove it. And the change itself changes the site architecture. So that's a lower barrier.

Another thing that we're really excited about in Hugo is faster site builds. Hugo builds continuously, more or less. And we use Netlify for staging builds. And in order to preview, in order for a pull request to be mergeable, it has to pass through that gate check. And Netlify has to be able to successfully build it.

And in Jekyll, those builds are many minutes long. Jekyll's performance tends to dwindle when the site gets over about 100 pages. And at what, like 270, 280 pages, Kubernetes is a huge site. And--

JARED BHATTI: That's all of the versions of the site, and the additional languages. It just becomes kind of a monstrosity.

ZACH CORLEISSEN: Yes. And so just better site performance overall.

CRAIG BOX: What proportion of your time do you spend writing versus building the infrastructure to allow other people to write?

ZACH CORLEISSEN: I don't get to write nearly as much as I would like to, but that's--

CRAIG BOX: Do you have an unfinished novel in you?

ZACH CORLEISSEN: Oh, what a deep, dark, tech writer question. I think one of the joys of being a lead, a SIG lead for Kubernetes is being able to contribute really meaningfully to the vision of where a SIG is headed. But the cost of that is not being able to participate so directly in the work.

JARED BHATTI: I think tons of people from the community come forward with great ideas.

ZACH CORLEISSEN: Yes, they do.

JARED BHATTI: And they do a good job of-- I'm glad that people don't just throw-- hey, it would be really great if you had this doc automation tool-- bye. People actually spend the time, and dedicate the time to regularly sync with us, propose ideas. And some of them are-- they're things that we end up turning down.

They're, like, we've got this great idea. You go, well, yeah, not right now. So some of it is just setting our scope and keep it focused.

ZACH CORLEISSEN: It's another piece. When you talk about going back to the scaling question, how do we scale, so the SIG test infra team for Kubernetes is an awesome group. And they're doing some really cool work with bots on the site to just make contribution a much smoother process overall. And that hasn't always been an easy adoption process for SIG Docs. But the work that they're doing makes our scaling process so much easier, and so much faster.

But like Jared says, we're fortunate enough to have a really vibrant and really dedicated community. And as a lead, that's one of my goals is to make sure that when contributors of whatever background of whatever skill level and whatever background, if they bring an idea to a SIG meeting, and they present a way to make documentation better, that even if we turn it down, that we give it the full consideration.

ADAM GLICK: So speaking of the community, you've mentioned a couple of ways that people can find their way to get involved with it. Paris Pittman, when we spoke to her earlier, talked about how documentation is a great way for people to get started with the Kubernetes community. You'd mentioned Hack Day as that people can come in, and you walk them through it.

What about people who aren't able to make it to one of those? What are ways that people can get started getting into the docs? And you mentioned you made it a little bit easier now that you are going with Hugo. But what's a way that someone who is interested in getting-- if I want to go and I want to make my first docs check-in, where do I start?

JARED BHATTI: Well, I think a good place to start is to go to the site and start reading our content. And if you see an issue, file an issue. And every single page has a little pencil icon in the upper right-hand corner that you can click on, and edit that particular page, and send us a pull request.

If you go to our list of issues at kubernetes/website, there's a list of issues that are all tagged with the tag at first issue. And they're usually smaller, easy-to-figure-out fixes that usually have the next actionable step listed in the issue. So you can just claim that issue, file a PR. It's a great pathway to become a contributor, and it's also a great way to just get involved in the community.

ZACH CORLEISSEN: And join the Slack channel, too. At SIG Docs, our Slack Channel is where we do the most communication about Kubernetes documentation. And we have contributors from all across the globe. So it's a great way to communicate asynchronously. If they can't make the 10:30 AM Pacific meeting time, the SIG Docs channel on Kubernetes Slack is a great way to be a part of the community.

JARED BHATTI: As we move to the new Hugo site, we're also looking to revamp the community guides to make them a lot clearer for people coming to the site for the first time. We want to make using the content as easy as possible, and for their ability to offer suggestions to us and give us feedback as easy as possible as well. So I think as people out there see an issue with the site, have an idea for improvements, feel free to file an issue and let us know. We want your feedback.

CRAIG BOX: So as Kubernetes is an open source project with a foundation behind it, in this case the Cloud Native Computing Foundation, I imagine it's quite rare for a full-time, foundation-backed documentation writer. This is a new position for the CNCF, as I understand?

ZACH CORLEISSEN: It is. There's a number of writers who contribute to other Linux Foundation projects, like gRPC, for example. I know Lisa Carey works on that. She's actually been offering us some advice on Kubernetes and other projects, too.

ADAM GLICK: Fantastic. Well, Jared, Zach, it's been great having you both here. Thanks for spending time with us.

JARED BHATTI: It's been an honor to be here with both of you as well. Thanks for having us.

CRAIG BOX: Thank you.

ZACH CORLEISSEN: Our pleasure. Thank you.

ADAM GLICK: That's about all we have time for this week. If you want to learn more about Kubernetes documentation, you can check out Kubernetes.io/docs/home for docs on Kubernetes, as well as info on making your own contributions to the Kubernetes documentation.

CRAIG BOX: Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @KubernetesPod. Or reach us by email at kubernetespodcast@google.com.

ADAM GLICK: You can also check out our website at kubernetespodcast.com. Until next time, ciao.

CRAIG BOX: See you later.

[MUSIC PLAYING]