Kubernetes Podcast from Google: Episode 70 - Windows Server Containers, with Patrick Lang

#70 September 11, 2019

Windows Server Containers, with Patrick Lang

Hosts: Craig Box, Adam Glick

Patrick Lang is the co-chair of the Kubernetes Windows SIG. He is a Senior Software Engineer at Microsoft, developing Kubernetes and related open-source projects supporting Windows Server Containers. Patrick joins Adam and Craig to tell the story of how containers came to Windows.

Do you have something cool to share? Some questions? Let us know:

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box.

ADAM GLICK: And I'm Adam Glick.

[MUSIC PLAYING]

CRAIG BOX: We're both on the road this week. I'm in Hong Kong. The only advice I was given was stay away from the protests. I thought, well, that's pretty safe advice. Instead, we will go up the Victoria Peak, the mountain in the middle of the Hong Kong Island. And so we trotted off to the Victoria Peak tram station.

Which, as it turns out, is right beside the US consulate.

ADAM GLICK: Yes, the base station there.

CRAIG BOX: And I don't know if you had a chance to see what was on the news the last few days, but there was a giant protest at the US consulate over the weekend-- tens of thousands of people all running around with American flags.

ADAM GLICK: Interesting. Did you enjoy the hike? I thought it was a beautiful hike and a beautiful view of a great city.

CRAIG BOX: Yes. Well, I ended up the next day just taking a car up to the top. And it's a much better idea, actually, rather than trying to go up on the tram. Take a car to the top, and that way, it can drive you right to the very top of the garden, and you can walk around and do it all downhill, and then get the tram back down to the bottom. It all cleared up the day after, so it was all very good.

ADAM GLICK: Is there still a mall up there?

CRAIG BOX: The tram station at the top is not so much a tram station as a Madame Tussauds/paid viewing platform/food gallery place, but it has great air conditioning, which, in Hong Kong, is very necessary.

ADAM GLICK: Awesome. I am doing slightly different trekking, moving through Oregon this week, running into a lot of folks from Burning Man, realizing that the hot springs in Oregon are sometimes clothing optional, but running into some great people, hearing some interesting stories, and really enjoying just getting out in nature a little bit.

CRAIG BOX: A lot of cycling in Oregon, I'm told?

ADAM GLICK: There is. You'll actually find a fair amount of cycling on the highways, that a number of the highways have bicycle lanes on the highway itself.

CRAIG BOX: Seems a safe place to put them.

ADAM GLICK: Shall we get to the news?

CRAIG BOX: Let's get to the news.

[MUSIC PLAYING]

CRAIG BOX: The schedule for KubeCon US in San Diego this November has been announced. Highlights for us so far include a keynote featuring Google's Tim Hockin, star of episode 41, and Kal Henidak from Microsoft, who will be doing a dual talk on dual stack IPv4 and v6 in Kubernetes.

There's also a keynote where a team promises to build a fully containerized 5G network live onstage, which sounds like one of those comedy gigs where you have to put your mobile phone in a lockup bag. If you can attend, please make sure you come to whatever meet-the-podcast-hosts party we end up organizing. If you can't attend, all the sessions will be recorded, thanks to the fine folk at Google Cloud.

ADAM GLICK: Shielded VM nodes have been added to GKE in beta. Such nodes bring the capabilities of Google Cloud's Shielded VMs to GKE, bringing security features such as secure and measured boot, a virtual trusted platform module, or VTPM, new EFI firmware, and integrity monitoring. Establishing trust using the VTPM means you don't need to have or hide access to the cloud metadata service, which has service account credentials for the VM, as this information is now securely exchanged at boot.

CRAIG BOX: Containo.us, makers of the Traefik proxy – spelt with an ash, the letter A-E, like how people write 'encyclopaedia' when they want you to think they speak Latin – has turned it into a service mesh called Maesh, also with an "AE". Maesh uses Traefik as a routing engine and eschews sidecars in favor of the out of favor single daemon, or single point of failure, per node, model.

This is the model that Linkerd used before switching to a sidecar proxy model in v2. You also have to configure your DNS per service to opt into using the mesh. This definitely makes it lightweight and simpler-- as promised, but limits the useful scope to inside a single Kubernetes cluster.

ADAM GLICK: Version 0.15 of project Contour-- an ingress controller for Envoy-- is out. Headlining this release is leader election, where if you run more than one instance of Contour, then only the leader will update the Envoy configuration. You also have to opt in or deliberately opt out of TLS security between the controller and the Envoy. Another post on the blog also talks about running Contour on kind, neatly tying back to last week's episode.

CRAIG BOX: At the TechCrunch Sessions conference last week, a panel of three of the earliest contributors to Kubernetes-- the aforementioned Tim Hockin, Brendan Burns from Microsoft, and Joe Beda from VMware, our guest on episode 12, as well as Aparna Sinha, director of product management for Kubernetes at Google and our guest on episode 13. In the 30-minute video, you can learn about the past, present, and future of the project and how it has changed how enterprises think about moving to the cloud and developing software.

ADAM GLICK: As Kubernetes 1.16 approaches release, 1.15 is rolling out to the major clouds, launching in preview on Azure Kubernetes Service and the Rapid Channel for Google Cloud's Kubernetes Engine. If you're curious about the update schedule for many of the cloud vendors, a GitHub comment from Aaron Wright House has looked into the last few releases and made guesses as to the pattern.

CRAIG BOX: Scalability isn't just a line drawn on two axes. It's more like an n-dimensional hypercube. To help us understand it in the three dimensions we humans can perceive, Google Cloud has published guidelines for creating scalable GKE clusters. Building on the published Kubernetes limits, the team explains how you can size a cluster based on the amount of work you want to do in it. The article talks about how the limits to numbers of Pods, Services, and CRDs, for example, aren't strictly fixed, but will cause you to fall out of your service level objective if you start trying to do things the system wasn't designed to do, a good way to avoid an entry on Kubernetes failure stories.

ADAM GLICK: In advance of the EU Summit, the Cloud Foundry project has shared an update on their work on service mesh. The networking team has Istio and Envoy working on Cloud Foundry running on the traditional Diego scheduler, but have decided not to further invest in this. Instead, they are now working only to support Cloud Foundry apps running on Kubernetes, a further sign that Kubernetes is the direction of travel for the project going forward.

CRAIG BOX: Ivan Babenko has this week finished a four-part series on using the Symphony PHP Framework with Kubernetes. In the blog series, he covers containerizing the application, publishing it with Helm, testing it, and building a continuous delivery pipeline, all things with a multitude of opinions and writings on the topic of, but this one is Symphony-flavored.

ADAM GLICK: You can write a blog post about setting up your blog on Kubernetes, or you can do the same and call it the Cult of Kubernetes. You'll get a lot more Hacker News chatter if you do the latter. Christine Dodrill moved her blog to Kubernetes and the orange site is telling her how wrong she did it, many without actually reading the article she wrote. If it continues to generate discussion, we're contemplating renaming the show to the Cult of Kubernetes podcast.

CRAIG BOX: And that's the news.

[MUSIC PLAYING]

ADAM GLICK: Patrick Lang is a senior software engineer developing Kubernetes and related open source projects supporting Windows Server containers at Microsoft. He's the co-chair of the Kubernetes Windows SIG. Welcome to the show, Patrick.

PATRICK LANG: Thank you for having me. Glad to be here.

CRAIG BOX: Most people listening to this will be familiar with the concept of a Linux container. What is a Windows Server container?

PATRICK LANG: A Windows Server container-- much like a Linux container-- is all about being able to wrap up an application and its dependencies all in a way that can run the same place anywhere. So like on Linux, you start from a base image that may be Alpine, Debian, or whatever, which has a basic user land that's there. And then it uses the shared kernel that you already have on the hosting operating system.

In Windows, what we did was we created two images. One is Windows Server Core, which is pretty much like what you would get with the Windows Server 2016 or installed server, but with the UI removed, or a Nano Server, which is much, much smaller. You're closer to the size of a Linux container, but basically, you're layering what your application needs on top of that and building it up so you can deploy it anywhere. So that way you've got something that's faster and more repeatable than a VM.

CRAIG BOX: These days, you can run Windows applications on Linux with Wine and you can run Linux applications on Windows with the Windows Subsystem for Linux. At a 101 level, you might now consider the operating systems conceptually similar with the single kernel, multiple users, and so on. But how are Windows and Linux different?

PATRICK LANG: A lot of this comes back to sort of historical differences. So one of the goals when Linux was developed was, they were trying to make something that was a relatively stable kernel with a consistent application binary and interface for that, as well as a limited set of syscalls. And that was all designed because they were trying to make it easy to move applications over from other commercial Unix-type operating systems but still get a consistent experience there.

Windows was developed quite a bit differently in that it was designed to be an entire operating system, not just a kernel, but also the API set you used initially referred to as the Windows API, later, Win16 or Win32. But a lot of those things that people were using in user mode for their application to handle things like clicking on Windows, handle text input, concatenating strings, all that kind of stuff was all kind of seen as one programming interface. And so the lines between kernel and user mode were often sort of not quite as stable and blurry.

CRAIG BOX: Especially on the 286.

PATRICK LANG: [CHUCKLES] Yeah. Don't get me started there. But as each version of Windows came out, whenever they wanted to give new functionality to an application developer, in many cases, some portion of that, like the Windows UI initially, was actually implemented in kernel mode. And so it's always been about matching the version that's there in order to take advantage of new functionality. And so the kernel and the user mode just kind of go hand-in-hand and go together, which is quite a bit different from Linux.

One of the other things is that because it was always designed to interact with the UI from the very beginning, once it came to multi-user, they actually had to think about this and say, well, how would you handle input from multiple users at once? And so one of the things that was created was something that was called a session. And so at one point, there is a separate product called Windows Terminal Server that let you basically have multiple sessions served over a network. So whereas Unix-based multi-user systems started off where everyone had a terminal and then later a VT window as Linux, some implemented, Windows actually had full sessions which could render all those windows there.

And then things like system demons would run and was a session zero, which was non-interactive. So whenever we bring up a Windows container, we're basically making the session that's invisible. It's headless, but all the things the APIs may need are still there and encapsulated in that session. And so that means that people can go ahead and do things like install applications that may have relied on a UI in the past, but just script it through an unattended installer. It installs and now it's up and running in that container, even though the user can no longer interact and click on it. But it gives them a way to run the server processes there.

CRAIG BOX: So if you're running an application on Windows, does it always assume that there is a graphical interface even if one isn't present?

PATRICK LANG: Some of the APIs actually do that. One really interesting use case for that is people are running things like rendering workflows. So let's say you're a game publishing house. You've got a tool that was written on Windows that can batch process some of your 3D meshes, apply textures, and then generate some sort of scripted intro video or something like that. Those are things where a lot of the times people would render stuff and then capture the frames off the screen and then go ahead and encode and post process that later. So you can run some of the same types of tasks within a Windows container as long as they can run on the other Windows Server Core and you've got the right dependencies installed for that application.

CRAIG BOX: Can I deploy an application with user interaction with Windows containers?

PATRICK LANG: Windows Server containers are focused around running headless applications. And so if you're looking for something to deploy things to desktops in an easier way, they've got other tools like the Windows Application Converter that make that much easier. And then those deploy things through, like, the Windows App Store or the Windows Enterprise Store. And so Windows Server containers and Docker are focused on headless apps, and then the other app converter story is focused on UI apps.

ADAM GLICK: How did Windows Server containers come about?

PATRICK LANG: That was actually a really fun project to start out. I think was back around sometime during 2014. We had just released Windows Server 2012 a couple of years prior. And we were looking at what's kind of the next trend? And around that time, Docker was gaining a lot more popularity within the Linux user community. I really enjoyed the way that they could easily wrap up and distribute their application without having to give people really long scripts to say, OK, well, you've got to get all these app packages right. You got to go modify these files in /etc. You need to create a dot file here. And Docker sort of encapsulated all that into one thing.

And we then said, well, conceptually, that sounds really, really good. How can we bring some of those same benefits to Windows? And so we just started prototyping it back then before Windows Server 2016 came out, and we actually put out a tech preview that did all the same things, but it was driven through Windows Power Shell. And then at the same time, we also had a parallel prototype where we had actually started with the Docker code base and formed a really good collaboration with some of the folks over at Docker, Inc, and of course, the open source maintainers.

So we basically built it twice, once using Windows-based scripts and PowerShell and once using kind of a prototype port of Docker that was talking to Windows primitives underneath. We've put them both out there. And overwhelmingly, people said hey, we like the Docker implementation because the skills translate over. And so we spent some more time refining that.

And then by the time we released Windows Server 2016, we were basically able to go ahead and get a preview of Docker out and then work on getting that production ready. And so that sort of set the stage to make containers that were something that was no longer tied to just a single OS. It was a concept and a set of practices that could be reusable and let people build really solid DevOps patterns.

ADAM GLICK: Is the isolation model different between Windows and Linux containers?

PATRICK LANG: Yes, the isolation model is a bit different. So probably one of the easiest differences to explain is that on Linux, when you start up, you just have one set of user ID and group IDs, and those are all just numbers. And they're aliased into friendly names using your /etc/password and group file.

But by default on Linux, if you reuse that same user ID on a different system, it may have a different textual name like Patrick or Craig, but it could still be UID 1000. So each Windows installation has its own security database that's handled by the local security agent. And so when I create a user like Patrick, it's actually generating a binary identifier that's unique to that Windows installation.

So when you start a container, it gets its own database. And so they don't necessarily share the same users between themselves or between the host. If you wanted to, you could create the same user on two different containers independently, but there's not that one namespace that ties them together. And it's kind of interesting, because even though it's still the same shared Windows kernel underneath, when you're running two containers on the same machine, those containers can have completely different views of the world in terms of what security principles are there. And so they can each make their own independent decisions around what is this user, what are the ACLs on this file, and that's all done within the context of that container.

CRAIG BOX: Following up on that, in the Windows ecosystem, you generally use the Active Directory identity service. Machines will be joined to a domain, and then you have a single user which has that same security identifier whichever machine they happen to be logged into.

PATRICK LANG: Yeah.

CRAIG BOX: How do you work with identity in the world of containers? Do you actually make each container join Active Directory as it's spun up? Or do you only want to work with identity that is ephemeral and created only for each instance?

PATRICK LANG: That was actually a pretty interesting set of challenges to work on. So without Active Directory infrastructure, everything I just said is still there. Every container is it's own thing. But one of the challenges we ran into was people frequently wanted to set up service-to-service authentication, and they would do that by usually giving permissions to the computer object in Active Directory. And so Active Directory has been there since Windows Server 2000, and it's in very heavy use everywhere.

The problem is that when we started looking at well, what is the load of creating an object per container? And if a container is going to be short-lived, the actual time that it takes to replicate that across multiple Active Directory servers could be prohibitive. So if the default sync is set to 15 minutes because you've got a highly available setup, do you want to wait 15 minutes for that container identity to be created before your application starts? Probably not.

We looked at what else was available in Active Directory, and there was a really nice feature called Group Managed Service Accounts. And what those do is it's yet another security principle that's stored in Active Directory, but it can be used by multiple computers or multiple users, and you get to set an ACL on that particular security principle. And so you could say this set of servers is allowed to get the credential for this service account, and this set of administrative users are allowed to retrieve it or update that ACL list.

That was almost exactly what we wanted for containers. But the thing was that gMSA still required the machine to be Active Directory domain-joined. And the containers, of course, we already went through this. We didn't want them to have to join the Active Directory independently. And so what we did was, we set up a way that we could basically broker those back to the node.

What this meant was an admin could go set up their Windows servers, domain-join them, and then go and install their container infrastructure on top of that. Then when they go and create the container, that can opt in and say, here's which group managed service accounts I want this container to be able to use. And then those requests will get brokered back and then as that container starts up, it'll be able to just simply assume that identity.

That means that if you've got something like a highly-available SQL database that's already got multiple instances running in your environment but you want to move an application out and into a container, then you could basically set up a service account, give access to that service account on the database, and then start up the container using that group managed service account.

And now you've achieved that service-to-service authentication and authorization that you wanted without having to actually change the way your application works. You know, you're not hard-coding a password, and then the stuff that's built into Active Directory that can automatically do things like expire a role, to group-managed service account passwords, still work, because that node running the container is still Active Directory joined. And so you didn't have to re-implement everything in the container.

CRAIG BOX: What had to change in Kubernetes to support Windows containers? And what had to change in Windows Server?

PATRICK LANG: When we initially released Windows Server 2016, our initial target was Docker and Docker Swarm. The capabilities that were required for that were very different because there was a network plugin model and a storage plugin model where a lot of that stuff was sort of deferred but still tightly coupled with the Docker engine itself. In Kubernetes, network and storage management was mostly handled by the kubelet calling additional plugins such as the container network interface. And then at the time, we did flexVolume and networking on container storage interface. But sort of the order of operations was a bit different.

And so what we realized was on the network side, because network adapter creation was decoupled from creating the container, what we needed to do was develop a network namespaces on Windows. And over the course of the Windows Server semi-annual channel releases in 2017 and 2018, we developed that and then made updates to the container runtime and then Kubernetes to be able to take advantage of that, which meant that we could now go ahead and create multiple containers, bind them under the same network namespace, so therefore, they could share the same IP in the same virtual NIC that was needed for all of the routing between pods and services within the Kubernetes cluster.

And so that was a really good example of how, since we were able to ship Windows twice per year as well as work with the Open Source Docker and Kubernetes community, we were making changes on all three fronts in order to make this actually achievable. And storage was similar, but not quite as complicated. We realized that we needed to be able to create a map volumes using sim links on the Kubernetes node in order to give a consistent file system to the containers on Windows. And so we had to make a few tweaks to how sim links were resolved to make sure they were resolved on the host first.

Because when the initial Docker implementation came out, we were resolving them within the container. And so by moving that, we were able to basically resolve those on the host once. And then the containers don't really see the sim link at all. It's just the restricted view of the files that they're supposed to have for their volume that was created by Kubernetes.

CRAIG BOX: There are lots of parts of the Kubernetes stack that were built with Linux in mind, to say the least. kube-proxy, for example, is basically a mess of iptables rules which is the way you configure the network system on Linux. Did you have to build an equivalent of that for Windows Notes?

PATRICK LANG: Yeah. For the network control plane, they exposed a lot more configuration and encapsulated in what's called the Host Networking Service (HNS), so that's kind of the management plane. And there was already something that had been developed called the Virtual Filtering Platform (VFP) that made it possible to write more granular rules around how individual packets would be either forwarded, rewritten, and also apply policy rules to them. But the HNS basically became the front end for doing that.

And so you could sort of think of that as an analog to iptables. And so kube-proxy was updated to be able to talk to the HNS and set those up. So I'll take up a concrete example.

In Kubernetes, when we create a new service and let's say we're using a bridge network topology rather than an overlay, then what we need to do is when a new service is added, kube-proxy proxy can identify that from-- I guess it comes from the API server-- and then what it will do is go add some forwarding routes to say, OK, this Kubernetes service on this IP exists at this remote destination. And then those are programmed into VFP rules that are part of the network namespace for that pod that's running there.

So where you might have an iptables forward or masquerade rule, we have an equivalent VFP rule that does the same thing. And so kube-proxy programs all that stuff. That meant that we were able to get a code base that could actually still share the code that was communicating with the rest of Kubernetes, but do the platform-specific network configuration underneath.

ADAM GLICK: How did you merge Windows Identity with Kubernetes access Controls?

PATRICK LANG: That was actually a really fun partnership that we had with Docker. I've got a couple blog posts that you can put in the notes that cover more about this. But basically, what we did was, we've worked with some of the customers that have already been experimenting with this with Docker Swarm. They were working with some folks at Docker, Inc. And what they did was they took the same mechanisms that we had there in the container runtime interface to configure this. But they took the data that needed to be configured and put that into a Kubernetes custom resource definition.

And so that meant that when we went to schedule a container, we could have a reference to that CRD and then they developed an admission controller that will actually let you do our back within the Kubernetes control plane to make a decision such as is this user allowed to create a pod with access to this CRD, much the same way that you would gate access to secrets down to a specific user or even store them within a specific namespace.

And so the model is very, very similar there, so that way you can use the same RBAC rules. But one distinction that I want to call out here is that these service accounts, passwords, are never actually exposed. So what is actually stored in this CRD is just configuration data. It's enough to uniquely identify that service account and then the Windows node itself-- when it is actually scheduling the container and starting it up-- it uses its existing secure Active Directory connection to retrieve that at the last second.

And so Kubernetes RBAC lets you have control and auditing around who's allowed to create deployments. And then when you want to know what process actually used that when it's a security principle-- which is the gMSA that still shows up in your Windows Active Directory audit logs. So that way you get consistency on both sides of the experience.

ADAM GLICK: Which parts of Kubernetes run on Windows?

PATRICK LANG: Today, our focus is on being able to run the kubelet and kube-proxy. That's been there since 1.14. What that lets you do is add a Windows node to an existing Kubernetes cluster. To make that a little bit easier, we've been working on some updates in collaboration with the cluster lifecycle team to support kubeadm for Windows for configuring that. And so that's kind of the next step in making the story easier.

But right now, the focus is not on moving things like the API server and scheduler over to Windows nodes. Because they're written in Go lang, it's something that's feasible. It's just not something that we've been focusing time on today.

ADAM GLICK: Gotcha. So a focus on the nodes where the workloads are running versus the master that's controlling it all.

PATRICK LANG: Yeah.

CRAIG BOX: Can I use GPUs in Windows containers?

PATRICK LANG: Yeah. It's a bit of a work in progress right now, though. In the most recent versions of Windows 10 and Windows Server 1903-- but I'd have to double check on that-- there's support to map in or basically share the GPU with what's already on the node. So that lets you use some of the DirectX APIs that are needed for things like some rendering workflows and the Windows Machine Learning Framework so you can run those within a container. But we don't have those hooked up yet to Kubernetes.

CRAIG BOX: When you do, can you call them "IndirectX"?

PATRICK LANG: [CHUCKLES] I'm not enough of a GPU nerd to say how indirect that already is compared with other APIs, but--

CRAIG BOX: There are a lot of people who have legacy workloads running on older versions of Windows Server. As they age out and fall out of their support lifecycle, how should someone with an app-- for example, running on Windows Server 2008-- get that up-to-date running in a container on top of Server 2016 or 2019?

PATRICK LANG: I think probably the first thing I'd recommend doing is-- because the container itself needs to be running Windows Server 2016 or newer, because that's when the components are sort of enlightened and made container-aware-- the first question to ask yourself is, well, can this application work on Windows Server 2016? And so if you're relying on some component that's been deprecated and removed from 2016, well, that's the first problem you have to solve.

But if the compatibility is there and you've got what you need, if you look at things like the .NET framework, Windows Server 2016 and 2019 still support .NET versions all the way back to 2.0 from the early 2000s. And so there's a good chance that it's going to be just fine, but what you're going to need to do is basically sit down and look at, OK, well, now that I know what dependencies I need, how do I script this install? Because you're going to be doing this in a headless environment, or probably through a Docker file.

So typically, what we'll see people do is take whatever run book-- whether it's a stack of sticky notes on someone's desk, which is, you'd be surprised I have plenty of those--

CRAIG BOX: Batch files?

PATRICK LANG: Batch files are a little bit better, or a PowerShell script, better yet. But you're going to want to try to run those in containers. And so most application installations can be done unattended, but in some cases, you may need to make sure that you know how to actually script that. Because I've seen some cases before where people are like, oh well, I used to click through the SQL installer.

And then once that was done, I would simply take a backup of that VM image and that became my base image. Well you can't just directly convert back to a container. And so that's where you need to make sure you go back and say, OK, for the SQL installer, what XML, INI, or what's the magic incantation I need to get that installed as well as any other prerequisites I have?

And so it comes back to, can you script your install? If so, yes, well then you can write a Docker file. And then when you do that Docker build, now you've got a container.

CRAIG BOX: Now I have a container. I have to keep that container up-to-date and I also need to patch the operating system that it runs on. What do I have to do on Patch Tuesday?

PATRICK LANG: On Patch Tuesday, you open up your Docker file, and then you go to the top line that says "FROM" and you and you update that. And then you rebuild and redeploy it. So Microsoft-- whenever there's a cumulative security update coming out the second Tuesday of each month, that's released both as a Windows patch that you can install on a normal machine, but we also push a new version of the container base images, Nano Server, Server Core, and also there's a third option called Windows that has a few components in addition to what's there in Server Core.

But all of those are updated and those are available and discoverable on Docker Hub. And then you can pull those down and then just simply rebuild your container on top of it, hand that off to your orchestrator, deploy it, make sure it's working OK, and then shut down your old container.

So Windows patching can become yet another DevOps process where you're just updating your Docker file and source control, then rely on your automation to rebuild, deploy, and then shut down the old ones.

CRAIG BOX: What about when the versions of the host underneath change? I understand that today you have to run a container that's the same major version-- at least-- as the operating system underneath it.

PATRICK LANG: Yeah, that's true today. So if you've got Windows Server 2019 and say you are patched for June, and then you're looking at how do I deploy the July patches? We're compatible across patch levels, and so you can update your host first and then update your containers later if you want to. That ordering doesn't matter.

But when it comes to go to the next version of Windows Server, then that's where the incompatibility arises. So we're working on fixing that by using Hyper-V isolation, because Hyper-V will let you bring up the same kernel version that's needed to run that container, which means that you'll be able to update your node OS first and then update your apps at another pace. But that's something that we're still getting cooked up on Kubernetes.

So that's an effort that we're currently working on. But one of the things that we need to do to enable that is Hyper-V runtime works much, much better under containerd, and so we're currently talking to the containerd maintainers in sort of working on a roadmap to where we can get containerd with Hyper-V support working great under Kubernetes. So that's something that's in the prototyping and planning phases right now. But that'll make that path forward easier in the future.

If you needed to do this today, like let's say you want to run 2019 and 1903 together, what I would recommend doing there is using either node selectors and labels, or use taints and tolerations so that way you can segment your cluster and have some nodes running one version of Windows, some nodes running another version of Windows, and then steer the workload onto the appropriate node.

ADAM GLICK: Thanks for the insight and the tips on that one. You have an incredible wealth of knowledge here, and I'm wondering, how did you get started with Windows in containers?

PATRICK LANG: I think for me personally, a lot of it was sort of being at the right place at the right time. For almost probably about the last 10 years at Microsoft, I was actually working over as part of the Hyper-V team which was really close to the Windows kernel team. So I had spent years already developing VMs and just storage virtualization technologies and virtual fiber channels and all kinds of stuff that was not container-related. But when we looked at how are we going to build containers, all the people that needed to come together to do that were either on my team or just down the hall.

And so what we started to do was we just spun up another group of people to start building the prototyping for this, coming up with the use cases. And it sort of spun out of the VM world. But we went back and said, OK, well, if we're not going to bring up a whole VM, what actually needs to be unique on Windows per container? And so between the VM and the kernel teams, we sat down and figured out what all the right abstractions needed to be, and then we were able to build that model. And so that's kind of how that team got started.

And then early on we had that partnership with Docker, which we learned a whole lot from. And being able to develop things in the open means that you get a whole lot more views on it and come to a better solution ultimately. So it's been a lot of fun.

CRAIG BOX: How should someone who's interested in Windows Server containers get started using them?

PATRICK LANG: Probably the easiest thing would be if you've got a Windows 10 machine and you want to just start experimenting with things, Docker for Windows can run both Windows or Linux containers. If you right click on the little Mobi icon down on the taskbar in Windows, you can switch between Windows or Linux containers as needed.

So when you first install, the default is, well, it's going to run Linux containers in a Linux VM. If you switch over to the Windows mode, then you can build a Docker file that will be from Windows Server Core, copy some files over, run a power shell script and then run my process. So if you go to aka.ms/windowscontainers, that's got a quick start guide there, and there's also some excellent documentation over on the Docker side as well.

So for getting started with containers at a base level, your Windows 10 is all you need. If you don't have a Windows machine, you could either bring up a VM-- in that case you could pick from Windows 10 or Windows Server-- or you could go bring one up on your cloud provider of choice. All the major cloud providers have Windows Server and/or windows 10 available. So you could spin up a VM and give it a try there.

When you want to move forward onto working with Kubernetes, the Kubernetes docs have a Getting Started Guide that show you how to start an existing cluster that's using Flannel. And then we have steps on how to join a Windows node to an existing cluster that's working with Flannel. And we're working on amending that to both simplify the steps using kubeadm and better set-up scripts, as well as support additional network plugins going forward. So that's kind of a current area of focus.

There's also previews going from multiple cloud providers, if you want to get a cluster up and running in the cloud and test on that. So be sure to check with your cloud provider and you'll probably be able to get added to a preview for that too.

ADAM GLICK: Most of this work is happening in the open source. How can people help contribute to the Windows Server container work?

PATRICK LANG: There's many ways that you can contribute. The first thing I would recommend doing is to get connected with the community. If you go to the Kubernetes Community repo, every SIG has a folder in there and that has a link to our weekly meetings and our agenda, as well as the playlist of all of our past meeting recordings. So I would go there, and then browse on that. And then be sure to go to the Kubernetes Slack and get an invite for that and then join us in the SIG Windows channel there. So that's how you get connected with everybody.

In terms of contributions, one of the things that we're really looking for is more insight on how people are actually using these and what type of applications they're moving. Because what we've seen is that a lot of people are running applications that are a little bit older because they've been building and running their business on them for 10, 15, maybe even 20 years, in some cases. And so the use cases tend to be a little bit different from that of somebody that's developing something completely from scratch.

So if you can share feedback on what works, what doesn't, and where we need to improve things-- whether it's something like, well, performance needs to be better in these particular cases, or we need more compatibility with this set of older APIs or things like that-- or things that'll help us steer the direction that's there.

Because if we look just at what's already there in Kubernetes, there's already a lot of Linux-specific functionality there, and duplicating that may not necessarily be what someone's looking for. What they're looking for is, how do I make my app work great and how do I make my experience work great? And we're only going to get that kind of feedback from people that are using it for Windows applications. And so that's one of top things we're looking for there.

When it comes to the install experience, that's an area where we've been spending a lot of time working with kubeadm and things like that. And so, we'd love feedback on how that's working and also improving that stuff. If you've already got experience building and deploying applications on Windows, we'd love to have you help on some of that.

And then the other thing is, once you're familiar with setting things up, we do have a project that we use for tracking our release-by-release goals as well as our backlog of feature requests and bugs. And so we'd love for you to jump in and say hi, I went to one of the SIG meetings, and help by taking some of the stuff off that board, if you want to contribute directly to the Kubernetes code base itself.

But the best thing about open source is the community. And so what we really need to do is grow that community as much as we can, because that's how we're going to make things better.

ADAM GLICK: Awesome. It has been fantastic having you on the show today, Patrick. Really appreciate you coming on.

CRAIG BOX: Absolutely.

PATRICK LANG: No problem. It's been a pleasure, and hope to see some of you in San Diego.

ADAM GLICK: You can find Patrick on Slack and GitHub at @PatrickLang.

[MUSIC PLAYING]

CRAIG BOX: Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter, @KubernetesPod, or reach us by email at kubernetespodcast@google.com.

ADAM GLICK: You can also check out our website at kubernetespodcast.com where you find transcripts and show notes. Until next time, take care.

CRAIG BOX: Happy travels.

[MUSIC PLAYING]

View More Episodes

Windows Server Containers, with Patrick Lang

Chatter of the week

News of the week

Links from the interview

Transcript