#128 November 3, 2020

Antrea, with Antonin Bas

Hosts: Craig Box, Adam Glick

For pods to talk to each other in Kubernetes, you need a virtual network. Antonin Bas is a staff engineer at VMware and a maintainer of Project Antrea; a CNI plugin which provides such a network. He talks to Adam and Craig about encapsulation, virtualisation, and 10,000 year old Finnish artifacts.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box.

ADAM GLICK: And I'm Adam Glick.


CRAIG BOX: Did you enjoy a socially-distanced Halloween?

ADAM GLICK: Oh, yes I did. We got a chance to dress the little one up and walk around. Mostly, just enjoy the decorations in the area, and also take a look at all the things that people had done to be socially distanced. People had set up long gutters up on a top that they'd slide candy down, or put bags out in front, or a table with things on to really make sure that people could enjoy Halloween, even if it was a little different than normal.

CRAIG BOX: Any light shows?

ADAM GLICK: Nothing quite like what we've seen online. TThere's a link in the show notes to one that is truly, truly impressive from someone who has a lot of time and materials on their hands, and created an incredible show for "Enter Sandman" from Metallica. So check that one out. It's a good watch.

CRAIG BOX: Now, it was Halloween week last week, and this week, of course, it is election week. Now, we're not sure when you are listening to this, but it is important to realize that it is time to go vote for Bird of the Year.

ADAM GLICK: That's right. Have you chosen your bird?

CRAIG BOX: I haven't, but I have a couple weeks to think about it. Bird of the Year is an annual event run by the Forest and Bird Association of New Zealand looking at some of New Zealand's native birds, some more endangered than others, and encouraging you to pick one and then go out and campaign for it. And it's a little bit lower-stakes than some other elections that have been happening around this time, but I thoroughly encourage you to check it and out and get behind one. We will probably have made our picks by next week, and we'll have to see who ends up victorious.

ADAM GLICK: [CHUCKLES] Thanks again to all of you who have already filled out our audience survey at kubernetespodcast.com/survey. For those who haven't yet, you've got one week left. We're keeping the survey open until November 10, so please take a moment and go to kubernetespodcast.com/survey and take the three minutes it should take you to fill out the survey to let us know what you think of the podcast, and how we can continue to make this the best podcast for the cloud-native community. Thanks!

CRAIG BOX: Let's get to the news.


CRAIG BOX: D2iQ, formerly Mesosphere, announced this week that the company will stop development of their Mesos-based DC/OS platform. Co-CEO and Co-founder Tobi Knaup announced they will no longer be a co-company, instead transitioning all of their efforts to their Kubernetes-based platform, DKP. Support will still be available for existing customers of DC/OS as the company works to transition them. The announcement contains the ominous statement that they have, quote, "Resized the company to become more nimble and streamlined," which has been confirmed to mean layoffs.

ADAM GLICK: It's been a year since Docker announced their pivot to a developer tools company supporting the Kubernetes ecosystem. Last week, they outlined their expectations for year 2, including prescriptive development environments, more image management tools with automation pipeline integration, and more official container images.

CRAIG BOX: Part of Docker's plan to become profitable was to implement limits on pulls of container images, which has now come into effect. A number of providers have offered workarounds or suggestions. By default, GKE will pull from a cache of the most common Docker Hub images that is hosted on Google Container Registry, or GCR. A post from Google this week outlines how you can identify dependencies that are hosted on Docker Hub, and move those images to GCR directly. Red Hat and VMware also posted about how you can use Quay and Harbor, respectively, to run registries without pull limits.

ADAM GLICK: Security vendor StackRox has released KubeLinter, an open source static analysis tool that checks Kubernetes yaml files and Helm charts to ensure the applications represented in them adhere to best practices. The tool is written in Go and runs at the command line. The release calls out that it's an alpha release, and that breaking changes could still come to the product as it moves through its release stages.

CRAIG BOX: HashiCorp announced the 1.0 beta of Nomad, possibly the final remaining non-Kubernetes orchestrator to deploy containerized and non-containerized applications across cloud and on-premises environments. This release is largely about stability, as the product moves towards GA, but includes a new dynamic application sizing feature in the enterprise version. Namespaces have been moved from enterprise to open source, and a new UI feature allows you to visualize the topology of your cluster.

ADAM GLICK: Vitess is showing off its fitness with general availability of version 8. Vitess, a clustering system for horizontally-scaling MySQL, has increased its MySQL client compatibility with support for a host of popular languages and frameworks, as well as toolings like MySQL Workbench. More logging and metrics were added to the VReplication workflow, and new capabilities for online schema changes and orc failover support were added as experimental. Of note, there are some breaking changes when it comes to compatibility, so check with the release notes before you deploy.

CRAIG BOX: In a recent blog series on programmable web, Bob Reselman from CogArtTech has been looking at gRPC. Where it came from, how it works, and how you use it. The latest posts talk about real-world applications, including an overview of how the Kubernetes container runtime interface is implemented. The nine-article series is a good read to learn about RPC and API design.

ADAM GLICK: Finally, we want to take a moment and acknowledge a very important member of our community who passed away this week. Dan Kohn helped build the cloud-native community, and specifically the CNCF, as its founding executive director. He was a visionary leader who truly built the organization to make it a force in computing that we are all benefiting from today. You'll be missed, Dan.

CRAIG BOX: And that's the news.


ADAM GLICK: Antonin Bas is a Staff Engineer at VMware and a maintainer for Project Antrea, a Kubernetes network plugin based on Open vSwitch, which he tries to make as simple as possible to deploy and troubleshoot. Antonin is also a core contributor to the P4 language, a Linux Foundation project which enables programmability in the network infrastructure. Welcome to the show, Antonin.

ANTONIN BAS: Thanks. I'm very happy to be here for my first podcast ever.

CRAIG BOX: Nothing to worry about at all. You grew up in France. You got your undergraduate degree there. But then you went to Stanford. How did that happen?

ANTONIN BAS: I think I've got a pretty typical story for my generation, where I got into computers through video games. I played a lot of video games in the late '90s. And I still play video games today probably more than I should. So that's how I got into computers. And then I got into programming through my dad, with some distant memories from his college days. He's not a programmer himself. He's not in the software industry. But he told me, oh, you like computers, you should try writing a program. And he recommended me to try the BASIC language, which is a very surprising recommendation for the time. I think it was like 2003, 2004.


ANTONIN BAS: Nobody was using BASIC anymore, right?

CRAIG BOX: So was that Visual Basic, then?

ANTONIN BAS: I think it was Visual Basic, yes. But even Visual Basic was dated at the time.


ANTONIN BAS: Don't want to make anyone mad, but it was the case. I wrote some simple programs, and then I quickly graduated to C++ and took some tutorials online. That's how you did things at the time, took those tutorials. It was before YouTube was a big thing. I did a little bit of web programming, nothing too complicated. Basically, I realized pretty quickly I wanted to do a career in tech. I went to college in France. Then I was like, if I want to do a career in tech, I think Stanford would be a great place to go. And so I applied to several universities, both in the US and in the UK. Eventually, I chose Stanford. And that's how I got there.

ADAM GLICK: You've gone into computer networking, which isn't the area I normally think of people focusing on, especially if you're into video games where graphics are such a big thing in it. Why was networking the thing that attracted you?

ANTONIN BAS: It's not really the sexy field of computer science, at least not anymore, and not for a long time right now. It used to be graphics, I guess. Now everyone is into machine learning and big data. But I really wanted to understand how the internet worked. And I was interested in distributed applications. How would you build them? How do they communicate with each other? How would you build scalable and resilient systems? That's how I got into computer networking.

Really, it's when I got to Stanford, I met a professor there called Nick McKeown, who is a professor of electrical engineering and computer science. And he's considered one of the founders of software-defined networking. At the time I met him, just after getting to Stanford, he had just gone through his Nicira acquisition by VMware. He's one of the co-founders of Nicira. I met with him. I became a teaching assistant for his Intro to Computer Networking class. I did some research that he sponsored, and he is also how I got my first job. He's been a big influence on me.

CRAIG BOX: Do you find that you really have to understand the technology to be able to teach it to other people?

ANTONIN BAS: Yes, definitely. Otherwise, I just don't feel comfortable explaining things to people. I mean, how would you even explain things to people if you don't know what it's about? Sometimes it takes me a long time to learn things, just because I don't feel comfortable until I get to the bottom of it and I fully understand it. Once I've learned something, I usually feel comfortable teaching it.

CRAIG BOX: Now, around that time, open source was obviously a big thing at places like Stanford. Was that something you were involved with while you were at school?

ANTONIN BAS: No, actually. I didn't get to open source until after I graduated from Stanford. Maybe I did some minor contributions, but actually, I can't remember anything. I really got into open source when I started my first job.

ADAM GLICK: Was that a factor of what you were doing, what you were interested in, or what the company was doing?

ANTONIN BAS: Definitely what the company was doing. Funny story. I was originally going to join a high-frequency trading firm in New York at the end of my master's. I talked to Nick-- Nick McKeown who I just mentioned, and I was like, Nick, I'd like a recommendation letter to go there because I've been working with you for a bit now. OK, I'll give you your recommendation letter. But first, you have to do me a favor, and you have to go meet those guys in Palo Alto who are building the startup which is going to revolutionize networking. Of course, he was one of the co-founders of this startup.


ANTONIN BAS: So I went to talk to them, and eventually I decided to stay. That's how I joined that company and I'm not in finance today. Going back to the question, the company that I joined, Barefoot Networks, was introducing a new paradigm for designing network devices and programming them, like the next step in the software-defined networking revolution. This whole paradigm was built around a programming language called P4, which is a domain-specific language to program network devices and define how individual network packets should be processed.

A big part of the company's strategy was to build the strong open source ecosystem around that P4 programming language, with an entire tool chain that was to be licensed under Apache. You have a compiler for P4, a software switch to emulate P4 programs, SDN APIs for P4, and so on. Pretty much my only job at Barefoot was to maintain this ecosystem and contribute to this ecosystem. So that was really a big part of the company's strategy, and that's how I spent so much time doing open source.

ADAM GLICK: So you're contributing a lot to the P4 language, which is also part of the Linux Foundation. Why a new language, rather than using one of the existing languages?

CRAIG BOX: Like Visual Basic.

ANTONIN BAS: [CHUCKLES] Like Visual Basic. P4 is a domain-specific language. You have constructs built into the language which are specific to networking and packet processing. So you had a few papers at the time around similar concepts, but you didn't have a well-established programming language to define packet processing pipelines. But if you look at P4, in a way, some parts of it are C like. You can do arithmetic using standard operators. Some parts of it, at least, are imperative. If you look at some selected snippets of P4, it can remind you of C.

CRAIG BOX: A lot of people will have gone through networking back when it was plugging cables into switches, and so on if you move on to cloud, some people may have missed the whole software-defined networking space, because we don't really need to have to worry about how cables are plugged in anymore. We have someone do that. How would you define software-defined networking for someone who's not familiar with the space?

ANTONIN BAS: I think there are two big revolutions that happened in that space, virtual networking and software-defined networking. I would say the two are related, but they're distinct from each other. Virtual networking is the ability to create virtual networks for your virtualized workloads. Just like you can have multiple virtual machines sharing the same physical servers and running on the same physical servers, you can define multiple virtual networks which are going to share the same network physical infrastructure.


ANTONIN BAS: Virtual machines on the same server can have different tenants. Virtual networking can also support multi-tenancy and isolate your workloads. The way you would typically do that, for example, is for each tenant, you can build what we call an overlay network using a technology like VXLAN. And in that case, virtual switches are very important because they're going to act as termination endpoints for your overlay network.

As far as the physical network underneath, what we can call the underlying network, the underlying network fabric is concerned, nothing has changed. It's not aware that VXLAN is being used. Those are building virtual networks using overlay technologies like VXLAN. So virtual switches become very important for that because at the time, you didn't have any physical network device that was able to support those new technologies like VXLAN and act as termination endpoints for them.

CRAIG BOX: We used to be able to take a switch and configure certain ports to be on a VLAN, which would mean that, effectively, it was, like you say, a virtualized network on top. Where does the X come from in VXLAN?

ANTONIN BAS: Actually, virtual networking, it's not a new concept. Like the V in VLAN, as you say, it stands for Virtual. So it was a way to say I have one L2 network, and I'm going to splice it, shard it, or whatever, between multiple tenants. And they're all going to receive a VLAN ID, and those networks are going to be independent. But then you ran into quite a few issues. First, a VLAN tag is quite small. You can only support 4K different VLANs. It's still like an L2 network. It doesn't scale that well.

With VXLAN, you define some new protocol headers, you can basically put whatever you want in those headers for the network identifier in VXLAN. It's 24 bits. So you get to way more than just 4K different tenants. And then because you are using an encapsulation, which is the big difference between VLAN and VXLAN, your underlying network can be anything. It can be a routed L3 network fabric, which you can't do with VLAN. You get many more possibilities using VXLAN. It lets you build those virtual L2 networks. But underneath, you can have any network as long as you have connectivity between your servers.

ADAM GLICK: What's the main benefit for the people that are setting these up? Why would you want all these separate VLANs or these separate pieces together?

ANTONIN BAS: You get isolation. Each tenant gets, basically, its own L2 network. And those L2 networks can work completely independently. But I think a big part of it is, in the virtual world, the ability to migrate your virtual workloads. When you move a VM from one physical server to another physical server within the same VXLAN, you don't have to worry about the underlying physical network at all. You actually don't have to go in and change the configuration for your network switches, your physical devices, or your routers, or whatever. You don't have to touch them.

You just need to update the virtual switch at the source physical server and at the destination physical server. Basically, you have to tell those virtual switches where the VM is going, and that's all that needs to change. The physical network is not aware of the migration at all. And thanks to SDN, Software-Defined Networking, which gives you open APIs to switches, in particular, virtual switches, doing this programming becomes very easy, and you don't have to wait for some convergence event or anything. Migration of virtual workloads is a very big advantage there.

CRAIG BOX: We can relate this back to how Kubernetes does networking. The machines that nodes are run on tend to have a single IP address, whether they be plugged in by a cable in the physical hardware, or whether they be a cloud instance. Kubernetes very early on introduced the idea that each pod has its own IP address. So now we need a way of getting multiple IP addresses to a machine. In the early days, that would be achieved by way of an overlay network. Of the early overlay networks, things like Flannel and Weave Net, are they implementations of the same idea? Or are they somehow different?

ANTONIN BAS: Flannel uses a VXLAN overlay network which actually follows the same principles I've described, pretty much. The project I'm working on, Project Antrea, also builds an overlay network. You can choose between different overlay technologies. And we support the modes which do not require overlays and do not require encapsulation. But in my mind, based on how the Kubernetes network model is, building an overlay for a pod remains one of the easiest solutions and most portable solutions for completely different types of Kubernetes clusters.

CRAIG BOX: The alternative, as you mentioned, was a routed network where you didn't have an overlay. And that was one of the early features of the Calico Project, where I believe they used BGP, and basically treated each node like it was a physical switch, I guess. That was touted as being better performance. Is that still a requirement, or are overlay networks now as performant as routed networks?

ANTONIN BAS: When you don't use an overlay network and you don't use encapsulation, you eliminate some of the overhead because encapsulation requires additional errors, maybe a bit of additional processing. You actually reduce your effective MTU, Maximum Transmission Unit. The amount of data you can send in each packet becomes a bit smaller. It's not going to matter for large MTUs, 9K MTU, of course, because then the offset becomes small.

ADAM GLICK: As you mentioned, you're working on Project Antrea right now. What is that?

ANTONIN BAS: We've been talking about Kubernetes networking in existing projects, but we haven't really defined what Kubernetes networking is about. If you look at the Kubernetes documentation, the Kubernetes network model, it's very simple. What it says is that pods on the node should be able to communicate with pods on all the other nodes without any network address translation. When the destination pod receives traffic from a source pod, it can identify that source pod using the source pod IP, which has not been rewritten.

And the second thing is that processes on the node, and especially the Kubelet Kubernetes agent on each node, should be able to communicate with all the pods on that node. Project Antrea is a network plugin, so it's going to enforce and implement this Kubernetes network model. But if you look at the CNI ecosystem, the network plugin ecosystem, you can see that all the popular projects, like Calico, Cilium, and so on, they go beyond just those very simple requirements, and they provide you with additional features. Project Antrea is one such Kubernetes network plugin. It provides networking across pods, and it provides some extra features which are useful when running clusters and deploying workloads on clusters.

ADAM GLICK: It has a very interesting name. Is it Greek? Where did it come from?

ANTONIN BAS: [CHUCKLES] You know that quote in computer science which says that, there's only two hard things in computer science, and its cache invalidation and naming things.


ANTONIN BAS: I think it's fair to say that the naming in Antrea was one of the hardest parts of the project. We wanted to stay in the Kubernetes naval theme, right? So you have Kubernetes, you have Helm, you have Prow, all those things. But at the same time, if you look at existing projects in the Kubernetes networking space, you can see that there is another theme, which is around nets and around fabrics. So you have Service Mesh, you have Flannel, you have Weave, and so on.

We came up with a lot of potential candidates before we open-sourced the project which were at the intersection of those two fields. And then we sent them to the VMware legal team, and they cut 95% of them for intellectual property rights concerns, or something like this. I don't remember any of the other candidates, but at the end, pretty much Antrea was the only one left. So not much of a choice left there.

It's named after one of the oldest fishing nets that was discovered. And it was discovered in a Finnish town called the Antrea. The net is named after the town, and the project is named after that net. I think town doesn't exist anymore. I think it's a Russian town now with a different name.


ANTONIN BAS: That's the origin of the name. Originally, we had an internal name which was an acronym. It was OKN, for Open Container Networking, or Open Kubernetes Networking. The VMware legal team doesn't like acronyms much, though.

CRAIG BOX: I like how the K could stand for both "Container" or "Kubernetes" depending on-- there's precedent there.

ANTONIN BAS: Yes, we like that.

ADAM GLICK: [CHUCKLES] I was about to say, I think there's another project that may have also followed that same naming convention. [CHUCKLES]

ANTONIN BAS: More than one probably did. [CHUCKLES]

CRAIG BOX: Antrea is a Container Network Interface, or CNI plugin. CNI is a relatively recent, in Kubernetes terms, system for configuring networking. How does CNI work, and why is it the right thing to implement to solve your problem?

ANTONIN BAS: For people who are not familiar with CNI, you want a top-to-bottom approach. What happens when you create a pod? So the user is going to request for a pod to be scheduled by providing a pod specification to the Kubernetes API, right?


ANTONIN BAS: It can be directly. It can be through a deployment, or something like this. And then the Kubernetes Control Plane is going to schedule that pod on one of the Kubernetes nodes in your cluster. Once it has done that, it's going to let the Kubelet agent running on the node know about this pod and say, OK, you need to run that pod, that container, essentially, on your worker node.

And once Kubelet gets this information, it's going to interact with the container runtime. And that's done through a standard interface called CRI, Container Runtime Interface. And then the container runtime needs to interact with the Kubernetes network plugin through another standard interface, which is called CNI, or Container Network Interface. And CNI is something that exists outside of Kubernetes, really. It's a CNCF project to define the API, and maintain it, and maintain some standard plugins which implement this interface. It's a key part of Kubernetes networking. And because that's how Kubelet eventually interacts with the Kubernetes network plugin to inform it on how to provision networking for a specific pod, all the Kubernetes network plugins have to implement this interface. It's a very simple interface. You actually have three different methods. CNI Add, which is-- I've created this pod. Configure the networking for that pod. Then you have CNI Delete, which is-- I'm deleting this pod. Do your cleanup. And then you have CNI Check, which can be called periodically to make sure that everything is good.

If you look at what happens when the pod is created, typically on Linux, what that would involve is, you create a "veth" pair, which is like a virtual ethernet cable. Everything that goes through one end exits through the other end. You're going to create that veth, which is two interfaces which are pipelined together. And then you're going to put one end of this in the pods network namespace and then attach the other end of it to some kind of virtual switch. And that's how you provision networking for a pod. So you assign an IP address to it, configure routes, and so on. The rest really depends on your network plugin implementation.

CRAIG BOX: So now you need to connect the other end of that virtual ethernet cable to a switch. You've mentioned before some of the technologies here, and in particular, Open vSwitch. Why is that the technology that you chose for Antrea?

ANTONIN BAS: The plugins work differently, right? Calico, for example, doesn't use a bridge because it's a routed network plugin. In Antrea, we do use a bridge, and we use Open vSwitch, a virtual switch, to build that overlay network between pods. And there are multiple reasons why we chose to go with Open vSwitch. One, the obvious one-- maybe we should get it out there-- is that Open vSwitch is very actively promoted by VMware.

It was designed by Nicira, and then VMware acquired Nicira. So a lot of the Open vSwitch contributors and maintainers are also VMware employees. We can rely on that expertise when we use Open vSwitch for Project Antrea. There are many other reasons, right? It's a Linux Foundation Project. It has a very strong development community, including outside of VMware. You have a lot of people at Red Hat for example, working on Open vSwitch.

It's available out of the box in pretty much any Linux distribution, and that's going to include cloud-optimized container distributions. AWS has one called Bottlerocket. I think in Google it's Google Container OS. I didn't check that one specifically, but typically, Open vSwitch datapath is going to be available out of the box in Linux. So it's very important for portability.

Other reasons, and this one is very important for us, is that Open vSwitch works both across Linux and Windows. It's very portable. For us it's very important, because Kubernetes on Windows is still lagging behind. But it's been picking up some momentum in the latest Kubernetes releases. We really wanted a dathpath technology that enables us to support both Linux worker nodes and Windows worker nodes without completely duplicating the efforts across both implementations. If you look at other network plugins, maybe using things that eBPF for their database implementation, well, eBPF is not something you can use on Windows. It's a Linux-specific technology.


ANTONIN BAS: Being able to have that portability, define packet processing pipeline for Kubernetes that we can use pretty much identically on both Linux and Windows, was very important for us.

I think finally, a very interesting aspect is the fact that you have a lot of hardware vendors which support Open vSwitch actively. By "support," I mean that vendors which provide physical network interface cards with support for Open vSwitch offloading. So for some bare-metal use cases that require a lot of throughput and low latency between your workloads, you can leverage that hardware offload support and actually offload your packet processing to the network interface cards. And that's something that's really facilitated by Open vSwitch.

ADAM GLICK: When the network is connected, how does the pod get its IP address? Is it DHCP? Something else?

ANTONIN BAS: There is that whole notion of IPAM, IP Address Management, around the Kubernetes network plugins. And every plugin is going to be different. What we do in Antrea is right now we rely on a part of the Kubernetes control plane called the Node IPAM Controller. So when you create your Kubernetes cluster, for example, with Kube Admin, you provide what we call a pod CIDR, which is a range of IP addresses-- for example, 10.0/16. Then you get 65,000 IP addresses there.

And what the Node IPAM Controller is going to do is every time a new node joins a cluster, it's going to carve out a smaller subnet, a smaller range of IP addresses out of that bigger range. And is going to allocate it to that node and say, OK, for the pods created on that specific node, you can pick an IP address from that range. Antrea is going to listen to the Node IPAM Controller.

And the way it works is the node specification in the Kubernetes API is going to be updated with that range of addresses. So we listen to that, and then when a new pod is created and Kubelet invokes a CNI interface, we're going to assign one IP address from that range. It all happens locally. It's all reliant on the Kubernetes control plane right now. There are different ways of doing IPAMs that we will probably explore in the future.

CRAIG BOX: If I'm running on a managed service like GKE that interfaces with an underlying network, is Antrea something for me? Or is it just for people who are building from scratch and plugging in network cards and using kubeadm?

ANTONIN BAS: I love this question. I mean, it's pretty complex to answer because every cloud provider and every Kubernetes-managed service is going to be different. We would love for Antrea to be supported anywhere Kubernetes is supported, from bare-metal clusters to on-premise clusters, to clusters that you build yourself on EC2 or GCP, to managed services like AKS and EKS.

If you look at the Antrea repository, you can see that we provide documentation and Kubernetes YAML manifests for you to run Antrea on those cloud-managed services. The problem is, we don't support them natively. So by natively, I mean when you create a cluster on GKE, for example, ideally, we would love for our users to have a drop-down menu. And they just select Antrea, and they hit "create cluster", and the magic happens. Antrea is deployed, and the network configuration is correct.

That's not how it works. For this to happen, your project has to gain some momentum, and you have to work with the cloud providers to enable that support. So obviously, that's something we would love to see happen. But right now, that's not how it works. Right now, we ask our users to first create a cluster. Then usually they have to run some kind of script to get rid of the existing networking solution. And then they can deploy Antrea using a specific yaml manifest that we provide.

And every cloud provider, like I said, is going to be different. For example, in GKE, we can actually do a pretty good job, because you can deploy a cluster with kubenet, which in a way is a default network plugin. Just by itself, kubenet doesn't do much. As you say, it relies on collaborating with the cloud provider to program routes in the physical network infrastructure so that the cloud providers know how to forward packets between pods.

So in GKE, we are able to completely replace kubenet and build our own overlay network to provide Kubernetes networking. For other cloud services like EKS and AKS, we have to work in what we call CNI Chaining Mode, where we collaborate with the existing solution from the cloud provider for EKS, AWS, VPC CNI. So basically, they do the IP address allocation. They create the veth for the containers. But then Antrea is going to be invoked after this. And we're going to steal the veth away and plug it into Open vSwitch. And we do that so that we can properly enforce Kubernetes network policies, which is a big part of what Antrea is providing as a network plugin.

ADAM GLICK: So you're talking about ways that you might want to replace the default CNI that might be available with things. Antrea isn't just about networking, though. There is also security and policy aspects to it. What does Antrea do from those aspects?

ANTONIN BAS: The very basic thing here is that we do implement a Kubernetes NetworkPolicy API. So that's not something that all plugins do. We talked about cloud provider CNIs. Those typically don't enforce Kubernetes network policy, so they don't implement the Kubernetes NetworkPolicy API. If you look at a plugin like Flannel, for example, also does not implement Kubernetes NetworkPolicy API. But if you look at the most popular programs, I want to say Calico and Cilium, they all implement this API, and they actually go beyond that. And we do the same thing in Antrea, where we provide some extensions to the Kubernetes network APIs.

Because what you realize once your plugin is used in the context of an enterprise is that the Kubernetes NetworkPolicy API has some limitations. And you need to extend that to satisfy all the use cases. And so those plugins do that. Sometimes it's fully open source. Sometimes it's proprietary. Sometimes it's somewhere in the middle.

ADAM GLICK: Now Gatekeeper, and OPA are becoming popular. Can you plug OPA into Antrea? Do they work together, or do they compete?

ANTONIN BAS: So they are kind of orthogonal. OPA is an admission controller that lets you restrict what kind of resources can be allowed to be created in your cluster. OPA, in particular, replaces pod security policies, which is an API being deprecated. And that API lets you do things like, OK, I do not want users to be able to create privilege bots, or bots which are going to be able to mount directories from the underlying Kubernetes node.

That's what OPA is about, generalizing that ability to prevent which resources are going to be created in your cluster. It applies to pods. It applies to any resource, and can apply to network policies as well. And it can apply to custom resource definitions. So using OPA, you can define policies to dictate what kind of Kubernetes network policies you're going to be allowed to create in your cluster. Kubernetes network policies are really about deciding which pods can talk to which other pods.

Earlier on, I talked about the Kubernetes network model, and I said that pods should be able to talk to every other pod without address translation. But as soon as you define Kubernetes network policies and your CNI plugins implement those policies, this is no longer valid. When I apply a network policy to a pod, that pod becomes what we call isolated. And it can only talk to pods for which you explicitly defined network policy rules. So the two, OPA and network policies, are orthogonal. But they can interact with each other, because OPA can let you define policies for network policy admission in the cluster.

ADAM GLICK: One of the other features of Antrea is that it does encryption-- in particular, node-to-node encryption. Is Antrea a service mesh?

ANTONIN BAS: No. So Antrea does provide encryption, but that would be at a different layer than service mesh. And overall, it's a CNI plugin. And whichever service mesh solution you use, like Istio, they tend to operate on different layers, and they are complementary. The CNI plugin operates at layer 2, layer 3. It provides connectivity between pods. Whereas service mesh is more about how your workloads, not really at the pod level, can communicate with each other. And so in Antrea, we provide node-to-node encryption.

As part of the overlay network, you can use IPsec, which is an encryption and authentication protocol for IP, which is going to encrypt the traffic leaving one node until it gets to another node. But for example, traffic between pods on the same node is not going to be encrypted. Service mesh operates at a different layer, above layer 4, and it does support things like MTLS.

It's going to be a much more ubiquitous form of encryption, where all the traffic that you want can be encrypted. So different layers, usually IPsec or other technologies like WireGuard, can provide higher throughput than doing end-to-end encryption with MTLS. But it's not really end-to-end encryption, right? It's just traffic that goes across nodes if your underlying network is not considered secure, for example, or if your cluster is split across multiple sites.

ADAM GLICK: Would it be fair to say that what you're talking about is more at the infrastructure layer, versus the service mesh is more at the application layer?

ANTONIN BAS: Yeah, I think it's fair to say that.

CRAIG BOX: How does the Project Antrea community interact with the upstream Kubernetes community? Is it just through the CNI implementation, or is there more there?

ANTONIN BAS: We have quite a few Antrea contributors which are very active upstream, and especially in the Kubernetes SIG network. There is an ongoing effort to improve the Kubernetes NetworkPolicy API, which we mentioned previously. And one of our core contributors, his name is Abhishek, is very involved in this. The idea behind this effort is, as I said, all the popular network plugins tend to provide their own extensions using CRDs for the Kubernetes NetworkPolicy API. And you can provide different advanced features for the enterprise.

When you look at the landscape here, you can see that all those plugins tend to provide some subset of features which is common across all of them. So the idea behind this upstream effort is to say, OK, maybe we can unify those APIs, or at least part of those APIs, so that we can provide a better experience for Kubernetes users, and maybe avoid vendor lock-in to some extent by standardizing those APIs. And so in Antrea, for example, we have the notion of "ClusterNetworkPolicy", which is the ability to define the network policy which is going to apply to all the pods in a cluster, as opposed to Kubernetes network policies, which are namespaced resources, so they're only applying to pods with the single namespace.

If you want to apply that policy across all namespaces, you have to repeat the policy definition. If you look, Calico, Cilium, Antrea, they provide that ClusterNetworkPolicy API. It's kind of similar. So the idea of this upstream effort, which Abhishek contributes to but I think we have some folks from the GKE team as well, is to extend the Kubernetes NetworkPolicy API with a new API resource for cluster network policies.

ADAM GLICK: Currently, Project Antrea is something that you're working on outside of a foundation. Is it something that you'd like to join the CNCF?

ANTONIN BAS: We are seriously considering donating Project Antrea as a CNCF Sandbox Project. That's something we're working on currently. And we haven't done an official submission yet, but that's something that should be coming in the upcoming months. Because if you look at the CNCF landscape and the CNCF Project, there is no actual CNI plugin which is part of the CNCF.

And we think it's important for synergy between those different projects in the CNCF ecosystem to potentially have a CNI plugin as part of the CNCF. Because let's look at Kubernetes Network Policies, for example. They are a standard Kubernetes API, but the default Kubernetes network plugin, kubenet, which is used to run upstream conformance tests, for example, does not implement Kubernetes Network Policy APIs.

So how would you validate those APIs through testing? You really can't. Yet, if you look upstream, you have some end-to-end network policy community tests maintained by the community. Maybe if we had a project like Antrea as part of the CNCF, as a Sandbox Project, then it would make sense to run those tests on the project upstream as part of a pre-check NCI, as part of continuous integration. Basically, validate those tests and validate those APIs.

CRAIG BOX: So aside from potential Donation to the CNCF Sandbox, where do you see Antrea going next?

ANTONIN BAS: There are many things. But if I were to mention one, I would say a big part of our focus going forward may be on supporting Telco use cases. There is a lot of noise these days about telcos transitioning from virtual machines to containers to run virtual network functions. So basically, transitioning from the OpenStack ecosystem to the Kubernetes ecosystem. And they want to do this for multiple reasons, but a big part of it is benefiting from the Kubernetes ecosystem, which is very active and very dynamic.

Right now, we're very far from this being a reality. But we get a lot of requests in upstream Antrea to accommodate some of these Telco use cases. Telco needs things like multiple networks, multiple interfaces per pods, the ability to do traffic steering, so service chaining between the different network functions that you can run as containers in a Kubernetes cluster.

For 5G, you have the ability to slice a network, to create multiple virtual networks with their own SLA guarantees. Basically, those are kind of new use cases for Kubernetes. And I think there is definitely a strong interest in having a Kubernetes network plugin provide a unified and integrated solution for Telcos. So that's something we're looking at with Antrea.

ADAM GLICK: For people that are listening to this and this sounds interesting to them, how can they get started?

ANTONIN BAS: That can be deployed by applying a single YAML manifest with kubectl, for example. That's something we really try to keep simple. As I mentioned before for Kubernetes Managed Services, if you're using Kubernetes Managed Services, we have dedicated documentation and dedicated YAML manifests. But yeah, single command line.

ADAM GLICK: Thanks for joining us, Antonin.

ANTONIN BAS: Thanks for having me. It was great.

ADAM GLICK: You can find Antonin on Twitter at @AntoninBas, or on GitHub at github.com/antoninbas.


CRAIG BOX: Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter at @KubernetesPod or reach us by email at kubernetespodcast@google.com.

ADAM GLICK: You can also check out our website at kubernetespodcast.com, where you can find transcripts and show notes, as well as links to subscribe and the ability to take our survey. Until next time, take care.

CRAIG BOX: See you next week.