#25 October 16, 2018

GKE Container-Native Load Balancing, with Ines Envid and Neha Pattan

Hosts: Craig Box, Adam Glick

GKE container-native load balancing enables Google Cloud load balancers to target Pods directly, rather than the VMs that host them, and to evenly distribute their traffic. Product manager Ines Envid and staff software engineer Neha Pattan explain how.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

ADAM GLICK: Hi, and welcome to the Kubernetes Podcast from Google. I'm Adam Glick.

CRAIG BOX: And I'm Craig Box.


Well, I'm back home. How about you?

ADAM GLICK: I've got one more week to go before I get home. I'm out at Gartner's ITxpo this week. So if any of you are here in Orlando this week, stop by. I'll be happy to give you a sticker.

CRAIG BOX: Can you also take engagements at Disneyland?

ADAM GLICK: I would love to, but unfortunately I do not have any time to visit Disney. I feel so jealous. I see all the people walking around with the little ear hats on. It's adorable.

CRAIG BOX: I've never been to Florida, but I guess they have the infrastructure for large events.

ADAM GLICK: Yes indeed, everywhere you look, hotels. It's also wonderfully sunny and beautiful here.

CRAIG BOX: How was your London experience?

ADAM GLICK: London was great. I want to say thank you to all the people who came up, found me at the booth, and said hi. It was great to talk to a bunch of you and get your feedback on the podcast and thank all of you for listening. I hear that you got a chance to check out a somewhat famous piece of artwork this week.

CRAIG BOX: I did. That is my story, and I'm sticking to it. Many of you will probably have seen on the news the live creation of a new piece of artwork at Sotheby's when a Banksy print went under auction and then spontaneously shredded itself halfway. It's still not quite clear whether it was entirely a publicity stunt, whether the auction house was in on it. But as a result, we now have a new piece of art, which has been retitled, and now called "Love is in the Bin." And it was on display at Sotheby's for two days this weekend, which just happened to be the two days after I flew back to the UK. So I did get a chance. You can check out the show notes for a lovely picture of me with the Banksy print.

There is a very, very long line. There was a whole bunch of other great art there, which was going under the hammer this week. No one really cared about all of that. Everyone were just queuing up to take their selfie with the Banksy. Sort of a statement on modern art, really, isn't it?

ADAM GLICK: It is-- a wonderful statement about making lemonade from lemons. I wonder if it'll show up in the Tate.

CRAIG BOX: You should take a "Kubernetes Podcast" sticker and make a series of small incisions to halfway up and then see what it does to its value.

ADAM GLICK: [LAUGHING] Give that a whirl. Why don't we get to the news.


CRAIG BOX: A number of Kubernetes-related announcements were made at Google Cloud Next in London last week. GKE private clusters are now generally available. Private clusters have both their master and nodes deployed with private IP addresses and a VPC so that they can operate completely disconnected from the public internet. In case you want to get internet access on those nodes, you'll need to do a Network Address Translation, or NAT.

Google also launched a managed NAT service in beta. Cloud NAT is a distributed, fully managed software defined service and is not instance or appliance based. The full details on how this works can be found in the Cloud NAT documentation. Google also announced container-native load balancing, which is now available in beta. This feature lets the load balancer directly target Pod IPs instead of VM instances, and you'll hear a lot more about this in a few minutes. Customers were also told that one-click Istio deployments and MTLS encryption will be available to GKE users next month.

ADAM GLICK: Amazon Elastic Container Service for Kubernetes now supports dynamic admission controllers, allowing customers to deploy custom web hooks. Admission controllers allow you to run a piece of code after an API request has been received, but before the objects are created. They are commonly used to enable automatic sidecar injection with Istio, and that's the feature that Amazon called out in their blog post.

CRAIG BOX: Skaffold, with a K, 0.16 is out. And its headline feature is that you can now sync files to your running containers without any changes to your deployments or extra dependencies. Matt Rickard, with a C and a K, who you met in episode 6, has written up an example of how to use this feature with Python. And you can find that in the show notes.

ADAM GLICK: The Cloud Foundry Foundation announced two new Kubernetes-focused projects at its EU summit in Switzerland last week. The first of these projects, Eirini with an E, will make it possible for vendors to use Kubernetes as the container scheduler for the Cloud Foundry application runtime. The other project, CF Containerization, is designed to package BOSH releases into containers and deploy those containers into Kubernetes. The intent of this project is to give IT the ability to deploy the Cloud Foundry application runtime into existing Kubernetes clusters. These projects come to us from IBM, SUSE, and SAP.

CRAIG BOX: Kubernetes co-founder Brendan Burns has written another book. "Managing Kubernetes," written with Craig Tracey at Heptio, looks at how to manage applications in the Kubernetes environment. Physical copies are due in November, but you can get a jump on the book through giving your email address to Heptio.

ADAM GLICK: Every release of Kubernetes comes with a "Five Days of Kubernetes" series of blog posts, giving deeper dives into new features. Posts for 1.12 include discussions of volume snapshotting, the new RuntimeClass feature for selecting container runtimes on a per-Pod basis, and topology-aware dynamic provisioning, which allows you to provision storage based on things like zones and regions.

CRAIG BOX: Steven Acreman, from the Kubedex, posted the comparison of GKE, Microsoft AKS, and Amazon EKS. After a day on Hacker News and a bunch of comments, it has been updated to include IBM Cloud Kubernetes Service and Alibaba Cloud Container Service for Kubernetes. It offers an external viewpoint on which service is best for users in various situations with various cloud providers, as well as a table of facts and figures for those who prefer to judge based on such things.

ADAM GLICK: New Relic has purchased CoScale to increase their Kubernetes container and microservices monitoring support. The acquisition adds OpenShift and Docker monitoring support to New Relic's current application monitoring offerings. Terms of the deal were not disclosed.

CRAIG BOX: And that's the news.


ADAM GLICK: Our guests this week are Ines Envid and Neha Pattan. Ines is a product manager and Neha a staff software engineer in the networking team here at Google Cloud. Welcome, Ines.

INES ENVID: Hello, everyone.

ADAM GLICK: Welcome, Neha.

NEHA PATTAN: Hi. Great being here.

CRAIG BOX: Congratulations to you both on the launch of Container-native Load Balancing. What exactly is the product that you've just announced?

INES ENVID: It is really basically being able to load balance directly to containers instead of VM instances. And that brings optimizations to the routing, as we're able to remove the double hop. It also brings visibility and optimization into the load balancing algorithms, as we understand what are the number of Pods that are hosted in each specific VM instance.

CRAIG BOX: What was the problem with the previous implementation model, where we used kube-proxy to route requests into a cluster?

NEHA PATTAN: Before Network Endpoint Groups, the way to define a load balancer for containers was to have the instances or nodes that were part of the cluster be defined as backends of the load balancer and to have these nodes be configured with iptables rules that would route the requests that came from the load balancer to the correct healthy containers that were actually backends. And so there was a second hop for the traffic that came from the load balancer. Now Network Endpoint Groups basically get rid of the second hop, and it basically optimizes the load balanced application by sending the request directly from the load balancer to the containers that are backends.

CRAIG BOX: The idea of targeting Pod IP addresses directly sounds so obvious in hindsight. What needed to be implemented to make that possible?

INES ENVID: So public clouds have traditionally implemented networking functions to work with VM instances. And therefore things like load balancing or any of the other networking functions have not been working natively with containers. It really requires some modeling, which the networking is implemented to the granularity of IPs that represent the Pods instead of the granularity of the VM instances. And Google Cloud is really investing in making sure that the networking functions-- load balancing being the first one, but across the portfolio-- work natively for containers.

ADAM GLICK: You mentioned targeting the IPs of the Pods rather than of the VMs or the hosts. Can you talk me about why that's important?

INES ENVID: Well, when we are instantiating Pods as part of the nodes in a GKE cluster, really the nodes map to the VM instances. But then the Pods that are relevant to the application that is being hosted on the server, it's really represented at the networking level just at IP level. We are hosting a set of IPs that represent the Pods, and all of them are hosted in a single VM instance. So the VM instance itself is not able to represent any more the granularity necessary for a Pod that is hosted in the VM instance. So that is why it's so important to change the paradigm in how we're implementing networking functions to now understand the granularity of IPs and implement the networking functions natively there.

ADAM GLICK: So if someone were familiar with traditional networking, say for VMs, what's different about this, and how should they think about it?

INES ENVID: The way to think about it is understanding that, as they are configuring the GKE cluster, there is a set of Pods that get grouped in services. And the GKE, the Kubernetes Engine model, it models very well the grouping of Pods as part of a service that is persistent in the cluster. When you apply load balancing to those services, then the model that the Kubernetes Engine provides gets mapped directly to the Google Cloud IP-level networking function model. And in doing so, we're able to really replicate the intentions and the declarative intent that is expressed in the Kubernetes Engine configurations to a later model in Google Cloud that is able to implement that intent with the granularity of containers and IPs.

CRAIG BOX: What are the benefits of being able to route traffic directly to Pod IPs rather than having to go through that second hop of the queue proxy?

NEHA PATTAN: There are actually several benefits of native routing, and we have achieved this using alias IPs. So if you create a cluster using alias IP ranges, then you get native routing support. And one of the benefits of this is that you get anti-spoof checks, and so there is additional security for your routing to your Pod IPs on the nodes.

Another advantage is that you're able to centrally manage all the IP space, and so all the node IPs, the Pod IPs, the cluster IPs, are allocated out of primary and secondary ranges that you can manage centrally in your organization. And the third advantage is that it helps with scalability. And so you're able to create more nodes and more Pods in the cluster.

CRAIG BOX: And then what impact does it have on the fairness and health checking of load balancing?

NEHA PATTAN: With native load balancing, there are several advantages as well. Initially, without Network Endpoint Groups, when we were programming instance groups, the Google Cloud load balancer was able to health check only the node endpoints. And the iptable rules were then configured to route the requests to the actual Pod IPs. And the nodes in the cluster were health checking the actual containers that were serving the traffic of the load balanced application.

Now with native load balancing, you are able to get the benefit of load balancing and health checking natively, the Pod IPs and the containers that are actually serving the application. And so there is an added advantage there.

CRAIG BOX: Are there any occasions where you wouldn't want to use container-native load balancing, where you'd want to use the old method?

NEHA PATTAN: We actually recommend using the new method, mostly because it's more scalable, it's more performant, and it gives better health checking performance, as well as load balancing. Another added advantage is that you're able to get the logging for the flows that come from the load balancer using the correct source IP address, which is that of the load balancer itself.

CRAIG BOX: Yeah, so previously you had to configure services with an annotation that basically said only machines that are running Pods that serve that traffic will be in the instance group for that. And so we can do away with that now with native load balancing.

NEHA PATTAN: That's right.

ADAM GLICK: What will the introduction of service meshes like Istio do to the world of Kubernetes networking?

INES ENVID: It's just a very interesting concept. And Kubernetes has really provided a very successful model for orchestrating and deployment of containerized applications. Kubernetes hasn't solved altogether the networking service-to-service communication, and there is really room for improvement for communicating services in a microservice deployment. So the specific issues that Istio is set to solve is the authentication and encryption security of the service-to-service communication, also richer traffic management for the routing across services, and also observability and visibility for the services. So I truly believe that the addition of Istio and its specific capabilities for service-to-service communication is really going to complement very nicely the Kubernetes capabilities on the networking routing.

CRAIG BOX: Are there any other pieces of low-hanging fruit in terms of networking, of areas where we see that there's an obvious optimization that we should make?

INES ENVID: I think there is a lot of potential room for convergence on some of the APIs that declare aspects like network policies or Ingress in an integration wall with Istio on Kubernetes. I think those aspects are still to be determined. But as we really move towards microservices deployment in which Kubernetes and Istio might complement each other, it's important to come up with convergence in some of the declaration of these data models.

CRAIG BOX: I know that kube-proxy is the most poorly named part of Kubernetes.


Accidentally-- it started out as a proxy, and then it just basically became a router of some sort. Now that we're able to program external access through Network Endpoint Groups, what other roles does kube-proxy play in the cluster?

INES ENVID: kube-proxy is really a very flexible concept. And everything that gets implemented in iptables is extremely flexible. This flexibility comes with tradeoffs for optimization of the routing visibility into the input. I think as we are moving more to a native container network function, definitely in Google Cloud, we want to really remove kube-proxy from the places in which it does not add that additional value, and it was just bringing a convenient way to deliver the functions.

We are really moving those functions to the Google Cloud networking stack, and then we really want to make sure that it doesn't play a role when it's not strictly necessary. I believe that moving forward for having a proxy type of functionality is really the Istio sidecar proxy. It's bringing a lot more to the table as it is able to really treat the traffic at the application level.

And in doing so, it's able to really provide traffic management functions at the application level. It's able to provide authentication of the service. It's able to provide the encryption of the application layer as well. So it is a much nicer, more value-added way of really communicating in services. So therefore, the approach that we want to take is really provide Layer 3, Layer 4 native routing capabilities as much as possible and then shift the intelligence of the proxy to the application layer in which it still brings a lot more to the table.

ADAM GLICK: What's coming up in container networking that you're most excited about?

NEHA PATTAN: One of the things that we've really taken an initiative in over the last year or so in the networking team is to take a Kubernetes-first approach. And so we've been redefining and rethinking our data model. We've been redefining some of the fundamental aspects of networking, how it works, in order to think of containers and in order to support containers as first-class citizens for Kubernetes using GKE.

And that's one of the things that, with the Network Endpoint Groups launch, with container-native load balancing, this is one of the things that we are doing, which is an initial launch in this area. And I think there'll be a lot more that comes up in the space, and all of the work that we are doing here is really exciting.

CRAIG BOX: Ines, Neha, thank you both very much for joining us today. It was great to have you here.

NEHA PATTAN: Thank you so much.

INES ENVID: It's been a lot of fun. Thanks so much, Craig, for hosting us.

CRAIG BOX: You can find out about container-native load balancing in the GKE docs, and you can find that link in the show notes.


ADAM GLICK: Thank you for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @KubernetesPod or reach us by email at kubernetespodcast@google.com.

CRAIG BOX: You can find all the links from this episode and every other at our website, at kubernetespodcast.com. Until next time, do take care.

ADAM GLICK: Catch you next week.