#209 October 6, 2023

What's new in Istio, with John Howard and Keith Mattix

Hosts: Abdel Sghiouar, Kaslin Fields

This week we explore what’s new in Istio with core maintainers John Howard and Keith Mattix.

Do you have something cool to share? Some questions? Let us know:

News of the week

KASLIN FIELDS: Hello and welcome to the "Kubernetes Podcast" from Google. I'm your host Kaslin Fields. And--

ABDEL SGHIOUAR: I am Abdel Sghiouar.


KASLIN FIELDS: This week, we chatted with Keith Mattix and John Howard. Keith and John are core maintainers of the Istio project. We chatted about Istio, the status of the project, the new ambient service mesh architecture, EBBF, and more. But first, let's get to the news.

ABDEL SGHIOUAR: Linkerd the announced version 1.24. The new version of the service mesh tool comes with multi-cluster support, gateway API conformance, and many, many new features. Click the link in the show notes for details.

KASLIN FIELDS: Amazon announced plans this week to invest up to $4 billion in AI startup Anthropic. Anthropic offers an AI-powered chat bot, which recently launched its first consumer facing premium subscription plan. Anthropic has made a long-term commitment to provide AWS customers around the world with access to future generations of its foundation models via Amazon Bedrock, AWS's fully-managed service that provides secure access to the industry's top foundation models.

ABDEL SGHIOUAR: The Call For Proposal or CFP for Kubecon and CloudNativeCon EU is open until November 26th, 2023. The event will take place on March 19 to 22 next year in Paris, France.

KASLIN FIELDS: The CNCF announced cloud-native security slam. The 30-day challenge is designed to help project creators and users improve their software supply chain security. Participants will have access to a library of curated content to learn and improve their software security posture. Registration is open. And the event will take place from October 10th to November 9th, 2023.

ABDEL SGHIOUAR: The Istio Certified Associate or ICA is a new certification launched by the Linux Foundation. The new certification which was initially developed and maintained by Tetrate is designed to help people assess their level of proficiency with Istio. The exam is bookable today, but the ability to schedule it will be possible somewhere in November 2023.

KASLIN FIELDS: Sonatype said it discovered 14 different NPM packages targeting Kubernetes installations. These packages use a technique called dependency confusion to hide themselves as JavaScript libraries in components. Once these files are inside a target machine, they run obfuscated code that collects sensitive information, such as usernames, SSH keys, and IP addresses, and ships them to a remote domain. And that's the news.


ABDEL SGHIOUAR: Well, hello, folks, to a new episode of the "Kubernetes Podcast." Today, I'm talking to Keith Mattix II, who is a software engineer lead at Microsoft, and John Howard, who is a software engineering lead at Google. Both of them are long-term contributors to Kubernetes and Istio.

And we got them to the show today to discover what's going on with Istio. Because last time, the topic have been on the show was quite some time ago. Welcome to the show, Keith and John.

KEITH MATTIX: Happy to be here.

JOHN HOWARD: Yeah, happy to be here.

ABDEL SGHIOUAR: Thank you for accepting my invitation on such a short notice. As I was saying before, we figured, hey, let's just check in with our friends at Istio and figure out what's going on there, because quite a lot of things have happened. I'm going to go in an unorthodox way and start in reverse order. The last one, native sidecar supporting Kubernetes 1.28, I read the article that John have published on istio.io on the blog. Can either of you shed the light on what's that change, and what does that mean actually for Istio?

KEITH MATTIX: Yeah, sure I can talk about sidecars in general, and then let John talk about the process of bringing over to Istio. So sidecars in Kubernetes has been a long-time pattern without a lot of rigor, without a formal API. And this has led to a lot of weirdness as people try to deploy sidecars in production and use them in the way that they're used to using other Kubernetes pods.

So a classic example is when you run a sidecar on a cron job, once that cron job has completed its execution, the main container finishes its execution, the sidecar will continue to run. And you get what I've called zombie cron jobs that just keep on running, and running, and running unless you do some weird trap-- Linux trap magic to make it stop. Between that and some resource issues and startup ordering, it just hasn't always been super clean if you need to do sidecars in regular containers and even in net containers in Kubernetes.

And so in 1.28, as an alpha feature, some folks between SIG node and SIG network and Kubernetes got together and said, we're going to fix this. And John and I both popped in and worked in that as part of that effort at different times-- so really, a group effort from a lot of folks across the community-- in order to find a way to bring that sidecar pattern to parity with an API and be fully supported as an API. And the scope was specifically kept small in order to make it workable, but it worked. And so now in 1.28, and as an alpha feature, you have to turn it on when you boot up your cluster, but folks can actually get native sidecars in Kubernetes built into the API, which is a very exciting thing.

ABDEL SGHIOUAR: Yeah, definitely. I was reading a Reddit thread in which Tim Hockin was replying asking-- so people asking about design choices, and the reply that I pretty much liked was it's a noninvasive design choice without too much changes. And speaking of that-- I'll turn it back to you, John. I was wondering why the implementation have been done through init container with a restart policy and a very simple-- put in simple between codes can the logic flow in the kubelet rather than, let's say, I was expecting the fix to be like a full spec sidecar.

JOHN HOWARD: So for some context, for folks that haven't-- if you haven't read the blogs, there's blogs on Kubernetes, Istio, Linkerd, a bunch of people have talked about it, so plenty to read up on. But the basic implementation is that today, a sidecar, you typically just have a normal container. And there's nothing special about it.

Now, it will be an init container, but it will have this new field restart always, which is kind of obscure. But it basically means it's a sidecar. And it gives properties that-- there's no sidecar anywhere in the API exactly, but it more or less means it's a sidecar.


JOHN HOWARD: So there's a few reasons for that. One is that adding more and more fields that are-- we don't want to have like an init container, a sidecar container, a new container someone comes up with a field and keep having all these different types of containers. But this API has been under discussion for a very long time. If you go back and read blogs from the Kubernetes 1.16 release, there's actually a blog post saying that this feature was launched-- not just that it was planned, but that it was actually launched.



JOHN HOWARD: And that was four years ago, right? Of course, did not launch four years ago. It was kind of almost launched, and then decided not to. So this has been under discussion for quite a while. And so there's been probably 100 different APIs that have been pitched.

This one kind of meets almost every requirement we would want, and it's kind of very minimally invasive, like you said. It's just one extra field on an init container. But one of the especially nice things about it being an init container is you can sequence it with other init containers, for example.

So we're from Istio, so we do service mesh. One of the things a service mesh needs is a MTLS credential. Easter doesn't work this way out of the box. We have our own MTLS provisioning flow. But you could, for example, have an init container that provisions a certificate.


JOHN HOWARD: Now, how do you run that if we just say it's a sidecar container? Does it run before the sidecar container starts? Does it run after? We have this actually sequencing between init containers, and sidecar containers, and real containers. So by putting it in the init container, it makes it quite easy to sequence.


JOHN HOWARD: The other options were this giant dependency graph of containers. And some very complex stuff was pitched which would be very powerful, but again, very complex. I don't know that anyone loves it, but everyone likes it, and that's what it takes to get something approved these days.

ABDEL SGHIOUAR: Yeah, I think that's a really interesting way of putting it, because like myself, a lot of people that read about this in the first place, the title is a little bit confusing. Because it says "native sidecars." And then you go read the implementation, I'm like, OK. Well, it's just an init container with a special field, essentially. But I still think it's a powerful implementation. I still think It solves the problem for probably 95% of the cases right.

JOHN HOWARD: Yeah. And at the end of the day, most people actually probably-- well, at least most users won't actually know or care how it's implemented.


JOHN HOWARD: In Istio, the sidecar is injected automatically. And so unless you're looking at things under the covers, what's going to happen is, eventually, once it's fully rolled out on it by default, Istio just does this for you and things just work better. Right? It's a completely transparent improvement for users.

ABDEL SGHIOUAR: Correct, except that I still think that some users would care. One of the use cases I think about would be something like fetching a password or certificate from a vault. Not talking specifically about Hashicorp vault, but a vault or a secret store. That's typically done through sidecars. Right?

Because one of the things you want to do is you want to monitor if the passwords have been rotated and maybe reload the app or whatever. So I think people there would care as well. Another use case, I'm just going to quickly mention this and stop rambling, Argo, not Argo CD, Argo Workflows, which is an open source tool that does workflow orchestration, I have worked on a project where adding Istio to this became problematic.

Because Argo workflows will orchestrate your workflows in the form of jobs and pods that are started, execute a job, and then die. And if those jobs take a couple of seconds by adding Istio to them, with all the waiting, and all the retrying, and all the stuff, it becomes a minute to execute a job. So now if you have a native-ish way to handle sidecars, it's better, essentially.

JOHN HOWARD: Yeah. I'll give you another example that I think is pretty relevant. I used to work doing DevOps for a data engineering team, and they were using Apache airflow to orchestrate a lot of their jobs. And as AI and ML is one of those very popular things that people are talking about nowadays, a lot of folks who are doing large batch operations in Kubernetes clusters run into that exact same scenario you were explaining with those Argo workflows, where they've got to go and have a vault sidecar to get credentials, and short-lived credentials talking to their blob storage, or what have you.

And to try to do that through using Istio, and they've got that job as part of the mesh and it's encrypted, you really run into a lot of issues trying to have the sidecar not just continue to churn and continuing to run after the main job has exited. And so there are ways around it. Istio, in particular, has a-- or Envoy, in particular, has a quit path you can use a trap to hit, or other folks will run a cleanup job in the cluster every so often. There's a whole project around sequencing containers and killing processes like this. But now that we've got native sidecars in Kubernetes, at least in Alpha, it'll work a lot more seamlessly for folks in the data batch job side of the Kubernetes consumption.

ABDEL SGHIOUAR: I'm going to jump a little bit ahead and ask very quickly, is there any work going on with the Istio community on optimizing the sidecar itself, like startup time, like the time it takes for the sidecar to be ready?

JOHN HOWARD: That's a good question, sort of.


A lot of our focus on improving the performance, and specifically resource overhead of Istio has been in making it part of Ambient. A lot of the concerns we have are architectural issues that are hard to address with the current sidecar architecture. The startup is pretty quick in general, though. It takes about 1 second on a medium-scale cluster,


JOHN HOWARD: A lot of the issue is actually-- seems to be on the Kubernetes side. I don't want to call out Kubernetes for simply being slow when I'm not an expert there, but it does seem like once the container is actually started, it's fairly quick to get ready. So I don't know. I guess my answer is, maybe. [LAUGHS]

ABDEL SGHIOUAR: Maybe. OK. Well, you mentioned Ambient mesh, so let's go there. Maybe one of you, or both of you, have been asked to answer this question hundreds of times. But can one of you give us just a very quick, what is Ambient mesh?

JOHN HOWARD: Sure. Yes. Like we talked about, Istio and other service meshes traditionally have been implemented with the sidecar proxy. You stick another container alongside every pod. That runs the service mesh and does all of the cool things that we like.

Ambient is kind of rethinking that architecture, and it does so by splitting apart the service mesh into two layers. Usually, we start the other way around, but I'll start backwards, because I think it makes more sense. We talked about the issues with sidecars. Right? And some of these are fixed by the sidecar container enhancement, but a lot of them aren't.

Even with that enhancement, it's still hard to onboard the mesh. It's still hard to upgrade it. You can't go and just take a running pod and say, hey, I want the new Envoy version because there was a CBE or a new feature I want. You can't just add it dynamically to a pod. Usually, the Istio onboarding docs is like, oh, first run Install, and then go restart your 10,000 pods, and come back in a week once that's done.

ABDEL SGHIOUAR: Yeah. It's pretty invasive. [LAUGHS]

JOHN HOWARD: Yeah. So-- and there's other issues, too. There's resource footprint of using it in every pod. The list goes on and on. You can find long lists of issues probably somewhere. [LAUGHS] So with Ambient, we said, OK, we want to take that proxy and completely decouple it from the pod.


JOHN HOWARD: We want to run it externally, and externally could be on a different node. It could be completely outside the cluster if we wanted. It's really just this independent standalone thing. We run it typically in the normal deployment as just a standard deployment so it can be auto-scaled up and down, and that has quite a few benefits.

One, it's now just a normal pod. So we can scale it. We can upgrade it, do standing rolling restarts. And we can share it between application pods. So we can amortize some of the costs of the sidecar functionality.

But now that it's standalone, we have an issue. Right? How do we get the traffic from the application pods to the service mesh proxy, and how do we do it securely. One of the huge value adds of service mesh is that we have MTLS everywhere.


JOHN HOWARD: If we're sending all this traffic plain text to this remote proxy, we're not getting a lot of value out of the TLS. Right? So we added another component, which we call the Ztunnel, which is really this per node component that is very small and really only responsible for two things. One is encrypting all traffic between pods, and two is getting traffic to the waypoints. Oh, I think I forgot to mention a waypoint is what we call that remote proxy.


JOHN HOWARD: So getting traffic to the waypoints when it's required. So we call this splitting the two layers in half. We have the L4 secure transport layer, and then the full L7 traditional service mesh layer with the waypoint.

ABDEL SGHIOUAR: And we're going to explore a little bit that more in details, I think starting at the layer 4, the Ztunnel. It's a Rust-based tunnel, which I think the community have basically brought from scratch. Right? It didn't exist before.

JOHN HOWARD: Yes. Correct.

ABDEL SGHIOUAR: So a lot of questions there. Why dedicated proxy? I mean, I think that one of the biggest points of contention in the service mesh space is EBPF or not EBPF. Right? I bet you have heard this many, many times before.

JOHN HOWARD: Of course. Yes. Many, many times.

ABDEL SGHIOUAR: We're not going to point any fingers today. We're just going to try to answer why Ztunnel, and not EBPF.

JOHN HOWARD: There's probably more "why nots," because we actually considered quite a few Ztunnel implementations. And we actually made quite a few Ztunnel implementations before we did the initial launch as prototypes.


JOHN HOWARD: So if we go back to what Ztunnel's goals are, one of the key goals is to secure transport between all pods that are communicating. And we had made the decision early on in Istio, and continue to reassess and make the same decision to use mutual TLS as that encryption and authentication layer. This is the industry standard. It meets all the compliance requirements.

It's highly established across service mesh, and well beyond service mesh. Everyone is probably viewing this podcast over HTTPS somewhere. And so we wanted to double-down on that. Going directly to the EBPF question, EBPF really is about programming the kernel in a very highly-limited programming environment. Right?


JOHN HOWARD: TLS is extremely challenging to do in that environment. There's some kernel TLS, KTLS support in the kernel to some extent, but it's very tricky and hard to use. It's not something that's really been done widely or is well established. Once you want to do something outside the kernel space, you're then sending packets up to user space proxy. Right?


JOHN HOWARD: That's one of the biggest issues with EBPF, is it's kind of all or nothing. It's much easier to program something in user space. You have all the programming languages, you're not constrained by the limited programming environment, and you can use whatever library has battle tested things you want. Right?


JOHN HOWARD: So if you can do everything in the kernel, it's quite nice, because you don't have to pay the cost of sending things to user space, and you can do whatever you want. But if-- the second you have anything in user space, the more stuff you're doing in the kernel, the harder it is, because it's so constrained. Right?


JOHN HOWARD: So we knew that we were going to need to go to user space for TLS, but also for the tunneling, which I'll get to in a moment. And so it was clear for us that EBPF wasn't going to cut it here. The choices then was like, OK, do we use Envoy, which is what we used for the rest of Istio for our sidecars and our gateways, or do we consider something else? And if we consider something else, what is that?

So we actually implemented in the Envoy version of Ztunnel, and a Go version, and a Rust version. All three of them. And I think at one point, they even all passed the same tests, and we had them all side-by-side working.

At this point, we've consolidated on just one option. And we really looked at those and considered, what is the current performance? What is the potential optimal long-term future performance? how fast can we iterate on it? And a lot of those things.

And what we found was that Envoy-- I mean, Envoy was the obvious choice. Right? Everyone on Istio is familiar with Envoy, because we've been working for it for everything else, and it's a very established project. It's a great project.

But we found that for what we were molding it to do, it became challenging. Envoy was really designed as a service mesh proxy and ingress gateway proxy. But we were now using it in a different way, where we're this-- almost like a router. It's somewhat different.

So Envoy is very flexible, but putting it through this different model was really stretching its limits in performance and usability, really. It's much easier to be able to say in code, oh, use MTLS. That's a Boolean.

Versus in Envoy, saying, oh, here's 100 lines of config for every pod telling you what doing TLS means. Right? It's very flexible, but it's also very generic. So you have to program very extensive things.

And you can't bake in Istio-specific knowledge into Envoy because it's not an Istio-specific project. So yeah, those were the motivations that led us to building the Ztunnel from the ground up. And we chose Rust versus Go because of some performance numbers. Really, that was the less-interesting decisions, I think.

ABDEL SGHIOUAR: I think it's one of the interesting decisions for me, personally, because I do like Rust as a programming language. But I just want to go back to one thing you mentioned, John. When you were talking about using Envoy to do what Ztunnel does, or building a new one, it felt to me the same discussion of using Docker as a runtime in Kubernetes, or using container D.

Docker can technically be used as a runtime, but do you want to use a thing that can do more than just a runtime just to execute a runtime? Which is kind of like-- Envoy can do more than just MTLS, and trying to bend it to just do one simple thing feels like it's easier to just build a new thing from scratch. Right?

JOHN HOWARD: Yeah. I mean, it's also always a tricky problem. It feels easy to build something from scratch when you're only--


JOHN HOWARD: --considering the simple use cases. And then it scales up. It scaled up more already than I thought it would, to be honest. I still think it was the right decision. But when we first implemented it to meet what then was all the requirements, it was, like, 1,000 lines of code. And now it's maybe 5,000. Nothing crazy, but quite a bit more than what it initially was.

ABDEL SGHIOUAR: Yeah, but I still think it's substantially less than Envoy itself.

JOHN HOWARD: Yeah. And that was also one of the things, is even if Envoy and the Rust version were at the same level of performance when we started, which was not the case, the custom-built one had quite a bit better resource footprint and other characteristics, we thought that by making it purpose-built, we could have a guaranteed better long-term path. As long as we're willing to invest enough engineering resources, which Istio is a pretty big project with a lot of contributions from a lot of different companies, so that's a pretty good assumption, that we could beat performance as something that's general purpose because we have a very, very specific goal of just doing secure transport. And that's all we need to focus on. We're not going to beat Envoy at all the things Envoy does. Envoy does a ton of stuff.


JOHN HOWARD: No one is competing with Envoy there. We just want to do one thing and focus only on doing that thing well. And so we thought with that focus, we could outperform Envoy.

ABDEL SGHIOUAR: That's a very valid point. And when you were talking, you talked about tunneling. The Ztunnel uses something called HBONE. Maybe Keith, you want to shed light on that, what's HBONE. I've never heard about it before hearing of Ztunnel.

KEITH MATTIX: Sure. And I've got a small anecdote here. When Istio launched Ambient mesh back in September of 2022-- John, correct me if I'm wrong. I think that's when it was.


KEITH MATTIX: I remember seeing it, and I was like, huh, HBONE. OK. Let me see what it is. And I tried to Google it and found nothing.

So it's an acronym that if you Google it, you're going to find nothing but Istio resources. But really, what HBONE is, it's actually a fairly generic protocol. It's based on the masque standard in IETF.

So if you go and Google "masque," M-A-S-Q-U-E, you will find more information about it. And it's really just a way to generically tunnel traffic over HTTP. And the actual masque proposal, the masque standard, it's based on Quick, HTTP 3.


KEITH MATTIX: But we are currently just using HTTP 2 due to some, I believe, performance concerns for Quick. No, not performance. I think it's support of some libraries. I forget.

But essentially, we take the masque protocol and we apply it to a well-known port, and we set the authority header-- use the authority header to indicate the destination service. And when you put all those components together, so a tunneling protocol of well-known ports and a way of using the authority header, that together is what gets you HBONE. And the way that Istio uses this HBONE protocol is for the mesh transport that John just described.


KEITH MATTIX: So previously-- right now with Istio, you've got what we like to call transparent proxying. And the typical way that most meshes do transparent proxying today is by taking the same network byte stream. If you're proxying an HTTP request, that same request is going to go from your client to your server, and it's going to be relatively unchanged.

And the way that Istio indicates that this traffic is designated for a mesh is via doing some creative-- by adding some creativity to how we handle a TLS extension called ALPN. And it works pretty OK until you try to do some more complex things. You also end up having to modify the request itself if you want to do anything with headers.

So a classic example is rate limiting. If you want to add rate-limiting headers back and forth in a sidecar mesh or a traditional transparent proxying mesh approach, then your actual traffic request that's got your application data, your application payload, is going to have those additional headers and telemetry information encoded in that transport. By adding in this tunneling protocol, we can actually keep 99%, or really all, of the mesh transport-specific information in the-- I guess you would say the overarching tunneling protocol. So the HBONE request is going to have the headers, and the telemetry, the peer metadata, all these rate-limiting headers, et cetera. All that information is going to be living within the HBONE transport, and your application data will remain unchanged within the tunnel.


KEITH MATTIX: And so that's really what HBONE is. And the fact that we add a well-known port to this masque protocol, this tunneling protocol for HBONE, means that we can make some assumptions within the Ambient mesh about where traffic is supposed to flow when we're doing redirection.


KEITH MATTIX: So, yeah, that's HBONE in 3 minutes.

ABDEL SGHIOUAR: OK. That's a much better description than I could find myself on the internet. A lot to unpack there. So let me ask this very simple question. Ztunnel is technically still a transparent proxy. Right? The pods don't even know it exists.

JOHN HOWARD: Correct. Yes.

ABDEL SGHIOUAR: I remember one of the discussions that we had, I think it's with you, John, early on the Ambient mesh design is the fact that the Ztunnel can impersonate the identity of the pod. So essentially, the receiving pod on the receiving end does not see the proxy as a requester. It sees the pod as a requester. Right?


ABDEL SGHIOUAR: Is that through certificates? Because it's technically a layer 4 thing. Or is it through the service account? I mean, I guess it sends the service account of the pod.

JOHN HOWARD: So how it works-- I think there's two parts, is, one, how is it transparent, and then how does it do the identification?


JOHN HOWARD: The transparency part is really just a usage of the Linux networking stack. They have this feature that's called IP Transparent, actually. Well-suited name.

And that, plus T proxy, plus maybe a few other things, allows you to send requests that claim you're from some other IP address. And so it allows you to do a transparent proxy jig, which is exactly what we want to do. So we can have something in between two destinations and preserve the source IP and destination IP on both sides, basically.


JOHN HOWARD: If that sounds very scary, it requires high privileges, which is why we don't do this in sidecars, for example. Because we don't want to give every application root permission, but it's fine to have this node-level proxy that already is receiving all traffic, so it's already highly privileged, to have that privilege. Now, the other end of your question, I think, was how do we actually identify which pods are talking to each other now that it's not actually the pods talking to each other. They're talking on behalf of the Ztunnel.

So each Ztunnel is actually going to have a-- when they communicate, they are using the workload certificates that are tied to each workload's identity, just they would in the service mesh. So if I have a client application talking to a server application, there will be a TLS handshake where two certificates are exchanged. One will be the client identity, one will be the server identity. And it does that by each Ztunnel that runs on the node, getting a unique certificate for every identity running on that node.


JOHN HOWARD: So if I have five different pods running on my node, I may have five different certificates in Ztunnel. And it aggregates them and picks the right one based on the source and destination traffic.


JOHN HOWARD: Importantly, each Ztunnel only has access to-- that's not just how we code a Ztunnel. That's actually how we set up the permission model. It only has access to the pods on its node. So it can't just say, hey, I'm Ztunnel. Give me every identity in the cluster, and I'm going to go "man in the middle" everyone. It's very scoped, just how kubelet is implemented, where kubelet can't just ask for an arbitrary secret.

It can ask for a secret running-- that a pod on that node requested, and then mount it in the pod, but it can't just ask for arbitrary ones. So we do the same thing in Ztunnel to make sure that the blast radius of the permission of Ztunnel is scoped to that node. So it has a node privilege, but it's not a cluster-wide privilege.

ABDEL SGHIOUAR: And I guess also that's good from a resource perspective, because if the Ztunnel doesn't have to know the certificates of all the workloads in all the cluster, then its resource footprint would pretty much be limited to whatever runs on the node. Right?


ABDEL SGHIOUAR: Yeah. Cool. That's actually pretty interesting. And so then the next natural question would be, what happens if two pods are talking on the same node? Do they still go through the Ztunnel?

JOHN HOWARD: Yes. The answer is, yes, they go through the Ztunnel. It goes through the Ztunnel because we can apply policies. Basically, network policies. But we also added the ability to do identity-based policies.

So a network policy, you say, these IP addresses can talk to these IP addresses. Whereas an identity-based policy is saying, the server can talk to the client. And we verify that based on the certificates. Right?


JOHN HOWARD: So because we still may need to enforce those policies for two pods talking on the same node, as well as telemetry, we do capture all the traffic. What we don't do, though, is encrypt that traffic, because there's nowhere to encrypt it to.

Ztunnel could, I suppose, in theory, loop around and call itself and encrypt that. But it's already Ztunnel to Ztunnel. So we have a fast path. But from the security perspective, it's equivalent, because we still applied the policies and whatnot.

ABDEL SGHIOUAR: When I was trying the Ambient mesh, in the authorization policy, there is actually two modes. You could set it based on the identity or based on the request, depending on whether you have the waypoint proxy or not. But just with Ztunnel, you can already have an authorization policy that says, this service account can talk to this service account, or, this service account cannot talk to this service account. And then you add the waypoint proxy, and then you add the layer 7 get request on /EPI, /V1. Whatever, whatever. Right?

JOHN HOWARD: Exactly. Yeah.

ABDEL SGHIOUAR: Cool. So because you mentioned about the encryption part between the pods and the Ztunnel-- because I have done a couple of talks about Ambient mesh in general, like Introduction to Ambient Mesh. And the question I get all the time is, well, now the unencrypted path, it's much longer than with sidecars. Because with sidecars, it's basically a local host.

JOHN HOWARD: Yes. That is true. I gave a talk about this, actually, an entire talk, 30 minutes about this, at Amsterdam KubeCon. If you want the diagrams and full descriptions--

ABDEL SGHIOUAR: We'll make sure to have a link for it in the notes.

JOHN HOWARD: I will test my explanation skills with no pictures and less time.


But, yes, if you look at the pictures, you see-- and you draw a line of where the unencrypted is and where the encrypted is, with Ambient, the line is longer than with sidecars. And with sidecars, the line of unencrypted traffic is also longer than doing TLS directly in your application.

ABDEL SGHIOUAR: Of course. Yeah.

JOHN HOWARD: And so if you look at it, you say, that's not secure. I don't like that. I don't want this.

But if you start to actually look at the threat model, it starts to become a lot similar. Our goal is not to say that this line is shorter. Right? Our goal is to say that our application is secure. Not even the application, but maybe your entire architecture is secure.

And so if we look at how someone can exploit this longer segment of unencrypted traffic in the three different modes, they end up being quite similar. So you may think, even if doing TLS directly in the application, there's no way an attacker can somehow "man in the middle" of that and see my unencrypted traffic. That's actually not true.

If they have high permissions on the node, which an attacker may get if they escalate some container breakout or other means of getting root on the node, they can actually look at the pod's memory. Or, using EBPF, it's actually quite easy to look at the unencrypted data that is being sent through the open SSL libraries, for example. It's really-- it's one line of code in EBPF with some tools. And you can see all the unencrypted traffic coming directly from the application. If you have root on the node, your node is compromised.

ABDEL SGHIOUAR: Yeah, of course. Yeah.

JOHN HOWARD: So everything's kind of out the window at that point that's why one you should protect your nodes quite a bit and two you should make sure that you don't have lateral movement from one node to another, which I mentioned earlier.


JOHN HOWARD: So it's quite similar for the sidecar and an Ambient. If you have permission to go do a TCP dump inside of the pod network namespace to see the unencrypted traffic between the application and the sidecar, then you have root on the node anyways. And it's the same with Ambient. If you have that permission to inspect the traffic, once again, you have root on the node. You could do all sorts of things like just remove the sidecar, remove Ambient, sniff with EBPF. Right?


JOHN HOWARD: So at the end of the day, it does feel worse, but it's largely, I think for most people, essentially equivalent from a security point of view.

ABDEL SGHIOUAR: Yeah. And as you said, also it always depends on the threat model and what are you trying to protect against. But I like the fact that you mentioned that-- it sounds EBPF is a double-edged sword. I mean, you could rely on it to do a lot of things, but then you have privileged access, and then you have this Pandora's box of observability that you can just exploit.

KEITH MATTIX: I mean, it's just the kernel.

JOHN HOWARD: Yeah. I'm sure there's ways to do it without EBPF. It just makes it very convenient.



Yes. So you folks have been to KubeCon before, and you know every year, KubeCon has one large theme. There is always something which is the thing.


ABDEL SGHIOUAR: Whether it's cost optimization. That was the thing from Amsterdam this year. EBPF was the Valencia thing last year. Everybody was talking about EBPF.

And to me, it always sounds when you have a hammer, everything looks a nail. [LAUGHS] It's a pretty fascinating space to be in, to be honest. Let's not dig ourselves a hole there, and just go further and talk about-- so we talked quite a lot about Ztunnel, which is layer 7 stuff. Now the waypoint proxy. So that's the namespace multi-tenant shared proxy for everything layer 7.

I guess I have a more precise question. Is today-- because obviously, Ambient mesh is still experimental. You have to create it. You have to use the CLI to create the waypoint proxy. Are we going to be in a space where, if you just have a policy, the waypoint proxy is spun up automatically for you? Is that the goal?

KEITH MATTIX: I wouldn't say that's the goal. I do think we plan to make enhancements for deploying waypoint proxies. I think one notable thing we plan to do is make a way to do it via Helm, for example, would be the low-hanging fruit. But the key with waypoint proxies is that, and you'll see this throughout the Kubernetes community and ecosystem in general, is that we try to take a persona-based approach to designing these components and architecture.

So when you look at who is going to be deploying the waypoint, there are a couple of options. One, it could be an application developer who has autonomy over their namespace, and they say, hey, I want to be able to do L7 policy here. And so they will deploy a waypoint proxy. That makes sense. It seems relatively intuitive.


KEITH MATTIX: You also might have-- on the other flip side of that, you might say, OK, this is a very top-down, enterprise-y type business. And so you might have a cluster admin who says, OK, everybody can only use the Ztunnel, unless we specifically deploy waypoint proxies inside this namespace ahead of time. In both of those cases, you want to have the declaration of intent from the user to create a waypoint proxy to be separate from applying the policy.


KEITH MATTIX: Because if applying-- in the application developer case, if you creating an authorization policy for L7 means a waypoint proxy gets spun up, then somebody who just has authorization policy creation access can now allow a new pod to be spun up, which takes extra resources, et cetera. And it also changes a bit how the traffic flows in Ambient from the Ztunnel to waypoint.


KEITH MATTIX: So it doesn't leave you with really snug feelings when we think about security. On the other side, if it's a cluster admin, similarly, you don't want an application developer being able to override the intent of the cluster admin. So typically with declarative systems like Kubernetes, you want the declaration to be very specific.

And so it's a trade-off between explicitness and making somebody say exactly what they want to do, and ease and things being simple. And so while it might be easier maybe for somebody to just have an affiliation policy automatically spin up, that assumes that the person creating the policy has a positive intent, and that's not something that we try to do. John, you have anything else you wanted to add to that?

JOHN HOWARD: It's maybe only tangentially related to this, but I think an important thing about waypoint proxies is that you don't necessarily need them everywhere. In the service mesh model today, if you really want to get value out of service mesh, the first step you need to do is onboard your entire cluster to service mesh, more or less. You don't have to, but it's really-- if you want to go and say, hey, I want to enforce that all traffic is MTLS encrypted, well, first you have to go make all the pods have the sidecar. Right?


JOHN HOWARD: If you want to apply a canary policy, even though as a service provider, I'm the one writing the routing rule that says to send 1% of traffic to my new experimental version, that policy is actually applied on all the clients. So I need to go tell all my clients to start using the sidecar first. Right?


JOHN HOWARD: With Ambient, it's not really the case. Step 1 would be you get Ztunnel everywhere, which is extremely noninvasive. And one of our goals is compatibility, which is not strictly a goal with the sidecar-based approach in Istio. We don't modify the HTTP requests, we don't do load balancing.

It's transparent. We do the original source spoofing that we talked about earlier. Yes, you still do have to do that. But it's much easier, and we expect that it could be like checkbox, or something done at cluster creation.

And then later, as you want, you can go to one namespace and say, hey, look. This application, I really need to do a canary here. It's only this one. I'm going to deploy waypoint just for this one specifically. And it's a very intentional action.

I don't think that for most users, unless they're all-in on service mesh, and you have experts that have won everything, they probably don't need a waypoint everywhere. Right? You may just need it for a few applications that you really need the HTTP functionality. Whether that's WASM plugins, canary routing, telemetry, whatever they want. So that's part of why I also think the explicit enablement of waypoints is quite nice, because it's really a very different world with Ambient. It's not like you need a waypoint everywhere.

ABDEL SGHIOUAR: That's actually a very good way of putting it, because I haven't thought about the explicitness versus the ease of use. This means that if I drop a layer 7 policy, whether it's a routing rule, authorization policy, or whatever, if the waypoint proxy is not there, obviously, that rule would not be enforced. Right?

JOHN HOWARD: That is true. Yes.


JOHN HOWARD: Which is obviously an issue. And we are trying to find good ways to signal that to users. So if anyone has suggestions, feel free to send them our way.

ABDEL SGHIOUAR: I think I do, which will lead to my next question, is about the observability part. One of the strongest things with Istio sidecars-- with Istio with the sidecar model is the observability, is being able to collect fine-grained telemetry around what's going on in the mesh. And I did, obviously, test Ambient mesh with Kiali. It just works out of the box. You get the graph, you can see what's going on.

So in Ambient mesh, what are the telemetry points? Because they're not the sidecars anymore. So the collection points, what are they?

JOHN HOWARD: We have basically the Ztunnel and the waypoints. They can both collect telemetry. Now, Ztunnel is only operating at L4 or the TCP layer, so you're not going to get HTTP request metrics like you would with sidecar today. But we can get connections opened and closed, and bytes sent, things like that.


JOHN HOWARD: And then the full, I've got 503 error, all these HTTP-level things, that's at the waypoint level. So a lot of the HTTP metrics, very, very nice. If you want a service graph to see what your dependencies are between your services, the TCP-level metrics work fine as well. So that's why you saw on the Kiali graph, it's very easy to then just get an understanding of what's talking to what, which is very powerful, even if you can't see the fine-grained HTTP metrics.

ABDEL SGHIOUAR: Well, I do see that there are a bunch of policies deploying a bunch of namespaces. But if there are no waypoint proxies to enforce them, then maybe just raise a metric to say, hey, look at this.

JOHN HOWARD: Oh, yeah, yeah. Absolutely.

KEITH MATTIX: John's actually got a proposal for some control plane telemetry enhancements that I'm pretty excited about in helping to raise those potential misconfigurations or improvements to users when you get inside those situations.

JOHN HOWARD: Yeah. We want to do things like let you know if you're using deprecated features, or experimental features, or things like that as well. We haven't really changed the metrics on the control plane for many years, so there's quite a bit of improvements we're looking to make there.

ABDEL SGHIOUAR: That's exciting. I think control plane metrics are also becoming a big thing in Kubernetes itself. So I think if Istio follows the same model, it's pretty cool. It would be pretty cool.

So then I just want to quickly wrap it up. We talked about Ztunnel and waypoint proxy quite a lot. So let's say I have two pods, two different nodes, I have an L7 policy, a waypoint proxy and the Ztunnel. What's the end-to-end flow? How does it look like?

KEITH MATTIX: Are both of the-- client and server, they're both in the mesh? They both have Ztunnels?

ABDEL SGHIOUAR: Both in the mesh, yes.

KEITH MATTIX: OK. So the way that the traffic is going to flow is that your client is going to send a request. In typical fashion, it's going to try to access the server via the Kubernetes service.

The redirect rules that-- between Istio CNI-- the Istio CNI programs on the host are going to send that traffic to the Ztunnel.


KEITH MATTIX: The Ztunnel is going to apply any policies at that level there, and then it's going to do a lookup and see, OK, where is this traffic destined for?


KEITH MATTIX: OK, it's in the mesh. Oh, it's got a waypoint deployed. And then it's going to send the traffic to the waypoint proxy specifically.

From there, the waypoint proxy is going to apply the L7 policies that are deployed. That's where you're going to get another telemetry point for the L7. And then once the waypoint proxy applies the policies, parses the HTTP traffic, it's then going to send the traffic to the server pod.

ABDEL SGHIOUAR: The receiving end.

KEITH MATTIX: The receiving end. The redirection rules on the server pod is going to send that traffic back to the server Ztunnel, and then it's going to proxy that back to the pod. And everything from the client Ztunnel to the server the Ztunnel is HBONE unencrypted. I think that's right.

ABDEL SGHIOUAR: So basically, pod, client Ztunnel, service Ztunnel, waypoint proxy, receiving pod.

KEITH MATTIX: Close. Pod, client Ztunnel, server waypoint, server Ztunnel.

ABDEL SGHIOUAR: Oh. So if there is a waypoint proxy, the traffic doesn't go through the Ztunnel on the receiving end?

KEITH MATTIX: It does, but after the waypoint.

ABDEL SGHIOUAR: Oh, interesting.

JOHN HOWARD: Yeah. The client Ztunnel is smart enough to know that the request is going to need to go through the waypoint. So we just go directly there.

ABDEL SGHIOUAR: Ah. So the waypoint is able to unpack MTLS, then, in this case?

JOHN HOWARD: Yes. The waypoint will terminate the MTLS, inspect the inner request, and do whatever it needs through the HTTP parsing and whatnot, and then re-encrypt it and send it on to the destination.

ABDEL SGHIOUAR: OK. I guess that's-- if there are no diagrams already on the documentation that explains this flow, we need something badly. Because I was under the impression that the traffic would always go through the Ztunnel, regardless of whether there is a waypoint proxy or not.

JOHN HOWARD: Yes. There's many open PRs and issues to add docs. So the docs are quite sparse now. But, yeah, we will get there. [LAUGHS]

KEITH MATTIX: We are working on it.

ABDEL SGHIOUAR: If somebody has a particular interest in this--


ABDEL SGHIOUAR: --and are listening to this episode, just please help. OK, cool. So that's good. Then let's wrap it back to the native sidecars. And I know that you covered this in your article, John, but how do you see this impacting Ambient mesh, if at all?

JOHN HOWARD: In Ambient mesh, there's no sidecars, per se. So it doesn't hurt it, but it doesn't help. It's completely orthogonal. However, it's not like we're making a new product. Istio sidecars and Ambient mesh interoperate. Right?


JOHN HOWARD: So we don't expect that sidecars disappear one day and everyone's on this new model. Maybe in the distant future. But for quite a while there will still be sidecars, and so those users that are using sidecars with or without Ambient will still get the benefits of the sidecar containers.


JOHN HOWARD: And even with Ambient, they may still use sidecars for other reasons like logging, or other non-service mesh use cases. So it's still a win, even though we have alternatives now.

ABDEL SGHIOUAR: I liked the fact that you mentioned something which is the interoperability part, which is essentially going to allow people to migrate off sidecars to Ambient mesh in a gradual way, I guess. Is that the goal?

JOHN HOWARD: That's the plan. Yeah. So we would have a migration from sidecars, and then you remove sidecars and move the onboard to Ztunnel and Ambient, or even potentially the ability to have long-term sidecars intermixing in the same mesh.

ABDEL SGHIOUAR: Oh, interesting. OK. I didn't think of that. OK. Then sidecars are going to stick around.

JOHN HOWARD: Yeah, maybe, unless everyone says that no one uses them. I don't know. We haven't actually got widespread production usage of Ambient or anything, so it's hard to make forward-looking statements. It could be that we see that everyone wants to use sidecar still. We could see that no one does. And then we'll reevaluate as we get more feedback.

ABDEL SGHIOUAR: That's a very valid point. I think there is still technically no real production benchmarks for Ambient mesh because it's still experimental, technically. So I wanted to touch on something very quick, the gateway API. And the reason I want to touch on this-- I know that this was already on the list of questions, but Keith, you mentioned something in the persona model for Ambient mesh. That's kind of inspired from the gateway API, because the gateway API is persona-based.


ABDEL SGHIOUAR: But I think what's interesting is that you folks are adopting the gateway API as a way to manage Istio traffic--


ABDEL SGHIOUAR: --compared to the specific CRDS like service destination, and virtual service destination rules, blah, blah. Right? Is that a fair way to describe it? Can you shed light on this a little bit?

KEITH MATTIX: Yeah. For those of you who might not know, the Gateway API project around July of last year started an initiative to figure out a way to make the HTTP route, TCP route, et cetera, all these different resources and constructs that were typically used for north-south traffic in your traditional ingress gateways, to-- you want to take those same constructs and ideas and bring them over to the mesh world and see if there is compatibility. There that was what we now call the Gamma Initiative. And John and I are two of the original leads of that initiative, and we still are active in gateway API today.

And so Istio has experimental support for the gamma spec for the service mesh spec within gateway API. And so users have two choices, depending on what standards body they want to adhere to. Either the Istio APIs that have been around for a while, virtual service destination rule, you named a lot of them, or the HTTP routes, TCP route, et cetera from gateway API. And the choice is really going to depend on your use case.

Virtual service and destination rule are a lot more feature-rich. They do a lot more things, and you have more flexibility, especially in sidecar mode, of where you put the virtual service, for example, and how they operate. With gateway API, the HTTP route, et cetera, today, it binds to service, and you have-- you can do-- request response modification, traffic splitting, and some of those canonical use cases for doing traffic management in Kubernetes.

The choice really just comes down to what your needs are and what you see yourself adopting in the long-term. Obviously, as time goes on, we'll add more and more features and capabilities to the gateway API for both ingress and mesh. But if users have a need now, then we still, at this point, believe we intend to support virtual service and the API for Ambient and Istio in general.

ABDEL SGHIOUAR: I am personally very excited about how the gateway API utilization within the Istio projects is going to evolve over time. I think one of the main reasons is because everybody basically needs less CRDs in general. Just generally speaking, in Kubernetes, we don't need to deploy extra stuff. And if the gateway becomes standard in Kubernetes out of the box, then I see a very good value of using it for Istio traffic management as well.

JOHN HOWARD: One thing I would say is for some other projects that are either using ingress, which is not the most flexible API, or they don't even have an API, so gateway is a very obvious value-- but even for Istio, I see people like, hey, virtual service and an HTTP route, those are kind of the same thing. Why do I care? I think there's actually quite a big improvement for two reasons. One is, like you said, everyone's using it. So all the documentation, blogs, Helm charts, integrations, cert manager, external DNS, et cetera, they're all-- are, or will be, gateway API-based. So there's one ecosystem.


JOHN HOWARD: And two, the API, while they look similar, it's built on five years of mistakes that Istio made, as well as other projects. And so it's similar, but quite a bit better in subtle and important ways that, while it may not be apparent when you're doing the "Hello, World" demo, it will be quite apparent once you have production and you have thousands of routes and whatnot, and you're trying to understand how they interact and work and whatnot. So I think it's more important and useful than it may seem at first glance. So I'm super excited about it as well.

ABDEL SGHIOUAR: And I think also I have to say this, that although the CRDs in Istio seem like they're written for English audiences, they are still, for some reason, confusing. If you're sitting in front of an Istio cluster and trying to understand, OK, so it's a virtual service, it's English, but still very confusing English. So I prefer the terminology of the gateway API, personally. But that's just personal preference.

KEITH MATTIX: See, my favorite is-- when I was learning Istio the first time, my favorite was destination rule.


KEITH MATTIX: That it's named "destination rule," but oftentimes, it affects how the client operates.


KEITH MATTIX: It was a fun little learning there.

ABDEL SGHIOUAR: Yeah, exactly. I don't want to use more time than you have already generously given me, so I just want to close out on this. CNCF graduated project Istio is. Any impacts on the project, the community you guys have been involved for a while?

JOHN HOWARD: I think that's why Keith is on this call. Right?


ABDEL SGHIOUAR: I am not aware of the history there.

JOHN HOWARD: I guess that's not the graduation. That's the donation.


JOHN HOWARD: Yeah. The graduation-- I don't know. I think it's a nice checkbox, but it's not like a graduation means that Istio is now mature or something. It's more of a formal recognition that we met a list of check boxes. And it's not something new. Istio has been mature for quite a while.


JOHN HOWARD: So it's really just a recognition of what Istio was, not a promotion of-- or a real change to Istio. So it's nice to have. It's not a huge deal, from my point of view.

ABDEL SGHIOUAR: I would have to admit-- I mean, Istio has been mature way before it was even donated to CNCF. So I guess probably the donation part was more important, of having other people being involved. Right?

JOHN HOWARD: Yeah, exactly.

ABDEL SGHIOUAR: Did you see more involvement from other companies because of the donation to the CNCF?

KEITH MATTIX: Like John said, that's one reason why--

ABDEL SGHIOUAR: That's why you're on--

KEITH MATTIX: --I'm on this call. So, yeah, the donation to the CNCF has, I think, really been a signal to lots of companies and people across the ecosystem that the Istio project is a place that truly does emphasize and care about collaboration and open governance. And the fact that I'm, at Microsoft, able to be a part of the project and work together and innovate in areas like Ambient mesh and things of that nature, I think it's a really big sign.

And I think that other projects who come in the future will be able to see the benefits of being a part of a foundation of the CNCF, or just in general, the power that open communities can bring to an ecosystem and to an industry. So for me personally, of course the donation has been impactful. As far as the graduation, I think, kind of like John said, it's simply an acknowledgment of the production quality and stability and maturity that Istio has had for years.


KEITH MATTIX: Looking forward to seeing where it continues to grow and to reach a new segment of users.


JOHN HOWARD: Yeah. The last time at KubeCon, the graduated products got cupcakes. So that's pretty big. Chicago, we may be eating Istio cupcakes, or maybe even cookies. I don't know.

KEITH MATTIX: We need a mascot. We need an animal. We're a graduated project. Now we need to figure out-- like, Istio iguanas, or whatever. [LAUGHS] An animal. It can be.

ABDEL SGHIOUAR: You also get a bigger booth. Graduated projects technically gets a bigger booth than the incubated ones, so--


ABDEL SGHIOUAR: --that's a plus 1. [LAUGHS]

KEITH MATTIX: Well, that's enough. It's enough for our booth duty.

ABDEL SGHIOUAR: They typically have booths for different projects, depending on the maturity level. So, yeah, it's pretty cool. Well, this was a really good conversation, folks. I'm really happy I had you.

I'm really happy I managed to get John, because we talked about this back in Amsterdam, and I was thinking he was just saying, yeah, yeah, whatever. But, yeah. That's good.


So thanks for being on the call, John and Keith. And we'll maybe do another checkup with Istio in five years from now. We'll see.

JOHN HOWARD: Yeah. Sounds good. We'll still be talking about the same stuff, probably.



ABDEL SGHIOUAR: All right. Thanks, folks.

JOHN HOWARD: All right. Thanks.

KEITH MATTIX: Thank you.


KASLIN FIELDS: That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter at @KubernetesPod, or reach us by email at <kubernetespodcast@google.com>.

You can also check out the website at kubernetespodcast.com, where you'll find transcripts, and show notes, and links to subscribe. Please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening, and we'll see you next time.