#15 August 7, 2018

Istio, with Jasmine Jaksic and Dan Ciruli

Hosts: Craig Box, Adam Glick

Istio has hit 1.0, and there’s no-one better to tell you about it than Jasmine Jaksic and Dan Ciruli from Google Cloud. Adam and Craig bring you this, as well as the news from the ecosystem.

Do you have something cool to share? Some questions? Let us know:

News of the week

ADAM GLICK: Hi, and welcome to the "Kubernetes Podcast" from Google. I'm Adam Glick.

CRAIG BOX: And I'm Craig Box.


Welcome to another week. This is a week that I've spent predominantly on vacation. And Adam, how's your time been spent?

ADAM GLICK: My time has been spent catching up from all the things that we got through in Next and all the follow-up behind that, catching up with some customers, and most importantly, discovering a wonderful series on Netflix called "Fullmetal Alchemist." And I realize I'm coming to this decades late, but it is a fantastic series. I'm one season into it so far, and totally engaging. If anyone out there loves anime and has not seen this, it's seriously something to put into your queue.

CRAIG BOX: Well, unfortunately, we don't have the kind of holidays that let you spend a lot of time watching television. My week has been mountains, marmots, and meetups here in Colorado. I presented what I believed to be the first Meetup talk on Knative last week. And since, then I've been driving around Colorado with my partner, and we've had a fantastic time climbing mountains, driving up mountains, and getting out and climbing the last 100 feet or so. It's not so much a mountain climbing thing. And enjoying a lot of wildlife.

So if any one of you follow me on Twitter, you'll see some pictures of some mountain things. I should put moose in there as well. That was a very important part. We didn't find moose early in the trip, so we had to go back to the park. And yesterday indeed, we saw a good seven or eight of them. So fantastic time.

ADAM GLICK: Cool. You want to get to the news?

CRAIG BOX: Last October, Docker announced that they would be integrating Kubernetes into their Desktop and Enterprise products. Support for Kubernetes entered the Docker Edge channel in January and the Enterprise platform in April. It has now graduated to the desktop distribution, which means any user running the stable version of Docker Desktop can develop locally on certified Kubernetes. Docker says that automatic upgrades are rolling out now.

ADAM GLICK: Project Harbor, an open-source container registry, has joined the CNCF at the sandbox level. Harbor is a cloud-native registry that stores, signs, and scans content. It was originally created by VMware and extends Docker registry functionality, adding security, identity, and management functions. Harbor supports replication of images between registries and offers security features, such as vulnerability analysis, role-based access control, and activity auditing.

CRAIG BOX: A trifecta of news items from Microsoft. Azure Kubernetes Service is now available in the Southeast Asia Singapore region. Microsoft released an open-source metrics adapter for Azure, allowing you to horizontally scale based on things like Azure Service Bus queues. And CloudBees Core, formerly the Jenkins Enterprise, is now GA on AKS, joining GKE, Amazon EKS, and OpenShift as supported platforms.

ADAM GLICK: Speaking of OpenShift, Red Hat this week announced Red Hat OpenShift Container Platform 3.10. The primary theme of this release is performance and brings commercial support to several features built by Red Hat in the Kubernetes upstream, such as device management for GPUs, the CPU Manager, and support for huge pages. We can't wait to see what's on tap in Open Shift 3.11 for Workgroups.

CRAIG BOX: Hot on the heels of their $8 million financing round, Codefresh, provider of a continuous delivery platform for Kubernetes by the same name, have announced an Enterprise version. Codefresh Enterprise can be run fully on prem, as SaaS, or in a hybrid configuration with a local agent. The Enterprise version office features such as support for Windows containers, single sign-on authentication, and a layered permissions model for deployment pipelines.

ADAM GLICK: Upside posted an interesting blog post this week about sinking their Kubernetes secrets with LastPass. In this case, they're using LastPass a bit like a key management service and accessing it from their various Kubernetes clusters. While maybe not as enterprise-focused as using a key management service or hardware security module as a way to store secrets, it is an interesting API-driven approach to using a common, broadly adopted consumer technology.

CRAIG BOX: And finally, with the launch of Istio 1.0, Tetrate, a startup founded by former Google Istio product manager Varun Talwar, is starting to come out of stealth. A post by engineer Matt Turner talks about installing Istio on Amazon EKS. While specifically talking about nightly builds, Matt points out the important fact that EKS does not currently support webhook mission controllers required for automatically adding Istio sidecars to Kubernetes deployments. So at this point, people will have to use manual sidecar admission or create their own clusters with a tool like Kops, with a K.

ADAM GLICK: And that's the news.

Jasmine Jaksic is a technical program manager on the Istio project. Dan Ciruli is a product manager on Istio and works on the managed version of Istio for Google Cloud. Welcome to both of you.

DAN CIRULI: Hey, Adam. Thank you very much. We are really excited to be here.

JASMINE JAKSIC: Hey. Happy to be here.

ADAM GLICK: Congratulations on the 1.0 launch. How would you describe Istio to someone who hasn't heard of it before?

JASMINE JAKSIC: So we all know all the different issues we have with monolithic applications and services. And of lately, the trend has been to move more and more closer to microservices, which solves a lot of problems. But what it does not do is when you have hundreds of thousands of microservices, it becomes extremely difficult to connect and manage and monitor and secure them. And this is where Istio comes in. You don't need complex libraries. You don't need complicated codes or a large team of developers to manage them for you. You get all of those things out of the box.

DAN CIRULI: And I'd say you get those problems far before you're at hundreds or thousands of microservices. If you have 10 pods running in your Kubernetes cluster and they're communicating at all, you need to know what communication is happening. And what Istio does is gives you ways to understand the communication between your services, to control how it happens, and to secure it.

CRAIG BOX: Is Istio for developers or for operators?

JASMINE JAKSIC: It is actually for both. And what it does-- it provides abstraction between what a developer does and what an operator does. And what I mean by that is if you're a developer, you can build new services, you can redeploy existing services, and you don't have to worry about operator controls. And similarly, if you're an operator, you can configure, you can make changes, and you don't have to write or change existing code. So that level of abstraction is what is called service mesh. And Google had perfected this way before we came up with Istio. And that's why we wanted to open source it.

CRAIG BOX: Dan, do you remember a couple years ago, we had a customer roundtable day in Toronto that I arranged? A bunch of Google PMs came. We had a chat about this kind of thing.

DAN CIRULI: I remember it like it was yesterday.

CRAIG BOX: Yeah, that's fantastic. Does that mean I'm a co-founder of Istio?

DAN CIRULI: Absolutely, yes. In fact, if you haven't gotten your trophy yet, the co-founders trophy, you should be--

CRAIG BOX: Huh. I didn't know they were issuing those.

DAN CIRULI: Have you been at your cube? It's probably at your cube.

CRAIG BOX: I'll have to go back and check my cube. But it's been a good couple of years now, obviously. I was looking back at the slides we showed that day, and I was surprised, actually, at how much the service [INAUDIBLE] stuff we've just announced for Istio. I'm like, we must have been telegraphing that for quite some time.

DAN CIRULI: The thing that I remember from that day, actually, very clearly is people talking about wanting to make decisions on their Kubernetes clusters based on L7 metrics. I don't want to scale based on CPU consumption. I don't want to scale based on my RAM consumption.

ADAM GLICK: When you're saying L7, you mean like HTTP.

DAN CIRULI: Yeah, things that you only know when you understand the protocol-- for example, how many requests are happening. People want to scale based on things like how many requests per second are I serving right now or what are my latencies looking like right now. Those are the top two things that people want to scale on. And one of the things that Istio does is collects metrics like that for all of your services so that you can plug those back into Kubernetes and scale with them.

CRAIG BOX: How did we go from the discussions we had on that day to having an open-source product a few months later?

DAN CIRULI: We were thinking about it already. And that was really a time of discovery for us. Because we were going out, and we hadn't chosen a name. We hadn't figured out what the architecture would be. But we were just listening to the problems that people were having.

And what we realized is that the things that Jasmine talked about, those issues that people have with communication-- how you understand what's happening in an application, how you scale it appropriately, how you do a canary rollout-- everyone who was adopting Kubernetes was having those exact problems. And in doing those, in going out and going to meetups and hearing what's hard, Kubernetes is solving a lot of problems for you. It's making it easy to deploy and easy to deploy frequently. That's great.

But now what's happening is it turns out it created a new set of problems. And so we had the same set of problems internally. We had solved it internally with a sidecar proxy that talked to a management plane, a control plane. And we said, hey, it sounds like the world needs that.

CRAIG BOX: There's something in this.

DAN CIRULI: Right. And so we started talking to some people in the industry. Turns out IBM had a project that was doing something very similar.

CRAIG BOX: Only for traffic routing, I understand, though.

DAN CIRULI: Right. It wasn't doing the whole thing. Turns out that Lyft had just open sourced a really fantastic proxy. And we said, I think we can build something that's going to solve these problems for people.

JASMINE JAKSIC: Yeah. And when we realized the amount of overlap we had with Amalgam8, IBM decided to partner with us and shelved the product that they were working on. And like Dan pointed out, we decided to use Lyft's proxy and together build Istio.

ADAM GLICK: Folks from Istio here on the "Kubernetes Podcast--" and I often hear about Istio in a Kubernetes kind of scenario-- are they part of the same project? Why are they different?

DAN CIRULI: That's a good question. I just got asked that. We considered whether or not Istio should be part of Kubernetes. And the reason it's not is that when we were on that listening tour, when we were going out and talking with people about what we were thinking about building and what it would solve, the overwhelming response we got was, that's great, but I need it not just for Kubernetes. And when we were talking to people then, even people who were doing all their new deployment on Kubernetes, if it was a company that was older than, say, four or five years old, they were also doing it on Kubernetes. And when they talk about, hey, secure the traffic between my services, or route traffic between my services, or give me a picture of what's going on, they absolutely said this has to include not Kubernetes stuff. And so that's why, obviously, this pairs very well with Kubernetes. But that's why we said, OK, we'll sit alongside Kubernetes.

ADAM GLICK: In the context of Kubernetes, it's not the only service mesh that's available for people to use with Kubernetes. What makes this one different?

DAN CIRULI: I think the big thing is that we've tried to get a lot of people in the industry behind this. We have worked very hard. IBM, as Jasmine said, has been there from the very beginning. Now VMware has joined. Cisco has joined with us. Red Hat has joined with us.

And the big difference here, I think, is that we've got a lot of people saying, hey, this isn't my solution. This is our solution. And they're contributing. They're helping us make design decisions. I'm sure there will be lots of commercial implementations. I think that's a really positive thing. And I think that's what differentiates it from everything that comes from a single company out there.

JASMINE JAKSIC: And it's interesting. Because when we started back in 2016, it was a relatively small group of developers from IBM and Google mostly working on this. And we used to have one-hour meetings once a week where we would go through all the bugs and feature requests and triage them. And that was all the time we needed.

But that all changed once we launched it to 0.1 back in May. Because up until that point, we didn't have a community to speak of, mostly because we hadn't launched anything. But after May, there was a huge adoption across the board. And like Dan pointed out, now we have a thriving community, both with users and developers and partners, and we have hundreds of people who have been eager to communicate with this product. And we've been working all together very closely to make 1.0 happen.

CRAIG BOX: So why 1.0, and why now?

DAN CIRULI: Well, I think the biggest thing is that it's ready for production usage. And I'm not saying that because we've finally passed our tests and everything is fine. I'm saying that because people are using it in production today. eBay was-- I was just in a conversation with them. They're using [INAUDIBLE] for services. The Weather Company is running a large installation, 400,000 requests a second. We have a bunch of customers, over a dozen that we know of.

Now, being open source, of course, we don't know most of the people who are using it, which is interesting. But we know of a bunch of companies that are in production using this today. Auto Trader in the UK has said that it has been instrumental in their move to the cloud. So the biggest thing is we know it's ready for production workloads.

And we've also had a lot of people tell us, hey, we're not going to use that until it's 1.0. Right now, we've got some early adopters, people who are courageous, people who worked through some problems. I want to be honest-- of course they found problems. There are a lot more people who won't use a piece of software until it's 1.0 or 1.1. And at this stage, we need to learn more, and we need it to be out there. So we know that it's working. We know that it's working for a bunch of use cases. And we want to encourage adoption there.

CRAIG BOX: Is it finished?

JASMINE JAKSIC: We have spent well over a year working on 1.0, and there are a lot of things that are coming out with this launch-- so as an example, new features. If you think about service to service incremental MTLS, that is a very powerful feature to be able to have MTLS traffic and HTTPS traffic and not have to worry about migrating it all overnight.

CRAIG BOX: I have been waiting for that for some time.

JASMINE JAKSIC: Exactly. And there are a lot of other people who've been waiting for that as well. So that will be part of 1.0.

We've also spent a lot of time hardening the existing features. And what I mean by that is a lot of features currently that are in alpha would be moving to beta-- not all of them. I know that would be ideal. But it's going to happen iteratively. So we picked all the key features that the community really wanted and made sure that they're moving to beta.

We also spent a lot of time on performance and scalability. So when we did the testing last time, we realized that the overhead for adding a sidecar is less than 1 millisecond now, which is phenomenal compared to where we started. And if you think about scalability, you can actually increase it 1,000 or even 1,500 QPS per CPU, and it handles them really, really well. We also improved documentation. We are improving the site. So there's a lot of effort that went into making this happen.

DAN CIRULI: Now, is it done? Of course not. But the fun thing is of course it's not done. We have a roadmap. We have features that were in process. There are features that will still be in alpha that we're still developing.

But what's exciting at this point is we don't know what we don't know. And we know of a dozen-ish people using it in production. It's about to get a much broader adoption. And as soon as we do, we're going to learn a lot about how it's used. And we're going to learn. When it falls over, we're going to learn what features people really need to take it to the next step. So no, of course it's not done.

CRAIG BOX: You've talked about the low overhead of installing Istio. Should everyone install Istio on their cluster?

DAN CIRULI: I love that question, and the answer is yes. Istio needs to get to the point that everyone who's using Kubernetes considers it a no-brainer. They should view it as, of course, I'm going to turn it on. Unless you're running batch workloads completely, you're going to want to monitor the traffic on that. Istio has to become the easiest standard way to do that. So that is absolutely my goal is that 90-something percent of the people who turn up a Kubernetes cluster decide to turn up Istio.

JASMINE JAKSIC: And I would also add to that and say if you've been waiting on the sidelines, waiting for a time to use Istio, that is absolutely now.

DAN CIRULI: That time is now.



CRAIG BOX: And the place is here.

DAN CIRULI: I love it.

CRAIG BOX: Istio is also a component the new Knative serverless ecosystem. Can you speak a little bit about that?

DAN CIRULI: Yeah. Knative is the open-source serverless implementation. We talked to them very early on, and they've actually-- there's been some pushback in the community. Hey, should you take a dependency on Istio? Absolutely there's been some pushback.

Well, here's the deal. When you're running serverless functions, when you're running these functions, you need something to route traffic to them. And unless you're crazy, you need something to monitor what's going on there. And in all likelihood, you want something to secure them. And so the Knative team had a choice to make-- release without those features and say you can add those later if you want them, build all that themselves, or use Istio for that. And they chose the latter.

As I said before, it's the Istio team's job to make adopting Istio easy and, in fact, transparent for a use case like that. And so that's our goal. If it seems too complicated, then it's on the Istio team to make that easier.

But yeah, so Knative requires Kubernetes. It requires Istio. And that may seem like an odd choice right now. But a year from now, I hope you look back and say, well, of course.

ADAM GLICK: So what was announced at Google Cloud Next?

JASMINE JAKSIC: There were a lot of cool announcements that happen at Next and that were related to a Istio. I will start with 1.0. And obviously, like I said, this is something that we have all been working on for a really long time. And it has a ton of cool features, and it's something that all the customers and partners have been waiting in the sidelines for. So yeah, that's one big thing.

We also have a managed Istio that is going to launch to alpha. And I don't know if any of you got a chance to, as an example, watch the breakout session that Stackdriver team had with Descartes Lab. And Tim did this really cool demo where he showed all the different services that he had as a way of service graph. And you could see the entire topology and understand the traffic that flows between the different services. You could click on every single service, understand the latency between them, and drill it all the way down to traces. And if you are in SRE, to be able to go back and see what changed over a period of time is a very powerful thing. And that's only a very small piece of what is going to come as part of CSP.

DAN CIRULI: And CSP is the Cloud Services Platform. It is a bunch of projects that we've been working on that we're putting together. Istio, managed Istio is one of the big components there. But probably the main component of the CSP is GKE On-Prem. And what our goal is there is to allow you to adopt a single architecture, whether it's running in your own data centers or running in ours, use a common control plane to control the deployment, to control the administration of those, as well as with Istio, the management of the traffic between those services and into all those services.

CRAIG BOX: And if you're interested in learning more about CSP, you can listen to episode 13 with Aparna Sinha, where we talked in-depth about Google's new Cloud Services Platform.

ADAM GLICK: My favorite episode so far.

DAN CIRULI: Policy is exciting. [LAUGHS]

CRAIG BOX: Now that we've hit 1.0, is there anything we're looking to change about the governance of the Istio project?

DAN CIRULI: Well, we've been rapidly accepting contributions from some big players in the industry, as I mentioned before. And we are looking to get them involved in many ways, one of which is we'll happily take those pull requests. And we think that's great. More developers means faster progress. But we also want to have them contribute to governing the project.

And so one of the tasks on the steering committee is to really get serious about defining exactly who sits on the steering committee and how does it relate to our strategy as a project and get those people involved. As I said before, we don't want this to be perceived as a Google project. This is an industry project. And so yes, we want to get all of those companies involved so that we all feel some ownership over it.

JASMINE JAKSIC: And I would add to that and say yes, there is a steering committee, and there's a technical oversight committee, but also, all the different areas have their own working groups. So if you are, let's say, a security expert, and you have very specific ideas about what you can build and how you want to build it, and you want to have an in-depth discussion with the developers at Istio, a great way to get engaged would be to start attending the weekly and bi-weekly meetings they have. And you can do demos, and you can really be part of the team that builds the next version of Istio.

ADAM GLICK: Speaking of the next versions of Istio, obviously you have a roadmap. What's coming up on the roadmap, and what features in that are you most excited about?

JASMINE JAKSIC: So now that 1.0, like you said, is out, we want to spend more time understanding what the features are that we currently have in alpha that did not make it to beta that our users want in beta soon, and making that happen. We also want to spend more time in the hybrid environment. So one of the things that you saw happen in 0.8, which is the previous long-term supported release, was we have multi-cluster support. So that was our first foray into supporting hybrid environment. And what we would like to do going forward is expand on that. And sometime in the near future, you might be able to use Istio on prem, Istio on GCP, or for that matter, anywhere else, and be able to pull them all into a single mesh and manage them and secure them very seamlessly. So that's something that we would be working on.

We also want to invest in API management capabilities. So be it API analytics or gRCP transcoding, there are a whole host of features that people are excited to have. And that is something that we would be spending time in the future working and building them. And you are all obviously welcome to participate and contribute to it. After all, this is an open-source project. So we rely on the listeners, and obviously all the developers, to help us build this.

DAN CIRULI: I'm glad you brought up API management. That was something I was going to say. And the other thing that I personally feel strongly about is facilitating incremental adoption of Istio. One of the things that we've seen in people who have done successful implementations is start by not implementing all of the pieces and not getting all the functionality. If all you need is routing at the edge, you don't need to install Mixer, for example. And some people aren't using sidecars. Some people are. Certainly you get fewer features if you don't. But we can actually make it easier, say, through our installation and the Helm scripts we published to make it easier for people to start small and phase in more components as they decide to take advantage of more of the features. So I want to see us do that work.

JASMINE JAKSIC: And if you haven't already checked out a presentation that Dan did on a la carte and how people can use Istio incrementally, I think you definitely should check it out. It is an awesome presentation.

CRAIG BOX: It's better in person. He gesticulates a lot.

DAN CIRULI: Invite me over to your company, and I'll tell you all about Istio a la carte.

CRAIG BOX: If you do not have the luxury of a visit from Dan Ciruli, how would you recommend a company would get involved with the Istio project?

DAN CIRULI: We decided to put a bunch of information on the World Wide Web.

CRAIG BOX: That's a good place to have information.

DAN CIRULI: We even have-- I don't remember the IP address. However, we do have a domain name.


DAN CIRULI: A URL. So istio.io is really the best place to go. It's got docs. Docs are never good enough. There's never enough. But in this case, we have really worked hard on them.

CRAIG BOX: Thank you.

DAN CIRULI: I promise that you will find deficiencies. PR is accepted. Of course, that's all open source.

And on istio.io, you'll see a link to Community. First of all, there's all kinds of getting started, here's examples you can run through. But there's also a Community page, and that has links to our Rocket.Chat. It has links to the different Google Groups where you can ask users questions. There's another one for if you're doing development and integrating with Istio as well as the dates and times of the working group sessions.

JASMINE JAKSIC: Right. And if you want to have more real-time discussion with the [INAUDIBLE] developers, you're welcome to participate in the Slack channel. Granted, it is a full-time job. But if you're open to it, you're welcome.

CRAIG BOX: Are the Rocket.Chat and the Slack the same thing?

DAN CIRULI: The Rocket.Chat is for anyone who's thinking about Istio, using Istio, and the Slack is used for people who are developing on Istio. Rocket.Chat is open to anybody.

JASMINE JAKSIC: And also talking about istio.io, one of the cool section that we will have under Documentation is Operator's Guide. So this is equivalent to SRE playbook. So if you try Istio, you have some questions, you want to troubleshoot it, you want to know how to go about doing that, this would be a great place to start. And it will be part of the 1.0 release.

ADAM GLICK: Jasmine, Dan, thank you so much for coming on the show today.

DAN CIRULI: It was our pleasure.

JASMINE JAKSIC: Thank you for having us over. This was fun.

CRAIG BOX: You can find the IP for istio.io and other links to Dan and Jasmine's Twitter pages on our show notes at kubernetespodcast.com.

ADAM GLICK: Thanks for listening. As always, if you enjoyed this show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @kubernetespod or reach us by email, kubernetespodcast@google.com

CRAIG BOX: You can also check out our website at kubernetespodcast.com, where you'll find our hand-crafted artisanal show notes. Until next time, take care.

ADAM GLICK: Catch you next week.