Kubernetes Podcast from Google: Episode 160

#160 August 26, 2021

KEDA, with Tom Kerkhove

Hosts: Craig Box, Jimmy Moore

KEDA, the Kubernetes Event-Driven Autoscaler, is a project that adds superpowers to the Kubernetes horizontal pod autoscaler, including zero-to-one scaling. Celebrate KEDA reaching Incubation in the CNCF by listening to an interview with maintainer Tom Kerkhove from Codit. But first, learn about Craig’s worst concert experience.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

CRAIG BOX: Hi and welcome to the Kubernetes Podcast from Google. I'm Craig Box with my very special guest host Jimmy Moore.

[MUSIC PLAYING]

CRAIG BOX: Every week, we have a little chat before we bring you the news of the week. And I don't want people thinking we don't bring the same journalistic rigor to the chat segment as we do to the news segment. So we should point out that, a couple of weeks ago, we gave you the news that there were new hosts of "Jeopardy!" And it turns out that, that news wasn't the truth for very long.

JIMMY MOORE: No that's right, in fact, Mike Richards was quickly replaced. Or, I guess, canceled as they say.

CRAIG BOX: Yes, he hosted for what will be five episodes live on the air. Which works out to one day of recording time before the time was called on his contract. He will be temporarily replaced by Mayim Bialik, and then the question of who the new host will be returns.

JIMMY MOORE: Well I am team LeVar Burton all the way. Reading Rainbow through "Jeopardy!" For the rest of my life, please.

CRAIG BOX: He did a great guest turn on "Community," if you remember that.

JIMMY MOORE: Mhm. Yes.

CRAIG BOX: Troy will be very disappointed.

JIMMY MOORE: Are you a fan of Ken Jennings, or perhaps the AI robot that won "Jeopardy!?"

CRAIG BOX: Watson from IBM? I don't know if that was a contender for the hosting job. But one thing I will say is, when Adam left, and we were looking to do guest hosts, I gave serious thought to reaching out to Ken Jennings. Because he does host podcasts, he's very active on Twitter. There is some chance he might know what Kubernetes is. And I thought it would be quite funny to have as my first guest host — just have Ken Jennings on and see what people thought. Just completely out of context, that would be an excellent thing to do. But I never got around to it.

JIMMY MOORE: That's awesome. I love that idea. He has a game show now where he basically says, I'm smarter than you, for half an hour, to general, good old red blooded Americans on TV.

CRAIG BOX: Yes, it's an American version of a British TV show, and it's very popular in New Zealand as well. So whenever I'm back there, it's on just before dinner time. And I always end up watching "The Chase".

JIMMY MOORE: Yes, "The Chase". All the best things here are taken from the British, I believe.

CRAIG BOX: Indeed. Now, sad news the last couple of days. They announced the passing of the drummer of the Rolling Stones, Charlie Watts. And that leads me to tell a little story about my worst concert ever. And I'm going to start, Jimmy, by asking you what's your worst concert ever?

JIMMY MOORE: You know Craig, I don't do a ton of concerts, but I do have a good worst concert experience. When I was a little kid, probably eight or nine, my mom took me to see The Judds. They're a country music duo with Wynonna Judd. And I got lost. I was a little boy who got lost at a concert, and my mom and friends went everywhere looking all over for me. She said, we're going to dredge the rivers — you know, it's the '80s, people were scared of children being lost all the time. Very different from today. But the silver lining of that story is that the Judds got to call my name out over the P.A. system. So, Wynonna Judd said, “Jimmy Moore, Jimmy Moore, if you're here, your mama's looking for you.” So I ran up to that stage and found my mom. It turned out all right, but I was scared.

CRAIG BOX: Did they have you both up on stage to announce to everyone you were safe?

JIMMY MOORE: In my version of the story, I'm going to say yes. And then we did a quartet.

CRAIG BOX: I guess that's a thing your mom could have done. She could have just told the band that you were lost, and then that's your way of all getting up on stage together.

JIMMY MOORE: She probably would have. What's your horrible concert experience?

CRAIG BOX: Well, my worst concert experience brings back, obviously, to Charlie Watts and the Rolling Stones. There was a competition on a radio station I listened to in New Zealand, back in the mid 2000s, to win tickets to see the Rolling Stones, who were going to be playing in Auckland. And what they did was they had five Rolling Stones songs very quickly sped up, maybe two or three seconds of each, but maybe a second and a half of audio in total. And anyone who could guess all five songs in the right order would win two tickets to see the Rolling Stones. And if no one got it within a couple of days, it went up to four tickets. And if no one got it within a couple of days after that, and so on, until it actually got up to eight tickets.

So they were going to try and give away four pairs. They ended up giving away eight tickets. And then it was basically the day or two before the show. And no one had got it, it was just very, very hard to do. And they were just taking dozens of calls at that point. And I was calling up and just guessing songs. I looked up a couple of them and there was really no way you could actually tell what it was. And I happened to get all five songs in the right order. And so, there we are, congratulations, you've won eight tickets to the Rolling Stones.

And what they wanted to do was get me to call some of my friends who I might invite along with me, and have them appear on the radio and be excited about the fact they were going to go to the Rolling Stones. And of course, no one was available. I was trying to get hold of friends of mine who I would go to concerts with, and they just weren't answering the phone that day or something. And maybe no one was hugely excited about this prospect at all. But I had managed to cobble some friends together, and we went along to see the Rolling Stones. And they weren't bad, as far as concerts go.

I should say, for them being a classic rock band, I didn't really have the right appreciation for them. They had a good stage show, for a bunch of old guys at least. They had one part where they had the stage on a tractor or something. It just basically pulled out into the middle of the crowd. It was a B-stage section, which moves in and out. And Charlie's drumming away on it, and so on.

I have downloaded a bootleg of the concert. I'm going to listen back to it now with a little bit more experience and maybe respect for music under my belt, and see how that was. But anyway, opening for the Rolling Stones was Nickelback. And they were the worst concert ever.

JIMMY MOORE: [LAUGHING] Oh, that is the worst silver lining to a story I've ever heard.

CRAIG BOX: Should we get to the news?

JIMMY MOORE: Yes, let's get to the news.

[MUSIC PLAYING]

JIMMY MOORE: Autoscaling project, KEDA, moved from the CNCF sandbox to incubation this week. KEDA, or Kubernetes Event-driven Autoscaling, aims to simplify application autoscaling and optimize for cost by supporting scale to zero. Listen to today's interview to learn more.

CRAIG BOX: Also, two weeks ago, we brought you the news of the NSA developing hardening guidelines for Kubernetes. It didn't take long for the first tool to verify that your cluster is compliant with these. The team at ARMO Security saw overlap between the recommendations and their commercial tool, and the NSA requirements. And have released a command line utility called Cubescape. Cubescape uses the Open Policy Agent with configuration taken from the ARMO platform to check for things such as privileged containers and host access permissions.

JIMMY MOORE: GKE has always used the Google Cloud Identity and Access Management System to authenticate users to Kubernetes. Now you can configure a third party identity service as an external source of user data, such as Active Directory or any OIDC compliant service. The new feature is available in preview. GKE also added support for the Google Virtual NIC Driver, allowing network speeds up to 100 gigabytes per second.

CRAIG BOX: Solo.io has announced version 1.1 of Gloo Mesh Enterprise, their Istio powered service mesh. It now includes Gloo Mesh gateway, taking the features of their standalone API gateway, Gloo Edge, and adding them on top of Istio. Other new features include improved security certificate management and backporting fixes to the previous four versions of Istio. Pay attention to that last one, as a number of CVEs were recently found in Envoy and fixed in the last three versions of Upstream Istio.

JIMMY MOORE: And finally, a story. Are time zones supported by Cronjobs in Kubernetes? Depends on who you ask. Support for specifying which time zone a Cronjob runs is a popular feature request, but was rejected by SIG architecture, citing the difficulty of having to maintain a time zone database and just deal with time in general. Then an update to the Cron library, which Kubernetes uses, went ahead and implemented time zone support. And someone discovered that Kubernetes inherited it by mistake. Unit tests are being implemented. A welcome addition for people who wanted this feature, but a reminder that your software sits on an iceberg of software supply chain. And you really need to track what's happening in your dependencies.

CRAIG BOX: And that's the news.

[MUSIC PLAYING]

Tom Kerkhove is the containerization practice lead at Codit, an IT integration firm headquartered in Belgium, and a maintainer of the KEDA Project. Welcome to the show, Tom.

TOM KERKHOVE: Thank you for having me Craig.

CRAIG BOX: You live near Bruges in Belgium. It's a fairytale town, isn't it? Canals and bridges, and cobbled streets and those churches.

TOM KERKHOVE: It's really a nice city. A lot of people prefer Gent or Brussels, but I like more of Bruges who is a smaller, like you say, fairytale town.

CRAIG BOX: How do you end up with an internationally recognized IT firm working out of there?

TOM KERKHOVE: When I graduated from high school, they were one of the potential companies to work for. And I was happy to join them because I mattered as a person to them, and I was not just one of the many.

CRAIG BOX: Tell me a little bit about the kind of work that Codit does.

TOM KERKHOVE: Codit is a consultancy firm that helps you build scalable and reliable platforms in the cloud for you. So we really help you from design, all the way into production, and even beyond that. So we really take you through the whole journey and make sure everything works fine, and it's scalable, and it's ready for production. And beyond production as well. Because it's a continuous journey nowadays.

CRAIG BOX: You are both an Azure architect and the leader of the containerization practice. Tell me how containers became a thing that was part of the tool set that you would offer to customers.

TOM KERKHOVE: I believe containers should be the standard for any software you write because it really gives you the portability between the infrastructure that you run on. I could run them on Kubernetes, in Azure on prem, or in the cloud. Or, if you want to have less infrastructure, I can take that same application and deploy it on a more platform as a service offer in Azure without changing my application. It's a glide path between the control you want over your infrastructure, in my opinion, and it's super powerful.

CRAIG BOX: As your specific offerings before Kubernetes — they had Azure Service Fabric and so on — where do you think the balance came between needing to offer something that was cross platform, multi-vendor, and so on? Versus them building out things the way that Microsoft has done in the past?

TOM KERKHOVE: I think Service Fabric has its own place, but it's a very specialized niche scenario. Where Kubernetes is more of the open source, well adopted container orchestrator. That also gave them the power to bring their cloud to their customer. Because with their whole Azure Arc story, they allow you to run anywhere, but have that centralized operational story. And without Kubernetes, that would be super hard because they would need to build their own brand platform that also happens to run on other clouds, for example.

CRAIG BOX: When did you first discover Kubernetes?

TOM KERKHOVE: I think that must have been four years ago, or something, where I had my first customer moving to Kubernetes. Which is also how I got into autoscaling in general.

CRAIG BOX: Was that a customer driven request? Was that something that the customer had heard about and wanted to be involved in? Or did they come to you with a requirement, and then you found Kubernetes to be the best solution to that?

TOM KERKHOVE: They wanted to use Kubernetes, and that was also in the time frame where Kubernetes really started to pick up. Also the marketing machine started picking up. So it was driven by marketing, but in the end it was a good decision for them that gave them the platform agnostic aspect as well.

CRAIG BOX: Don't necessarily like to think of it as marketing, but there are a lot of people who make decisions based on what they'd like to be on their CV for their next job.

TOM KERKHOVE: Fully agree.

CRAIG BOX: What was the type of application that the customer wanted to run on Kubernetes?

TOME KERKHOVE: It was all about work. So Non-HTTP workloads, which they had running on a service called Azure Cloud Services. Which is still there, but it has gotten very few investments, and for them that was a big risk. So they decided, we need to run this elsewhere, and Kubernetes was their alternative. So we could just containerize the same application, deploy that on Kubernetes, and that's that. At least, that's what they thought it was. That gave them a lot more responsibilities in terms of infrastructure, and also from an autoscaling perspective.

CRAIG BOX: You say there, that's what they thought. Once the application was running on Kubernetes, where did it not meet their needs, or where did it not deliver the things that they thought they were going to get from it?

TOM KERKHOVE: First, the learning curve was a lot higher for people who were not aware of, first, containers, and then, Kubernetes. But then, a simple example, again, was autoscaling. In Azure they had that as a service. And now it was up to us to implement it ourselves. And again, we're talking years back where the only thing that you had was the HPA, and just the Prometheus metrics adapter, and that was it.

CRAIG BOX: Let's look at what the state of autoscaling was at the time. You mentioned their HPA, which is Horizontal Pod Autoscaling. That's effectively looking at CPU and memory usage of pods in the cluster, at the time?

TOM KERKHOVE: That is correct. And for HTTP workloads, that is typically OK. But we were using workers, and then these workers are typically processing a queue. You want to autoscale based on the amount of work that's still on the queue. And at the time, that was not really possible. However, that's also the time where, I think it was then called custom metrics, were being added. So you could install your metric adapter, fetch the metrics, and make them available to Kubernetes.

Now at that time, the only metric adapters were Google Stackdriver and Prometheus, I think. But because we use a cloud provider, we were not using Prometheus at all. So I went to my customer, hey, we could bring Azure Monitor Metrics into Prometheus, use the Prometheus metrics adapter, and then start autoscaling. But they decided, that's way too much things to maintain, or even implement, at the time. So they went with manual scaling. But in the end, I did it myself in my spare time, which was how another open source project of mine got started, Promitor, to build that bridge between the two.

CRAIG BOX: I was going to ask, because I would think that if someone was addressing that today, to say that it only supported Prometheus might not be that much of a bottleneck. Because we almost assume that everything has some Prometheus in it somewhere.

TOM KERKHOVE: That's a bit of a pet peeve for me because, if you're a deep tech Kubernetes developer, that's an assumption you hear a lot. But that's actually an assumption that's not always, or actually with our customers, almost never the case. Because our cloud provider is able to scrape Prometheus metrics and make them available next to other Azure services. So we tend to rely on that, and that's also when you have a closer look at KEDA. You will see that there is no requirement for Prometheus because we do not want to enforce people to do that. But those who have it can use it, but it's not a requirement. And that's a difference with other autoscaling technologies out there.

CRAIG BOX: KEDA was announced in May 2019 as a collaboration between Microsoft and Red Hat. Can you summarize, even though you weren't working for one of those vendors, what the project was at its launch?

TOM KERKHOVE: I had an early preview to that, and I was super excited because this was really the big gap that we were suffering from. At that point, I think we had around five to ten scalers. You could deploy KEDA, and you could bring metrics from the standard ones like AWS, Azure, Prometheus, and I think Redis or it was Kafka. One of the two, and that was it. So you could get started, but just with a handful of scalers just to try it out. And you could easily scale your applications.

CRAIG BOX: When you talk about these scalers, it sounds to me like KEDA is a broker between queue based systems and the metrics based scaling that's built into Kubernetes?

TOM KERKHOVE: KEDA, basically, is a metrics adapter for a variety of systems. Why is this important? Kubernetes only allows you to run one of those in the cluster. So if you would use Prometheus, and let's say, Azure Monitor, and another system, without KEDA you would need three adapters, which is not possible. So we replace that with KEDA, and we offer a catalog of scalers. And these scalers are just a way for you to say, I rely on system A, this is how you authenticate, and we want to scale. And then KEDA is able to authenticate, integrate, and retrieve the information it needs to, to make decisions for scaling.

CRAIG BOX: Now when you got that preview, did they give you a hint as to why they didn't just build an Azure monitor plugin for Kubernetes– why they decided to build something open, in collaboration with other vendors?

TOM KERKHOVE: That was actually already there. So, by that time, there was an Azure monitor metrics adapter. But they really wanted to collaborate with them, meaning Microsoft and Red Hat, to build an open application autoscaler for any products, platform, or technology that you had to use. And I was very happy that was the case. And that's also why, in March 2020, they decided to donate this to the CNCF as a sandbox project, to also show to the community, we want to be vendor neutral and this is how we want to do it.

CRAIG BOX: So your customer, who has these workers that they want to be able to scale on, how did they implement KEDA to solve that particular problem?

TOM KERKHOVE: They're not my customer today, anymore, but I think they were just installing KEDA, using the Azure service push scaler, and that was actually fixing their needs completely.

CRAIG BOX: Users of Kubernetes will be familiar with the idea of defining a deployment, which says effectively, I want this particular specification for my pod. And I want to have perhaps an autoscaler which says, I have a certain number of those. What does the process look like if you're using KEDA today? Do you still use deployments? How do you scale those with these external scalers?

TOM KERKHOVE: KEDA is easily installable through either Helm or the OperatorHub. And you can take any workload in the cluster, and just deploy what's called a ScaledObject. The ScaledObject defines what you want to autoscale, and how you want to autoscale. With what you want to autoscale, you can just point to your deployment, to your jobs, to your custom research definition, even. And then by using the triggers, you define when it needs to scale. And you can use one or more of them in conjunction.

CRAIG BOX: I obviously have to connect my Kubernetes cluster now to signals that are outside the cluster. How do I handle the security here? Is KEDA running as something in the cluster that has to have permission to act upon my part and then also connect out to my external queue service?

TOM KERKHOVE: No, so KEDA itself connects directly to the systems, and it doesn't change anything to the application itself. All it does is it uses an HPA under the hood, to which it fetches the metrics, and then it's just a typical HPA approach that scales your deployment. So we never touch your workload itself, we just extend it. And we use an operator to make sure that all your changes are applied how you want them.

CRAIG BOX: With autoscaling, I'm able to set a maximum and a minimum number of pods that I want to run. Scaling is a trade-off between cost and time. If I want to process my queue quicker, I can run more workers. But I obviously need to find what the right sweet spot is, and having those limits is great, in terms of being able to not overrun cost. But how do I monitor or audit the scaling system to figure out what the right point is to run that?

TOM KERKHOVE: We provide a few ways. One is we emit Kubernetes events for KEDA itself. So when we start scaling and all of these, because we use the HPA, you also get all the observability there. And we also expose Prometheus metrics for those who want to consume them. So those would be the ways. But depending on the tooling that you're using, you could also extend it. For example, at my current customer, I take all the Kubernetes events, bring them outside of the cluster, and I add my own autoscaling overtime dashboard that shows the instance count, put application, put name-space, and whatsoever.

CRAIG BOX: Now let's say I want to scale the number of pods running based on ambient sunlight here in the UK. And I should just point out, this is not a scale to zero joke. How would I do that?

TOM KERKHOVE: You could use the chron, or you can build your own external scaler which determines when there is sunlight, and then you just plug it into KEDA and we can start scaling your application.

CRAIG BOX: What does that process of plugging into KEDA, though? Is that something that I write, or is that a script that runs on top of the existing cluster?

TOM KERKHOVE: You would need to have that custom external scaler, as I said. Which could be pull or push based. Which runs in your cluster, and then KEDA talks to that API to check, do we need to scale or not? It's fully separated, it's fully extensible, and you can even publish your own external scaler on Artifact Hub so that all the people can find it.

CRAIG BOX: Let's have a serious conversation about scaling to zero, then. That's not a thing you can do with the horizontal pod autoscaler out of the box. You need to have a minimum of at least one pod. Why is that?

TOM KERKHOVE: Frankly, I don't know, because this is really a great way to build cost efficient applications and really get the most out of your investment. If I'm not mistaken, there is a cap, but I'm not sure. But we've also been thinking, should we contribute back more upstream, make sure things become part of Kubernetes itself. We're still investigating that. But being a separate product makes sure that we can pivot more easily than going through the Kubernetes release cycle, which is more robust for obvious reasons.

CRAIG BOX: KEDA can support scale to zero. How is that implemented?

TOM KERKHOVE: It's fairly simple. So what we do is, when you go from one to zero, we delete the HPA and manually scale your workflow to zero. And then once we go from zero to one, we do the reverse. We scale your workload back to one, we recreate the HPA, and then that's where he takes over. He will decide, is one enough? Or should we go to your maximum defined, or something in the middle? So that's where we extend.

CRAIG BOX: And that being able to run zero replicas of a particular thing lends itself well to more tent situations and to function-based workloads. And indeed, the announcement for KEDA was very heavily tied into announcements around Azure functions and the ability to basically have nothing running, and then get an HTTP request come in, and then scale up. Is that watching a queue or a load balancer? Or, how in that case are you getting the signal that you need to create that HPA in that initial pod?

TOM KERKHOVE: Just to be clear, HTTP is not supported out of the box. We have a separate component for that. But yeah, that's how it works. So KEDA scales your application based on watching the queue metrics. And if it sees there is suddenly a message on that queue, it will fire up your deployment and scale it accordingly. So, we really watch those systems that you integrate with so that you don't have to do that.

CRAIG BOX: There is a similar feature built into Knative in being able to spin up pods when it receives the new request. I understand that Knative also uses KEDA for some of its autoscaling needs.

TOM KERKHOVE: Knative is really strong at HTTP based autoscaling. And it's fully push based, while KEDA is more pull based for those metrics. And actually, Knative is also using KEDA under the hood for scaling all the non HTTP workloads, if I'm not mistaken.

CRAIG BOX: After you found KEDA, and you'd implemented it for the use case we've talked about before, you've since become a maintainer of the project. What was that transition like? Were they looking for help and you were able to jump in? Or was it a conscious choice that you wanted to be able to contribute to something that you were using?

TOM KERKHOVE: I think it's mainly about the excitement of seeing them build what I needed for a long, long time. So I just started helping on the small pieces, such as documentation, thinking about potential features, helping with the sandbox proposal for the CNCF. But I didn't implement any features and actually, fast forward almost two years later, I still haven't because I'm not a Go developer. And maybe that's why KEDA is so successful, because I didn't contribute any code. Doing open source is so much more than code. There's documentation, there's governance, there's automation, there's just triage. There's so many things, and that's where I've been focusing on. So that the people who know what they're doing can make KEDA a better place.

CRAIG BOX: So you've been there through the process of getting KEDA into the CNCF sandbox, and now into incubation. What were those two processes like?

TOM KERKHOVE: It was a reflection of how nice CNCF is as a community. I think Sandbox was a bit harder because we were in the midst of changing the process for Sandbox, so it was not really clear what the new approach was. But, in general, a lot of nice people helping us be successful with our project, helping when things were not clear, and also giving a valuable input on how to make our project a better place as part of the foundation. I highly recommend it.

CRAIG BOX: You mentioned before whether or not this should be part of upstream Kubernetes. And there are a lot of projects that are saying, well, we can't just be contributing to the core because the core releases on this three, now four, month schedule. And we want to be able to act at a pace that makes sense for our project. Did you give thought to whether or not there should be a top level CNCF project, or whether it should be a subproject project of Kubernetes, even if it doesn't release as part of the main core?

TOM KERKHOVE: Good question. We haven't thought about it because we were fully independent when it started. We're still fully independent, but maybe becoming a Kubernetes subproject could be a valuable option so that we still have the flexibility, but that we have a bigger reach so that it's a lot easier for people to autoscale.

CRAIG BOX: The project had released version 1.0 before submission to the Sandbox and released version 2.0 last year before its recent graduation to the incubation phase. What was the decision process behind when the project was ready for each of those major version increments?

TOM KERKHOVE: I think the answer is just that we are doing semantic versioning, and that's purely when we decide to ship a new major version. For 1.0, that was — and actually that might be my first biggest feature — is that we wanted to have the capability that end users can decouple the authentication from their application. So that's how trigger authentication got introduced, which had some breaking changes. And then for 2.0, that was the separation of ScaledObject and ScaledJobs. Because we supported both in 1.0, but they have different characteristics. And we noticed that, for end users, it's very confusing where, with a ScaledJob, you get a pod basically in a job manner. While ScaledObject is more the daemon approach, and we'll just scale it out. And because of that confusion, we decided to split them and be more explicit, along with the custom CRD approach.

CRAIG BOX: One of the things that a 1.0 release does is signify to a community or to potential users that the project is ready for production. Whether or not that's actually true, that's very much a thing that people believe. And I would like to congratulate you on using semantic versioning properly. How did you tell that people were adopting KEDA? And how have you responded to community use cases?

TOM KERKHOVE: That is one of the biggest problems in open source. You have no idea who's using your stuff.

CRAIG BOX: It's also one of the biggest problems in podcasting.

TOM KERKHOVE: I can imagine. Maybe that's also one of the main things that I've been doing, chasing people to get them on the website. But no, jokes aside, I think we've grown a nice community on the Kubernetes Slack. We try to support other people when they do presentations, when they write blogs, so we have an understanding of what they're trying to achieve. And in terms of the customer adoption, that's still a hard one, but frankly just asking people, hey, nice use case, what company are you working for and can we list them? That's typically how we learn. And also, we do community stand ups every two weeks. There, we also learn a bit from the end users, what they want to achieve and who they are exactly. But it's a big problem, in my opinion.

CRAIG BOX: One of the gates to move to the incubation phase was getting references from end users. And obviously, that's something that the CNCF has, end user communities and abilities to reach those people. What are the gates now for a potential move to a graduated project down the line? What things need to happen in the project before you can meet that bar?

TOM KERKHOVE: Good question, I would need to look it up, but the first thing that comes to mind is you need to have a security audit. And those things need to comply with some project standards as well. But that's far, far away for us now. I think the security, and the operations, and such, is a big factor that really makes you a mature project. And also, in terms of community building maintainers, make sure that the project is healthy with good governance. I think that's important.

CRAIG BOX: Do you think the project is in good standing in those areas at the moment?

TOM KERKHOVE: Our security scanning can be improved, but I think the biggest improvement point is maintainership. Where we have four maintainers of three companies. We are always welcoming contributions, and based on those contributions, we are happy to invite another person to the maintainers to spread the work a bit.

CRAIG BOX: You have scalers that have been contributed by a lot of people. So there are a lot of people contributing who are perhaps experts in the particular system that they want to integrate with KEDA and who aren't necessarily going to need to stick around the core. Do you think maybe that the CNCF model needs to appreciate that's a way that some projects will be built? That it doesn't necessarily make sense to have a large core of people working on the center of a project like this?

TOM KERKHOVE: I think it's going to be a mix. We will still have the core, but all the scalers that are part of the core, as well, are still maintained by us. Which is a problem that we are having now, is we have the, let's call it, the commit and run. We have a system expert, like you mentioned, who contributes scaler for system X. It gets merged, and then they're gone. But what if there are issues now?

CRAIG BOX: My British sunlight scaler doesn't work anymore, who do we call?

TOM KERKHOVE: For example — and that's why we're looking at introducing governance just for adding scalers — should we add new scalers as external scalers by default? Or should we still add them to the core, but by contributing them, you kind of agree that you will be co-maintaining them with us? Now, nothing is written in stone yet because we don't want to make it too robust and formal. But we've seen cases where bugs are reported for systems that we have no idea how to even test because that person didn't add any tests, even. So that's a bit of a problem we want to fix.

CRAIG BOX: And, in terms of your roadmap, you've talked about HTTP scaling. How would you solve that problem?

TOM KERKHOVE: HTTP autoscaling is fun because it's synchronous communication. So that means, if you scale from one to zero, or zero to one, you need to capture the request, hold it, and wait until your workload is up and running. We're using an interceptor to do that for you, but we also need to find a unified way to get metrics on the one we need to scale. Those who have used service meshes are aware that there is a service mesh interface who has a nice traffic metrics specification. Now, we would like to have that, but have it applied on Ingress, service mesh, service to service, basically any HTTP workload on Kubernetes. And that's one thing that we're working to see if we can make that happen. Because then, also, tool vendors could rely on that specification and build tools around that, regardless of the type of HTTP traffic.

CRAIG BOX: Any other major areas of improvement on the road map?

TOM KERKHOVE: We want to be more extensible as well. We want to keep on adding more scalers. So if you have any requests, please let us know. For example, we will add cloud event support. So instead of Kubernetes events, that we will trigger your HTTP endpoint. But I think the biggest one is that we've been wanting to do predictive auto scaling, but to do that, you need to have information to build a model on. So we will most likely look at making our metrics, and then scaler information, available for long term storage. And then, as a next step, we will look at making that predictive, so that you can really optimize for cost and scale out before the traffic is there. And help you scale your applications.

CRAIG BOX: And finally, your conference bio says that you turn coffee into scalable and secure cloud systems. Are you the most efficient way of doing that, or could I take my coffee and catalyze it somehow and spend a bit less energy on the process?

TOM KERKHOVE: Let's just say that I drink way too much coffee, but I like building cloud systems.

CRAIG BOX: Fair enough. Well, thank you very much for joining us today, Tom.

TOM KERKHOVE: Thank you for having me.

CRAIG BOX: You can find Tom on Twitter at @tomkerkhove or on the web at blog.tomkerkhove.be.

[MUSIC PLAYING]

JIMMY MOORE: Thanks for listening. As always, if you enjoy the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @KubernetesPod or reach us by email at kubernetespodcast@google.com.

CRAIG BOX: You can also check out our website at kubernetespodcast.com where you will find transcripts and show notes, as well as links to subscribe. Until next time, take care.

JIMMY MOORE: See you later.

[MUSIC PLAYING]

View More Episodes

KEDA, with Tom Kerkhove

Chatter of the week

News of the week

Links from the interview

Transcript