Kubernetes Podcast from Google: Episode 211 - etcd, with Marek Siarkowicz and Wenjia Zhang

#211 November 17, 2023

etcd, with Marek Siarkowicz and Wenjia Zhang

Hosts: Abdel Sghiouar, Kaslin Fields

Guests are Marek Siarkowicz, Senior Software Engineer at Google Cloud, Tech Lead of SIG-etcd AND Wenjia Zhang, Engineering Manager at Google Cloud and Co-Chair of SIG-etcd. We spoke about the project, the recent change to become a Special Interest Group and how to learn etcd.

Do you have something cool to share? Some questions? Let us know:

News of the week

Co-host this week is Mofi Rahman (Twitter/X, LinkedIn). Cloud Developer Advocate at Google
Karpenter graduated to Beta
The Kubernetes SIG Network announced release 1.0 of the Gateway API
Ingress2gateway new CLI to migrate from Ingress to Gateway
The Call for Proposals for KubeCon EU 2024 will close on Nov 26, 2023

Links from the interview

Links from the post-interview chat

Transcript

Show full transcript

ABDEL SIGHIOUAR: Hi, and welcome to the "Kubernetes Podcast" from Google. I'm your host, Abdel Sighiouar.

[MUSIC PLAYING]

Kaslin was not available for this episode, so we brought in a new cohost. Welcome to the show, Mofi.

MOFI RAHMAN: Hey, Abdel. Thanks for having me. Long-term fan of the show, and it's exciting to be here.

ABDEL SIGHIOUAR: Awesome. Mofi is a colleague based in New York, and he's working on batch. In this episode, we spoke to etcd maintainers Marek Siarkowicz and Wenjia Zhang.

MOFI RAHMAN: We spoke about the project, the recent change to become a special interest group, and how to learn etcd.

ABDEL SIGHIOUAR: But first, let's get to the news.

[MUSIC PLAYING]

MOFI RAHMAN: Karpenter graduated to beta. The open-source tool is a node lifecycle manager for Kubernetes. Introduced by AWS in 2021, the tool has seen good adoption, reaching almost 5,000 stars on GitHub. Karpenter is in the process of being donated to the CNCF under the autoscaling SIG.

ABDEL SIGHIOUAR: The Kubernetes SIG network announced release 1.0 of the Gateway API. With this milestone, a lot of key APIs like Gateway, Gateway Class, and HTTPRoute Reach GA. Congratulations to the community, the maintainers, and the contributors on these amazing achievements.

MOFI RAHMAN: Ingress2Gateway is the name of a new CLI release by the Kubernetes community. The tool is a converter from Ingress to the new Gateway API objects. Run it and it will pick up the current context, connect to the cluster, and load all ingress objects and print out the equivalent YAML manifest using the gateway API resources.

ABDEL SIGHIOUAR: The call for proposal, or CFP for KubeCon EU 2024 is still open and will be closed on November 26 this year. As a reminder, KubeCon EU 2024 will take place in Paris, France between the 19th and 22nd of March.

MOFI RAHMAN: And that's the news.

KASLIN FIELDS: Today, I'm very excited to be talking about etcd, something that I'm very interested in, with two of the maintainers, Marek Siarkowicz, who is a Senior Software Engineer in Google Cloud, and a Tech Lead of SIG-etcd, and Wenjia Zhang, who is an Engineering Manager at Google Cloud and a Cochair of SIG-etcd. Welcome to the show, folks.

WENJIA ZHANG: Hello. Hello, hello. I have been listening to your podcast a lot, and I'm so excited to actually be here. Yay.

KASLIN FIELDS: Excited to have you.

MAREK SIARKOWICZ: Yeah, great to be here.

KASLIN FIELDS: Yeah.

MAREK SIARKOWICZ: Thanks for inviting me.

KASLIN FIELDS: Welcome. I'm really excited about this because I've been interested in etcd for a long time. I remember in like 2017, one of my first tasks at a new job was to lay out how one would deploy Kubernetes, and the etcd part, I found the hardest. Because at the time, and I think it's still true today, I found etcd to be really difficult to find a lot of information about and really dive deep into unless you really dive into the code.

So I personally found it difficult to learn about, and so I'm really excited to learn about it from you all, who are experts. And then also, over time, I've heard more about the etcd contributors and the work that you've been doing, and I've wanted to get involved. But since I was already involved in Kubernetes, I couldn't really add that on. But we'll talk more about that in a little bit. First off, if anyone out there is not familiar with etcd, could you give a brief overview?

MAREK SIARKOWICZ: etcd is a key value store, which acts as a Kubernetes backend. So it's everything that happens in Kubernetes. Every change that is made, any operation done by controllers or operators will go through API server and be stored in etcd. So etcd is the brain behind the brain of Kubernetes.

WENJIA ZHANG: Yeah, exactly. And now, what is etcd world literally? And why is it called etcd?

KASLIN FIELDS: Yes.

WENJIA ZHANG: So Unix, all system configuration files for a single system are contained in the etc folder. And then the last letter, d, stands for distributed. So that aligns with what Marek just explained. etcd stores configuration information for large scale distributed system.

I still remember when I first talked to the founder of etcd from CoreOS, and they are pretty strict about how you pronounce it, because you cannot say e-t-c-d because it diminished the meaning of it. It's not the product that they viewed. It has to be etcd. So yeah, you can see my name wrong. You cannot see etcd's name wrong.

KASLIN FIELDS: I love that. I did not know that.

WENJIA ZHANG: Yeah.

MAREK SIARKOWICZ: There are also really strict about how you write it. It should always start from lower case because of that. And I've seen when I joined project and started writing documentation, after some time, someone just gone through and fixed all my mentioning etcd, because, yeah, you should not capitalize it. It should always be etcd as in folders.

KASLIN FIELDS: I did not know that history. I'm really excited about that part. But I'm also really excited about learning how to properly use the term, because I've definitely heard it pronounced different ways. And I'm glad that it sounds like I've been pronouncing it right.

WENJIA ZHANG: Yeah.

KASLIN FIELDS: And writing it right, because it needs to be lowercase. Wasm, I've also had this issue with recently because I think we interviewed someone about Wasm, and they were saying that the proper way to do it is actually with a capital W and then lowercase "asm" instead of all capital, which is what I definitely see most of the time. I don't know if they're going to win that battle, but etcd, I believe, can.

I love history and the history of the term is great. Could you tell me a little bit more about the history of the project? Where did etcd come from?

MAREK SIARKOWICZ: Yeah. etcd came from CoreOS, which was one of the startups in early days of digital transformation into cloud native computing that is now.

KASLIN FIELDS: I remember those days.

MAREK SIARKOWICZ: Before Kubernetes, and operators, and all the fancy stuff that we have now, the focus of the project was to maintain a single source of truth for configuration within a data center. So basically, you had a list of machines that all run single kernel, or distribution of Linux, and distribution Linux was able to automatically upgrade. And it did it by taking a look into etcd, because if you are pushing an upgrade and want to upgrade and now all machines to new version of Linux 5.0, you don't want all of them to start rebooting immediately. So etcd was strongly consistent, highly available key value database that also had locking capabilities so machines can take a look and limit how many machines can be upgraded at the same time.

It was created in 2013, but only within one year, it became the storage of Kubernetes. This is mainly because of same principles and philosophies. Both systems have one source of truth, and everything that happens needs to be available, transparent, and available in the source of truth. So it needs to be highly valuable and reliable. Kubernetes needed something like that, so it makes sense for when it is to adopt etcd.

KASLIN FIELDS: And I think Kubernetes' first public commit was in 2014. So etcd predates Kubernetes a little bit, right?

MAREK SIARKOWICZ: Yes.

KASLIN FIELDS: Yeah.

WENJIA ZHANG: Yeah. Well, I will bring a little history between etcd and CNCF. So etcd was accepted to CNCF on December 11, 2018 at the incubating maturity level. And that was in Seattle, KubeCon 2018. Then shortly after two years, it was moved to graduated maturity on November 24, 2020. At that time, we had a whole year of virtual KubeCon.

KASLIN FIELDS: Yeah.

WENJIA ZHANG: So yeah, etcd is a pandemic graduate.

KASLIN FIELDS: I see. And you came in at incubating, so you skipped over the sandbox level. Because it had already been around for five years at that point.

WENJIA ZHANG: Yeah, exactly.

KASLIN FIELDS: Good to know.

MAREK SIARKOWICZ: Yeah, I think etcd was second project that was part of CNCF. One of the first ones definitely.

KASLIN FIELDS: Yeah. Would make sense for it to be early.

WENJIA ZHANG: One of the very early ones, yeah.

KASLIN FIELDS: And how did you all get involved with etcd?

MAREK SIARKOWICZ: Now I'm working for over two years in the project. I started since the latest minor release, which was the 3.5. When I joined, I had the opportunity to pick any project that was available at that time because I was moving from the previous team that was working on Kubernetes logging monitoring.

And when I got an opportunity to work on distributed system, I immediately-- I don't have any regrets. I immediately picked etcd because I found distributed systems always hard but interesting-- hard in interesting way, that I want to learn it because barely anyone is talking about this, and this is so unique area that I need to be there.

KASLIN FIELDS: etcd is such a central tool. Kubernetes, I don't think, could really exist in the way that it does without etcd. One thing I wanted to mention earlier, which I forgot to, is that it's often called the single source of truth in Kubernetes and wherever etcd is used.

WENJIA ZHANG: Yeah.

KASLIN FIELDS: Cool. So the distributed part of it was really interesting to you?

MAREK SIARKOWICZ: Yeah, like distributed systems are-- it's not easy to get an opportunity to work with them. It's complicated enough that it's mostly avoided. You try to build applications that are independent or stateless applications, and even in Kubernetes, everyone is really scared, I think, now. Maybe more, but stateful is something still not understood by many people.

KASLIN FIELDS: One of my favorite topics to talk about is stable applications and distributed systems. And you're totally right on the-- distributed systems are kind of hard to get experience with. That's something that I tell people a lot when they're interested in learning Kubernetes for the first time, is like, oh, I'll just do a project at home, and I'll use it, and it'll be great. But any project you can come up with to use Kubernetes at home is probably not the best use case because Kubernetes is just meant for scale and meant for that distribution. So it gets tricky.

MAREK SIARKOWICZ: Yeah. So here comes from my second passion, which is testing.

KASLIN FIELDS: I love testing.

MAREK SIARKOWICZ: You cannot simulate everything, but in your home environment, you cannot get any failure mode. But in testing, you can simulate everything. So yeah, my main thing is proving that etcd is or is not correct in some situations, and it's mostly by simulating or injecting a failure into it.

So technically, you can. But it's just a test. Most of my learning of distributed system is by just testing and evaluating more than trying to run something in my garage.

KASLIN FIELDS: So you're going to help me learn etcd through testing, is what I'm hearing. In all that free time you have.

MAREK SIARKOWICZ: Yeah. I mean, I've wrote full testing framework for etcd and did at least one or two presentations about this.

KASLIN FIELDS: Excellent. I need to go check those out. We'll put those links in the show notes if we can find those.

WENJIA ZHANG: We really appreciate Marek's passion in etcd. He is now the lead of GKE etcd, and he's been doing a lot of great work in etcd in open source. And he just briefly mentioned it as testing. But before he did this correctness test, how do we find data corruption issue? We found it in production.

How scary is that? And now with this test framework, he can proactively find the problem in the system before any cloud provider production system and actually hit them. Well, there are still a lot of improvements in Marek's mind, but yeah, we're getting a lot better recent years. Fixing things or adding features is just easy for him.

KASLIN FIELDS: Yes, I'm a huge advocate for testing. Testing in production is not ideal. It sounds like you've had experiences with that. I'm excited to hear it. And Wenjia, how did you get involved with etcd?

WENJIA ZHANG: I started to work on etcd since 2017. That was the year that you mentioned you found etcd pretty hard to operate. And that was in the middle of 3.2 release, and I think that was 3.2.11. And then right before the release of 3.3.

None of those exist anymore. None of those are supported anymore, so long it's been. And I have been working on a lot of different parts around etcd in Kubernetes ever since, open source Kubernetes etcd, API machinery, and now with Marek, we are running reliable Kubernetes workload at scale in Google.

KASLIN FIELDS: So while I was over here struggling to learn etcd, you were doing something about it.

WENJIA ZHANG: Yeah, we are.

KASLIN FIELDS: Getting involved. And what version is etcd on now? It's interesting that the etcd versions we're talking about here are like three point, I'm used to Kubernetes world where it will never go beyond one.

WENJIA ZHANG: We start with-- I think when I joined, there are still two point something. And then there was three point.

KASLIN FIELDS: I feel like I started looking into it with a two point. I feel like 2.8 was what I started with. I might be totally wrong, but.

MAREK SIARKOWICZ: Kubernetes should have cheated like etcd, because etcd just started from 2.0. There was no version 1.0.

KASLIN FIELDS: I see. Interesting.

MAREK SIARKOWICZ: I did some research recently, and first release was version 0.2, and then there was some work on stabilization, and when they released, they released it in version 2.0.

KASLIN FIELDS: I love this history.

MAREK SIARKOWICZ: Or at least the first stable release was announced as version 2.0, to correct myself.

KASLIN FIELDS: Interesting. Usually folks use 1.0 as the stable release, but guess that didn't work out. I would be very curious what those conversations were like.

WENJIA ZHANG: It start 98.

KASLIN FIELDS: Yeah.

MAREK SIARKOWICZ: Usually takes three times to get something right. So maybe we are on the good number now.

WENJIA ZHANG: Yeah.

KASLIN FIELDS: That's true. So 1.0 of etcd was the pancake release. I don't know if you've heard of the pancake rule, but the pancake rule is the first pancake, you always throw away, because it never turns out right.

WENJIA ZHANG: It's interesting.

KASLIN FIELDS: Awesome. So you all have been involved with this for a while now. And we mentioned earlier your titles. Marek, you're a tech lead in SIG-etcd, and Wenjia, you are a co-chair of SIG-etcd. And we've talked about the history here.

etcd started out as this standalone project, which is used in other places aside from Kubernetes, but it's definitely most well known for its role in Kubernetes. And recently, there hasn't really been any announcement about this as of our recording, but etcd became a SIG within Kubernetes. Could you tell me a little bit about that?

WENJIA ZHANG: Yeah, definitely. Woo-hoo!

KASLIN FIELDS: Yay. I'm excited. More people in the Kubernetes community.

WENJIA ZHANG: Yeah, we're excited inside, even though we don't write blog posts. We will.

KASLIN FIELDS: We'll get there. We'll get there. I'll help with that.

WENJIA ZHANG: I have to mention some names to talk about this. Originally, earlier this year Han Kang, who is a long time API machinery contributor and also the chair of SIG Instrumentation, Han and Marek wrote a doc called "The Case of SIGify etcd." So that is the beginning of this journey.

Yeah, now we are SIG. James Blair from Red Hat and I are the co-chair, and Marek and Benjamin are the co-tech lead-- sorry, Benjamin Wang from VMware and Marek are the cotech leads. So why we want to make etcd as Kubernetes SIG instead of--

KASLIN FIELDS: And by the way, SIG stands for special interest group. I should have said that earlier.

WENJIA ZHANG: Oh, yeah.

KASLIN FIELDS: It's how Kubernetes aligns its work.

WENJIA ZHANG: Yes. So etcd serves as an engine which makes Kubernetes what it is right now. And etcd provides the guarantees and the protocols that enable the controller pattern, which in turn provides the basis of the rich Kubernetes ecosystem. So you can imagine if etcd went wrong, a lot of things can go wrong.

So making etcd a first-class citizen of the Kubernetes project means that we will have a proper space to make explicit contact between etcd and API machinery, and to prevent the change that will violate that contract on the etcd level.

KASLIN FIELDS: Just tying it much more closely to the project since it's already such an integral part, and just tying in with the processes will probably make that much easier.

WENJIA ZHANG: Yeah, definitely.

MAREK SIARKOWICZ: etcd wasn't the project that everyone appreciates in Kubernetes ecosystem. There are so many use cases. Like now, you can get Kubernetes on submarines, and oil racks. So you have different things for different folks.

KASLIN FIELDS: On ships. We had an episode about Kubernetes on ships.

WENJIA ZHANG: Oh yeah?

MAREK SIARKOWICZ: And etcd wasn't always the best thing to pick in every environment, even though Kubernetes is always the main thing, but etcd was not. So not everyone agrees that this is the direction that Kubernetes should stay. But one of the things about community is that there is sense of shared destiny. Because if we-- by etcd having a project that everyone can take stake in, so everyone uses and depends on, everyone can contribute and everyone can benefit from those contributions.

So there is a lot of thought about compared to all other interfaces in Kubernetes, like storage that sharded after some time and define interface, Kubernetes, from SIG API machinery point of view, is the shared destiny that we will not want to change. So it's important that we bring it closer so we can all improve it and make it work for different use cases instead of everyone sharding and then going in their different way, and at the end ending up with situation that Kubernetes development slows down because everyone wants their database to specialize to each database, or either to optimize some kind of performance, or not the other. So compared to all other cases like database beyond Kubernetes, there is a huge benefit of everyone contributing and taking part in it instead of saying, oh, let's define interface, and let's go somewhere else.

KASLIN FIELDS: So from a technical perspective, this makes a lot of sense. Like we've said, etcd is integral to the way that Kubernetes works today. I'd be very interested to hear some of the arguments around Kubernetes not using etcd. I can't imagine Kubernetes without etcd. If folks listening feel strongly about that, maybe you should share on social media or somewhere.

Also I think this change to becoming a special interest group within Kubernetes, it's not just about the technical side of it. It's also about the community side, right? etcd was already a CNCF project. So you had to fill out the CLA, the contributor licensing agreement. That's like a CNCF thing that they organize for a lot of their projects.

Kubernetes is so big, and as a cochair of SIG contributor experience, I deal with a lot of the process around making the community work. So what benefits do you see from joining the Kubernetes community for etcd?

WENJIA ZHANG: I think it's a win-win situation. It benefits the Kubernetes community as well as the etcd community. From the etcd perspective, we want to integrate better with the upstream Kubernetes and absorb their best practice, like the caps and PRs to improve the consistency and reliability of the code base. And the Kubernetes release machinery and Kubernetes security response are all helpful for etcd, and bringing Kubernetes contributors to etcd is also one thing that we want, we are trying to do.

And then from the Kubernetes aspect, absorbing etcd as a SIG also means that we can shape the direction of etcd more directly, completing things like downgrade so that Kubernetes can later leverage. And having etcd being a SIG in Kubernetes also makes Kubernetes better.

KASLIN FIELDS: The Kubernetes community is very well known. So we also have a lot of benefits, I think, with hopefully attracting new contributors. Because like I said, I've always wanted to get involved with etcd, but I had other things and I probably still will not be able to.

But hopefully folks who come to the Kubernetes community, because we have a lot of new folks who come in who want to find ways to contribute and we're always trying to tell them, oh, find what you're interested in within the project, go hang out with that SIG, find out what's going on. It's kind of difficult always to get involved, to find that thing that you're really excited about and get integrated with that part of the project. But hopefully being part of Kubernetes, we can direct more of those folks who are coming to Kubernetes because of the big name toward etcd.

WENJIA ZHANG: Yeah, definitely. And then one thing to mention is compared, as you said, Kubernetes is a humongous project. And compared to Kubernetes, etcd is relatively smaller code base, and easier to understand. And it's actually a good starting point.

KASLIN FIELDS: It's interesting that it's kind of like a standalone thing that is within the Kubernetes community now. So yeah, it makes a lot of sense for new contributors to be able to grasp.

MAREK SIARKOWICZ: A lot of contributors trying to contribute to etcd that are so ingrained at Kubernetes ecosystem and tooling, and bots, and releases that contribute to the project, assuming that those same tools are available. And this is a big-- the gap that they are trying to interact with the bot from the pull request bot and ask, or, retest missed this code.

And unfortunately, from our perspective, it not only doesn't work, we cannot even give this permission to our contributors because GitHub doesn't support letting users run their tests. Only people that can write the repository can rerun tests, which is, of course, not great if you have one test that is flaky and you're trying to get your PR merged.

KASLIN FIELDS: A lot of the things you mentioned, Wenjia, I think are things that people easily forget about or ignore with developing an open source project. Things around release, and like Marek was saying, GitHub infrastructure and GitHub automation and things are actually really hard to figure out for open source projects, especially at the scale of Kubernetes. So SIG contribex has a subproject that works on a lot of our GitHub stuff and we have all of these unique tools like Prow that help us do the work that we do within GitHub while maintaining certain rules around access, and making sure that things make sense. So you get some of that by joining Kubernetes.

WENJIA ZHANG: I just do want to mention with all the things that we mentioned about, we are not moving the repository into Kubernetes organization.

KASLIN FIELDS: Oh, interesting.

WENJIA ZHANG: Which means the existing etcd development experience won't change. It will stay the same, but with some more benefit, like the bots and stuff that you can use. So for the people who are already etcd maintainers, contributors, it won't be that much disruption.

And also, becoming a SIG doesn't mean we will only focus on Kubernetes use case. So there are other cases still using etcd, but not Kubernetes. That's fine.

KASLIN FIELDS: Yeah, I had heard some conversation about features within etcd that are not used by Kubernetes but are used in other use cases, and what's going to happen to those now that it has become a SIG. And what I've been hearing, and it sounds like you're echoing here, is those things will continue, and etcd will continue to be a project that folks can use outside of Kubernetes as a reliable key value store in distributed systems. But just now, it has the benefits of using some of Kubernetes's processes.

MAREK SIARKOWICZ: One of my personal goals was to not only by moving etcd to Kubernetes not to make etcd specialized in Kubernetes, but do reverse. Take the work of Kubernetes community and integrate it back to etcd, because etcd project not always like solved all the problems, because the use cases are so proud and it's hard to track. Because Kubernetes users are so specialized in how they use etcd, they build a lot of additional tooling or improved how to use etcd API that is not available for everyone to use.

What I meant here mostly is the watch cache, which is from Kubernetes perspective, the main driver for optimized usage of etcd. And having this available for everyone and have benefit would be a huge difference. There are now more and more projects that try to-- they need a control plane and they need an efficient, highly scalable control plane that Kubernetes even cannot reach the scalability. For example, to build batch.

Batch has been always not great on Kubernetes, and it's hard to make Kubernetes shift from running applications to batch workloads. And I hope that by bringing some concepts from Kubernetes to etcd, it should make it much easier, because there are already projects that support both etcd and Kubernetes as its control plane. And the main problem for them, they don't get the tool-- like in Kubernetes, you get a lot of benefits from having already built in watch caching API server.

But having this also available from etcd would make everyone benefit. So basically, how we can build a reconciliation loop without the overhead of Kubernetes serialization/deserialization? This is one of the things that has been blocking some projects already, like Calico, [? Typhoon, ?] or Cilium, that would greatly benefit from this.

KASLIN FIELDS: I did not expect you to talk about batch use cases and this is a really interesting. One of my teammates has been focused on batch use cases in Kubernetes. And so I've been learning a bit about that.

And one thing that I've noticed is Kubernetes has tools that are meant for batch use cases, like jobs. But honestly, what I see in a lot of real-life production batch use cases is folks want a lot of control over how the workloads are scheduled, and I'd never heard it taken to the etcd level, and I find that very interesting.

MAREK SIARKOWICZ: This is less about scheduling. It's more about you need to start configuration somewhere. And because of batch workload type, you do a lot of immediate churn.

KASLIN FIELDS: Yeah. Batch is all about doing a big chunk of work in smaller chunks.

MAREK SIARKOWICZ: So in Kubernetes, you pay a lot of costs because you're reading and putting back a lot of objects quickly. You have scheduler that reads/writes and some controller that will read and write them back to the storage. There is a lot of overhead of Kubernetes API that does serialization/deserialization.

You don't have this overhead in etcd. You can just put the configuration, read it back. What you need is just a get list and watch. And this is already something that etcd provides.

KASLIN FIELDS: And batch workloads are also very popular with AIML, which, as we all know, is the thing.

MAREK SIARKOWICZ: When I heard about this, it was also surprising for me. But it makes sense that project like Cilium, they hit scalability threshold on Kubernetes and just decided to support an independent dedicated etcd cluster to store its configuration.

KASLIN FIELDS: Interesting. Also, coming back to what I was saying earlier about learning etcd, etcd is important in my experience in the certified Kubernetes administrator exam. You have to know a little bit about how etcd itself works and how the clusters work. If you don't mind switching gears for a minute here, I would love to hear your advice on how to learn etcd and a little bit of information about how it works.

MAREK SIARKOWICZ: I mean, I learned only two years ago, but it's so long.

WENJIA ZHANG: Marek started with a complicated project, downgrade.

MAREK SIARKOWICZ: Yes, I did.

KASLIN FIELDS: So into the fire. Right into it.

WENJIA ZHANG: Yep.

MAREK SIARKOWICZ: Yeah. For me, the main thing was having a good understanding of raft.

KASLIN FIELDS: Oh yeah, raft. I haven't heard about that in a long time. That's a key component of etcd, right?

MAREK SIARKOWICZ: Yes. Yes, it's just a library getting a PhD. So there is a full 100-page paper that was written to basically create and discuss all the edge cases. So it's a consensus protocol, which are both simple to talk about, but the crucial point of distributed systems and the most hardest thing.

But for me, what raft provides externally for me as a user or as a etcd-- because you can use raft in any application that you would want. I don't know if you would want, but it's a crucial part of etcd because it can take any amount of requests that can be done on different times by different users and just find one order, or proposes one order, and just make sure that every member that is a member in etcd is a member of a cluster.

So every process, that is working as one unit to solve the same problem. So for etcd to store some value, or to remove some value, or to read some value. All of them has one history of those changes. So if you understand that etcd is just there are coming some requests, then it goes to raft, it creates some order, and then etcd just executes this order really helped. And so for me, raft was the main thing.

KASLIN FIELDS: As integral as etcd is to Kubernetes, raft is as integral to etcd.

WENJIA ZHANG: Yeah. I started learning etcd with that raft paper as well.

KASLIN FIELDS: Interesting.

WENJIA ZHANG: There are a lot of ways to learn etcd. You can read that paper first. There are a lot of great YouTube videos, a lot of them from KubeCon, Marek has some great tutorials. We can link some things in there.

KASLIN FIELDS: Yeah, that'd be great.

WENJIA ZHANG: And then there are a lot of help need issues in etcd that you can start with. And I just want to mention here we actually also have an etcd mentorship program right now going on. The thing is we are shaping our first cohort of the etcd mentorship program. Unfortunately, right now, we do not have that much bandwidth to help as many people as we want.

KASLIN FIELDS: I mean, maybe you have something going on, becoming a SIG in Kubernetes. I could maybe see how you could be busy.

WENJIA ZHANG: Yeah. But we want to target it having more etcd contributors, maintainers first. And then with that, we will have bandwidth to help grow more people who are just interested or who want to start to learn, and things like that. But stay tuned. We'll have more things coming up.

KASLIN FIELDS: Excellent. So several strategies, really, we've gone over here. There's that raft paper if you want to dive into really how etcd itself works. And we also talked about testing, finding a use case, and playing around with etcd on its own outside of Kubernetes would definitely be a good way to learn. We'll find links to tutorials and videos, if there are any particular ones that you all want to recommend, and the mentorship program.

If you're interested in getting involved with etcd as a contributor, my advice for new Kubernetes contributors is always attend the regular SIG meetings. And when you first attend them, you don't know what's going on, and it's a little overwhelming, and you're like, what am I really doing here? But the key to it is you keep going to those meetings and you'll start picking up the language, you'll start understanding how it's organized, and then you become part of the community. So definitely check that out. And as you said, there are good first issues, right?

WENJIA ZHANG: Yeah.

KASLIN FIELDS: So folks who are interested in starting out with contributing, you can find those good first issues and help out there. And also, another great way to learn is coming up very soon here-- KubeCon.

WENJIA ZHANG: Yay.

KASLIN FIELDS: What does etcd have going on at KubeCon?

WENJIA ZHANG: We actually have a lot of activities going on this time. It's November in Chicago, and we will have SIG-etcd, the SIG track talk, our very first SIG track talk. We used to have project etcd talk. Now we have the SIG track talk.

So we will actually be talking about forging a stronger bond between etcd and Kubernetes. Marek will be there. I'll be there. James Blair, the other chair of etcd will be there. And then Marek have a very interesting topic, secret of running etcd, there as well.

And then there are tutorial sessions about how to master etcd observability from [? Bogdan ?] and [? Vivek ?] from Apple. They are long-time etcd contributor as well. So that will be interesting. And then I just noticed that, when I looked up the schedule on KubeCon, there is a talk from Priyanka from [? Susie. ?] She will be talking about the etcd revision and resource version in Kubernetes. That's actually a very good topic.

KASLIN FIELDS: Priyanka is actually a tech lead in SIG contribex.

WENJIA ZHANG: Oh. OK. And we'll have contirub-fest. We'll have etcd kiosk. And all the details will be posted on etcd.io/kubecon.

KASLIN FIELDS: Oh, nice. We'll make sure to put that link in the show notes. What is going on with the etcd kiosk? Is that just on the show floor? You can come and meet the contributors?

WENJIA ZHANG: I think so.

KASLIN FIELDS: Cool.

MAREK SIARKOWICZ: Yeah we are still using or pulling strings to get our dedicated kiosks, which is a benefit of being CNCF project.

KASLIN FIELDS: Nice.

MAREK SIARKOWICZ: We are doubling down both SIG meeting and kiosk, which means that will be every day, all day someone will be there to talk about etcd.

WENJIA ZHANG: Yeah, come talk to us and try to learn it if you want to learn it, or learn how to run it, or learn how to contribute to it. Yeah, come talk to us.

KASLIN FIELDS: After this, I am definitely putting the secret of running etcd on my schedule. I need that. So I definitely haven't figured that secret out myself yet. Awesome. So thank you both so much for being here. I'm really glad that I got to learn a little bit more about etcd, and I look forward to seeing you all around the community.

WENJIA ZHANG: Yeah. Yeah. And thank you so much. It was a very pleasant conversation here. I'm so glad to be here. Yeah, now on podcast. That's exciting.

KASLIN FIELDS: If you're interested in joining etcd, make sure you join the SIG meetings. All right, thank you so much.

WENJIA ZHANG: Thank you. Bye. Bye, Eric.

[MUSIC PLAYING]

ABDEL SIGHIOUAR: Well, that was a pretty interesting episode. What do you think, Mofi?

MOFI RAHMAN: Yeah. I mean, etcd is such a low-level thing for Kubernetes. On a day-to-day use case of Kubernetes, you don't really think of etcd that much. It just happens. It just does things.

ABDEL SIGHIOUAR: Yeah. Of course, unless you are an administrator or something. Then it's your problem. Well, on the self-deployed Kubernetes.

MOFI RAHMAN: Yeah. And also another thing I think the person Kaslin's talking about in her team working on batch was me, and one of the challenges of running batch workload on Kubernetes is that on batch, you are oftentimes running thousands of copies of a job. And there is a possibility that you can overload the etcd server by having too many of certain type of objects, because etcd on Kubernetes has some limitations.

ABDEL SIGHIOUAR: Yeah.

MOFI RAHMAN: So my understanding and learning about etcd came from the point of view of, OK, I have a job that is going to spin up 200,000 pods and I might actually be reaching the limit of etcd.

ABDEL SIGHIOUAR: Yeah, this was an interesting conversation, because I think that this was the churn rates that Marek was mentioning. The API server and etcd have a churn limit, so you're not able to like spin up 3,000 pods at the same time. You can only do a certain number of pods per, I don't know, at any single moment. And I don't know, I think on [? GKE, ?] it's like 20 or something per operation. So you cannot like just make requests to spin up more than 20 pods per request.

MOFI RAHMAN: Yeah. For batch workload, a lot of our customers are looking at how many maximum throughput of pods we can get, and a lot of that has to do with the Kubernetes API itself. But underneath, because Kubernetes has to store all that information to etcd, some of the limitation actually hits etcd as well. Can we get 300 pods spinning up per second for a batch workload, or some arbitrary number like that?

ABDEL SIGHIOUAR: I'm wondering if Kueue, you know Kueue, the queuing tool for Kubernetes, if it has solutions for that? Do you know?

MOFI RAHMAN: So Kueue itself does not necessarily control the Kubernetes API server. Kueue handles everything via using the suspended flag on the job. So it will still be a Kubernetes object, but it will be suspended so that it doesn't actually take up resources on the cluster.

ABDEL SIGHIOUAR: I see. But I was wondering more if Kueue has a way to-- well, since it's a queue, it has a way to limit how many requests go to the API server. So it's like before the request goes to the API server.

MOFI RAHMAN: So the way queue is designed at this point, it is not meant to have its own rate limiter for Kubernetes. It rate limits on how many objects get resources on the infrastructure, not-- because otherwise, Kueue would have to maintain its own storage so that it can cache things before sending it to Kubernetes. And Kueue's goal from the get go was make it as simple and as much of not reinventing the Kubernetes wheel as possible. But again, down the line, it is possible that the use cases become so complex that we need to do something similar. But with Kueue, at this point, it is keeping as much native Kubernetes as possible.

ABDEL SIGHIOUAR: That makes sense. OK, I get it now. I think also the other stuff that Marek was talking about, and I don't know if-- I don't want to speculate here. But he was saying that in addition to etcd, the API server adds overhead with the serialization/deserialization. And I think he was hint-- I'm not sure, but there was a hint in the conversation toward can we actually skip the API server and use the watch feature of etcd directly, right?

MOFI RAHMAN: I thought there was a hint of that. Also in the conversation was really interesting for me where-- I didn't know about this, but they did mention in the community people are talking about can we make this entire interaction between API server and etcd into some sort of interface where other database and other solutions can be used.

ABDEL SIGHIOUAR: Yeah, standardize it. Yeah.

MOFI RAHMAN: And I thought that was interesting because in Kubernetes, pretty much every single other thing is an interface. The image itself is an interface, the networking is an interface.

ABDEL SIGHIOUAR: The CLI, the CNI, all that stuff. Yeah.

MOFI RAHMAN: Yeah, all of these are interfaces. But when you think about in theory, the thing etcd does, it's such a simple thing. It stores all the state.

ABDEL SIGHIOUAR: Yeah, it's key value. That's it.

MOFI RAHMAN: It's key value storage. But the thing he mentioned was really interesting to me is that optimizing for different type of database is way harder than just optimizing for one type of storage.

ABDEL SIGHIOUAR: Of course. Yeah. Yeah. Yeah.

MOFI RAHMAN: So yeah, as a community, we have decided to back etcd. And think having the community sticking together and making etcd as best-- as good as possible, probably long term better for the Kubernetes project. But again, as they also spoke about in the conversation is that people might have different opinions and we might see some other things that are happening and conversation.

I am sold that etcd is-- we have already put so much effort in etcd and making it as good as possible is probably the best idea. But I'm interested in seeing what people have thoughts about. Are people finding limitations or are there any solutions that work better?

ABDEL SIGHIOUAR: Yeah. If they work toward having a standardized interface for, basically, Kubernetes to access a key value store, then that probably would open up the door to other things. You have zookeeper, you have-- what's the one from HashiCorp?

MOFI RAHMAN: Console.

ABDEL SIGHIOUAR: Console. That's technically-- I mean I, don't think console would work because it's HTTP, it's layer seven. So that will add up a bunch of overhead for the requests going in and out of console. But yeah, no, stuff like that. Key values, there are tons of them already.

So that was an interesting conversation about the churn rate and the overhead. I didn't know that Kubernetes, the API server would serialize/deserialize the data before-- well, I mean, it's part of the HTTP, so it makes sense. I just did not think about it. But as you say, this is-- it's one of these components that as long as it works, you wouldn't care. And it just, for the most, just works.

So I think converting it to a SIG probably would make sense to also bring up the Kubernetes closer together, because Kubernetes, by this stage, is pretty mature, and already has the tooling, and it's already-- the community knows how to work very well together. They're doing three releases per year and they're very good at doing it. So yeah.

And speaking of this, I was just chatting before we started recording. One of the funny things was they mentioned that the graduation was it, was a pandemic project, right?

MOFI RAHMAN: Yeah. November 2020, I think, they graduated.

ABDEL SIGHIOUAR: And also, the last time etcd was on the Kubernetes podcast was in the pandemic. It was the beginning of the pandemic, March 2020. So it's kind of funny how this whole timeline works.

MOFI RAHMAN: And the other thing that was really interesting for me was when the conversation went to about learning a new project or learning etcd, and Marek mentioned his first foray into etcd was trying to figure out downgrading etcd, which is such a left field of anything normal people would do with it etcd.

And it's kind of trial by fire. You learn most when you're in unfamiliar territory and doing something that most people wouldn't touch. But at the same time, it's probably not how most people learn things. We learn things by doing simple things, Hello, world type things.

So that was interesting, he is now the tech lead of etcd. So I'm assuming he learned it pretty well. But he did that via doing something where 99.99% of the people will never touch, downgrading a software. You don't think about it that.

ABDEL SIGHIOUAR: Yeah. Yeah. Yeah. It's quite interesting also that-- yeah. You usually you would learn something by either reverse engineering it or following some documentation and building some stuff with it, but not having to-- that would be like learning Linux by having a bunch of Windows servers that you have to migrate to Linux and you have to do it like now.

MOFI RAHMAN: Yeah.

ABDEL SIGHIOUAR: I'm not saying that Linux is downgrade over Windows, but you have to learn it on the job. You have to do it and you-- people are expecting you to get stuff done.

MOFI RAHMAN: Yeah. Most of my Kubernetes learning probably happened while troubleshooting something, like something not working and you are basically going through the error logs and trying to figure out what this means, and just go to Google, and Google it, and just read issues that are stale for like four years. It's just--

ABDEL SIGHIOUAR: Yeah. Oh, yeah. The amount of time spent on Stack Overflow and GitHub, and--

MOFI RAHMAN: Yeah. I think all the growth and knowledge comes from frustration.

ABDEL SIGHIOUAR: Yes. Yes, I agree with you. But then this is a side conversation, we're sidetracking here. But this is back to whenever people ask me, what's the great way to learn x, insert x, whatever it is, right, cloud, or whatever.

It's like, you have to have the fundamentals nailed down. You have to know your way around the command line. As long as you do that, troubleshooting a Linux server, or a Kubernetes cluster, whatever, you know how to use the command line, you know how to pipe stuff through things and filter and stuff like that, then it doesn't matter. If it has an SSH console, you know how to fix it.

MOFI RAHMAN: Yeah. And another thing that was really interesting from this conversation as well is that something like etcd is such a fundamental building block of this cloud native system like Kubernetes, when it works well, you don't even think about it exists. So a lot of these projects that we have a lot of thankless maintainers maintaining these projects day in and day out and keeping this whole behemoth of a system that is running right now running, until something actually goes wrong, you don't even think about they exist.

So it is-- both of us, Kaslin included, we work fairly closely with open-source communities, so we have probably decent understanding of these challenges. But most people that are benefiting from a lot of these open source projects probably don't have to think about them as much. And we probably think a little bit more about those things.

ABDEL SIGHIOUAR: While you were talking, I was looking up something that you reminded me of. This gave me the vibes of the-- what was it, the Heartbleed vulnerability in OpenSSL in 2014 when suddenly-- OpenSSL is so fundamental to how the internet works that suddenly that library having that vulnerability, everybody was just running around like headless chickens trying to figure out how to fix it, and then realize, well, none of the maintainers actually work on this thing full time. Everybody does that as a side job.

MOFI RAHMAN: I think we should link this in the show note, but that XKCD comic of all the internet by one person working from the Midwest on the project.

ABDEL SIGHIOUAR: Exactly.

MOFI RAHMAN: All the big projects handling on top of that--

ABDEL SIGHIOUAR: Exactly.

MOFI RAHMAN: That is very I don't relevant to this conversation, too.

ABDEL SIGHIOUAR: Yeah. I mean, yeah, it's just the realization that something so fundamental to how everything works, and then suddenly-- it's actually the same thing of you go to a conference and you meet three people who are the sole three maintainers of a single big library that everybody-- and then it was like, oh, did you fly on the same flight? OK. And you took the same bus to come to this conference.

MOFI RAHMAN: Yeah. We need a new open source project that basically tracks who are the core maintainers of certain things and don't let them schedule things at the same. Yeah.

ABDEL SIGHIOUAR: Are doing--

MOFI RAHMAN: Funny

ABDEL SIGHIOUAR: Exactly. You're not allowed to be on the same bus, not on the same flight, not even in the same corner. That would be very funny. But yeah, it's-- I mean, obviously, this is something that I think you probably have been in the industry long enough to know that the way open source is done today is completely different from the way it was done like 10, 15, 20 years ago.

It was people doing things on their site on their spare time. Now there are people being paid full time to actually work on open source, which completely fundamentally changed the vision people have towards open source.

MOFI RAHMAN: Yeah. I think one of the conversations also brought up in the conversation, too, about the shared fate. So a lot of the open source work now is companies understanding that they can go a lot further by sharing their ideas rather than just building individual things.

And also, something switched in the last 15, 20 years where consumers also are looking at open source projects with a more favorable vision where open source project seems to have a better life cycle, seems to have better support long term, and has quicker growth. So it's one of those, dare I say, win-win situations where companies realizing they can do more things, consumers also realizing they can get more out of their vendors and whatnot from being supportive of open source projects.

ABDEL SIGHIOUAR: Yeah. Despite what everybody says, everybody has opinions about the CNCF, this is one of the core values of having a foundation like the CNCF or the Linux Foundation. It's giving that feeling of maturity, that this is not just a random person with a random GitHub repository that can, at any moment in time, turn it into private, and then suddenly no one can download the code. So like, no, no, this is a proper foundation with proper governance and the same people that know what they're doing.

So yeah. Yeah, anyway, that was cool. That was interesting. etcd episode. I didn't know I wanted to even learn anything about it, and then an hour later, I'm like, oh, how come I was missing all of this stuff?

MOFI RAHMAN: I honestly think I am still OK being blissfully ignorant, because we have bigger and more complex things on top of etcd and Kubernetes we can do.

ABDEL SIGHIOUAR: Yes.

MOFI RAHMAN: But I was glad that I spent that time learning about it.

ABDEL SIGHIOUAR: Yeah, it was pretty cool. It was pretty cool. Well, thank you for hanging out with me, Mofi.

MOFI RAHMAN: Thanks for having me.

ABDEL SIGHIOUAR: It was fun having you on the show. And yeah, we'll talk to you next time. Thank you.

MOFI RAHMAN: Until next time.

[MUSIC PLAYING]

KASLIN FIELDS: That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter, @KubernetesPod, or reach us by email <KubernetesPodcast@google.com>.

You can also check out the website at KubernetesPodcast.com where you'll find transcripts, and show notes, and links to subscribe. Please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening, and we'll see you next time.

[MUSIC PLAYING]

View More Episodes