Kubernetes Podcast from Google: Episode 225 - Postgres on Kubernetes, with Álvaro Hernández

#225 May 15, 2024

Postgres on Kubernetes, with Álvaro Hernández

Hosts: Abdel Sghiouar, Kaslin Fields

Álvaro Hernández is the founder and CEO of OnGres a company that provides among other things a distribution of Postgres that runs on Kubernetes, called “StackGres”. Álvaro is also an AWS Data Hero and a passionate database and open source software developer

Do you have something cool to share? Some questions? Let us know:

News of the week

Kubernetes Code Cleanup:

Links from the interview

Álvaro Hernández:

Transcript

Show full transcript

ABDEL SGHIOUAR: Hi, and welcome to "The Kubernetes Podcast" from Google. I'm your host, Abdel Sghiouar.

KASLIN FIELDS: And I'm Kaslin Fields.

[MUSIC PLAYING]

ABDEL SGHIOUAR: This week, we speak to Álvaro Hernández, founder and CEO of OnGres, a company that provides, among other things, a distribution of Postgres that runs on Kubernetes called StackGres. We spoke about databases on Kubernetes, Postgres, and its extensions, and building dynamic containers.

KASLIN FIELDS: But first, let's get to the news.

[MUSIC PLAYING]

The largest code cleanup from Kubernetes yet has been completed, removing over 1 million lines of code. The removal of vendor-specific code from Kubernetes, also known as movement of that code from in-tree to out-of-tree, is a trend in Kubernetes which you can see reflected in several projects.

The removal of storage plugin code and the design of the gateway API are two other examples of this trend. This particular PR removes code that provided functionalities specific to cloud-provider environments. Much of this functionality will now be provided out-of-tree in separate controllers.

Check out the links in the show notes for more information and history on this Kubernetes Enhancement Proposal, or KEP, which has been in the works since 2017. And congratulations to all the contributors who made this exciting achievement possible.

ABDEL SGHIOUAR: Google I/O, Google's largest developer conference, is happening online on May 14. You can tune in to the livestreamed keynote and dive into technical content and learning material on demand.

KASLIN FIELDS: The CNCF's report on KubeCon EU 2024 is out now. This was the largest KubeCon ever, with over 12,000 attendees, 52 of which were first-time attendees, and 78% were from Europe.

The event featured 331 total sessions, with 223 breakouts and 19 keynotes from 647 speakers. It also featured KubeCon's first-ever hackathon, CloudNativeHacks, held in conjunction with the United Nations. And over 6,000 people attended 16 co-located events, and there were over 2,540 CFP submissions.

ABDEL SGHIOUAR: The CNCF and Google are hosting a Kubernetes birthday bash event on June 6 at the Google Bay View Campus in Mountain View, California. The birthday bash will blend elements of a reunion show, while celebrating many hidden figures that made this project possible, and a preview into what awaits us in the next 10 years. We'll celebrate people who have been integral in shaping Kubernetes into what it is today and discuss why its significance will only grow in the years ahead. RSVP to the party are open now.

KASLIN FIELDS: The Kubernetes community has taken over management of the official Kubernetes account on EKS. Previously, the account was run by the CNCF, and the CNCF will still post on the account occasionally. Follow Kubernetes.io on EKS for news from the Kubernetes project for users, community members, basically, for everyone.

ABDEL SGHIOUAR: For some particularly topical projects, the Kubernetes project spins up working groups, contributor teams with specific topical focus. The community is excited about the creation of a new serving working group, which will focus on AI workloads in Kubernetes. The serving working group's goal is to discuss and enhance the support for inference-serving for accelerated workloads in Kubernetes, and to make Kubernetes the natural choice for hosting production inference reliably, and improve all serving workloads along the way.

KASLIN FIELDS: The Data on Kubernetes Community, which will be discussed more in today's interview, has an ambassador program, with applications open now until May 31, 2024. The DoK Ambassadors Program was created to acknowledge and empower members of the community that have demonstrated expertise in the DoK ecosystem and a level of enthusiasm for participating in the community. And that's the news.

[MUSIC PLAYING]

ABDEL SGHIOUAR: All right. Today, we're talking to Álvaro Hernández. Álvaro is the founder and CEO of OnGres, a company that provides, among other things, a distribution of Postgres that runs on Kubernetes called StackGres. Álvaro is also an AWS data hero and a passionate database and open-source software developer. Welcome to the show, Álvaro.

ÁLVARO HERNÁNDEZ: Hello. Thank you very much for having me.

ABDEL SGHIOUAR: This was a long, "had to be done" kind of thing. We met last year in July in Greece. [CHUCKLES] And we've been discussing having you on the show, so I'm glad you're here. Let's get started with the question we ask all the time. Introduce yourself. Who are you? What's your background? [CHUCKLES]

ÁLVARO HERNÁNDEZ: Actually, I think I'm going to answer how other people define me. [CHUCKLES] And it's kind of the Postgres man. In reality, I started using Postgres a long, long time ago, way back in the university times when I didn't know anything about databases. And I needed to develop an application. I had to store some data.

And I asked some of my colleagues. They say, use a database. I was like, what is that? They explained me, and I said, OK, which one do I use? And they say, Postgres. OK, fine. I never looked back, never used another database.

ABDEL SGHIOUAR: Nice.

ÁLVARO HERNÁNDEZ: It always fulfilled my needs. So I'm kind of a Postgres person. But more generically, I'm, of course, a telecommunication engineer by my university degree. I studied both in Spain and in the US. And I'm very, very passionate about technology. I define myself as a learnaholic, which means I like to learn about almost every corner of technology. I have my preferences, but I like to learn about everything and have a wide-open vision about all technology around.

And I'm also a founder of a company, a bootstrap company, which makes me a small startup but makes me very proud about this nature. And I also like and pray very much for open-source software. I've been doing a lot of open-source all my life. Since I was, again, back at the university times, I have contributed and created a lot of open-source software. And this is also my passion. So technology, in general, is all about passion for me.

ABDEL SGHIOUAR: Nice. And I'm reading some of those kind of icebreakers we put in the DoK. You've never actually worked for anyone. You've always been your own boss.

ÁLVARO HERNÁNDEZ: Yeah. Yeah. I don't know if this is good or bad. It's probably bad in many ways because I haven't had the chance to learn from this. But yeah, it's how it's gone, and I like this. Honestly, I've been an intern for two or three months at the company while I was studying at the university. I don't think if this counts or not.

ABDEL SGHIOUAR: That doesn't count. [LAUGHS]

ÁLVARO HERNÁNDEZ: Yeah, I guess. I mean, I had a boss back then, and I broke some rules already because, well, for example, the corporate laptop had to have Windows installed. And I said, I cannot work with Windows. I need to use Linux.

And they said, no, this is not permitted. But two days later, it was running on Linux already. And well, they didn't kick me out. And actually, they called me because, oh, you're the Linux expert, and we need to go with a customer. Can you do a training? I was an intern. I was like, OK, whatever, but yeah.

Then what happened is that at the university, we started getting some inquiries from some people who needed help. And that turned into a paid job, and that turned into a company, eventually. And then from that company, it's been a few around a few attempts and pivoting, it's called, in a more nice way, and up till today. So yeah, I've never worked for anyone other than myself.

ABDEL SGHIOUAR: Wow. OK. That's interesting to talk to people with this background where you go straight from university to a founder. I mean, we ignore the time you were intern.

ÁLVARO HERNÁNDEZ: Right, right.

ABDEL SGHIOUAR: So speaking about the a company founding, your company is called OnGres. And you built a software, an open-source software, just to be very precise. We're not talking about OnGres today because we don't talk about vendors.

ÁLVARO HERNÁNDEZ: Exactly.

ABDEL SGHIOUAR: But the software is called StackGres.

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: What is StackGres? What are you trying to solve?

ÁLVARO HERNÁNDEZ: StackGres, actually, the name tells it all. It's about the stack. So StackGres is a platform for running Postgres on Kubernetes, to put it simply.

But when you look at Postgres, I mean, what I'm going to say about Postgres, I love Postgres. It's a fantastic database in so many ways. But Postgres itself, sometimes I call it-- it's like the Linux kernel. It's something that is very trustable, reliable, powerful, performant, and all that. But do you run just the Linux kernel on your laptop or your server? I mean, no.

ABDEL SGHIOUAR: Probably not. [CHUCKLES]

ÁLVARO HERNÁNDEZ: Right. You need some tools around it. You need at least a shell. Maybe you need some ls, grep utilities. Maybe you need some window manager, like the X, or Wayland, or whatever. So you need a lot of software accompanying it.

And Postgres is exactly the same. Postgres, at its core, it's tiny. It's minimal. If you download Postgres, it's a 10-- 15-megabytes download, compressed. If you build the image, it's like 9 megabytes, basic Postgres image. It's a tiny binary, but it has tons of functionality. But you need all these tools around.

So for example, Postgres does not come with connection pooling. And almost everybody knows that you should run Postgres with connection pooling. You very hardly don't want connection pooling with Postgres. And it doesn't come with Postgres.

Then you probably want high availability. Who doesn't want high availability on your database? But Postgres doesn't come with high availability included. It has the primitives but not a solution for high availability. Postgres does come with truly basic backup solution, but in reality, you need to buy a proper backup solution that's not included in Postgres.

So we call this the stack problem. To run Postgres, you need the stack of components surrounding it, next to it, that make up the whole ready-to-use, let's say, batteries-included solution. So we have been building this stack at my company for many years. And we are deep believers in IAC, Infrastructure As Code. And we were all trying to pack this in many ways, typically using things like Ansible.

And the thing is that, across customers and across environments, we realized that this was not reusable. I mean, yeah, it's code, but you're supposed to be able to compose it, to reuse it. And every environment was different. And that led us to using different code for every environment. And it was a lot of copy-and-paste and hard-to-maintain code, which diminished the advantages of using infrastructure as code.

So one day, we started thinking how we can create a uniformly distributed package, something that we could package the stack and deploy anywhere. It doesn't matter how the environment looks like. The apparent answer was that this is not possible, until thinking deeply about the subject.

We concluded one day, Hey, what about this thing called Kubernetes? Kubernetes is, essentially, an abstraction over infrastructure. It's all API-driven and abstracts away, basically, very roughly speaking of course, both compute, networking, and very importantly for us, the storage. It was like, wow.

So we can consider Kubernetes as a virtual infrastructure that is uniform. It doesn't matter what it runs. On any cloud, on-prem, it's going to present the same virtual infrastructure. And then it's all API-driven. I can automate. I can program against this.

It was like, wow, this is the solution to pack our stack into a place where we can deploy uniformly. And that's what it is. It's something that it's packaged for Kubernetes, and it runs anywhere that Kubernetes runs. I mean, this is now, of course, very usual. Everybody understands this. But at the time, that was kind of a revelation, like, wow, we can really do this in a uniform way.

ABDEL SGHIOUAR: Yeah. I mean, yes, everybody understands it now, but I think that Kubernetes, when it was at its inception 10 years ago, was not really designed for these kinds of use cases there. Well, we're going to get into details.

So I want to quote something from an article you wrote on New Stack. And the quote was, "Is Kubernetes ready for stateful workloads?" question mark. So this article was out in October 2021. Now we're three years later. What's the answer to that question today?

ÁLVARO HERNÁNDEZ: OK. I think I would double down on my answer at the time. So at the time, I stated-- I've gone even as far as recommending everybody that, by default, you should deploy your databases on Kubernetes.

ABDEL SGHIOUAR: Oh, wow. OK. Strong statement. [LAUGHS]

ÁLVARO HERNÁNDEZ: Yes. Yes. In general, strongly opinionated. And I would say that you have to have strong reasons not to do this. And one of the reasons actually piggybacks on what we were just chatting just before, the stack problem.

Deploying a database is not a simple task. It involves especially Postgres, but it applies to pretty much every other database out there. It involves coordinating a lot of software, the database itself, all the companion tools around it, the backup tools, the monitoring, the alerting, the high availability.

And even more, because you're asking me, as of today, well, as of today, for example-- and just to briefly mention something we just added to StackGres-- the support for sharding, to scale out, to create a cluster that may involve dozens of Pods, in Kubernetes terminology. How do you do all this manually?

Even if you do it with something like Ansible, this is a huge endeavor. You need to coordinate all these tools. You need to make sure that the config files are appropriate, that they are compatible with each other, that this given technology that you're picking for high availability was from a different vendor or community than the connection-pooling solution. They get along well together.

So Kubernetes has allowed not only these orchestration capabilities it natively provides but also the concept of operators, which, obviously, is central to running pretty much anything but definitely workloads, like databases.

The goal is that they encapsulate the knowledge domain of running that particular piece of software, the database, so that, A, it's exposed to the users as a very simple-to-use, high-level interface. And B, it packages all this knowledge, so you don't have to have it, and you don't need to go into all these details.

So as of today, deploying a complex piece of software, which is a database with all these tools, with maybe clustering capability for scale-out and a lot of moving parts, and have it streamlined in a very simple CRD or a few CRDs that an operator is providing to you, it's extremely compelling.

But is this enough to be the default? And again, I would say, yes, because one of the things that Kubernetes provides is this high level and a lot of primitives for resilient operation. If something fails, your Pod where your database primarily is running in a non-Kubernetes environment, you're going to have some other servers that will take over this role.

But typically, you will not provision a new server to replace the one that has failed. Or this will involve humans. I mean, you can still do, of course, but in practice, I've barely seen this.

Our customers who run on non-Kubernetes environments do have to manually provision. Or maybe it's all automated, but at the end of the day, it's going to be a human say, OK, we have this server. And then this server will take over this role.

So temporarily, you run with one instance less. Everything is designed for this. And you have capacity planning, and basically, you're paying an expert server. But with Kubernetes, everything can reheal automatically, as you have some pool, more common pool of servers.

So this is one of the main decision factors for me. The level of high availability and resilience that you get by operating Kubernetes is superior to what you get in non-Kubernetes environment. The other one is related to the so-called day-two operations. After all, Kubernetes presents-- and I call it a programmable API. And against this API, you can program anything that will interact with the compute, with the storage, with the network. And you essentially can do things with this.

So if you think about-- OK, let me bring you a very, very specific example. I want to run a pg_repack operation in a Postgres database. This is an operation that Postgres has. When you're doing updates or deletes, they are soft updates, soft deletes, essentially. So kind of rows are marked for deletion, but they're not immediately removed.

And after some time, you may accumulate some of these dead rows. They're called bloat. So at some point, 40% of your table is essentially bloat. This consumes more memory, more I/O, and you want to get rid of that. There's mechanisms for automatically getting rid of that, but at some point, you need to run an operation called repack that says, let's compact this all and defrag. In the very old times, it definitely doesn't work like--

ABDEL SGHIOUAR: The Windows time?

ÁLVARO HERNÁNDEZ: Yeah.

ABDEL SGHIOUAR: Yeah.

ÁLVARO HERNÁNDEZ: And it's online, so it's OK. But this involves a little bit of domain knowledge. You need to know how to run pg_repack without risks. And you need to check for potential errors and potential conflicts. And there's a small pre-check list and all these things. So how this is done now, as of today-- essentially, manually, I mean, there's going to be a runbook. Maybe someone wrote a script to do this. But it's still an involved task.

With Kubernetes, it's possible to program this in the operator. It's possible to offer this at a distance of five lines of YAML CRD. We've done this in StackGres. I'm speaking from something that exists as of today.

So you write five lines YAML on a CRD that's called SGDbOps, StackGres database operations. You say, I want to run a pg_repack on this database. And that's it. And Kubernetes will run it for you, and we'll post the results on the status part of the CRD. So it communicates back with you and tells you, this is the result. So this level of automation of day-two operations can reach much farther than even managed services as of today.

ABDEL SGHIOUAR: I think probably this is one of the few times in my whole time we've been doing this podcast that somebody explains data operations really clearly, [LAUGHS] because it's a word that a lot of people use all the time, not only related to databases, generally speaking, like, oh, day-two operations. Kubernetes simplifies data operations. But what does this actually mean?

But I want to go back to one thing, one thing you said that is kind of interesting for me. When you say an operator encapsulates domain knowledge, people think of operators as just a CRD, that you write an object, and then some wizardry happens behind the scene, and then something happens. But what I'm hearing here is that there is logic in the operator itself, which is specific to whatever that operator have been written to do. Right?

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: So can you give me another example of something specific to Postgres that's an operator, like yours or any other operator, should include or does include?

ÁLVARO HERNÁNDEZ: Yeah. There's a huge amount of logic behind it. Just to give you a very quick example, StackGres right now is approximately a quarter million lines of code. So it's big. So the goal of operators-- and yes, it's true that operators, sometimes people see a one-to-one mapping between some code running and the CRD. But both are separate. The CRD, in my opinion, is the API to the user, the--

ABDEL SGHIOUAR: The interface.

ÁLVARO HERNÁNDEZ: --interface to communicate with the user. The user specifies the intent via the spec part of the CRD. And the user receives feedback on the status part of the CRD from the operator. So it's an API. This API, in my opinion, has to be designed in a very high-level way to encapsulate this knowledge to present to the user something simple to use.

For example, CRDs, in my opinion, for Postgres operators, should be very high-level, should not expose Postgres internals, for example. Don't talk to me about, you configured this particular file or how Patroni works. I just want to create three instances. So the field should be called instances, not Patroni configuration, etcd. No. Just talk me about instances. I want high availability or not--

ABDEL SGHIOUAR: Yeah, abstraction layer.

ÁLVARO HERNÁNDEZ: That's simple, right?

ABDEL SGHIOUAR: Yeah.

ÁLVARO HERNÁNDEZ: Of course, you provide mechanisms for people who like low-level, who are experts in that domain, to go low-level and tweak the details if so they want. But the API needs to be designed from the perspective of the person who is not an expert in Postgres.

Then on the operator land, the operator should then handle all this complexity for you and say, OK, we're taking this very simple interface. Now let's make it work well. And there is two parts to this. One is to orchestrate all these components, as I said. For running Postgres in production, you need all this stack.

And this involves a lot of components, the sidecars, and wiring things, and creating different config files with different styles, making sure they all work together. This is a lot of effort that is done behind the scenes by the operator itself, and you don't need to realize. You don't need to see this. It's all done by the Postgres experts already for you.

But the other part of this is the things that the operator needs done to run reliably on Kubernetes. And there's a few things that-- there was a public discussion some time ago with Kelsey Hightower about running databases precisely on Kubernetes.

ABDEL SGHIOUAR: Yeah. I remember this one.

ÁLVARO HERNÁNDEZ: And the conclusion, and definitely my opinion, is that, on one hand, there is a few things you need to do to run well a database on Kubernetes, especially a database like Postgres, which has a primary replica architecture, a single writer node. And I call this twisting Kubernetes' arm a little bit. It's not very hard. You just need to do some things. But they are encoded in the operator, so again, you don't need to worry about this.

It's like, you shouldn't deploy a Postgres Pod by yourself, just take the Postgres image and say, create StatefulSet or even a deployment, you're done. No. Probably that's not what you want to do. You're not going to get very far with that. So a conclusion was also use operators because the operators have already encoded this kind of knowledge.

But let me give you another example. Postgres has a config file called PostgresQL.com, where you can specify up to 300 parameters to tune Postgres. If you design the CRD, you can say, OK, here's my literal text. And you embed there directly the config file, the parameters you want to tweak. And the operator just copy this blindly into the config file and runs it. This is one approach.

The approach that, for example, we have taken is, no. Let's create a specific CRD for Postgres configuration. It will have YAML-style syntax, not PostgresQL.com syntax, so it will be parameter:value. And the names of the parameters are validated by the operator. The values of the parameters are validated by the operator. The ranges of those parameters and the units, if they have units, are validated by the operator.

So basically, you're not allowed to create a CRD with an invalid configuration. And this is the kind of knowledge that the operators can encode that prevents the user from making these kind of mistakes or needing to learn more. Some parameters are blocked, for example. You cannot configure because you will break the cluster. It will not work. So this is the kind of domain knowledge that the operators encode.

ABDEL SGHIOUAR: Got it. So when you said that you need to twist Kubernetes' arm-- I am using air quotes, by the way, because this is not recorded in video-- but when you say that you need to twist Kubernetes' arms to make things like a single-instance writer for Postgres work, what do you mean by that? Can you elaborate a little bit more?

ÁLVARO HERNÁNDEZ: Yeah. Let me give you a couple of examples. So for example, in architectures like this, like Postgres, there's a single primary, and then you have replicas. These replicas are read-only.

ABDEL SGHIOUAR: Yes.

ÁLVARO HERNÁNDEZ: They're replicating all the changes that happens to the database. Now, if you blindly run this on Kubernetes on a StatefulSet, and there is a need to kill a Pod for whatever reason-- resources, or rescheduling, things like this--

ABDEL SGHIOUAR: A node goes down, whatever.

ÁLVARO HERNÁNDEZ: Yeah. Then it matters a lot which one you kill. And Kubernetes will have no saying into that. It will pick one. So it may pick your primary.

Yes, Postgres is prepared, and the operator will be prepared so that if primary dies, another one will be promoted. But this involves a little bit of downtime, like, well, this operation election happens, so it's going to be the next primary. So you want to kill the primary the least amount of times that you really need to. Essentially, kill everyone else but the primary. So this, you need to signal to Kubernetes. There's no primitives for doing this comfortably in Kubernetes.

Actually, we planned at some point to create something like a StatefulSet with knowledge of this type of architecture. Call it a primary replica as a StatefulSet. We haven't gone into this because we can actually twist a little bit Kubernetes' arms and prevent this from happening. So there's a little bit of magic behind the scenes. Sometimes we technically remove the primary from the StatefulSet and then bring it back for certain, for example, day-two operations, to ensure this Pod is not killed. That's one of the things.

The other one is, for example-- this is quite actually quite simple-- by default, all Pods mount a shared memory file, /dev/shm. And it's typically, by default, 64 megabytes. Most applications are more than happy with this, but Postgres happens to use a lot of shared memory. And this blows very, very soon, as long as you use this database a little bit. So you need to set this explicitly to a higher value and allow it to work well.

So there are certain things you need to tweak on Kubernetes. You need to set up Pod-disruption policies because you don't want, again, your primary to be disrupted easily. So these are the kind of things that are basic. The operator does this in the first six months of development, and then they're done forever. But if you want to deploy, by yourself, a Pod, it will not take you very far because of these reasons.

ABDEL SGHIOUAR: Got it.

ÁLVARO HERNÁNDEZ: And then maybe your experience will not be good, and you say, oh, yeah, Postgres databases will not run well on Kubernetes. Well, that's not exactly the case. You just need to do some work.

ABDEL SGHIOUAR: Yeah, and I have the feeling just from talking to you that a lot of people that have this opinion of Kubernetes is not great for StatefulSets. They have just used StatefulSets in their default format.

ÁLVARO HERNÁNDEZ: Could be.

ABDEL SGHIOUAR: So StatefulSets are great because they give you a static, predictable, addressable Pod at least, so you can address a specific Pod instead of the default way Kubernetes work, which is using a service. So that's already a step ahead of stateless workloads. But what you're describing goes beyond just, OK, run a bunch of StatefulSets and cross your fingers [LAUGHS] and hope it will just work, right?

ÁLVARO HERNÁNDEZ: Exactly. Exactly.

ABDEL SGHIOUAR: OK, there was a lot of interesting things there. So I'm going to skip this question about why Kubernetes gets such a bad rap, because I think we covered it. We know, OK, so if you're going to do a database on Kubernetes, you need to use an operator that knows how to handle databases properly.

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: Just don't do it yourself, right? [LAUGHS]

ÁLVARO HERNÁNDEZ: Exactly.

ABDEL SGHIOUAR: So that's the TLDR.

ÁLVARO HERNÁNDEZ: That's definitely my take. That's the TLDR of today, yeah.

ABDEL SGHIOUAR: And I mean, StackGres is open-source.

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: So anybody can use it, but there is quite a lot of other operators out there. So I'm sorry. I'm going back a little bit to the start of the conversation because you talked a lot about automation and self-healing, restarting, all this stuff. When people don't do Postgres on Kubernetes, it's typically just regular servers, like a VM, a physical server. So how does that work? What does that world look like? Is it just scripts and pets, essentially?

ÁLVARO HERNÁNDEZ: Yeah. I would say Ansible, a lot of-- I mean, there's still two schools of people who do this all manually, which is mind-blowing to me. And the people who use Ansible-- and Ansible can take you very far. I love Ansible, nothing against it. But again, first of all, it's very imperative in the perspective of, OK, this is my node A, this is node B, and this is node C, and node A is going to be the primary--

ABDEL SGHIOUAR: They are not cattle. [LAUGHS]

ÁLVARO HERNÁNDEZ: --that's already a wrong assumption. No. Node A is not the primary. It might be the primary when you start the things, but might not be your primary later on.

But then it depends a lot, again, on the environment. The beautiful abstractions that Kubernetes provides, like CSI drivers, for me, that's key. Just say, OK, use whatever storage. I don't mind. It's going to be the same interface. We'll just provision dynamically volumes against the storage class. That's all I need to know. And then you manage a Kubernetes administrator how you want the storage to be.

Lately, we have developed a functionality which is piggybacking on Kubernetes' functionality, which I think is also a game-changer, which is support for snapshots. Just think about medium-sized databases of today, 5-terabytes database. If you want to back it up, a backup may take hours to be taken, minutes to hours.

But restoring will probably even be slower because at the end of the restoration process, there is a single-threaded process called WAL replay. I mean, you use backups only for disaster recovery, but if there is a disaster, you may not want to wait for hours for your backup to come back.

When you use snapshots, then taking a backup with a snapshot takes seconds to a few minutes. Restoring takes a few minutes, plus the last phase of applying, which may end up being 20 minutes. Now you've gone down from four hours to 20 minutes, worst case, or even to a few minutes. This is a game-changer. And this is all thanks to this infrastructure of snapshots.

Now, can you use the snapshots on a VM? Of course, you can. Actually, sometimes it might be even simpler, but the reality is that few people use them. Now when I'm running on Kubernetes, and thanks to the support we added on StackGres for snapshots, I stopped using backups. I don't use backups anymore. I use the snapshots as backups.

ABDEL SGHIOUAR: Interesting.

ÁLVARO HERNÁNDEZ: Just to clarify, the snapshots as backups means snapshots plus the WAL files, with the backup taking in a consistent state of the database. I mean, it's the same thing. It's fully equivalent to a backup. It just takes seconds to minutes to take and restore, versus what you typically do without it.

ABDEL SGHIOUAR: So you would still back up a database every now and then?

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: But you would take volume snapshots--

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: --at the volume level, not even at the database level, right?

ÁLVARO HERNÁNDEZ: At the volume level.

ABDEL SGHIOUAR: Yeah, that was my understanding. And so when you said that backing up and restoring takes hours, I'm going to whip out my very, very little database knowledge here. That's because, typically, backing up a database means generating SQL statements one row at a time, one table at a time.

ÁLVARO HERNÁNDEZ: No, we're talking about physical backups here. That is called a logical backup. And you can take those, but definitely I don't even call them backups because they represent instance of time. It's like a copy of the database, how it was at the time. You start taking those logical backups.

ABDEL SGHIOUAR: Which might not be consistent because if a dump takes so long, by the time the dump is finished, the database might have changed, right?

ÁLVARO HERNÁNDEZ: Well, fortunately, the databases provide mechanism for transaction isolation, which, essentially, means that, yes, the database can expose you a frozen view of the database.

ABDEL SGHIOUAR: Interesting. OK.

ÁLVARO HERNÁNDEZ: For example, if you set Postgres to isolation mode, repeatable, and read-only transaction or serializable, you'll get this frozen view of the database. The database [? hasn't ?] stopped, the world [? hasn't ?] stopped, from your own perspective.

ABDEL SGHIOUAR: Got it.

ÁLVARO HERNÁNDEZ: Of course, this has a small impact on performance, and bloat, and what I mentioned before. But the database can do this comfortably. So you'll see the database at the state it was when you start taking this logical backup. But then everything that happens after, you will not have it.

When I'm talking about the backup, which is also a physical backup, involves the time you copy the whole directory of the database. In Postgres, everything is contained within a single directory. You copy that. Think of making a tar file. But then the database keeps recording, also, together with this tar, all the WAL files. Or it's called the differentials, the changes that happen to the database.

ABDEL SGHIOUAR: Yes.

ÁLVARO HERNÁNDEZ: So both things together form the backup itself. And then you can recover up to the last byte of data. The same with snapshots. The snapshot will just replace the tar, but it will be the snapshot plus these WAL files, these differentials.

ABDEL SGHIOUAR: Got it. Got it. So I was not familiar with the concept of physical versus logical backups, so that's a learning. For me, backup was basically dumping the database and then restoring it again from the command line. But that's good to know. So, see, that's why you need database experts. [LAUGHS]

ÁLVARO HERNÁNDEZ: [LAUGHS]

ABDEL SGHIOUAR: All right. So moving on, you are an active member of something called DoK, or "dock," or-- I don't know how you pronounce it, but Database on Kubernetes. There is a community called Database on Kubernetes, right?

ÁLVARO HERNÁNDEZ: That's right.

ABDEL SGHIOUAR: What is that?

ÁLVARO HERNÁNDEZ: I pronounce it D-O-K, but I'm not a native English speaker, so I might be wrong. [CHUCKLES]

ABDEL SGHIOUAR: OK. [CHUCKLES]

ÁLVARO HERNÁNDEZ: So this is a community that formed around a set of people who really wanted to say, yes, Kubernetes is ready for stateful or data workloads, like what they are saying about there's some reluctance of people wanting to run data on Kubernetes.

And it's a community that really impresses me. It has progressed very fast. And it's a very large community now with thousands of members. They're doing a lot of activities all around the concept of, yes, it is possible to run data on Kubernetes. It runs a lot of meetups, a lot of virtual events that anyone can, obviously, watch and connect to. I've been a speaker there four or five times already, probably, around topics like the ones we're discussing today.

But it's also very interesting because this community hasn't started creating outputs other than the events. For example, it runs a report every year about the state of data on Kubernetes, which is extremely interesting and I would recommend everyone to see it. You can download from the website.

ABDEL SGHIOUAR: We will have a link in the show notes for the episode.

ÁLVARO HERNÁNDEZ: Excellent. And it shows the trends on how data on Kubernetes is being utilized today by companies. And what are the predictions of the last years? And numbers are always showing very impressive adoption of data workloads running on Kubernetes.

There's also been a white paper released recently about running data on Kubernetes, showing best practices. I partially co-authored a part of it. And I think it's a great resource for anyone who wants to run any type of data workloads on Kubernetes, from databases to data-processing engines to Kafka on Kubernetes. There is a lot of knowledge there and best practices that is also very interesting.

And from there, it's taking and it's creating other projects to help people understand this landscape. The Operator Feature Matrix is also one of these projects being done by this community. And it's been very active. It's co-hosting, also, a co-located day at all KubeCons already for several years in a row, several editions in a row, with a large attendance. It's just a growing field and dispelling this myth. And I think it's doing a great, great job.

ABDEL SGHIOUAR: Awesome. Nice. And so one of the things that this community does is working on something called the Operator Feature Matrix, OFM. What is that? What is the Operator Feature Matrix?

ÁLVARO HERNÁNDEZ: So this Operator Feature Matrix, the idea, this is a project that has been done within DoK community. I was part of the team behind this idea. And this comes from a user struggle that we identified, which is, OK, let's say I want to pick a Postgres operator for Kubernetes. How many operators are there? It's like a dozen, even more as of today. It's crazy. There's so many. So which one should I pick? I don't know.

Even more importantly, OK, this one, I like this one for whatever reason, X. Which features does it have? Because this is my list of features. Does it has all these features? Which ones does it have? And then, OK, yeah, you go to the documentation of the operator and you kind of figure out.

But then they're starting to peer comparisons, and they will call things differently. So one operator says, I have this. Another one says, I have that. And they are the same thing. Sometimes they're different. So it became very confusing for users, not even also to compare, but even to understand, which features does a given operator provide?

So the OFM, the Operator Feature Matrix, is an idea to create a matrix of features, with two goals. One is standardizing the name of features. This may sound obvious, but again, what I call vertical scalability, for Postgres, is not what others call for vertical scalability.

ABDEL SGHIOUAR: Got it.

ÁLVARO HERNÁNDEZ: Even if you look at the operator maturity level that Red Hat defines and is also in the marketplace, it defines terms that I don't think any operator has.

ABDEL SGHIOUAR: Oh, interesting.

ÁLVARO HERNÁNDEZ: None. It doesn't exist in the real world. This may be because things are called differently. So the first goal is to standardize on feature names. What does it mean that Postgres has, if we're talking about Postgres, vertical scalability? Create a formal definition with an explanation.

And then the goal of the OFM is to create such a matrix-- actually, the matrix, as written, is on GitHub. It's a community-driven project. It's published on GitHub. And it's been created so far for Postgres. And there's already pool requests for other technologies because it may apply to any domain technology.

It has around 110 features already listed. And these features provide also a guidance for vendors as to how to say they're compliant with this feature, like, OK, you say you have vertical scalability. Then that means that you need to have this and this. Demonstrate via this and that.

ABDEL SGHIOUAR: Got it.

ÁLVARO HERNÁNDEZ: So at the end of the day, it also defines a YAML format for vendors to make submissions and say, OK, my operator X version Y has these 78 features out of this 106. And here's the proof to this, links to the documentation, or links to anything that proves that, yes, you have vertical scalability, or you support taking backups or snapshots, for example.

ABDEL SGHIOUAR: Got it.

ÁLVARO HERNÁNDEZ: Those are examples of features. Or you have sharding or whatever. So vendors are-- I mean, this is work in progress of today, but the goal is for vendors to make the submissions.

And then the DoK communities-- now, again, it's a work in progress right now, but is building a website. So users may be able to go to this website and pick any feature, any operator, see the matrix, compare if you want, and understand in a very objective way what features are provided by whom and in which way.

ABDEL SGHIOUAR: While you were talking about what the Open Feature Matrix is, it sounded to me like we need something similar for OnGres, actually, on Kubernetes, [CHUCKLES] it sounds like, because OnGres is the same thing. There are so many operators, and it's very hard for people to choose.

And usually, people choose because, A, familiarity, because they are familiar with one particular proxy versus another-- Nginx versus HAProxy or whatever. Or they choose because they like it. They like the design, or the interface, or whatever.

ÁLVARO HERNÁNDEZ: Yeah.

ABDEL SGHIOUAR: So yeah, it's quite interesting. OK. So I think you talked about this a little bit, but what's the landscape of operators today? I assume the ecosystem is so complex since there is the need for a feature matrix, right?

ÁLVARO HERNÁNDEZ: Yes. Exactly.

ABDEL SGHIOUAR: So what's the landscape? Because you are providing an operator for Postgres. Postgres itself is a community project. It's not owned by a single vendor.

But then there is also MySQL and MariaDB, which, one of them is owned by a company. So what's the landscape? Between the start-uppy things like yourself and the vendor stuff coming from big companies, how does that look like?

ÁLVARO HERNÁNDEZ: I would say it's a very non-uniform landscape as of today. There are certain technologies that have gotten very deep into Kubernetes. For example, I mentioned already that Postgres has. I lost count of how many operators provide creating databases, operating databases for Postgres on Kubernetes.

But if you look at all the data technologies and all the databases, there is not a lot of interest. If you look at Oracle, there is recently some efforts to run Oracle on Kubernetes or even on containers. And obviously, they are vendor-directed or vendor-originated.

ABDEL SGHIOUAR: Yes.

ÁLVARO HERNÁNDEZ: And it doesn't seem like there is a lot of interest, even though, from a technology perspective, the size of the container, which is huge, there's no reason why you shouldn't be able to run another database on Kubernetes, like Oracle. So some technologies have gotten very deep into Kubernetes, like Postgres, maybe also fueled by the recent increase in popularity and adoption. But some others are less present in Kubernetes, let's say.

It also boils down to, a little bit piggybacking on what we're talking about, the stack problem, how much more difficult it is to deploy without the automation that Kubernetes provides. In Postgres, you need all these components that I mentioned, all the sidecars, all this wiring up of these config files, and different technologies from different providers.

When you have everything packed in images and an operator that wires everything for you, this is super convenient against doing it manually. Now, there are other technologies that are just simple. You just deploy and it just works. So there is less appeal for having an operator for this.

Let me give you an example here. Kafka, which is a very widely-used, obviously, software, its development is complex. It may involve a lot of nodes. It may involve a ZooKeeper layer, even though now it's not strictly necessary, but some people still operate this way. And there's a lot of moving parts.

It's very worth to have an operator to automate, to orchestrate, to apply best practices to deploying a Kafka cluster. And there are streams, for example, that does a very good job in this regard. But if some other technology is so simple to deploy, then there is less interest. You gain less by using an operator, even though it's still compelling, for me.

ABDEL SGHIOUAR: Got it. Yes, I remember recently, Strimzi made it to the CNCF as, I think, an incubated project or something like that. So I never heard about it until we saw it in the news. And I was like, there is an interesting Kafka thing going on there.

OK, so last time when we chatted-- and back to the conversation of Postgres, the core Linux kernel, and then you need all these components next to it-- there is this concept of extensions in Postgres. And you are doing something about it.

ÁLVARO HERNÁNDEZ: Yes.

ABDEL SGHIOUAR: Can you talk about that? [LAUGHS]

ÁLVARO HERNÁNDEZ: OK. So let me start by briefly defining, what is an extension? Let me use an analogy. Postgres is the browser. In your browser, Firefox, for me, or Chrome, or whatever you use, there's this concept of plugins. It's some pieces of code that are written by third parties that you load into your browser and enhance the functionality in some way.

Postgres has the same thing. They're called extensions. And they can be as simple as adding a few extra data types. Postgres can create data types, and you can do this with extensions, creating new functions, user-defined functions.

But they can go as far as literally transforming your database. You can turn, with an extension, your Postgres database into a specialized GIS for geographical-systems database. The most powerful GIS system is an extension to Postgres, for example.

ABDEL SGHIOUAR: Or a vector database, which is very relevant for AI stuff, right? [LAUGHS]

ÁLVARO HERNÁNDEZ: You just stepped on my words. Absolutely. Right?

ABDEL SGHIOUAR: Yeah, so-- [LAUGHS]

ÁLVARO HERNÁNDEZ: A vector database. You can make it a database specialized for time series with an extension called Timescale. You can turn Postgres into a sharded database with extension called Scitools. There is over 1,000 extensions and probably more. Some of them may not be public. You can develop your own extension. It's not that hard. So they greatly enhance, potentially, the functionality of Postgres.

Now, extensions also have another side, which is, they are written in a C language or any other language. They can access internal functions of Postgres. And they have unlimited access. In eBPF, there is no VM or any verifier, so it's kind of untrusted code, technically, what you're running.

ABDEL SGHIOUAR: So unlimited access, you're saying?

ÁLVARO HERNÁNDEZ: Unlimited. Unlimited.

ABDEL SGHIOUAR: OK.

ÁLVARO HERNÁNDEZ: They can literally call any internal function they wish.

ABDEL SGHIOUAR: Oh, OK. That sounds scary.

ÁLVARO HERNÁNDEZ: There is extension points, official extension points, in Postgres called hooks, which are like function pointers that you can replace by your own-- and then there's intercept like [? event ?] workflow-- and then do your own thing. But still, you can do something else and just call an internal function because you want. That's why they're extremely powerful.

But on the other hand, they might be dangerous, or at least you need to take a little bit of care. That's why when you look at the managed services provided by cloud providers, for example, the set of extensions that you can use is limited because it has to be explicitly allowed. And these extensions need to be very well verified that they will not break your cluster. So I would say it's the most loved feature of Postgres extensions. And it's probably one of the most factors driving Postgres adoption lately, the quality, and the amount, and the functionality of extensions.

Now, if we think from a Kubernetes perspective or a containers perspective, if we want to provide a rich set of extensions for users to use-- let's say-- I always target the number 200. I want to provide our users with 200 extensions, so they pick the one that they want to use. What do we do?

OK. Option number one, we build a container with 200 extensions. It just works, and everybody can pick the extension they want. Now, there is a few problems with this approach. The first one is that the container may be big. Actually, I call this the fatty-container problem. How big? It doesn't matter. I mean, I can tell you it's several gigabytes.

But even if you don't care about size-- and let me say size still matters-- the problem is more of a security kind. Do you really want to have a running container image with 200 of random pieces of code fetched from GitHub repositories packed into your image? Probably not. [LAUGHS]

And then, even if you accept that, the day you want to update only one of those extensions, you're going to need to replace the container image, which means you need to restart your Pod, which means downtime for your database, which is not very nice.

ABDEL SGHIOUAR: Yes.

ÁLVARO HERNÁNDEZ: So this is not a good solution. And we invented a new solution, which is also part of the open-source StackGres-- and actually, I spoke of this at KubeCon North America 2021, I think. And it's a solution that we have developed for dynamically loading the extensions.

So this is a quite involved process that, first of all, implies creating a repository with extensions with a special format, digitally signed. And then there is a sidecar that we call the local pod controller that is responsible for reacting to the changes on the CRDs.

And basically, the user experience is extremely simple. Just, extensions is an entry, and it's an array, and you just list the name of the extensions. And then the very same moment you apply that YAML, the local pod controller will fetch the extension from the repository, verify the digital signature, and pack it into the file system, create some symlinks, and then make it available to Postgres. So instantly after, you can go to Postgres and say, create extension, and start using it.

Now, this works very well. You can download, and load, and unload extensions dynamically, and so far, so good. This is what is working as of today. But we realized this is not perfect, either. First of all, it's quite involved because we impact into the ephemeral layer of the top part of the file system-- the container--

ABDEL SGHIOUAR: The container, yeah.

ÁLVARO HERNÁNDEZ: --runtime, the snapshot. Then we need to redo this operation when the pod restarts. And because Postgres has primary and replicas, we need to do this in all the clusters. So the pod local controller and the operator, the StackGres Controller, coordinate all these operations. And again, it works very well, but it's complex.

On the other hand, that's not necessarily the best solution, either, because we're modifying the runtime container. I was listening the other day to the Falco episode. And I've listened to dozens of episodes of this podcast. And the Falco episode said, oh, we will flag if you add a binary, a new file to the container. That's exactly what is happening here. So this is not the best solution, either.

So we kept on thinking how we can fix this problem. And we have come up with a solution that is a little bit work in progress as of today. But it's basically the ability to generate dynamic containers. So what is that dynamic container? If you think about it, containers are essentially layers and a manifest that glues them together.

But these layers are, technically-- or even can be made, independent of each other. More precisely, if the layers are all additive, meaning they are not deleting or updating files that may be present on a layer underneath, you could technically combine the layers in any order, with potentially any other layers.

So what we have started developing is what we call a dynamic-container registry that is able to say, OK, you want Postgres with extension A, extension C, and extension 35? Here you have. You do a Docker pull, and you get back exactly that image with only those extensions. The image is immutable. Actually, from a client perspective, from Docker perspective, it's indistinguishable from a statically generated image by buildx, building the Docker buildx.

But it generated real-time. This doesn't involve calling behind the scenes, Docker build, or anything like that. It's generated instantly for you. And this solves a problem that, if we wanted to build all possible combinations of all extensions, all plugins, in all orders and all the factors that we want for the image, we calculated that we will need to generate 10 to the 63--

ABDEL SGHIOUAR: Wow.

ÁLVARO HERNÁNDEZ: --container images, which is larger than the number of atoms in the universe.

ABDEL SGHIOUAR: [LAUGHS] OK.

ÁLVARO HERNÁNDEZ: And we don't want to pay the bill for those images.

ABDEL SGHIOUAR: Of course not.

ÁLVARO HERNÁNDEZ: We cannot. It's not that we don't want-- we cannot. So instead, we came up with this solution to generate, to create, a dynamic registry that is capable of generating, composing, images real-time, based on certain layers.

And this is not only applicable to Postgres. This can be applicable to anything. If you look now to the landscape of containers, you'll see that there's variants, tags with Alpine base image, or ubi9 base image, or Debian base image, and then one with the backing symbols, and without the backing symbols, one with this plugin, one without this plugin.

We have OpenTelemetry Collector. I love OpenTelemetry. And you go to the Collector, and it has hundreds of plugins, similar to Postgres.

And so how do you build your container with only these three plugins? People have container images with 50 plugins, and that's fine. Well, it's not. It can be done better than that. So this is a widely applicable technology to compose dynamically layers of containers and serve you fine-tuned, customized containers for you that are immutable once you receive them.

ABDEL SGHIOUAR: Yeah, I remember first time we talked about it. I was thinking, there are so many applicable use cases. Proxies have the same problem. OpenTelemetry agents have the same thing.

There's a lot of things where-- even beyond, I mean, you could even take this and apply it in the context of, I have code. I have dependencies. Instead of just building container images with my dependencies each time, I just pull the dependencies dynamically because some dependencies doesn't change very often.

ÁLVARO HERNÁNDEZ: Exactly.

ABDEL SGHIOUAR: So I don't need to recompile the code each time. And if you add to that some digital signature, then you can ensure that the dependency is a layer that has been vetted, secured, checked, tested, all that stuff. So it's pretty interesting. So what's the current status of this? Is it in progress?

ÁLVARO HERNÁNDEZ: Yes, it's in progress. We're working on it, and we're going to be publishing the early bits that we have as fully open-source code, also.

ABDEL SGHIOUAR: Awesome.

ÁLVARO HERNÁNDEZ: So anyone can use it. And it's two components, one component which is the dynamic registry itself, and the other one is the component that actually applies this to Postgres-specific use case. So we're building Postgres images and extension images, all built as OCI images--

ABDEL SGHIOUAR: Nice.

ÁLVARO HERNÁNDEZ: --so small OCI images, one for every extension. OCI image is for Postgres and also other variants. And then the dynamic registry provides the capability to combine them all--

ABDEL SGHIOUAR: To compose them.

ÁLVARO HERNÁNDEZ: --in real time.

ABDEL SGHIOUAR: Awesome. Awesome. Well, Álvaro, this was a really good conversation. I really learned a lot.

ÁLVARO HERNÁNDEZ: Thank you very much.

ABDEL SGHIOUAR: And we are already way beyond how long it takes, which is not a problem. Usually, I like conversations. I think some of our audience like when conversations are very deep and technical. So any last departing words before I let you get your cup of tea, because you don't like coffee?

ÁLVARO HERNÁNDEZ: Right. Yes.

ABDEL SGHIOUAR: And you also don't like chocolate, and we have to talk about that another time. [LAUGHS]

ÁLVARO HERNÁNDEZ: Yeah, I'm a strange person, I know. I hate chocolate. I cannot even get close to it. I have a radar. I detect chocolate in small amounts far from me. Coffee, I don't like it. I mean, if I'm forced to, I can drink a cup of coffee and suffer a bit. Tea, yeah. Tea I like. [LAUGHS]

ABDEL SGHIOUAR: Sure. Cool, cool. Any last departing words?

ÁLVARO HERNÁNDEZ: Yeah. Well, what I would say, TLDR, is that you can run stateful workloads on Kubernetes very well and actually is my recommended way of doing so, definitely databases, and definitely Postgres.

There is a lot of community efforts, like this Operator Feature Matrix, that will help you also find what is your best solution for you, and that there is a huge amount of innovation happening. I'm really interested in this idea of these dynamic containers. And if anyone wants to know more, just feel free to ping me anywhere. Happy to talk about it.

ABDEL SGHIOUAR: Yes. Awesome. And you can follow Álvaro on Twitter on-- and I'm going to attempt this-- ahachete?

ÁLVARO HERNÁNDEZ: Yeah, that's pretty good, actually.

ABDEL SGHIOUAR: Oh, that's cool. @ahachete on Twitter, or aht.es is your blog.

ÁLVARO HERNÁNDEZ: That's simpler. Yes.

ABDEL SGHIOUAR: All right. Thank you very much, Álvaro. Thanks for your time.

ÁLVARO HERNÁNDEZ: It's been my pleasure.

[MUSIC PLAYING]

KASLIN FIELDS: Thank you, Abdel, for that interview. I am a big fan of the stateful world of Kubernetes, so [CHUCKLES] this is near and dear to my heart, though, weirdly, I've honestly never really done much with databases. I've never touched Postgres. That's for sure. I've done a couple of small database projects, but I come from a storage background, so that's why I'm interested in the stateful space, even though I don't have that much experience with databases. [LAUGHS]

ABDEL SGHIOUAR: I knew you were going to like the episode. And Álvaro is really, really knowledgeable about stuff.

KASLIN FIELDS: You all covered so much. I learned a lot about Postgres that I didn't know before, like the fact that their extensions are called extensions and not plug-ins. [LAUGHS]

ABDEL SGHIOUAR: And the fact that it has tons of extensions, as well. [LAUGHS]

KASLIN FIELDS: Yeah.

ABDEL SGHIOUAR: I didn't know much myself, neither. I mean, I don't have a lot of expertise in databases. But it was pretty cool to discuss with Álvaro about all this stuff.

So the story of how this episode came together is that I met Álvaro last year in Crete in Greece. We were together in an [? AN ?] conference, and we started chatting. And then he was talking about Kubernetes and databases. And I'm like, well, we need to have you on the show at some point. [LAUGHS]

KASLIN FIELDS: So to review, we talked a lot about Postgres, which is the general PostgresQL database, open-source thing. [LAUGHS]

ABDEL SGHIOUAR: Yes.

KASLIN FIELDS: And then what they're building is StackGres. And StackGres is a way to run a PostgresQL-type database on Kubernetes. So that means, to me, that there is an operator involved, which you all talked about a bit. So I think we're talking about StackGres being run on Kubernetes, using the operator pattern, using a custom controller on Kubernetes.

ABDEL SGHIOUAR: Pretty much. So that's essentially one of the products they are releasing.

KASLIN FIELDS: OK, cool.

ABDEL SGHIOUAR: So StackGres is the Postgres operator for Kubernetes. And then you have this thing that we discussed, which is the dynamic extension container-building tool, whatever the name of that thing is going to be.

KASLIN FIELDS: Yeah. [LAUGHS] And I think we'll have a link in the show notes to a blog post where they talk more about that concept of dynamic containers.

ABDEL SGHIOUAR: Yeah. I mean, just to resume for people who listen to this, I would assume if you are at this point in the episode that you have listened to the episode. [LAUGHS] But just in case you didn't--

KASLIN FIELDS: In case you just like hearing us talk and you just skipped to the end-- [LAUGHS]

ABDEL SGHIOUAR: Basically, yes. I mean, there are people probably who just like to listen to us talking.

[LAUGHTER]

Basically, they're building a thing that would allow you to say, I want Postgres version Y plus this, and this, and this extension. And they will dynamically generate a container image for you, which contains that Postgres version plus those extensions you have selected. So basically, the image you end up with in production is as minimum as possible, as little code as possible, and as small, as fast, all that stuff.

KASLIN FIELDS: It reminds me a bit of multi-stage builds in Docker, like there are ways that you can configure what you're building to create fewer layers so that you can optimize the container image that comes out of it. So I feel like there's something related there. [LAUGHS]

ABDEL SGHIOUAR: I would assume it uses that under the hood, but I think that the way they are doing it is that they don't really expose the Docker file to you. You have a YAML file in which you specify, which extensions do you want, and then they dynamically generate. So they pull the layers, and just put them together, and give you the final image.

KASLIN FIELDS: Smartly. [LAUGHS]

ABDEL SGHIOUAR: Yeah. I assume under the hood they will use multi-stage build because that's the way to go.

KASLIN FIELDS: Yeah. Learning how to do multi-stage builds for Docker containers efficiently is not something that I ever dove deeply into. [LAUGHS] So if somebody wants to do that optimization for me, all for it.

ABDEL SGHIOUAR: It's a pretty cool idea. I think that this opens up the possibility to other things, potentially. Just thinking here, maybe for Java workloads, you could potentially also dynamically generate Java container images with only the classes you need or only the stuff you need. It basically generates minimal images.

KASLIN FIELDS: There's a lot of interesting discussions that one can have about Java containers. [LAUGHS] The whole concept of the Java virtual machine plus the style of virtualization of containers together makes for some interesting interactions. [LAUGHS]

ABDEL SGHIOUAR: Which, as you are talking, made me think, we should probably have somebody to talk about this on the show. [LAUGHS]

KASLIN FIELDS: Probably. I think the thing there is Spring Boot. It's been a while since I've been in that space. Spring Boot is a thing, right? [LAUGHS]

ABDEL SGHIOUAR: Yes, yes. Well, maybe. I mean, there are multiple-- Spring Boot is just one framework. There is a lot frameworks in Java world, but you should think about somebody, somebody cool, and then we can have a deep discussion about this.

KASLIN FIELDS: I have some ideas.

ABDEL SGHIOUAR: Nice.

KASLIN FIELDS: But also, another thing that you all talked about was the Data on Kubernetes Community, which I didn't know that you didn't know about. But I'm a big fan of the Data on Kubernetes Community. I have a number of friends and open-source colleagues [LAUGHS] who have ended up working with the Data on Kubernetes Community closely.

I pop in there every now and then. [LAUGHS] I've always wanted to get more involved than I have been. But as you all said in the episode, they've run a co-located event at KubeCon. They are a virtually active community, so if you want to attend virtual meetups or read their blog posts, the report he mentioned is a fantastic output of the Data on Kubernetes Community.

So their focus is, as it says, data on Kubernetes. So I actually did a talk at a Data on Kubernetes day at KubeCon about the concept of stateful and how data on Kubernetes means more than just databases. But it is largely focused on databases. It's a great community.

ABDEL SGHIOUAR: Yes. It was my first time learning about it. It looks like they also just have this matrix, this benchmark, that they use to compare different operators. So they do have quite a lot of interesting things going on. I mean, I think I heard about it, or probably, l talked about it at some point. I just never really looked into it. But yeah, it's pretty interesting stuff.

KASLIN FIELDS: There are plenty of communities, so it's [LAUGHS] easy not to be.

ABDEL SGHIOUAR: I mean, is this a working group?

KASLIN FIELDS: No. It is not a Kubernetes, open-source thing. It's its own thing.

ABDEL SGHIOUAR: It's a separate community?

KASLIN FIELDS: It's like a meetup, but bigger. [LAUGHS]

ABDEL SGHIOUAR: Got it.

KASLIN FIELDS: They've got a website. They do meetups. They do the reports. They probably have a Discord, or a Slack, or something. I haven't actually looked into that. [LAUGHS]

ABDEL SGHIOUAR: Got it. All right. Interesting, and we will leave a link in the show notes if people want to look into it.

KASLIN FIELDS: There are several links in the show notes about the community itself, and the report, and the white paper. The white paper, actually, that he mentioned-- I think I found the right one-- is by the CNCF Technical Advisory Group tag for storage. But it's called the Data on Kubernetes White Paper, which is funny because it has the same name as the community. [LAUGHS]

ABDEL SGHIOUAR: Nice, nice. Cool. Good.

KASLIN FIELDS: Yeah.

ABDEL SGHIOUAR: And before probably we end, congratulations on taking over the Kubernetes official X accounts.

KASLIN FIELDS: Thanks, to pull back the curtain there. [LAUGHS]

ABDEL SGHIOUAR: I know. I know that have been in the work for a while.

KASLIN FIELDS: Yeah. So I am a co-lead of the subproject for contributor communications in open-source Kubernetes. And so that's the group that now owns that account. So we've been working with the CNCF for a while on how exactly transition of that ownership will go and making sure that we have at least concepts [CHUCKLES] of rules in place.

We're going to be documenting those over the next couple of weeks around how we manage that account, what kind of content goes out on it, and all that kind of stuff. So we're very excited. We've got a team of open-source contributors ready to work on that. And we're super excited to start talking more with end users through that.

ABDEL SGHIOUAR: Nice. Basically, for the people who doesn't know, if you are a contributor to Kubernetes and you have received an email, it probably came from the group that Kaslin works in.

KASLIN FIELDS: Yep. [LAUGHS]

ABDEL SGHIOUAR: And so now you are doing the account. That's pretty cool.

KASLIN FIELDS: I now have this vision in my head of our community. If you've ever played a game or watched one of the shows where there's the information broker behind [LAUGHS] all of the monitors, controlling things behind the scenes, that's what I'm going to have as our headcanon for our group now. [LAUGHS]

ABDEL SGHIOUAR: Nice, nice. Well, cool. I mean, it's good to see that the account goes back to the community.

KASLIN FIELDS: Yeah.

ABDEL SGHIOUAR: I didn't know it was owned by the CNCF, but now probably people will be more willing to share stuff on the account, so that's cool.

KASLIN FIELDS: Yeah. Our other accounts are by contributors for contributors, so they're going to be a bit lighter in their [LAUGHS] moderation of what goes out on those. But contributors, if they have any news to share, can talk to us, and we can get it out there for them.

ABDEL SGHIOUAR: Nice.

KASLIN FIELDS: If you're on Slack and you're a Kubernetes contributor, paying us in SIG a visit. Also, if you want to get involved as a contributor, we have good projects. [LAUGHS]

ABDEL SGHIOUAR: Awesome. Yeah, go check it out. All right. Well, I think we should probably not make this episode longer than it has to be.

KASLIN FIELDS: Yeah. Thank you all for sticking with us, and I hope you enjoyed learning about Postgres and databases on Kubernetes.

ABDEL SGHIOUAR: Thank you.

[MUSIC PLAYING]

KASLIN FIELDS: That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media at @KubernetesPod or reach us by email at <Kubernetespodcast@google.com>.

You can also check out the website at KubernetesPodcast.com, where you'll find transcripts, show notes, and links to subscribe. Please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening, and we'll see you next time.

[MUSIC PLAYING]

View More Episodes