#236 September 17, 2024

Dagger, with Solomon Hykes

Hosts: Abdel Sghiouar, Kaslin Fields

Solomon Hykes is the co-founder of Dagger. He is probably best known as the creator of Docker. The tool that changed how developers package, run and distribute software in the last 11 years. His impact on our industry is undeniable. Today, we discuss his new venture, Dagger. Dagger is a new approach to how we do CI/CD.

Do you have something cool to share? Some questions? Let us know:

News of the week

Links from the post-interview chat

ABDEL SGHIOUAR: Hi, and welcome to "The Kubernetes Podcast" from Google. I'm your host, Abdel Seghir.

KASLIN FIELDS: And I'm Kaslin Fields.

[MUSIC PLAYING]

ABDEL SGHIOUAR: In this episode, we speak to Solomon Hykes. Solomon is the cofounder of Dagger, who is probably best known as the creator of Docker. We spoke about Dagger, CI/CD, and more.

KASLIN FIELDS: But first, let's get to the news. Kubeadm is moving to a new configuration version with the release of Kubernetes 1.31. The v1beta4 introduces some changes to the configuration file used to deploy Kubernetes with Kubeadm. The old version, v1beta3, is officially deprecated and will be removed after three minor Kubernetes versions.

ABDEL SGHIOUAR: The 1.32 release cycle for Kubernetes begun on September 9, with an expected release date of December 11, 2024. We wish the release team good luck and look forward to interviewing the release lead.

KASLIN FIELDS: The CNCF announced updates to the CKA exam. The Certified Kubernetes Administrator certification was one of the first to be available for platform administrators. The new updates introduced changes to competencies required for passing the exam and will go into effect no earlier than November 25, 2024.

ABDEL SGHIOUAR: The CNCF and Linux Foundation Research are running a 2024 generative AI survey. The survey aims to understand the deployment, use, and challenges of generative AI technologies in organizations and the role of open source in this domain. The target's population for the survey is professionals familiar with the use of generative AI in their organizations. It should only take about 10 minutes to complete the survey. If you're interested in participating, you can find the link in the show notes.

KASLIN FIELDS: Microsoft's Azure Container Networking team has announced new enhancements to Advanced Container Networking Services. Advanced Container Networking Services is a new product offering designed to address the observability and security challenges of modern containerized applications. The updates include introducing fully qualified domain name filtering as a new security feature. And that's the news.

ABDEL SGHIOUAR: Today, I'm talking to Solomon Hykes. Solomon is the cofounder of Dagger. He's probably known as the creator and cofounder of Docker, the tool that changed how developers package, run, and distribute software in the last 11 years or so. His impact on our industry is undeniable, and I'm incredibly honored to have him on the show today. Welcome to the show, Solomon.

SOLOMON HYKES: Thank you. Thanks for having me.

ABDEL SGHIOUAR: I think that we don't really need to do any introductions. People pretty much know you. If you have used Docker, you have used your software before, right? But we are here today to talk about Dagger, which I learned about, and the more I dig into it, the more it became intriguing to me as a concept, especially what you are trying to achieve. Let's hear it from you. What is Dagger? How can you describe Dagger to people?

SOLOMON HYKES: Dagger is-- well, it's an engine that can run your pipelines and containers, and it can run them anywhere. So that's our short version. It's most commonly used to improve CI, Continuous Integration, which, a lot of times, is a mess in a lot of software teams. It's something that you just kind of cobble together over time as you ship to your app, and it just kind of gets more complicated and messy, and you ignore it because you have to move fast. And then, eventually, if your project lives long enough, you can't ignore it anymore.

ABDEL SGHIOUAR: Yes.

SOLOMON HYKES: And it breaks. It becomes super slow or things just stop working. Maybe the person who wrote the first version is gone now. It's kind of just glued together. And so Dagger is an attempt at fixing that. And we're doing it in a way that's very similar to how Docker actually fixed very similar problems for the application.

So the insight is that the problems that you have in your CI/CD pipelines are very similar to the problems that people used to have with their applications, starting with the fact that you can't run it locally and then trust that it'll run the same on the server. It was also very hard to run them across different servers.

So that was just reality before Docker. Every server was kind of a unique snowflake, and reproducing environments was very hard. That's still the reality today with your pipeline. So the application now is more portable, but the pipelines that deliver the application aren't.

ABDEL SGHIOUAR: So it's interesting the way you describe Dagger when you started talking about it, is you said it's a way to run pipeline. But from my understanding, at least, when I was looking into it, it's also a way to define the pipeline, because you have the SDK components, which is available in TypeScript, Node.js, and Go, right?

SOLOMON HYKES: Mhm, and Python also.

ABDEL SGHIOUAR: No, Go, Python--

SOLOMON HYKES: Yeah.

ABDEL SGHIOUAR: Go, Python, and JavaScript. And then you have the runtime part, right? And that's kind of quite different than how most CI tools are today, because they tend to be the runtime for the CI and then a configuration language, which is usually YAML or something. So how do you think Dagger is different than how current CI tools work today?

SOLOMON HYKES: Yeah. The best parallel is to what happened with Docker and containerization about 10 years ago now. You had applications that were stuck on a server, and they only worked on that server that was a unique snowflake, very hard to reproduce environments.

And so the solution was to containerize it. Let's lift that application and package it into some sort of a portable format so that you can give it to another server, give it to another developer to run on their machine. It's a thing now. I can look at it, and I can run it here and here. And there's a runtime that guarantees that the behavior will be the same.

And like I was saying, the CI/CD pipelines have not benefited from that. So the application is now more portable across servers and from dev to production, but the pipeline isn't. It's stuck on the CI server. And so it's very hard, for example, to make changes and improvements to your CI/CD pipeline because you've got to push and pray. We call it push and pray. You make a change in your YAML or Groovy or whatever, git commit, git push, pray. Oh, no, I made a typo. Start over.

I mean, it's that way because CI started out as just a server. So the pipeline definition was just a configuration for a server that runs your builds. But now, it's much more than that because applications got much more complicated. And so the pipeline logic, the logic that builds and tests and deploys and automates all the tasks to get your application ready and deployed, that's basically its own application now. It's very complicated.

And if you look at it as an application, all of a sudden, you see an application that is in desperate need for better tooling and a better development experience. So it's in the Stone Age compared to the application itself. So the starting point is, how do we containerize that? And the reason it hasn't been containerized is that your pipeline is an application, but it's a really special kind of application. So containerizing a CI/CD pipeline is not the same as containerizing a web application. If it were, people would do it already.

And the difficulty comes from a few places. One is that you can't just run the whole thing in one container. You need to actually run each individual step of the pipeline in its own container. And then you need to orchestrate the movement of data, of artifacts, flowing from one step to the next.

And first-generation container engines-- Docker, et cetera-- they don't know how to do that. They don't know how your pipeline works. So you can give them the whole thing, and they'll run it, or you can give them each step in a container, and they'll run each step. But they don't know what's going on between the steps. And so really, they just don't know how your pipeline works.

And so that's what gets us to your question, which is the coding part, the key to containerizing your pipeline so you can solve the problem of the pipeline not being portable, not being standardized, is to make the engine that runs it smarter. You need to be able to describe the pipeline to it. So you need an API.

So the Dagger engine has an API that lets you describe your pipeline as a graph. Here's a step that does the build. Here's the step that downloads the source code, here's the step that deploys, or whatever. And here's the linkage between them. And here's the exact artifact that will flow through.

And then once you have that API, on top of that, you add SDKs in native languages-- like Python, Go, and TypeScript-- to allow the people who understand the pipeline best to describe it. And that's the team developing the application that the pipeline will deploy. So that's the key.

Right now, you kind of have these silos. You have the people developing the application, and then you have the people creating the pipelines that will build and deploy that application. And usually, they don't really-- they're not able to help each other because the pipeline is this complicated mess of YAML and shell scripts. It's its own machine.

So you've got the DevOps team or the SRE team or the build team, the designated DevOps people, in charge of that. But the problem is, they're centralizing work that grows exponentially as your team grows, because the tool chains of these teams is always evolving. Now, there's an AI feature. So now, there's all these new tools to do inference and set up models. And there's a whole new team. They're doing data engineering in Python now, and all these tools. And they need those tools added to the CI/CD pipeline.

But now, they don't know how to change the CI/CD pipeline, so they're waiting for the DevOps team to do it. The DevOps team is not familiar with the new tooling, so they have to go and figure out, OK, how do I integrate this new tool chain into the CI/CD pipeline? So everyone's kind of stuck waiting on each other.

You can fix that if you allow each team to program their pieces of the pipeline in the language that they're familiar with. So the Python team-- a team that develops in Python will also develop their pipeline logic in Python. Same thing for Go, TypeScript, et cetera.

ABDEL SGHIOUAR: Right, right. So I mean, there's quite a lot of interesting ramification to what you described. I think the first one that jumped into mind is, even after tokenization, when we got to the step where you could write your CI pipeline as a set of containers where each task is an actual container, there is always the question of, who is responsible for that actual container that executes that actual step? Is it the DevOps team in this case, or is it the developer team?

So what if the developers move from one version of Python to another? Who is going to update that container step? Whose responsibility? And I think that-- the other thing that came to mind, from the way you describe it, it sounds like with Dagger, you could potentially run the pipeline on any server. You don't need a CI server.

SOLOMON HYKES: Right, exactly. Yeah.

ABDEL SGHIOUAR: CI becomes just like any other application. As long as the engine is on that server, you could just run the CI there, and it would just work, right?

SOLOMON HYKES: Exactly. That's a big part of the appeal, is basically unbundling CI, taking the logic from these pipelines that do valuable things, that can automate work in your software project, and separating it from a particular server and a particular infrastructure platform. And a lot of times, those are proprietary platforms. These pipelines will only work on machines operated by GitHub or CircleCI or GitLab or whatever.

And there's a separation of concerns that's needed, because a lot of times, you want to run those pipelines locally. You want to shift left. And what happens, usually, is people already do. They just have to do the work twice. So there's the build. There's the pipeline. There's the official pipeline that runs when you push a new version of the code to the Git server. And then there's the semi-official or unofficial pipeline that is the set of shell scripts and makefiles and more that are glued together that you can run locally. They don't have feature parity.

ABDEL SGHIOUAR: Yeah, they're not identical.

SOLOMON HYKES: They have a ton of drift. But fundamentally, they aspire to be the same thing. It's just that the tooling-- you can't easily take the CI pipeline and run it locally, so people find a way. Again, it's exactly the same problem that we were addressing with Docker. That was just everyday life for the application, just running it locally. And having the same thing was just impossibly hard.

And now, we take it for granted. So we hope the same thing will be true for the pipeline, that your CI server is just a server that happens to be running this set of pipelines that-- because you need the power or you like-- there's something about these servers that you like. But you could change tomorrow. You're not locked in.

ABDEL SGHIOUAR: Or, on the topic of people handcrafting shell scripts and makefiles, sometimes, cloud providers would provide you with an emulator that emulates your CI pipeline locally, which--

SOLOMON HYKES: Of course.

ABDEL SGHIOUAR: --which, then, they have to maintain two pieces of software, right?

SOLOMON HYKES: Totally. Yeah, totally. And it's-- remember OpenStack? That was the same idea. It was considered very weird and unfamiliar, what we were doing with containers. And I think-- because it was very abstract initially when we said, don't you see this is not portable? Wouldn't it be cool if you could just run the same thing? And the answer was, oh, yeah, we're on that. We're just going to make the VM layer, the machine layer, very standard and open and easy to replicate.

And so sure, you're running on a set of proprietary VMs over there, but we're going to force everyone to implement these standards, and there's going to be this implementation that does the same thing as AWS. But it turns out, it's never actually the same thing. Plus, it's not efficient to do it that way. You don't actually want all of that.

If you look at-- we look at CI pipelines all day long. That's all we do. We look at people's pipelines. And a typical CI/CD pipeline today, it drags so much stuff. It's so heavyweight. There's so many dependencies. There's so much complexity. It's just layers and layers and layers of Band-Aids. And at the bottom is always shell scripts. But there's always layers to hide the shell scripts and then another layer to abstract away the abstraction. And so who wants to run that locally?

I admire-- there's a project called ACT. You're probably thinking of that one. There's a few, but there's a project called ACT, A-C-T, that aspires to run your GitHub actions, workflows locally. And I could never do that. I mean, I admire people who just take on that project because it's an impossible target-- full compatibility with a system that's not designed to be portable-- but you try anyway. You go as far as you can.

And then every day, I'm sure, someone brings this terrible use case where they really need compatibility, and now, you're just-- you're debugging two things. You're debugging your CI pipeline, and you're debugging the thing that promises compatibility. So yeah, that was application deployment 10 years ago, and it's pipeline deployment today.

ABDEL SGHIOUAR: Yeah, because we started talking about CI being configuration and shell scripts and makefiles. So Dagger is primarily programming language-driven.

SOLOMON HYKES: Yes.

ABDEL SGHIOUAR: So there is an SDK. There are programming languages. What was the appeal for doing that instead of just another YAML?

SOLOMON HYKES: Well, that was really the starting point. There's something about pipelines that just makes it impossible to solve the problem we want to solve without programmable pipelines. It has to be real code because of the very nature of a pipeline. The point of a pipeline is to connect pieces together. It's to be the glue. It's modular by nature. It's that build connected to that source control connected to that deployment, et cetera, et cetera, et cetera.

And the "et cetera" part is important because it's always changing. It's always growing. The I for CI is integration. So there's kind of a built-in network effect. You're always looking to connect another thing. And so the component system, the composition system, is everything. It can't be this optional thing you tack on later.

And I think every pipeline system that starts out as a strict, nice, pleasant configuration gradually devolves into trying to be code. And sometimes, they don't realize it. Sometimes, they have an epiphany, like, oh, shit, this should have been code. OK, god, let's try to fix it. But we're just embracing the software aspect of a pipeline from day one.

So our starting point is, day one, and we've been working years on finding the right model. OK, if I built a NOS, a specialized OS for running pipelines, these kinds of pipelines-- build, test, deployment, but also data pipelines. Now, there's AI pipelines. And all of those are completely intertwined today. If you go to any software team's stack and you look at their pipelines-- build, test, deployment, data engineering, and now, increasingly, AI inference, fine-tuning, whatever.

Those are not cleanly separated at all. They're, like, glued together. So if I were to design an OS that can run these things as an application that I can program, with a full-blown software ecosystem where I'm just installing new components to my pipeline with the same level of productivity and ease that I would when I add a library to my mobile app or my web app, OK, what would that look like? That's day one for us. That was day one.

Otherwise, it's a non-starter. That's my opinion. You always-- you run into a wall at some point. You run into some pipelines that you just can't deal with because you didn't bake them. You didn't think of that particular shape of a pipeline, so you made assumptions in your pipeline system, in your runtime.

Every pipeline is going to be these three phases. There's going to be this phase and then this phase. And then one day, a user says, well, actually, I need a completely different-- I do this completely differently. And then you're screwed. So it's got to be-- yeah, it's got to be like an OS, something that's programmable with an API. And when I say programmable, you can write code, and there's a runtime and system calls, et cetera. So we're thinking of this as an OS for pipelines.

ABDEL SGHIOUAR: Yeah. And one of the first things I noticed when I was looking into the documentation-- and this is one of my absolute favorite things in programming languages in general-- is the function chaining, the width. Starts with an image, do this, add a file, remove a file, do this, do this, do this, dot execute.

What's your take-- I mean, obviously, you implemented it, so you like it. There are people who doesn't, for reasons. What's your opinion about that? What's your take? Function chaining or using variables to pass stuff between steps in a function?

SOLOMON HYKES: Oh, I see. You're saying those are two opposite-- like, option A, option B.

ABDEL SGHIOUAR: These are the two-- typically, tends to be the two sides of the conversation. There are people who like function chaining for readability, like me, and there are people who absolutely hate it.

SOLOMON HYKES: Oh, I see. There are design constraints here to get to the best system. And the goal for us is the best OS to develop and run your pipelines. I'm simplifying. We don't use OS in our docs, but it's helpful to think of it as an OS, because these are applications. These pipelines are applications.

And so the starting point for us was, what does the engine look like? What does the kernel look like? What's the execution model for a pipeline? Because pipelines are different than regular applications And so the starting point is, OK, the ideal model for executing a pipeline, for modeling it, is a DAG.

That's why we're called Dagger. It's a Directed Acyclic Graph. So it's really boxes and arrows. And a box is a task, and the arrow is an artifact flowing from one task to the next. Or actually, you could flip it. The box is the artifact. The arrow is the task that's transforming it. It works both ways.

But the point is, it's a graph, either way. And each task-- we call them functions-- in the graph is executed concurrently. So this is-- it's concurrency. Running things in parallel is baked in. And that dictated our choice of technology. So we use BuildKit as our kernel, which is the same tech that powers Docker Build. So when you write a Docker file, that's internally-- Docker Build will convert it to this graph definition that then is executed by BuildKit.

And we discovered BuildKit can do way more than build. It's more of a general-purpose DAG execution engine. And so you can think of Dagger as, what if you built an OS on top of BuildKit? And instead of just running builds, you ran the entire pipeline. It turns out, it works great.

So that's our starting point. OK, we have this engine. It's the most powerful way to model and run pipelines. You get all these benefits. It's faster, caching everywhere. You get all these benefits. OK, but how do I program it? So we have, this is the best way to run pipelines as a DAG with parallel tasks.

And then separately, the other insights, the only way to really solve pipeline development and pipeline deployment for everyone is to have a programming model that's real software, where people can actually exchange components, reuse each other's codes. So you can really take any pipeline and run it on Dagger. So you need both of these things.

How do we connect them? How do you program this weird engine that runs things as DAGs and has a declarative API for modeling that DAG? That's the key thing. The engine is declarative. The API for the engine is declarative, always will be, because you can't program a graph of concurrent tasks imperatively, because an imperative model is you're telling one computer, do this, then do this, then do this.

But the DAG is lots of little computers, basically, and each one is performing its tasks. So it's like a factory where there's a bunch of stations, and the robots in the factory are each doing their thing in parallel. So you can't just write a Python script to describe that-- or a shell script. It doesn't map.

So most of our struggle before launching, looking for the right design, has been figuring out this dilemma. How do you create a great programming model, a great developer experience for this inherently declarative engine? And initially, we thought we found the solution.

We found a declarative configuration language that was more powerful than the others. And it looked and felt kind of like a nice, familiar, imperative language. You so it was, like, YAML but better. And you had reusable components. You had templating. You had comments. You had lots of cool stuff. That was a language called Q.

So when we launched, the only way to program Dagger was to write these pipeline definitions in this language called Q. And then we launched on that, and then we spent, basically, six months supporting people and helping them build their pipelines. And then we realized, OK, people just don't want to learn a new language. They love the power of this engine, but it's just too much friction to have to learn this whole new language.

And so we went back to the drawing board, and we found a way to generate SDKs in a declarative language-- sorry, imperative language, like Python, Go, TypeScript-- that could then query a declarative API. And the model for us is actually a very familiar model. There's a precedent, which is SQL. So when you're writing a web app in PHP and you make SQL queries, that's an imperative language dynamically calling--

ABDEL SGHIOUAR: A declarative--

SOLOMON HYKES: --a declarative API. SQL is a declarative language. And so in our case, we use GraphQL. GraphQL is also a declarative language. It turns out GraphQL is great for navigating graphs. Who knew?

ABDEL SGHIOUAR: Happy coincidence.

SOLOMON HYKES: Yeah, so summarizing all that, long story short, now Dagger is-- it's an engine that runs your pipelines as a graph of concurrent tasks with data flowing through them, which is the best model. It's the optimal model for running a pipeline at a fundamental level.

And that engine is driven by a GraphQL API that lets you, any client, written in any language-- or you can do it in Curl if you want. You can do it from a web browser-- describe this DAG of tasks. And these tasks run in containers, et cetera. And so you have the full power of this engine available to you in this declarative API. And then on top of that, we have imperative SDKs that make it easy to query and extend that API in your language. So that's the stack.

Back to your question, when you're at that point, you made your way to a great developer experience for programming DAGs. That gets you to one of the options, naturally. The other option-- it's not a matter of taste. Oh, I like just-- I don't like chaining. If you don't like chaining, you're going to hate the performance and extensibility of your pipeline because that's what a pipeline is. You're chaining operations.

So the point is, it's not a matter of superficial, subjective preference in my mind. There's one path that gets you to the solution and the other that gets you back to where you started, just with another abstraction that won't scale.

ABDEL SGHIOUAR: Yeah. I mean, it's definitely interesting to hear it from you, the logic, in the sense that you walked your way backwards toward SDK and not toward, like-- not starting with the function chaining and then getting into how the engine is designed, but the other way around, right?

SOLOMON HYKES: Yeah, absolutely. Yeah, and I mean, it was-- we did not have any preconceived notion of, oh, this is the syntax we want, to the point that we started with a completely different language. Really, we went on a quest to finding the best developer experience in a very empiric way, which is why the community part is so important.

We call it community-led growth. So everything we do starts with the community. It's, like, the first feature of the platform. It's the same with Docker. So we're kind of refining and improving the model. So we have a Discord server that's very active. We have all these events and calls, and it's just a really fun and engaging place to be. It's all the-- it's like a support group for people-- for traumatized pipeline engineers. We all talk about our terrible pipeline--

ABDEL SGHIOUAR: CI [? terror. ?]

SOLOMON HYKES: --stories and how to fix it. But the reason-- it's not just a gimmick that there's this community.

ABDEL SGHIOUAR: Yeah, there is feedback.

SOLOMON HYKES: It's just that we need the constant feedback. So we're developing the thing in the open. And every day, we're drinking from the fire hose of feedback people are-- they take the new version, and they go and try to improve their pipelines, and they come back, and they say, this was great, this didn't work. And we've been doing this for years. And so the-- starting with, hey, try this language. And then, OK, no. This, yes. This, no. And then we just keep going.

So yeah, no preconceived notion, but a very ruthless, pragmatic, empirical method of what's working here. What do people love? What's making them more productive? And we make assumptions. And sometimes, we're right. Sometimes, we're wrong. But, yeah, that's how we got to this point.

ABDEL SGHIOUAR: Sounds like I need to join the Discord server, because that would be--

SOLOMON HYKES: Oh, yeah, you should join. If any of this seems interesting to anyone listening, the one takeaway is you should join our Discord, because a lot of people in that Discord share your weird, niche interest for DAGs and pipeline engineering.

ABDEL SGHIOUAR: And function chaining.

SOLOMON HYKES: Yeah, function chaining. And if you hate it, come tell us why.

ABDEL SGHIOUAR: Yes. So the next question is going to be-- because you talked a little bit about the logic of starting from the core, the API, the declarative parts, and then the imperative parts with SDK.

And one thing, when I was looking at documentation, is you can write your pipeline as a set of functions, but you can execute each function separately, which is the equivalent of executing each step in a pipeline separately with the Dagger CLI. But you could also execute just-- let's quote, unquote, I'm putting air quotes here-- the "last step," which, for example, for an image is publish, and it will automatically resolve all the previous steps.

SOLOMON HYKES: Right.

ABDEL SGHIOUAR: Can you talk about that? Because that's actually super interesting.

SOLOMON HYKES: Yeah, I agree. And it's almost like-- it sort of requires rethinking the model. It's sort of, OK, you have to change your frame for how you look at the problem. And then once it clicks, everything is much more fun. And so this frame that you're talking about, it's similar to just-in-time manufacturing.

ABDEL SGHIOUAR: Yeah.

SOLOMON HYKES: You're familiar with that?

ABDEL SGHIOUAR: Yeah.

SOLOMON HYKES: So it used to be, you would just build 100,000 units at a time of this one car model, and then you store it in a warehouse, and then you wait for people to--

ABDEL SGHIOUAR: Buy it.

SOLOMON HYKES: or order it or whatever. And then Toyota and this whole lean movement came in, and I guess now, it went through a whole hype curve. But, I mean, it was revolutionary, the idea that you would manufacture on demand. And if your systems could support it, it required retooling everything, rethinking everything. But the efficiencies were massive, because you just didn't have to deal with all this inventory, and you could adapt and be much faster.

But the parallel here is, start with what you need plus the DAG of all the dependencies. And then the engine will figure out what needs to be executed or has been executed before and can be loaded from cache. So you don't talk to Dagger in terms of what to do. You describe a full graph of what could be done, and then you say what you want. And then from there, the engine will give you what you want.

So if you want to publish an image to this registry, say that, and Dagger knows exactly how to get you there. What's interesting is, over time, you start collapsing, because here's what happens. You start-- usually, what happens is you have an existing set of pipelines. And then you start-- for whatever reason, you start using Dagger. You want to simplify it or make it faster.

And so usually, you start with a small piece. You pick the pipeline that's just the most painful, and you would call it Dagger-izing. You Dagger-ize it. And you can do that very easily, very incrementally, because we don't-- you don't have to throw away your existing CI. It's just a tool that will run inside your CI. And over time, your CI gradually-- you kind of eat it from the inside. It just becomes this envelope for running Dagger pipelines.

And then the same pipelines, you can run locally outside of CI. So as you do that, you start finding opportunities to collapse with this on demand-- this pull model. It's a pull model as opposed to push. For example, in a typical CI/CD pipeline these days, there's a lot of intermediary artifacts that are being built and then pushed somewhere only to then have another pipeline--

ABDEL SGHIOUAR: Pull them.

SOLOMON HYKES: --pull them to do something else with them. Sometimes, there's several steps of that, especially now, when there's models and other additional layers. And we just kind of got used to this. Like, oh yeah, here-- what's this pipeline's job? Oh, it's to build and push. What's this other pipeline's job? Oh, it's to pull and do something else and then push or kubectl apply or whatever.

And then so you Dagger-ize-- what happens is, you Dagger-ize one. Great. You just made it more efficient. And then later you say, OK, let's Dagger-ize more stuff. Let me Dagger-ize this. So now, you have a Dagger pipeline pushing to registry and then another Dagger pipeline that's triggered by some complicated system that pulls from that registry to do something else.

And at some point, you're like, wait, why do I need this registry? It's just a cache. I'm just using it as a cache, literally. And I could remove another 500 lines if I just merge these two functions, and I just call the second one, and it will kind of on demand call the other one, and the artifact will just flow through. And the Dagger has a cache. So now, literally, the cache, the artifact that you used to push explicitly and pull is now in the cache.

So I think that's a really powerful mechanism, and I think it's going to take several years for it to play out. And it's not just Dagger. It's just a general efficiency that just needs to be implemented because this is so inefficient right now. And the end-- I think the end result is-- I think that the very concept of an intermediary CI server kind of goes away because that whole thing in the middle is the embodiment of this push model, because I'm developing here. So when I'm developing, I need to know the results of my tests, or I need to lint my code, or I need to do all these things.

So there is things that you need during development. And then there's things you need at deployment. So really, it's the production server that needs the final container or Kubernetes configuration or whatever to apply. And ideally-- and I don't think we're ready as an ecosystem yet. But eventually, I think the production server can just say-- can ask for the exact artifact it needs. And then the pipeline to produce that artifact kicks off on demand at that moment.

So whatever tests or builds or code generation steps or anything, configuration generation, anything at all, can be kicked off on demand from deployment. So I think, if you have development and then CI and CD, to simplify, I think development and deployment kind of eat CI. They each kind of eat half of it, and you end up with two things, development and production.

ABDEL SGHIOUAR: Yeah. Yeah, and just to be clear for people who are listening to this, what we've been talking about for the last five minutes, I guess. So let's take-- and you correct me if I'm wrong. So let's take a very simple example of a pipeline. You need to build an image, push it somewhere, test it, and deploy it-- or publish it. Well, let's say, build, test, publish.

If you are using a typical pipeline, that would be three steps. YAML-- build, test, publish, with all this intermediary pull/push you're talking about, or doesn't necessarily have to be pull/push. It could be saved to a local folder shared across the steps. That's an example.

But with Dagger, what you would do is you would write your build function, write your test function, which depends on the build function, then write your publish function, which depends on the test function. And then once you execute the publish function, then Dagger automatically knows, oh, these things need the test, which itself needs the build. And that's the graph part you talked about.

SOLOMON HYKES: Exactly. It's really similar to targets in the makefile, rules in Bazel. So build systems have this similar DAG model with rules or targets.

ABDEL SGHIOUAR: Exactly.

SOLOMON HYKES: But it's that, but applied to the entire pipeline.

ABDEL SGHIOUAR: Yes. And I think the powerful part is the fact that you don't have to do that intermediary, what am I going to do with my artifact when I move from one step to another, right?

SOLOMON HYKES: Yeah, yeah. You still can. I mean--

ABDEL SGHIOUAR: Of course, yeah.

SOLOMON HYKES: Yeah, there's an adoption part, which is, I think-- it's not like we had this novel idea and no one had thought of this before. It's hard to design and implement correctly, but we're hardly the first to ship a good implementation of this kind of a DAG model.

What's really hard is making it practical for people to adopt and practical to adopt in a ubiquitous way so it adapts to enough real software projects out there that you can reach critical mass. And in this case, the whole point is critical mass. If you don't have critical mass, then you're not useful. You can't justify the overhead of the complexity. So it's-- for us, making it very easy to incrementally adopt is very important.

ABDEL SGHIOUAR: Yeah. And so my next question was going to be, so you built-- you wrote all your pipelines and all this stuff, and then you have to-- while you are at the stage where you are going to use Dagger to run your pipeline locally. Your functions could be camel case, which is, each word has a capital letter, but then the CLI will execute it as a kebab case, which I find super funny.

SOLOMON HYKES: I see you've actually played with Dagger.

ABDEL SGHIOUAR: Yes, I did. So kebab case basically means, if your function is called Publish with capital P, you execute it as Dagger run publish with small p. But if it's a more complex word, it will just put in small letters all the words and then daisy chain them with dashes, essentially, kebab case versus camel case. This is quite interesting, because in other-- I mean, if you take Java as an example, Java will have a flag where you can pass the class name as a full camel case class name. So what was the logic there? Why the kebab case versus the camel case?

SOLOMON HYKES: Yeah. Well, first of all, this is a hotly debated topic. It's a great question because it points out a really important dimension of Dagger, which is the importance of a cross-language ecosystem.

So pipelines-- everyone needs a pipeline. And everyone wants to develop their part of their pipeline that's relevant to them in a language that's familiar to them. That means, in order to actually solve the problem for everyone, Dagger has to work great across many languages. And it has to allow composition and linking of the different steps of the pipeline across languages.

And so that's a really hard problem to solve. And at some point, you're going to have a collision between the conventions and expectations of each, the language silo. And so how to capitalize things is one area where you have this sort of culture shock. And the place where the culture shock takes place is Dagger.

So just to give a little context, so Dagger's this engine. It has an API. And then you call this API to describe what to do, declaratively as a DAG. And you do that by chaining functions. So each box-- if a graph is boxes and arrows, the boxes are function calls. You call these functions through this API.

But what functions do you call? What's available to you? So the Dagger API comes with batteries included. So there is a set of core types and core functions attached to those types for fundamental operations, and then you build up from there. So the fundamental operations are, pull a container image, run a container, move files around, get clone, things like that. Also, networking. You can bind a port--

ABDEL SGHIOUAR: Expose port.

SOLOMON HYKES: --from one container to another. You can set up tunnels. And so from those building blocks, you can build almost any pipeline. And we're still expanding that API, but a lot of it's there. Oh, there's secrets. Also, there's a core secret type, so you can safely pass secrets around, et cetera.

So with that, you can write a client in Go or a client in Python, and then it calls that API, and it does something cool. But very quickly, what happens is you want to extend that API. You don't want to just, you, from your little corner, run a cool pipeline. You want to encapsulate that pipeline logic you just wrote and abstract it in a new type that you define, like a custom artifact that represents your Python project with a particular way of building it or your deployment platform with a particular set of tokens you have to pass, whatever.

And so the extension is key. And so we created this system where you can package your Dagger functions written in your language into what we call a module. And that module is basically an extension for the Dagger API. So the Dagger engine loads it, and then it does magic. And then if a client calls that engine and queries the API, now your types and your functions are also available. And then you can do that recursively. So you can call-- you can load a module, and that module itself depends on other modules.

And those dependencies happen cross-language. So that's the key. So I can write a module in Python using the Python SDK, and then I can use someone else's module written in Go, et cetera, et cetera. And so where everyone meets is this GraphQL layer.

And so all the way back to your question, Python functions-- in each of these silos, we want Python developers to feel at home. So when you write Dagger functions in Python, it should feel like real Python. Same in Go. Same in each SDK. And then, as an extra feature-- I realize I'm talking a lot, maybe too much context.

ABDEL SGHIOUAR: No, I love it.

SOLOMON HYKES: But I guess this will act as a reference. Then you can also call any of those functions from the CLI. So basically, you can Dagger call, and you can compose a pipeline dynamically from the command line by saying, call this function from this module and then chain to this, to this, to this.

And in the CLI, there's also an expectation of capitalization, like you said, kebab case. It's weird for a shell scripter, a DevOps person writing YAML and shell scripts-- you type shell commands all day, and it's very rare that they're capitalized. It's just weird. In Java, things are capitalized, and it feels weird to the shell script. We don't want it to feel weird. We want it to feel familiar.

So what we do is we translate the capitalization. So you write your function name in Python with the Python convention, and then if someone calls that function from the command line, we'll expose the same function name, but in a shell-friendly capitalization. If a Go developer calls that function in their module, we're going to generate Go bindings for them that have the capitalization of--

ABDEL SGHIOUAR: Of a a Go function.

SOLOMON HYKES: --Go, et cetera. So occasionally, people get confused by that. Why-- or we mess it up. There's lots of edge cases, like, where do you put the dashes? Anyway, what a rabbit hole, what a fun rabbit hole.

ABDEL SGHIOUAR: It is.

SOLOMON HYKES: But it's worth it. That part works really well. It's what makes Dagger work. Cross-language composition is really hard. There's gRPC. There's just REST, I guess. But, yeah, I'm really glad with how it turned out.

ABDEL SGHIOUAR: Yeah, I'm coming from the shell world. So for me, it felt familiar. That's why. So you talked, also, about composition and the fact that you can reuse other people's functions or modules. And those are published into the Daggerverse?

SOLOMON HYKES: Yes. Well, yes, they're searchable in the Daggerverse.

ABDEL SGHIOUAR: Oh, so they can be hosted somewhere else?

SOLOMON HYKES: Yeah, so we copied the Go module system exactly.

ABDEL SGHIOUAR: Oh, interesting.

SOLOMON HYKES: So these modules are just code, first of all. They're not binaries. You don't distribute binary artifacts. It's a source code ecosystem. That's one thing I was always frustrated with with Docker, is we had an ecosystem of binaries, these images. Obviously very powerful, but you also want to know, what's behind these image? What's inside?

ABDEL SGHIOUAR: Yeah, how was it built?

SOLOMON HYKES: And we never got around to standardizing that because everyone adopted Docker, and then different platforms went their different ways and created their versions of this. So Red Hat had their own source code ecosystem, Cloud Foundry, a gazillion different platforms. And that was a source of frustration for me because there's some features, some things you can do only if you have visibility into the source code. You can just do more stuff with the platform.

So this is a pure source code ecosystem. Also, it's easier to trust and verify. You can go look at the code. Do I trust this or not? We don't host the modules. So just like Go's model, which I love, it's code, so we already have a very efficient system for distributing code. It's called Git. So actually, yeah, any Git server can host a Dagger module. And then you don't need any third-party service to access it. You just point your Dagger CLI or SDK at that repo, and it will load it and just do its thing.

But if you want to find modules or get information about modules, like, Is this trusted? Is this popular? Where are the modules? We have a search engine called Daggerverse that basically indexes all these modules. And over time, we're going to give you more useful information. And again, Google has their Package Index. I forget what it's called.

ABDEL SGHIOUAR: Yeah, I forgot the name. Yeah, I know what you're talking about.

SOLOMON HYKES: Yeah, same thing, same thing.

ABDEL SGHIOUAR: Nice, nice. Awesome. So I'm going to take the conversation in a slightly different direction, but it's kind of still related. Infrastructure-as-code-- where do you see Dagger fits, if it ever fits, into that world?

SOLOMON HYKES: I think it's very complementary. I mean, it's a very common integration. And superficially, you could think, oh, there's overlap. This is code. That is code. Surely, Dagger will try to do infrastructure and Terraform/Pulumi will try to do deployment pipelines, build, et cetera. And superficially, you can always find a maximalist for every tool that will try just to do everything with that tool. But in the case of Dagger, we're very clear that Dagger is not an infrastructure-as-code platform. It's a pipeline platform.

And one very common task that is automated in the pipeline is infrastructure provisioning. So it's very common to have a Dagger pipeline that includes a step that calls Terraform or Pulumi or something like that. And because it's code on both sides, actually, it's much nicer to integrate. There's a lot more you can do.

ABDEL SGHIOUAR: Nice.

SOLOMON HYKES: So mostly it's that. They're very complementary tools, because fundamentally, the model is different. Dagger is about one-way, stateless, cacheable pipelines, just a flow of artifacts. Infrastructure management is fundamentally about two-way sync of state. You have the state of your cloud resources, and then you have the view of that state, and then you try to reconcile. That's really-- kind of like the [? Ark ?] thing. I would hate to do that because it's really hard.

But, yeah, it's a great business if you can do it because you have all this lock-in. Who's going to go and mess with the AWS provider? It's like a driver. And so we gladly integrate that and just make sure all the hard work by Pulumi and Terraform is available as a first-class citizen in your pipeline, because your pipeline involves provisioning infrastructure and also 10,000 other things that you need to glue together. So that's our job.

ABDEL SGHIOUAR: Nice, nice. Awesome. This is going to be my last question. I took lots of time from you.

SOLOMON HYKES: Oh, this is-- I love talking about my product, so don't worry about that.

ABDEL SGHIOUAR: Awesome, awesome. So what are your thoughts on the open-source business in the light of everything that's been happening recently, without mentioning any names?

SOLOMON HYKES: Oh, like licensing changes, things like that?

ABDEL SGHIOUAR: Yeah, just to name a few-- licensing changes, stopping publishing, built artifacts for free, things of that nature.

SOLOMON HYKES: Yeah, I think-- I don't know. I feel like, at some point, people started thinking open source was, like, a business category. Like, what business are you in? Oh, I'm in the business of open source. But that's not a business category. It's an implementation detail of a product and a business.

So I think, if you group everything in between open source and not open source, then you'll get confused, because there's a lot of very different products and businesses that happen to create open-source code or be involved in open source. So we're one model.

On the other end of the spectrum, you have a business like PostHog, just to pick one that I'm familiar with. It's like the open-source Amplitude. And product analytics, but open source. And it's a very different situation, because in their case, they open source as a business argument. Hey, this is open source, so that means you can go and run it yourself if you want. You're not locked into us. It's a pragmatic business decision.

And in our case, it's different because Dagger not open source does not make sense, because we need this developer ecosystem. And so you need to give developers what they need to be able to customize and build their own software on top of the platform. So in our case, it's not about convincing a buyer that they won't be locked in. I mean, it helps, but it's just a different-- it's part of a community-led growth strategy.

So I don't know. I guess licensing doesn't matter, in my opinion, as much as people think. If you have a business that requires changing the license, that means you probably screwed up something else. I don't know. Here's our model. Our model is the Red Hat model, basically-- extreme openness on IP. So we have a regular open-source license, with no plans of changing it, and strict control on the trademark. So you can take the Dagger engine and modify it, redistribute it, do anything you want with it, because it is true open source.

But if you want to call it Dagger, then there are rules. You can't take Dagger, patch it, rebuild it, change it in any way, and then still call it Dagger, because that's our trademark. So that's really not about open source. Really, it's about any software product, because otherwise, let's say someone ships a broken, modified version, or they ship a feature that they thought was great but we don't like. Then that's confusing to users. What is Dagger? What is not Dagger?

So that's really important to us. And also, it was very important to Red Hat. I think it's a great model. That way, you stay in control of what's your product, what's your product experience, and then if the community doesn't like what you're doing, they can always fork it and create something else with a different name, and everyone's happy.

ABDEL SGHIOUAR: Awesome, awesome. And I think that we couldn't have ended it better. So we'll leave it at that. Thank you very much for your time, Solomon. This was a pleasure talking to you.

SOLOMON HYKES: Oh, my pleasure. Yeah, thanks for having me.

ABDEL SGHIOUAR: Awesome.

[MUSIC PLAYING]

KASLIN FIELDS: Thank you very much, Abdel, for that interview. It's really exciting, of course, to get to speak to someone who has had such an impact in this area of the industry. I've seen Solomon talk about Dagger at KubeCon. He gave a keynote where he talked about it a little bit, but I hadn't looked into it myself, so I was excited to learn a little bit about what it is. And so it's a CI/CD solution.

And we'll get more into that in a second. But one thing I wanted to say about my perspective on this is I feel like CI/CD is always something that I try to avoid in our world. [LAUGHS] It's very closely tied to the concept of containers because of the whole concept of, it works on my machine being kind of resolved by the way that containers encapsulate the processes. And so it's very close to concepts of CI/CD, and it's used in a lot of CI/CD solutions.

And so the two are close enough to each other that there's some assumption, I think, that when you understand something about containers and Kubernetes, you also know a bit about CI/CD. And I have aggressively avoided that, personally, because I think-- [LAUGHS] I think because I come from of more of the sysadmin side of things. And I really love that side of things.

And anything that brings me closer to the developer side of things, I need to avoid. I love getting the things to run on production, but I don't like the process of getting to production from the developer space. But with Solomon describing it, I was like, hmm, maybe I should look into this more.

ABDEL SGHIOUAR: Yeah, I think that the biggest drawback or challenge or whatever you want to call it for CI/CD for most people is the fact that it-- when people talk about it, it usually has these synonyms of very slow feedback loop in the sense that you have to submit your code and then wait for the pipeline to run to finally realize, oh, there is an error, right?

And most CI/CD tools, most open-source CI/CD tools, at least, doesn't have a way to run your pipeline locally. So you get your fast feedback loop. I mean, there are solutions, for example, Skaffold, which Google released a while ago, that both can be used to do local pipelines and run it on some CI/CD stuff. But yeah, most of the time people have to have this slow feedback loop, which I think is a huge drawback if you're a developer, basically, because as a developer, you want to see if your code works, and you won't see that very fast.

KASLIN FIELDS: It is one of those things where, when you talk to developers, they're like, ugh.

ABDEL SGHIOUAR: Yeah.

KASLIN FIELDS: CI/CD.

ABDEL SGHIOUAR: And that's what Dagger is trying to solve, right? That's what-- I mean, beside trying to solve it through code instead of configuration files, it's also trying to be that thing you can run locally and you can run in cloud and have that expected same behavior, if you want to-- I think this is the easiest way to phrase it-- basically, consistency in terms of how it works.

KASLIN FIELDS: I think there's a lot of philosophy, also, in the world of CI/CD, the whole concept of-- I like, in the testing and observability world, running things in production and the whole concept of, you never really know if something is going to run in production when you're running it in development environment, because you can try to make it as close as you can, but you can never be quite sure. So that CI/CD step is really important. And when done well, it can prevent a lot of challenges. But it's a very difficult thing to do well. So interesting.

ABDEL SGHIOUAR: Yeah. And as Solomon said in the episode, it's basically-- the inspiration or the reason why Dagger exists is coming from this realization that most people start with something simple and then start building things into their own CI/CD tools, which then turns into a Frankenstein-type configuration file plus bash scripts plus hacks to make it do whatever you need it to do. So that was the starting point of why Dagger existed in the first place.

KASLIN FIELDS: Which is very understandable, because getting from-- solving the problem of, it worked on my machine and it's not working over here, is a very difficult one. And that's kind of the problem that CI/CD is generally trying to solve, is you've got it running in one environment, and you need it to run in a different environment. How do you make sure, along the way, that by the time you get to that other environment, it's going to work? So I liked that he called out the origin of the name of directed acyclic graphs.

ABDEL SGHIOUAR: Yeah, DAG. Yeah. Beside the fact that Solomon seems to have some very interesting things that start with D-- so Docker, Dagger.

KASLIN FIELDS: Yeah, that's true.

ABDEL SGHIOUAR: Actually, there is a video on YouTube of him answering specifically this question of why--

KASLIN FIELDS: Really?

ABDEL SGHIOUAR: Yeah, yeah. There is actually a video. I don't know which-- I don't remember which conference, but we will make sure to have it in the show notes. That's essentially because, preparing for the episode, I was looking up his stuff on YouTube to see kind of what he talks about and how to not ask him same questions that have been asked before. And one of the questions specifically was like, why do you have an affinity for things that start with D? And-- yeah.

KASLIN FIELDS: I had not thought about that as a trend, but--

ABDEL SGHIOUAR: Well, yeah. I should have asked him the question, like, what's after Dagger?

KASLIN FIELDS: Yeah. Is it something else that starts with a D? [LAUGHS]

ABDEL SGHIOUAR: Yeah, exactly. I think one funny, interesting thing that-- I mean, totally off-topic, really, but it just interesting to me-- is right before jumping on the episode to record, I was on Wikipedia, because, of course, he has a Wikipedia page, and I realized that he actually speaks French. So, yeah, when we started, we started talking in French. And he was like, how do you know French? And I was like-- because he lived in France for a while, so it was quite interesting for me. It was just, like, completely off topic.

KASLIN FIELDS: Yeah, he talked about that.

ABDEL SGHIOUAR: Yeah, yeah, yeah. It was--

KASLIN FIELDS: He talked about that at KubeCon in Paris in the keynote.

ABDEL SGHIOUAR: Yes, yes. It was just very interesting, because he has such a unique name. Solomon Hykes is not really-- I mean, I think the last name is probably common, but Solomon is not a very common name.

KASLIN FIELDS: That's true.

ABDEL SGHIOUAR: So that's why I was like, why do you speak French? And then, yeah, he gave me a little bit of the backstory.

KASLIN FIELDS: But getting back to Dagger and the concept of directed acyclic graphs-- so what he was saying was that pipelines are always graphs, if you get right down to it. It's about how you connect one system to another, essentially, and the steps along the way. So it can always be represented as a graph. So he's starting with that as the baseline, which I found very interesting, and I liked the way that he compared it to Docker.

Docker was about standardizing at the right level. He talked about how there was also efforts going on around the same time trying to standardize the way that applications were run but that were happening at a lower level, at a hardware level, which I can't imagine that ever working. [LAUGHS] That would have been very difficult to implement on a global scale. But Docker, what, I think, was really successful about it was that it did standardize at the right level. So he's trying to do that again here, kind of with starting at the directed acyclic graph.

ABDEL SGHIOUAR: Yeah. I mean, if there is anything that people should take from this episode, it's the fact that one of the core features that I really like about Dagger-- because I tested it, and it's obviously in the episode that I'm a big fan-- is that as you are moving your artifacts from one step to another in your CI/CD pipeline, you don't have to save that artifact somewhere. As you said, the graph representation of a CI/CD pipeline, which is steps connected by lines, as you are moving from one step to another, the artifact is taken care of by Dagger.

So you don't really-- because in a typical CI/CD pipeline, what you will have to do is build the Docker image, for example, and then push it somewhere so that the next step can pull it and do things with it. Either you push it to a registry or you save it to some local directory. But with Dagger, you don't have to do that. The artifact itself will be passed along to the next step in that graph. So it's technically less work or less code that you will have to care about.

KASLIN FIELDS: And something else you did mention there which I also thought was really interesting was his frustration with Docker images being a binary file and that being the unit that you kind of share around because you lose so much in the conversion from the source code to the binary file. And he talked a bit about how, in Dagger, they tried to address that. And I didn't quite understand exactly what that was. So it sounded like some kind of marketplace of Dagger pipelines that are shared as source code. Is that what it was?

ABDEL SGHIOUAR: Yeah, so the Daggerverse, they call it. It's a marketplace of, basically, functions, because in Dagger world, steps in your pipelines are functions.

KASLIN FIELDS: Oh, OK.

ABDEL SGHIOUAR: And so basically, you don't have to reinvent the world. You can just use somebody's function. So they have this place where they can share the code. But what's interesting about that specific thing is they don't host the code. They basically index it.

So you have a place to search for stuff, but the code is not in the Daggerverse. It's somewhere else. They just index-- they merely index the modules. So you can just find them in a global search database kind of thing. And then you can just reuse them, or import them, or import the code and modify it, or do whatever you want with it. So yeah, that's essentially the Daggerverse.

KASLIN FIELDS: I'm seeing it in my head as more like a Stack Overflow of--

ABDEL SGHIOUAR: Pretty much, I guess, minus--

KASLIN FIELDS: Components.

ABDEL SGHIOUAR: Yeah, minus the opinions, I guess.

KASLIN FIELDS: Yeah. [LAUGHS] Good point. That's probably the main part of Stack Overflow. So it's basically kind of a marketplace of source code of functions, and you compose a single Dagger pipeline of multiple functions. So it's, like, the components that you can use to build the right pipeline for you, which I feel like makes sense with the issue he was talking about of, you wouldn't want that to be a binary, something like that. [LAUGHS]

ABDEL SGHIOUAR: Yes, because that was his frustration, as you said, with Docker, the fact that Docker Hub is essentially binaries with not necessarily the code itself. And I think it boils down to the matter of trust. Do you actually trust a built binary that you don't have access to?

And to remedy this in Dockerverse, they basically indexed the code. So the code is not on Docker verse, so-- on Daggerverse. So, yeah, I think it's quite interesting as an approach. And it comes from learning from stuff that they have done at Docker that they wanted to do differently.

KASLIN FIELDS: I felt very validated when he said that, too, because [LAUGHS] I have also been frustrated about binaries. I'll get a binary of something, and I'm like, how do I know what's in this thing? [LAUGHS]

ABDEL SGHIOUAR: Exactly. And you shouldn't actually trust stuff from Docker Hub.

KASLIN FIELDS: I was like, guess this is just the way it's done, and I should just be happy with it. But if Solomon Hykes isn't happy about it, I don't have to be either.

ABDEL SGHIOUAR: Correct. Yeah, usually, the way I describe it when I do talks about security, generally speaking, is do not download images from dockerhub.io/iamahacker/hackmeplease kind of image. Just don't do that, [LAUGHS] because you never know.

KASLIN FIELDS: I wonder if something like that exists. It would be pretty funny. [LAUGHS]

ABDEL SGHIOUAR: Probably not the same way I'm describing it, but I am quite confident it does.

KASLIN FIELDS: Well, yes, definitely does in that respect. But something with that naming would be pretty funny from--

ABDEL SGHIOUAR: It would be very funny.

KASLIN FIELDS: --a security professional who's trying to run some capture the flag or something. Would love to see that. Would be pretty funny.

ABDEL SGHIOUAR: Yes. If you listen to this and you find something on Docker Hub that has this, please send this to us. It would be interesting to see.

KASLIN FIELDS: Yes, please do. And one last thing from the interview that I wanted to talk about was his discussion about how IOC relates to Dagger. I loved how he framed it as-- it's very understandable that people would compare the two and wonder if Dagger was going to be an IOC solution of some sort, an infrastructure-- or IAC. I put IOC in my notes.

ABDEL SGHIOUAR: Infrastructure-as-code.

KASLIN FIELDS: Infrastructure-as-code, not infrastructure of code. But-- [LAUGHS]

ABDEL SGHIOUAR: Potato, "po-tah-toes."

KASLIN FIELDS: Yeah. [LAUGHS] It makes sense that folks would wonder if they're going to be in that space because infrastructure management and pipeline management can be very conceptually similar things. But I loved how he framed it as infrastructure management is two way. It's not something that you do once. You're not just moving something from local to production. You're setting something up, and then you are managing it over some amount of time.

So you're going to have to be able to manage changes to it as well, which is very different from what he's trying to do with Dagger, which is a one-way street. It is a directed acyclic graph where you go from the beginning to the end, and then you don't do things again in the middle. So I loved that explanation of, infrastructure-as-code integrates with Dagger, but Dagger has no interest in becoming an infrastructure-as-code solution itself.

ABDEL SGHIOUAR: Yes. And also, the fact that he described it as these things can be complementary to each other, because the-- I mean, the very typical conversation you would have with people is, where does infrastructure-as-code start and stop, and where does your CI/CD pipeline start? Do you use infrastructure-as-code tools to do everything, including deploying the app itself? Or do you use other things? So having that clear distinguishing between the two is pretty good, but having them being kind of complementary to each other is also interesting, I think, in my opinion.

KASLIN FIELDS: I think that is one thing that has kind of put me off from CI/CD in the past, is that it is such a broad area. Where does it start and stop? [LAUGHS]

ABDEL SGHIOUAR: Yes, yes.

KASLIN FIELDS: And so I like this take on it, and I'm glad that we learned about it today. Thank you very much, Abdel, for conducting that interview. And thank you, Solomon, for being on.

ABDEL SGHIOUAR: Thank you. I hope you have enjoyed it.

KASLIN FIELDS: Thank you, everyone, for listening.

[MUSIC PLAYING]

That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media @KubernetesPod or reach us by email at <kubernetespodcast@google.com>.

You can also check out the website at kubernetespodcast.com, where you'll find transcripts, show notes, and links to subscribe. Please consider rating us in your podcast player so that we can help more people find and enjoy the show. Thanks for listening, and we'll see you next time.

[MUSIC PLAYING]