#127 October 27, 2020
David Pait was a touring musician in pop punk band Sparks The Rescue. Now, he’s an SRE working on Kubernetes at an ad-tech company. How did he get there? And if you’re looking to change careers, how might you? Craig and Adam dig in.
Do you have something cool to share? Some questions? Let us know:
ADAM GLICK: Hi, and welcome to the Kubernetes Podcast from Google. I'm Adam Glick.
CRAIG BOX: And I'm Craig Box.
ADAM GLICK: So this past week, actually finishing up on Monday of this week, Steam was running their digital tabletop fest. And as many of you know, I enjoy both tabletop games and the digital versions thereof. So it was fun to watch some of those be played. I was wondering if you tuned into any of that, Craig?
CRAIG BOX: No. Is a digital tabletop like that old Microsoft Surface?
ADAM GLICK: [LAUGHS] You know, I've actually played with one of those. And no, it would be more like if you took all those tabletop board games that you have and removed all the pieces. And you just had a computer that would set them all up for you, which is super handy, nice. Of course, during that I also picked up a new game that I started playing called Similo. Which, if you've ever played Dixit or Mysterium-- basically it's a two-player cooperative game where you are both working like Guess Who to try and figure out what is the one card. And one person can give you clues, showing you cards, and telling you whether the card they're showing you is similar or dissimilar to the one that you're looking for. And it's a little bit of a logical deduction game. A little bit of an abstract thinking game. It's just a fun little thing to play, and we are trying that one out.
CRAIG BOX: I think I've played Dixit before, but I did play a lot of Guess Who in high school.
ADAM GLICK: Yeah, so think of a mix of the two.
CRAIG BOX: We had a Guess Who board in our homeroom. And we got to the point to where basically without any knowledge of computer science or anything. A few classmates and I had invented the concept of binary search. We came to the conclusion that we could ask someone, do you have any accessories. And then cut down exactly half of the board depending on whether or not they're wearing a hat, or a scarf, or anything. And very quickly get down to just one candidate. And it got so bad, obviously with nothing else to do in homeroom but play Guess Who, that we could play an entire game without needing the board. There's only about 30 people on the board. And you can remember all of the characteristics and what you need to do to get down to the end and play a game without actually looking at them. So put that in your digital tabletop and smoke it!
ADAM GLICK: [LAUGHS]
CRAIG BOX: Thank you again, to everyone who has filled out our audience survey at kubernetespodast.com/survey. We've had some great feedback. And if you haven't had a chance yet there is still time. We're keeping the survey open till November the 10th, so please take a moment and fill it out. It should take no more than three minutes, all going well. Let us know what you think of the podcast, and how we can continue to make it the best show for the cloud native community.
ADAM GLICK: Let's get to the news.
CRAIG BOX: Last week was the Cloud Foundry Summit in Europe, which was marked by a complete pivot of the platform to Kubernetes. Earlier this year, the Cloud Foundry website introduced the product as "an open source cloud application platform". But if you look today, it says it's "the proven developer experience for Kubernetes". To that end, the cf-for-k8s runtime, which lets you install a cloud native packaged version of Cloud Foundry onto any Kubernetes cluster, has gone 1.0. The Helm-based KubeCF runtime released version 2.5, if you want an experience that's more like what you used to run on VMs. The foundation promises that these two packages will continue to converge. To learn more about Cloud Foundry, check out episode 105 with foundation lead Chip Childers.
ADAM GLICK: Microsoft has announced Akri, an open source project for the edge. Essentially it gives Kubernetes a way to identify and find IOT devices, which Microsoft is calling leaf devices. Akri's architecture is made up of four key Kubernetes components-- two custom resources, a device plugin implementation, and a custom controller.
The first custom resource is where you tell Akri the kind of leaf device you want to discover. Then Akri’s agent searches for the availability of your desired leaf. Once your device has been discovered, the Akri controller deploys a broker pod that knows how to connect to the leaf and utilizes it. If you're curious to see it in action, there's a demo provided in the announcement.
CRAIG BOX: Contentful has open sourced their Kube secret sinker, an operator that syncs Kubernetes secrets to the AWS secrets manager service. This isn't the first secret sinking service, but this one aims to fix the perceived challenges in security, flexibility in caching, and the existing solutions. One of the big changes Contentful claims is that they cache the secrets and only sync when there is an actual change, which can save money versus more frequent syncs. The project is available now. And the creators are interested in any feedback from the community.
ADAM GLICK: A distributed trace takes space, and they accumulate a space. Race to place them with Tempo, a new open source tracing back end from Grafana Labs. It is integrated with Loki, Grafana, and Prometheus, while only requiring object storage. Tempo is Jaeger-, Zipkin-, OpenCensus-, and OpenTelemetry-compatible. It ingests batches in any of the aforementioned formats, buffers them, and then writes them to Google Cloud Storage S3 or local disk, which the creators feel makes it robust and inexpensive.
Grafana also announced Loki 2.0, with improvements to the query language, and transformations now possible across all logged content types.
CRAIG BOX: Now you have somewhere to store all those traces. You can update their version as the OpenTelemetry project has announced a release candidate of the tracing spec. The project now aims to launch matching RC versions of its APIs, SDKs, collector, and auto instrumentation components, as well as producing a release candidate of the matrix specification.
ADAM GLICK: In case you're observing things running on AWS, they this week announced the aid of US Distro for OpenTelemetry, a secure and supported distribution of the proceeding. It comes with the connectors required to send telemetry data to AWS CloudWatch metrics, traces, and logs back end, as well as the project's other supported back ends.
CRAIG BOX: Amazon has also announced that their application load balancer, Ingress Controller, is now the AWS load balancing controller. It adds extra features, such as support for using network load balances with Kubernetes services on file gate, sharing ALBs with multiple Ingress rules, and support for fully private clusters.
ADAM GLICK: The Alibaba and Ant groups have released Nydus, an image service that is an extension of the Dragonfly peer-to-peer container distribution project. Nydus improves on traditional container images, minimizing download times for large containers and providing file integrity checks across the containers' lifecycle. At its heart, Nydus is a new container format that uses a user space file system. It addresses many shortcomings of the current OCI V1 spec. And the team says that they were inspired to open source Nidus after seeing how it implemented many features that were being proposed in the upcoming OCI V2.
CRAIG BOX: Robin.io has announced a new free-for-life version of their storage technology called Robin Express. Robin describes itself as a CSI-compliant block storage solution with bare metal performance that seamlessly integrates with Kubernetes administrative tooling. The express edition is different from the paid enterprise edition in that it doesn't offer 24/7 support, with a limit of five nodes and five terabytes of storage, but available at no cost. We assume that it's free for the product's life and not yours.
ADAM GLICK: Kubernetes has got so enterprise that you now can buy it from a telco. Verizon Business has launched VNS Application Edge, a hosted Kubernetes add-on to their virtual network services. The service is a white label from Raffi, and promises to bring businesses closer to the edge and make them better able to take advantage of Verizon's 5G network.
CRAIG BOX: And that's the news.
CRAIG BOX: David Pait is a site reliability engineer at ad tech company Nesertive. He is a contributor to the Helm Project, and currently resides in the Raleigh-Durham area of North Carolina. Welcome to the show, David,
DAVID PAIT: Thanks for having me.
ADAM GLICK: A number of the people we've had on the show have had interesting backgrounds, and you are no exception. You used to play bass in a band called Sparks the Rescue, which may be known to fans of pop punk who are listening. That's quite a start to a career. How did you get into it?
DAVID PAIT: Yeah, it's a pretty interesting story. It actually happened by chance, just kind of fell into it. I had some friends in high school who used to be into the pop punk scene and used to take me to shows all the time. And when bands would come into town, they'd like a place to crash. It used to happen to be my house. So one of the bands that did was the band I ended up going on the road with. I was just out of high school, went to college for a year or two. Didn't know what I was doing, basically majored in partying. It wasn't the greatest fit, so I was back home and trying to figure out what to do with my life.
They were about to go on the Vans Warped Tour. This was 2010. They needed somebody to do merchandise for them, and I happened to say, hey, I got nothing going on. I'd love to come with you. They thought I was cool enough to come with them. I ended up hitting the road. And that was my first tour.
CRAIG BOX: I imagine that's very important these days. I think bands make most of their money on merchandising.
DAVID PAIT: Yeah, quite a bit of it. For us, I think it was probably the bulk of our revenue, really. From that and the shows themselves, for the most part.
CRAIG BOX: So why would you downgrade to being a member of the band?
DAVID PAIT: I know. It paid a lot less, surprisingly.
ADAM GLICK: Hopefully it came with other benefits. What made you decide to move roles within the organization of the band?
DAVID PAIT: That also happened by chance, surprisingly enough. They figured out that I knew how to play guitar. So they set me up playing side stage. So if you guys ever go to concerts and you see somebody onstage and you have no idea who it is, and they're back there playing guitar, or piano, or anything else, and they're not really a part of the band, that's basically what I was doing for them. When they would go into a chorus, or try to do lead guitar or something, it would be me who'd be playing extra rhythm to beef them up and make them sound a little bit better.
I knew all their songs at this point, and their bassist decided that he wanted to go back home and just had enough of the road and the road life. I knew all their songs. Never picked up a bass in my life, but it was two less strings than guitar, so how hard could it be? So I picked it up and--
ADAM GLICK: [LAUGHS] Les Claypool right now is just crawling out of his skin.
CRAIG BOX: What was it like having the spotlight pointed at you for the first time?
DAVID PAIT: It was super interesting, and nerve-wracking, and probably everything you would think it would be. It was very different. Even from just going side stage or backstage to being front stage. People are really looking at you now. You can't mess this up. Those first couple of shows I basically just stood there like a statue.
CRAIG BOX: It matters what shirt you're wearing.
DAVID PAIT: Yeah.
ADAM GLICK: Had to work on the stage presence a little bit?
DAVID PAIT: I basically stood there like a statue for that first tour for the first four weeks, making sure I had every note correctly.
CRAIG BOX: Now, your agent provided us a bio, and it said that you've played shows in 46 of the 48 continental United States. Which two did you miss out?
DAVID PAIT: I believe it was North Dakota and Alabama. We've definitely driven through them, but I've never actually played a show there. For the rest of them, yeah, I've been to every single one.
ADAM GLICK: The world of music is a long ways from the world of IT and Kubernetes. But you're at least the third person that we've had on the show who had a musical background. Do you think there any skills or talents that you learned from being a part of the music world that are transferable and help you in terms of tech?
DAVID PAIT: I wouldn't say skills. Not a whole lot of it is really transferable. I think probably the biggest insight I can give there is the part of the brain you're using, they're both very creative aspects. As software engineers, you walk in and you're creating something from nothing. You're writing an app, or you're inventing some API that didn't exist before you literally sat down, and typed it out, and wrote it. It's the same thing with music. You've got something in your head that you like, and you need to flesh out. You tap into that same part of the creativity mind that you get into. Some people just more involved in tech and like it, and shift that way. But I think any kind of creative aspect is where musicians tend to gravitate towards.
CRAIG BOX: Yeah, I think you can look at programming as being both an art and a science, depending on which way you look at it. There are solutions that are true or false, but then there are also multiple different ways to get to them.
DAVID PAIT: For sure.
CRAIG BOX: So you're on the road with Sparks the Rescue. And obviously, that's not your career at the moment. What changed?
DAVID PAIT: Around 2012, a couple of the members decided that they had enough. Looking at the band, we weren't making the milestones we're really trying to hit. We were dropped by our label at one point, dropped by our management a month later. Then bookings dried up. So it was really tearing off for us. Two of them decided they were going to leave and try to do different endeavors. With that, the last two members live in Maine, and I'm from North Carolina. So it didn't make much sense for me to stick around. So I decided to call it quits as well.
I've always been interested in computers. Don't think it was until the last year I was really in the band that I realized that I could do it for a living. It never clicked in my mind before. Even when I went to college before I started being in the band, I was going for a business degree. I didn't even put two and two together quite yet. Throughout that year I figured out that I like doing tech stuff. I like doing IT. Basically the tech guy for anybody on the road who's having issues with their laptops, as well as running front of house, which is mostly technology based-stuff now anyways, was a natural fit for me to transition from that.
CRAIG BOX: So you went back to school?
DAVID PAIT: Yeah I went back to community college, actually, and got an Associates in Applied Science and Information Technology. From there, I looked around for some jobs during my last year. Found an internship at my current company.
ADAM GLICK: What was the first job that you stepped into?
DAVID PAIT: That last year with my degree, I started looking around for jobs and found an internship. I applied to a bunch of places that were hiring interns and actually got a callback from the current company I was at. I ended up getting the job, just doing helpdesk intern stuff. My first task when I got there, we actually switched antivirus providers, so it was literally going around to every single Mac in the building and installing and uninstalling antivirus.
The Windows side was a little bit easier, because we could do things through group policy. But the Macs, we had no management whatsoever. Through that, I took it upon myself to implement something, anything that could stop me from having to walk to every single laptop in the building and install that kind of stuff.
CRAIG BOX: Laziness is the true sign of the Site Reliability Engineer.
DAVID PAIT: This is very true.
ADAM GLICK: So it sounds like early on you were taking some interesting tech, and then looked at the fact that automation helps you solve things at scale. And looking at how you automated. How did you learn those pieces?
DAVID PAIT: It was really just trying to look out what was in the market. So on the corporate side of things, again being tasked with walking around and installing antivirus, there's gotta be some better way to do this. Looking at projects and open source software, and even paid solutions for how to manage software on Macs. Ended up finding a program called Munki, which is actually open sourced by Disney. And it's how they manage all of their Macs. So went ahead and set all that up for our internal company.
The last final piece that was still a pain, I had to actually go around and install Monkey itself on every single laptop. But at least after that I could manage everything remotely and push software out, and also have our own internal app store so that people can be more self-serving. So they don't have to put in tickets to get Adobe Acrobat installed. They can get it themselves if they need it.
CRAIG BOX: So it sounds like you've automated yourself out of your first job. How did you move onto the next step?
DAVID PAIT: Luckily, my manager was also the manager for the DevOps team. So we would be in daily stand-ups and go around the room. And I would be talking about how I installed antivirus software. And the next guy would talk about how he was moving stuff from Java 8 to Java 11. Most of the terms that were coming out of his mouth were flying over my head. I had no idea what was going on.
But luckily being around that and having those kind of mentors, I slowly started to pick up things that they had been talking about, and just got more interested in what it was. When I joined the company, I didn't even have any idea that DevOps was a thing or that you could even have a job in software, but on the operations side of things. So just being there with a company that actually had that was super beneficial and really opened my eyes up to the other side of software engineering.
CRAIG BOX: Give us an overview of the software that Netsertive builds and runs.
DAVID PAIT: Our company basically takes ads on behalf of companies. So we work a lot with considered purchases, probably the best way we put it. So think of your cars, your furniture, or anything that you're going to research online, and really go into a store and buy. That could even be medical services. You need to get liposuction done for some reason.
CRAIG BOX: No judgment.
DAVID PAIT: Things that you're going to research.
ADAM GLICK: Generally high ticket items, it sounds like.
DAVID PAIT: Very high ticket. Again, stuff you're going to buy in the store at the end of the day, and not purchase online. We work with those companies to localize all of their ads and run individual campaigns for each of their locations. So think of like an Ashley Furniture, or BMW and then all of their dealers. We'll take ads in, template ads, and then localize all of them for each of their corporate locations. And then push them out to third parties, being AdWords, or Facebook, or Bing.
And then on the inverse side of that, we have the config side. We also have the reporting side where we pull in the data from the previous day every morning, then put it in our reporting dashboard that customers can log into and see how their campaigns are performing.
CRAIG BOX: Now, given that you're talking to us on the Kubernetes Podcast, is it fair to assume that there is some Kubernetes in there?
DAVID PAIT: Sure is. About two years ago, right before I switched from IT to the DevOps world, they had just finished POCing Kubernetes. When I joined, in my first week it was tasked of all of us to bring up a cluster and actually get a new app into production within the first two or three weeks that I was there. So I was super fortunate to be joining a team that was just starting to do that. And I didn't get on too late where they had already went through the trials of figuring things out. I got to do that with them, which is super beneficial for me. I'm a very hands-on person so that's wonderful.
All of our stack, we have about 96, 97 microservices. And I think all but two or three are actually running in Kubernetes. And then all of our pullers right now-- again, I mention the data that we pull in is running on Cron in Kubernetes.
CRAIG BOX: Did you have to learn the art of DevOps at the same time as you were learning Kubernetes? Was that something you picked up by working with colleagues on your new team? Or was that something that you were able to study as you made that transition?
DAVID PAIT: It was mostly learned from having those mentors on the team, and figure out what they did and how they did it. Working backwards, learning everything that was going on. And then all of a sudden realizing that they were implementing this DevOps methodology. And then going back and researching it and looking more at what the discipline defined.
ADAM GLICK: You mentioned that you're doing a bunch of DevOps work, although your current title is SRE. How do those two fit together? And is there an operations team as well, or is it all merged into one role and team within your organization?
DAVID PAIT: For us in our org, we used to have what we called our DevOps team or DevOps engineers. We actually picked up the Google book on SREs, and decided that this is probably a better way to view DevOps the methodology and implementation of it. Try to stop using DevOps as a job title, and really make it more of a methodology that you apply to a job.
We made that transition, I'd say, six months ago. It's relatively new for us, a new step. I'm trying to figure out how all of that fits together. For right now it's really just the job title that's the implementation of the methodology.
CRAIG BOX: One of the key things about the SRE methodology is the shared responsibility. And being able to say to a development team, we're not going to support this until it meets our standards. To do that you need a certain amount of buy-in from management. How did your team go about getting that buy-in as you moved to SRE?
DAVID PAIT: It was really just making the case for why we needed it, and how it could benefit them. I think another big thing coming from the music world too, is the soft skills that you get from having to deal with different people every day. And trying to figure all that out. That really helped transition a little bit better to know that if I'm talking to the VP of product, or our CEO himself on how exactly implementing these methodologies is going to benefit the company in a certain way that applies to them directly.
CRAIG BOX: What does a day in the life look like?
DAVID PAIT: My days are pretty sporadic. It could be anything at all. But for the most part it's usually coming in and looking at tickets that may have come in overnight or any kind of alarms that may have gone off overnight. We actually have a SRE team in Russia that handles the overnight stuff. And we do hand-offs in the morning to trade to see if anything happened that I need to be aware of. I'm currently the first line of defense.
So if somebody finds an issue with our production systems, or a bug, or anything like that, I'll at least first go in and see if it's, A, an actual bug, or if it's just user error and they're just not using the system right, before I'll hand it off to our development team to take a closer look to figure out what's going on. Or if it's something that I might be able to do myself, and not have to bother a team to take them off of something, I'll do that as well.
ADAM GLICK: Send in the red shirts.
DAVID PAIT: Yeah. Exactly. The other half of it is project work. Whether that be moving things into Kubernetes, or right now we're currently looking at trying to move all of our pullers out of Cron, and more time-based to more data-driven. So that you started it up in the morning and we don't have to wait four hours for our automation to kick in. Because we're waiting on these pullers to technically be done, when they could be finished in 30 minutes instead of four hours, because you're actually depending on something that's been finished for a while. And you're not just waiting on time for time's sake.
ADAM GLICK: One of the components of SRE that's talked about in the book is that people spend a certain amount of their time doing coding on automation, and not just the running of the operations pieces of it. Is there a development component to the job that you do?
DAVID PAIT: For sure. It really just depends on the project and what we're working on, and where it fits in. Some of the stuff that we do, there may be some open source software that's already out there that I don't really need to dive into. But there are certainly parts that we have to end up coding ourselves. Really anything that's going to make my life easier is what I tend to gravitate towards. Most recently we're doing Kubernetes upgrades in Amazon with EKS. We're still using our own self-managed nodes.
They've come a long way having eksctl. That helps out a lot with doing those kind of upgrades. But for us, because we started right when they started their managed service offering, our setup is still quite different than how they're doing it today. I've actually written some automation in Go that'll do those upgrades for me. So I don't have to spend three or four hours clicking through the Amazon UI to set up new auto-scaling groups, and new security groups, and IAM roles, and EFS provisioning, and all that kind of mundane things.
ADAM GLICK: I understand that EKS wasn't the first Kubernetes platform that you used. But when you adopted Kubernetes, what platform was it originally running on?
DAVID PAIT: We started doing this POC and migrating to Kubernetes, say, 2018. During that time, Amazon didn't have a managed offering whatsoever. Google, you guys were the only ones who had an offering at the time. And we actually debated whether to hook up our complete AWS stack with Google just to get the Kubernetes option. We ended up deciding to go with Rancher, which was great at the time.
You didn't have to know entirely too much about the system, but you still had to manage the control plane yourself, which, again, was super beneficial for me and I think the rest of the team, to understand how all those pieces fit together, and if something goes wrong, you can still go in and figure out what exactly is happening. We did a lot of chaos engineering at the first outset of all of that. To have somebody go in and run a batch script that just overloads one of the control nodes, to see if we could figure out what entirely is going on and where the issue was to begin with.
CRAIG BOX: When Amazon bought their managed service out you migrated off Rancher. What were the challenges in how you go about that migration?
DAVID PAIT: That migration was pretty interesting. During the time we used the Heptio service Ark, I believe is what it was called then. I believe they got bought out by VMware, and it's now called Velero. But we use that to actually back up all of our clusters, which is a really great offering, especially with Kubernetes being so ephemeral. You're not actually saving EC2 instances anymore. You're just dumping a bunch of the YAML into a zip file, and that's your backup. So we use that to stand up new clusters in EKS and migrate everything over.
Everything was great for about the first four days. On day 5, we started noticing some issues with Ark itself trying to start up. And also our auto scaler just started kicking itself over and over. And we weren't entirely sure what was going on. I opened up some support cases with Amazon. They weren't entirely sure what was happening either. A couple days later I slowly started to dive into what was happening. And you could see some errors about one of the Kubernetes APIs not having any information about it. So it said it existed, but if you actually look for it, it told you it didn't exist.
What had happened is Ark had actually restored one of those beta APIs that was enabled in Rancher, but not enabled in EKS. But it didn't have any information about it. So it made it look like it existed, but it didn't actually exist. When any of those applications started up and tried to use discovery against the Kubernetes API to figure out what exactly is in the cluster, it would hit that certain point and then just fail out. OK, it said it existed, but it didn't actually exist.
CRAIG BOX: You could almost consider that a bug in Ark, that it let that happen. Did you follow up with them? And how did you go about resolving that not only for yourself, but for the community?
DAVID PAIT: I did. At the time I didn't know too much Go. So I did open a ticket and let them know what was happening. I believe they ended up fixing it and trying to avoid restoring the Kubernetes APIs themselves. But any kind of custom resource definitions they would of course do. I believe they stopped doing it from there. I did report it back to them. I believe they fixed it. But at the time, didn't know too much Go, so I didn't dive too much into it. I think today, I probably would have taken upon myself to probably implement a fix and commit it back to the repo, and help out the community.
CRAIG BOX: So the move from Rancher to EKS, was that motivated solely by having someone manage the service on your behalf?
DAVID PAIT: 100%. The $0.30 an hour that it costs for them to manage it for us was super beneficial for us. So we didn't have to take up so much of our time to manage those things ourselves. For the clusters that we have and the things we're doing, we don't need too many highly specialized things that we need to manage ourselves. They enable certain things that aren't enabled in EKS. So for us it was a huge relief burden that we could get off of ourselves, to not have to manage that control plane anymore.
CRAIG BOX: As Kubernetes becomes a commodity provided by cloud providers, how do you think vendors like Rancher, who started off making it easy to install these things, but now aren't necessarily valuable to you, how do you think they can remain relevant?
DAVID PAIT: I think they still are relevant. I think even today, they've made some changes to highlight Kubernetes as their main thing. Bare metal installs are still huge all around the country, and the world, really. Some people just aren't using cloud yet. To stand that up in your data center, Rancher is a great option to do that.
ADAM GLICK: In terms of your own deployment, and following SRE, and DevOps principles and practices, is everything automated, or are there still a lot of manual tasks to be done?
DAVID PAIT: I will say it's mostly automated. I wish it was 100% automated. There are still some kind of push button things. But since the time that I started in the role, and to where we are now, it's almost night and day. It's almost like a completely different engineering department. We used to have biweekly deploys where we'd actually have to take all of our applications offline, have a scheduled maintenance where we send out email to customers and say, hey, you can access it for an hour while we're upgrading everything, to now you can deploy it anytime, anywhere with a big help from Kubernetes to get us there.
But also that and using GitOps as well. As soon as you merge something to develop, it's in our pre production environments. And our automation kicks off, and you do your testing as you need to. The only last step that we have that still has to be done manually is the deployment to production. Still trying to convince my bosses that it's a good idea for us to automate that. We'll see how that goes.
ADAM GLICK: Still some wariness about fully automating the ability to push things into production in case something goes wrong. The safety valve, essentially, is that a human still has to make that call.
DAVID PAIT: Yup. But the good part about it, too, keeping in line with the SRE methodology, is we've given our developers a whole lot of tools that they can use themselves. So it's not me that has to deploy the production. It's really, anybody can do it. The team that's actually working on the app, as long as they've deemed it worthy, and tested, and everything's good, they can deploy it themselves. They don't need me as the gatekeeper to do it for them.
ADAM GLICK: Have you gone full on cattle versus pets? In terms of, have you shut off all RDP and SSH to all the servers, and any fixes go through a redeployment, versus a go in and make changes?
DAVID PAIT: For the most part. I think now that-- again, once we have most of our applications running in Kubernetes, with the use of Helm as well. It's a lot easier to just run our Helm rollback than it is to go in, and try to figure out what's going on. We still have some kind of legacy services and APIs that are still running on old EC2 instances, that we're hoping to move into Kubernetes or just shut off completely, hopefully by the end of next year.
CRAIG BOX: You've mentioned a few different methodologies there with Getopts, and also with Helm. How do you evaluate the tools out there for this kind of development to deployment on top of Kubernetes? And how do you decide what you will support and offer to your developers?
DAVID PAIT: For us it's really just what's going to fit our needs, and what the company wants to do. We're very open to new technologies. If anybody has a passion about something, or wants to try something new, we're super open about it. For us it's really just trying to find something that's going to fit our needs. For us it was really we need to deploy stuff to Kubernetes. What's the best way to do that? We have four different environments. And how do we get each one of these different? Do we end up using something like Kustomize? Or what was the old one that used to be, Ksonnet, I believe at the time. I think we investigated that as well. Or Helm. And Helm seemed like the best option for us. Right now we have just a repo of blanket Helm templates that we use. And then each one of our applications stores its own values file. And then ends up doing the ploys that way.
ADAM GLICK: You mentioned Helm a couple of times. And not only do you use it, but you're also a contributor to the project now, correct?
DAVID PAIT: Yes.
ADAM GLICK: What inspired you to contribute to Helm?
DAVID PAIT: The biggest thing for me to get involved with that was we had a bug that we need fixed. Most things that I end up contributing to are just stuff that I find that I need to fix for my company, I'm sure other people as well. From the migration from Helm 2 to Helm 3, there's a flag that you can set. It's called reuse values. We use that a lot in our pre production environment, to where we have basically a super chart, which is a giant Helm chart made up of all of our microservices, their Helm charts as well.
We use that flag to actually deploy to the environment to stand it up. And then after that, make changes to individual applications. So we leave that flag on all the time. What was happening is if you did it the first time, and you weren't using any values beforehand, it wouldn't deploy anything at all. It would just ignore that flag and completely break. Going in and making that fix to make it work like it did in Helm 2 again was what I contributed.
CRAIG BOX: Two years into your Kubernetes journey, how have you seen the technology change? And how do you think it would be different for someone getting started today, from when you did it?
DAVID PAIT: Luckily, with Kubernetes it's super backwards-compatible. So a lot of the stuff that we're doing now is still relevant today. There's a lot of new features you can take advantage of, but the stuff that we were doing then almost applies directly today. It has changed quite a bit. There's definitely a lot of new features you can get involved in. And lots of things that may be applicable for your company or may not. It's a bigger landscape than it's ever been.
CRAIG BOX: There are a number of people who unfortunately aren't able to keep up with their previous career, especially in the arts. Touring musicians obviously are a big part of that. If there are people who are interested in retraining today, getting involved in IT, or working their way into an SRE kind of career, what advice would you give them?
DAVID PAIT: I think the biggest thing is really just don't be afraid to do it. Look around, try things. If you're interested in it, give it a shot. Today, especially with Kubernetes itself, two years ago, it was difficult to get started. You had Minikube, and I think that was probably about it, to get something running on your own laptop. Today there's four or five different options that you can get up and running. Docker for Mac has a built-in Kubernetes cluster that you start. You can also run kind and have a cluster up and two seconds to do that kind of stuff.
For me, I think the biggest thing is just hands-on work. Some people learn a little bit differently. But for me it's really just diving in with a new technology and self training.
ADAM GLICK: Finally, for fun, you obviously used to play in a band, know lots of people and things in the music scene. What is one band you think is underappreciated that people might want to check out, but don't know about?
DAVID PAIT: That's a tough one. There's quite a few. I'll try to stick to the pop punk realm. One of the newer bands that's come out is called Hot Mulligan. A very strange name. They have very strange song titles as well. Very solid band. I believe they're out of Seattle. Don't quote me on that though. Super good.
CRAIG BOX: Lansing, Michigan
DAVID PAIT: Completely wrong!
ADAM GLICK: Ah, they're both the northern border states, right?
DAVID PAIT: That's close enough, right? They're basically next to each other.
CRAIG BOX: Do you have any stories from the road? Do you have any tales of bands who you think, oh, hey, they'll be out there drinking hard, but you go backstage and they're actually drinking a cup of tea, or something like that?
DAVID PAIT: Yeah. You'd be surprised at how many of your favorite death metal bands actually love Taylor Swift, and listen to pop music 24/7. I'm the exception to that rule where I actually really enjoy pop punk music, and I also played it. For the most part, your favorite musicians, whatever they play, they probably don't listen to that as well. One tour we were on we had a death screamo band that went on every night and screamed their lungs out. And then you'd see them a half hour later standing front row for our set, singing along with the words, and just absolutely loved our band. You find it quite a lot, actually.
ADAM GLICK: That's awesome.
CRAIG BOX: I think music people just love music.
DAVID PAIT: Yeah.
ADAM GLICK: Thanks for joining us, Dave.
DAVID PAIT: Happy to be here. Thanks for having me.
ADAM GLICK: You can find David Pait on Twitter @sparksthedave.
DAVID PAIT: That's it.
ADAM GLICK: Thanks for listening. As always, if you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @kubernetespod, reach us by email at firstname.lastname@example.org, or fill out our survey at kubernetespodcast.com/survey.
CRAIG BOX: You can find other things at our website, including transcripts and show notes, as well as links to subscribe. Until next time, take care.
ADAM GLICK: Catch you next week.