#158 August 13, 2021

Telekom (with a K), with Vuk Gojnic

Hosts: Craig Box, Jimmy Moore

What is a telecommunications provider, if not a very distributed system? Kubernetes is becoming an important engine for the world’s telcos, especially as they roll out 5G. Vuk Gojnic leads the team rolling out Kubernetes across Deutsche Telekom (the parent company of T-Mobile), and he tells us how the worlds of telco and cloud have converged.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box, with my very special guest host, Jimmy Moore.


We've had guest hosts on the show for a while. And another show that's been running through a roster of guest hosts is "Jeopardy!" which we've talked a little bit about in the past as being a cultural phenomenon with a host for a very long time, and coming back into the news recently from an impressive winning streak from contestant James Holzhauer. This week, they have announced their plans for permanent hosting.

JIMMY MOORE: Well, you know, you can't really fill the shoes of Alex Trebek. You can add something new and exciting. And actually they have two hosts. And I'm most excited about the host who's going to be doing kind of special events, like college tournaments or spinoffs you mentioned. And it's the actress Mayim Bialik.

CRAIG BOX: Mm-hmm. Of "The Big Bang Theory" fame.

JIMMY MOORE: That's right. You know, she not only played a neuroscientist on TV, but she's one in real life.

CRAIG BOX: That is an excellent example of good casting.

JIMMY MOORE: Yeah, absolutely. And, fun fact, when she was on "Blossom" as a child, her character blossom was super smart and got to be on "Jeopardy!" So she actually got to play the game as an actress. And now she's hosting it as a real human, I suppose.

CRAIG BOX: My interesting observation about "Jeopardy!" is that the guy who has been chosen to be the somewhat permanent host was the guy who was doing the choosing.

JIMMY MOORE: [CHUCKLES] That's right. He works there.

CRAIG BOX: Yes. It's a guy named Mike Richards, who is the executive producer. He does have game show-hosting experience, as I understand it. He was also in the running to be the host for "The Price is Right," which is hosted by Drew Carey of "The Show" fame.

JIMMY MOORE: That's right. Is he still hosting that? He's been on that forever, longer than Bob Barker.

CRAIG BOX: Really? I do remember Bob Barker's cameo in the movie "Happy Gilmore."

JIMMY MOORE: That's right. Didn't he, like, beat him up?

CRAIG BOX: Kneecapped the guy.

JIMMY MOORE: That's right, right on the golf course, yeah. Do you remember what he ended his shows with?

CRAIG BOX: Well, now, I never watch them, you see. I didn't grow up in the US.

JIMMY MOORE: Oh, that's right.

CRAIG BOX: These are just things that have filtered through to me from popular culture.

JIMMY MOORE: I wasn't sure if we exported. But he reminded you, always, spay and neuter your pets, at the end of every episode, because he wanted to control the pet population.

CRAIG BOX: Well, I hope that worked out well for him.

JIMMY MOORE: Yeah, I mean, I suppose he had lots of pets. Interesting fact-- I was voted most likely to be a game show host in college.

CRAIG BOX: Have you actually hosted a game show?

JIMMY MOORE: Absolutely.

CRAIG BOX: Well, then that's good voting as well.

JIMMY MOORE: I suppose. Not yet on TV. It's just been in my living room. But hey, what are you going to do?

CRAIG BOX: You're also a fantastic news reader, so let's get to the news.


CRAIG BOX: Key vendors supporting eBPF have announced the creation of an eBPF foundation. The group is a subproject of the Linux Foundation, similar to the CNCF. eBPF, covered in episodes 91 and 133, is an extension mechanism for operating system kernels, allowing you to address a number of networking, security, and observability use cases. The foundation establishes a steering committee to take care of the technical direction and vision of eBPF, and the founding chair is Thomas Graf of Isovalent.

JIMMY MOORE: Istio 1.11, out this week, brings a CNI plugin for deploying mesh workloads without requiring network admin capabilities and support for running a mesh with a control plane in a different cluster. New experimental features include support for Kubernetes multi-cluster services.

CRAIG BOX: You know your project has become key infrastructure when the NSA starts telling people that they should secure it. This week, the United States National Security Agency and sister org Cybersecurity and Infrastructure Security Agency launched Kubernetes hardening guidance.

Regular listeners to the show will not be surprised by their recommendations, which include scanning containers for vulnerabilities or misconfigurations, running containers with the least privileges possible, and using network separation to control the amount of damage a compromise can cause.

JIMMY MOORE: The regular listener may also be surprised to learn that not everything that runs on the cloud runs in Kubernetes. Google Cloud Service Directory is a fully managed registry for tracking services across environments, and even across clouds. In a preview released this week, GKE services can now be automatically registered and unregistered from the GCP service directory.

You can access records in the service directory over DNS, HTTP, or gRPC. Google also announced improvements to Cloud Logging and monitoring for GKE, letting you go directly to monitoring data for a Kubernetes object from a log entry.

CRAIG BOX: Monitoring company Sysdig has added a variety of new Prometheus features. A new integrations manager provides automatic discovery and assisted deployment of monitoring integrations, along with pre-configured dashboards and alerts. A new query explorer helps you dig into your data. And Sysdig can now act as a remote data store for long-term storage of Prometheus metrics.

JIMMY MOORE: When you think Kubernetes policy management, you may think of OPA and Gatekeeper. But there are other options out there. Another project in this space is CNCF sandbox project Kyverno. With more than 6 million downloads in less than six months Kyverno's creator, Nirmata, this week announced $4 million in pre-series A funding. The investment was led by Z5 capital.

CRAIG BOX: Finally, the CNCF has split their annual survey in half, and have just launched part 2. They'll ask questions like, how are you using cloud-native technologies? And what flavor of ice cream did you turn to when you learned your KubeCon proposal was declined? Make sure you tell them that you get your news from podcasts in question 23.

JIMMY MOORE: And that's the news.


CRAIG BOX: Vuk Gojnic leads the Kubernetes Engine Squad at Deutsche Telekom Technik in Germany. Welcome to the show, Vuk.

VUK GOJNIC: Thanks for having me, Craig.

CRAIG BOX: You grew up in Montenegro, and that's a country that not all of our listeners may be familiar with. Could you give us a bit of a flavor of the modern history of Montenegro.

VUK GOJNIC: Montenegro is a very old country, very small, 620,000 inhabitants, very diverse, both culturally and geographically. It has beautiful nature, warm people. And it's in the heart of the Balkans. And when you mention the Balkans, many people would immediately think about the complex social-political landscape. The whole history is full of extremes and controversies. But it's a very interesting place and a very nice place to be there and to visit as well.

CRAIG BOX: I understand that when you were at school, you did a lot of learning of computers as a concept on the whiteboard-- or on the chalkboard, I should say. And people were explaining to you how to program without you actually having access to computers. Is that the case.

VUK GOJNIC: Yeah, that's very funny from today's point of view. That time of early 90s, in that region, was not the best time of all. This was the time when I was growing up and getting education. So I happened to learn the programming in high school on the blackboard-- or on the greenboard-- where the teacher would draw the algorithms on the board, explain them, and then, next to them, draw the line of the code realizing that algorithm. And so it happened that only two to three years after that I got first permanent and stable access to the PC when my family decided to invest and buy me one in 1995.

CRAIG BOX: That's how Steve Wozniak says he invented the Apple 1. He drew it out on notebooks.

VUK GOJNIC: That's probably, for my case, a bit too ambitious. But was not that bad at the end.

CRAIG BOX: Now, we're going to talk a lot about telecom companies today. Telcos grew out of the national post office. In Britain, for example, there was the GPO. And they were the post and the telegraph company. And then, when telephony became a thing as well, they had what, in Europe, we call PTTs, or Postal, Telegraph, and Telephone Services.

Given the state of change through the Balkans during that period, what was the telecom system like?

VUK GOJNIC: It's pretty much similar to any country in the world at that time, any, I would say, even developed country. Because it was inherited from what once was Yugoslavia. The interesting thing about the country code of Yugoslavia at that time was +381. And it's inherited by today's Serbia. And then all other five countries or former Yugoslav republics got their own country code.

So the transformation in the 90s brought us, out of one country code, the six, which are all in the vicinity of 380-something, I have to say. It was more or less PTT, Post, Telephone, Telegraph service, that was a state-owned system that was very much, especially in that socialist economy, focused on reaching as many people as possible with the telephony services.

And in that situation, you had a lot of built infrastructure, the copper infrastructure, obviously, for that time. And you had all those things like, to connect the last village, there were some twisted pairs that were even 15, 20 kilometers long. That made a bit of issues later on when we were introducing broadband. But the dynamic development of modern telecommunications started in, say, mid-90s, by introducing parallel to fixed line, the mobile networks, and then, soon after, the internet access.

I have to say that I'm not coming from that typical telecom background. I was not dealing with the original communications services. Obviously, in the meantime, I learned a lot about that in my studies as well. So I was somehow coming into that world or close to that world from somebody who, as I said, learned to program on the blackboard, and then got very enthusiastic about that, and then spent a lot of free time to get into it. And it just happened I got in a place in which I merged or I joined these two things, telecommunications and software development.

CRAIG BOX: So you mentioned that you came from an internet background but you ended up in the telco space. You got a master's in telecommunications engineering from the University of Montenegro. Was there a computer science software option at school? Or was that what the tech course was that was available to you?

VUK GOJNIC: It was. The history of that is interesting because I started working so I got enthusiastic about computers and programming and everything in high school. And I was binge-reading all the magazines and the journals. And obviously, at the time, there was no internet, at least not accessible to us in the Balkans. So I couldn't even read more. But I was spending most of my free time looking at it.

So at the beginning of university, I was deciding what to go for. And there was an option on two faculties to go for something that's close to computer science. One is the faculty of mathematics and another was faculty of electrical engineering. And as somebody who was also, in high school, very much interested in physics and so on, mathematics was not my thing. So I went electrical engineering.

But at the same time, I started working in my first company that's, in the meantime, nonexistent and got essentially blended into what's Deutsche Telekom in Montenegro, today, in several iterations. But at the moment I started working in software development, I was doing the studies in parallel. And by year three or even four, when we would choose the specialization, I was looking at what's on the offer for a computer science specialization.

And I said, OK, most of it is either a bit old and not actual to the standards, and many of things I experienced in practice, I said I'd rather do telecommunications. And it was one and a halg years of specialization on this faculty. I like to say that, by education, I am an engineer of telecommunications. But by my professional origin and experience, I am a software developer.

CRAIG BOX: One of the things that you developed early on, you were a founder, I would say, of the most popular website in Montenegro, the Cafe Del Montenegro. How did that come about?

VUK GOJNIC: Yeah, that was a spontaneous thing. I spent a long summer in 1997, before starting faculty, browsing or surfing one CD that came with a magazine. And this CD was kind of a simulation of the internet. It was obviously a set of HTML pages. I had time, and then I got into and then learned how to make web pages and also dynamic web pages and so on.

And when we started at the university with one of my friends who was also enthusiastic about the introduction of the internet that was, around '96, coming into that region, we said, OK, why don't we make a website?

And it was started as a university project without any ambitions. And then we learned through it. And then, at the end, in 2005, we sold it out. But we touched many things that you could say are, today, social networks. We had a community. We had different things around.

So the good thing is that there is this Internet Archive service where I can still see how it looked in 2001. So it's a kind of archaeological site for me. Because what it is today, it's a proper online news site for professional journalists. And we obviously made it much more spontaneous and much more colorful. So this helped me a lot, got a lot of nice experiences, how to develop the property on the web or online, and how to promote it. We had all these things like a banner exchange and stuff, before dotcom boom-- or bust at the end.

CRAIG BOX: So you have a web ring, and say, if you like the site, you can click this link to go to the next site in the web ring.

VUK GOJNIC: Yeah, there was something like that. And all it cost us was just to paste a small JavaScript code or something like that. And then we would get, every month, a check from the US which we could turn into real money. So it was--

CRAIG BOX: Beautiful.

VUK GOJNIC: Beautiful indeed.

CRAIG BOX: You were working for ISP during this time. Is that the company that you mentioned that was acquired by Deutsche Telekom?

VUK GOJNIC: There's the company who brought internet access to Montenegro in early '97. What it was actually, was one small room and one rack in that room where there are a few servers and a bunch of these US Robotics modems for dial-up connections. There was another office where there were some standalone tower Alpha digital servers and so on.

So it was a very, very lucky coincidence that we got the users of that small internet service provider. And then, soon after, there was a member of the team who was developing early internet services, and then internet access, and then infrastructure.

Very soon, we moved to Linux there, and put everything either on open source or on our own development. And it grew to the point where it served a pretty decent number of users for Montenegro size. It was maybe 70,000, 80,000 of these dial-up accounts before it got acquired by a locally state-owned telecom company. And then that bundle got acquired, later on, by Deutsche Telekom group. But historically I came in through a small, I would say, like a garage company.

CRAIG BOX: So you've effectively worked at that same company now for how long?

VUK GOJNIC: 23 years would be this year if I count the continuity. But I could say I've worked for Deutsche Telekom group or its predecessors now for more than 20 years. [CHUCKLES]

CRAIG BOX: I hope that means they give you a nicer parking space.

VUK GOJNIC: Yeah, a parking space is always a topic. It's a benefit. But where I am currently, in Germany-- basically I moved, in the meantime, through a couple of positions-- but where I am currently, a parking space is a commodity in Telekom. [CHUCKLES]

CRAIG BOX: Your story is very interesting to me, because I hear a lot of myself in that and that I worked for an ISP for a while when I was at university, and then I worked for an internet hosting company for a while back in New Zealand. All through this time and for many years before, my father worked for the telecom company of New Zealand, who was, again, spun out of the New Zealand post office, and became known as Telecom, and through a variety of teams and sell-offs and so on, he's still effectively doing the same thing today.

But one thing that was interesting about that was that when I grew up and would go and see him at work, we had all these crossbar switches. And it was all very physical and loud and noisy and so on. Yet when I talk to people in telcos today, they're effectively doing the same thing that I'm doing. They're doing Linux and they're doing cloud-native and Kubernetes and so on.

And it's very much these worlds have come together, the worlds of hardware and software. Everything can be virtualized now. All services are software or available over the internet. How did that change affect you? How did going from an internet company to a telecom company and then effectively having to turn them back into an internet company, how did that work out?

VUK GOJNIC: That's well spotted, actually. I was reflecting many times on that journey that I started in a small ISP company where, even from the very start, where a bunch of servers and a bunch of network devices, switches, routers to connect you. And that was most of what it was. You had these infrastructure functions like RADIUS for authentication, like proxies to save on the traffic, the famous Squid, at the time, which is still around.

So all of that was a world that I grew up with and I knew. And then, through the different steps and different decisions in career, I came into world of telecom, which was at that time still-- and also for a long time after that-- was still very much there is a device, there is a network, there is a proper engineering doctrine, how do you draw the lines, the connections, and so on. But these boxes were more or less black boxes, at least for a telecom operator that we are in that we were for a long time.

Inside-- what happened to be clear to me later on, inside of these boxes, if you look at typical vendors or developers of this equipment, they were already, a long time ago, virtualized with some proprietary techniques. There were also software and hardware with certain separation. These vendors were using one platform for many types of network functions, how we call them, and so on. But it was not open. It was a mostly siloed world.

And what started happening in, let's say, late 2000s, maybe around 2010, the movement started going into, OK, could we decouple the hardware from the software? Could we make a generic layer of the hardware-plus-platform where any kind of software can run? So it's where the network function virtualization as such was born.

But in terms of somebody who is an observer from the outside, and for me, it was completely like, oh, I was searching. I caught myself searching, like is there a telephony switch which is open source. And there are some SIP-based soft switches, of course, as a predecessor of that.

But my all-time in between, let's say, 2005, when I get more into the telecom world, until 2015 more or less, I was thinking, OK, why are those things not in GitHub or earlier in the SourceForge and so on?

Now this trend came exactly. And what we are seeing is there is a radio access net for a mobile under the name of O-RAN, being built as an open source stack of open source code and so on. So a lot of things are now feeling for me like coming back home after a long break of like 10 to 15 years. I'm seeing the things that are completely normal to me 1987, 1988, and so on.

CRAIG BOX: During that time, you took a little detour into management and you took some executive roles-- strategy, CTIO, VP platform kind of roles. What encouraged you to make that move?

VUK GOJNIC: In the organization and in the society and the culture, there was a kind of notion of career progression that best pays off when you go into the management direction. So I learned, later on, the saying that, take a great engineer, convert him or her into the manager, and you lose the engineer essentially, and you get the bad manager.

I was always somewhere in between. And I got interested simply in leading the team, getting to run bigger projects when you get from a small ISP into Telekom, then you talk about big investments. So I was still very close to technology but managing the launch of the services like ADSL at that time, IPTV, converting the network from the old PSTN into software-based, soft-switch-based, and so on.

So I did a lot of projects, and then, through that, I got exposed to how do you evaluate the investments, how do you check the payback and return on investment, all these things. So it got interesting to me. So simply I was doing more and more on that. And at some point, I decided, OK, I want to go now to general management, general executive roles, corporate strategy and stuff.

It was simply the phase in the career in which I learned a lot. And I got different perspectives, which, after some time, started to have a shortage of challenge, or at least I got the feeling and the craving for doing something hands-on again.

CRAIG BOX: The engineer was still buried under there somewhere?

VUK GOJNIC: Yes, definitely, definitely. I was not sure what's bothering me for some time, let's say, before I switched off. But then, after I changed, I realized, OK, that was the thing I had to deal with and produce something or be very close to where the production is happening and work with the people enthusiastically on the stuff. And that happened to be in the domain of cloud-native and Kubernetes.

CRAIG BOX: Indeed. And this is where the stories recombine to some degree. Why does Deutsche Telekom need Kubernetes?

VUK GOJNIC: It is essentially the continuation of the story of virtualization that I mentioned, network function virtualization. Around 2010, there was a search for, OK, how do we decouple the hardware from software and how do we run software on the common generic platform layer?

And through a number of iterations, we got, till recently, a combination of either running on OpenStack or running on some VMware telco stacks when you talk about on-prem. And there were a lot of challenges with that. And at least in my observation, the single biggest challenge in this network function virtualization is that there was never a clear orchestration story or clear de facto standard in orchestration.

So I gave a bit of a lot of different orchestration initiatives that were all kind of having their shortcomings and advantages and disadvantages.

So at the end, there was no critical mass to do something meaningful. And in parallel, while we were in telco debating what's the better standard of the orchestration, Kubernetes started emerging, not without competition. So I essentially started dealing with Kubernetes more or less at the end of the evolution cycle, evolution of the natural selection, where there were still pockets of different species in orchestration, like Mesos and then others.

CRAIG BOX: But we're humans, and those things probably look tasty. And we're likely to work them out soon enough.

VUK GOJNIC: I was thinking of it, is it-- like many people refer to orchestrator wars. And somehow, war is too violent or aggressive. So I was seeing it more as a natural selection.

CRAIG BOX: It is indeed the dodo's own fault for being too tasty.

VUK GOJNIC: [CHUCKLES] I could subscribe to that one. But at the end, I will say Kubernetes simply took the prevalence because of a number of lucky and some unexpected reasons. So at the end, we ended up all driving cloud-native ecosystems with Kubernetes.

I diverge from the question. You asked me, why do we need it? At the end, Kubernetes has been shown as a much, let's say, better and much more widely adopted standard that offers many of these orchestration functions that we struggle to build in different orchestrator silos, in the NFV, Network Function Virtualization. So it ended up as a platform of choice, as of today, for driving the move to the cloud-native and to the cloud.

CRAIG BOX: A couple of times you have mentioned network function virtualization. If I think of a telco today, is it made up of a bunch of boxes with wires plugged into them, and then the software runs somewhere else to instruct those things what to do? Or are there other things that you need to run software for?

VUK GOJNIC: It boils down to that. So telco, or as we refer to it, network technology, as opposed to IT, Information Technology, is a collection of different physical artifacts which used to historically be those boxes that your father was dealing with, like cross-bar switches, central switches, remote subscriber stages, and so on. And these boxes or these locations still exist today. So there are core sites where you have core network functions or bigger functions. And there are edge and then far edge sites.

So when we talk about edge, we talk about all the rooms in your neighborhood, when you think of the fixed line-- rooms somewhere in the neighborhood where there are some telco devices to which all the optical, nowadays-- or formerly copper-- lines are connected.

In a similar fashion, you have mobile base stations, which are towers spread and scattered across the country, which are again connected to some box next to a tower, and there are devices with the software and hardware there. And they are back-hauled and upstreamed into the core network.

So it is essentially the nature of the telecommunications services. They are very distributed. And they need to be distributed because of the fact that people that use them are distributed. The thing related to cloud or cloud-like or cloud-native is that, formerly, you would put a box, for example, DSLAM, for a DSLR termination--

CRAIG BOX: That is a word I have not heard for many years.

VUK GOJNIC: Definitely. We would talk today about OLT, Optical Line Terminal, which is a word that's probably not that famous like DSLAM. But there is some hardware, some cards that connect the lines, and some also internet or ethernet cards that connect them to the core network. So they were like monolithic boxes.

Today, Deutsche Telekom has another project which is talking about disaggregation. It would say take the brain out of these network devices-- and the brain is obviously software always-- but take it out and build it as a microseries-based application that runs on the local Kubernetes cluster.

And the sole purpose of that local Kubernetes cluster-- or major purpose of that local Kubernetes cluster-- somewhere on the far edge is to talk to and to steer the network device that performs the function of switching, connecting, policy enforcement, and so on.

So in this sense, we are needing many Kubernetes clusters in many locations. In a similar sense, if you talk about bigger Cloud-native Network Functions, or CNFs, we can name 5G core. And 5G core is something that is a set of smaller network functions-- and nowadays, microservices-- that is simply handling the sessions of all mobile subscribers and mobile customers that are using mobile data.

So these sessions are aggregated policies applied in those sessions, certain profiles, authentication, and so on and so forth. And this is, in very simplistic terms, what these functions are doing, enabling us to record this show, for example.

CRAIG BOX: Down the street from me, there is a box which occasionally I'll see opened up with the gentleman plugging a bunch of wires into. Unfortunately, it is almost always a gentleman. And somewhere slightly further away there is probably a tower which is disguised as a tree, but probably not very well. Are there likely to be Kubernetes in both of those locations?

VUK GOJNIC: On a longer run, yes. With the caveat that the cabinet that you are talking about might be just cross-connect cabinet that doesn't hold the active equipment. But the O-RAN, for example, Open Radio Access Network standard and initiative, is built on the premises that the software for that base station for the tower, is going to run on a Kubernetes cluster that is either co-located where the current network equipment is that is blinking yellow and green LEDs in the dark there.

CRAIG BOX: Das blinkenlights, as they're called.

VUK GOJNIC: [CHUCKLES] Das blinkenlights, correct. So it's going to be co-located either there, or it's going to be aggregated a bit further away from that but still certainly in the region. There is a notion of distributed unit and remote unit. So it's all very much going in that direction. Are we there yet at the moment? No, so it's mostly in the POCs and in the labs, and in small-scale kind of field trials.

But I can very much foresee, in three to five years, you're going to walk next to or by many small Kubernetes clusters then you are going down that street.

CRAIG BOX: In the steady state, of course, these sites are connected back to your central core with a very fast network. Is there a locality advantage in having the software running so close to them? Or is it required for redundancy?

VUK GOJNIC: There are many of the new use cases that are opening by that. So for example, so far, the mobile network stations were individual entities, and they were working for themselves, and being back all the way to the core for some processing. And there is a latency included in between and so on.

So with these new models, you can have, for example, one small data center covering and handling several of these base stations. And then it could be used to make them cooperate between each other to provide to the customers the boost and the bandwidth, better quality, and so on. There is always a trade-off between latency and the bandwidth. You need to transport this row-- I'm now emphasizing much more on the mobile network.

There's a similar story on the fixed line and optical network. But there is a lot of bandwidth needed for a mobile to transport this RoE radio, baseband data, so-called. Because these are hundreds of gigabits per second of data. So the longer you transport it, the more complexity in the transport you have. So there is an advantage, definitely, being able to process it locally and then back-haul it as digested information, backwards.

There are also new use cases which would-- it's a bit of visionary thinking. Like when you have a compute capacity close to the subscriber edge, then you can enable these low-latency scenarios in which probably the most illustrative for some of the listeners is gaming. So when you have a think time very low and the latency very low. So you can have these geolocated gaming or augmented-reality-based gaming sessions where the information doesn't fly or doesn't travel completely all the way to the core or even to the public cloud and back. So you don't have this real-time immersive experience.

So there are many use cases that are being envisioned and could come up as a viable thing, calling it far edge cloud or something like that. But these things are still in the very early stage. And we are looking forward to seeing how they are developing.

CRAIG BOX: All right we've established the need to run Kubernetes in your data centers in the cabinets on the street, inside the mobile towers. You've built a platform to run Kubernetes at scale. It's called Das Schiff, which is German for "the ship." I think Das Boot was taken.

VUK GOJNIC: Yes, the famous movie took that one.

CRAIG BOX: Tell me about Das Schiff.

VUK GOJNIC: Das Schiff actually is-- we call it Kubernetes engine for Deutsche Telekom Technik, which is essentially the framework and the way to stamp out many of the clusters in the different environments in different locations and have them be continuously reconciled so that our small team does not need to take continuous care of them so that the cluster can take care of themselves, at least for a part of it. And it all happens in an automated way.

The notion of Das Schiff, we had in mind the sailing ships, the big ship that is transporting the freight, the load. So it's our positioning within the company as well that the team and the platform that could take and host your cloud-native applications, in analogy with the shipping, transporting them safely to wherever they need to be, this is kind of naming towards our internal customers. What it is, in essence, it's a framework for multi-cluster management.

You know, there's a distinction between ship and boat. People say the ship can carry the boat. And in a sense our engine is actually carrying many of the clusters. And those clusters you can think of as the boats or small boats that are holding different types of workloads. So it is there to present the scope and the nature of this multi-cluster setup that we decided in the early point to go for.

Because having a bigger clusters and multi tenancies, Kubernetes is still a controversial topic, especially for mission-critical or real-time applications. So we decided, let's build a capacity or capability to run many clusters with as least effort as possible.

CRAIG BOX: You've built on top of the Cluster API, which is a Kubernetes subproject. How do you interact with that API?

VUK GOJNIC: Our entire blueprint-- I have to say Das Schiff is a lot of application of a number of components for that very specific use case, which is globally specific, but you could find it in any telco like that, either today or in future. They are moving in transformation.

But we are applying many of the technologies that are out there and combining them in a certain specific setup that does the work for the scenario I just briefly described. And this is coupled with a bit of code that we are developing in order to cover the missing links and the pieces that are needed in order for everything to run smoothly.

We have obviously ambition to make it more systematic and bring more elements so that it's usable out of the box, but it is not at this stage. So currently we are in the situation that we are working with a couple of upstream projects. And some of them we are using and then providing a lot of feedback from the use case, and some of them we are also contributing.

So that's our first thought-- let's try to liaise and engage with interesting and relevant projects and support them instead of writing our own thing.

So one of those is cluster API, and another project of those is Flex. Because what we looked for is a way how we can have a fleet of clusters being reconciled, being managed, or managing Kubernetes out of Kubernetes. And this is what we found with a cluster API in combination with the GitOps. Because that cluster, the management cluster of cluster API, is looking at a git for the definitions of its tenant clusters or so-called workload clusters.

And the entire setup makes it possible that the management cluster is created based on the declarative manifests and based on the immutable principles, spinning out the tenant clusters or the workload clusters, and then reconciling them continuously.

Same thing happens when the workload cluster is up and running as an empty cluster. It is oriented towards the Git repo where it's defined what it needs to run, which components. Like you've got Prometheus, you've got the logging stack, you've got the policy stack, and so on.

And then it's caught in what we call a GitOps loop. So it's essentially the state of the lowest potential energy, is how I like to describe it, where the infrastructure is being reconciled and then we are observing the anomalies and reacting up on those anomalies.

So this is what we found in the cluster API. And the cluster API, we are applying as it is with two cluster API infrastructure providers. One is vSphere, because there is a lot of existing legacy infrastructure running on the local vSphere. And we are able to stamp out the clusters into these environments.

And the second one is Metal Kubed (MetalĀ³), which is an infrastructure provider to manage the bare metal servers through the Ironic and make out of that bare metal Kubernetes cluster, which is a big thing, actually, we are focused on as a major strategic orientation.

CRAIG BOX: I find it amusing if not ironic that the only piece of OpenStack that seems to be relevant today is the piece that runs Kubernetes, which indeed is called Ironic.

VUK GOJNIC: It is definitely one of the components that is serving multiple purposes, or purposes that went over the scope of what OpenStack was made for and seemed to be a very good general-purpose bare metal management piece of software. There are not many of those in that area. You find a lot of proprietary things.

But Ironic emerged as the one who is doing its job very well, serving as an API towards the bare metal, which is then rendered through the cluster API and the infrastructure providers.

So at the end, it's all abstracted when you set it up. Or it's all abstracted from anybody who is not deeply in the platform like our team is. And essentially you have bare metal servers described as Kubernetes objects, and then you are dealing with them as with any Kubernetes object. So this is what we definitely have as a benefit and as upside on the Ironic, which is having, I think, a very good future.

CRAIG BOX: Now, you are hiring at the moment. And you've commented that not everyone who you're talking to necessarily has the skill set required for the eventual job. What are the things that you find are missing? And how are you helping upskill people in those areas?

VUK GOJNIC: The skills, definitely in that combination that we are looking for, are not readily available on the market. Being a platform team in the telco where you are hosting a lot of network functions requires, in ideal case, person who knows Kubernetes very well and also Kubernetes very deeply, especially its networking stack, and then also the networking stack itself for the integration of the surrounding environment, and plus of being experienced in Linux.

So I would say a person with 15 years of Kubernetes experience, that could be a good description of what we could look for.

But the thing is, anybody who is proficient with Kubernetes is able to, after some time, coupled with the people who are already in the team, learn missing pieces. And the biggest area for that is networking. So we essentially are developing and upskilling by exposing people to those challenges and coupling them with the team members who are more experienced with it.

So we don't want everybody to come up on the same level. Because in the team, we have some pure software developers, we have some very strong networkers, and we have a lot of what we would call a SREs-- not a lot, we are a team of less than 10 people at the end, which is our goal and our, let's say, proclaimed challenge that we want to manage hundreds and thousands of clusters with that such team.

But the team is really mixed profiles. And the bigger challenge is how upskilling on the wider organization is going to scale up. Because we deliver a Kubernetes cluster and say, here is your API endpoint, here are your credentials, essentially based on IAM system. And then they say, in many cases, now, how to use the cluster and so on. So this is where there are structured programs/education going on, but nothing replaces the hands-on experience.

So I think it's a great transformative driver moving to Kubernetes. And some people are finding themselves really going from the person who operated the box into a person who needs to type the commands with kubectl, which we do not favor and promote except in test environments.

In an even worse case, we are pushing people to go and commit their things in the Git. And this is quite new for some people, and an interesting experience. But as I said, it's a transformation. And we are getting there, step by step.

CRAIG BOX: It definitely sounds like telco networking has taken a lot more from internet networking than the telco industry. And what I will take from that is you're probably more likely to hire me than my dad.

VUK GOJNIC: Are you interested? [CHUCKLES]

CRAIG BOX: Well, my dad's very gainfully employed still. He's still effectively, like yourself, had the same job for many, many years. Plus I don't think he's looking to relocate right at the moment.

VUK GOJNIC: That's definitely the case. The profiles-- we are hiring, for the platform team, people who are 22, 23, 24 years old, with a completely new view of the world. And then we pair that view and their world with this transforming world of telco. And beautiful things are happening, I have to say.

CRAIG BOX: How do you stop them just looking at TikTok all day?

VUK GOJNIC: They are people who are still on the MIRC, looking for the hacking news and stuff. It's a kind of geek subculture of the present generations that we are looking for.

CRAIG BOX: Well, even if people are on TikTok, they can just say, I'm just checking that the network is working correctly.

VUK GOJNIC: Of course. That's a good way to check what has a big impact on the wide number of customers. But a better thing is how we stop them from continuously getting a new fancy thing or new interesting project in GitHub and trying to implement it and push it to production immediately after three days.

CRAIG BOX: Can't keep a good nerd down!

VUK GOJNIC: Definitely.

CRAIG BOX: All right, well, thank you very much for joining us today, Vuk.

VUK GOJNIC: Thank you again for having me. It was a pleasure.

CRAIG BOX: You can find Vuc on Twitter, @VukGojnic. And you can find out more about Das Schiff on GitHub.


JIMMY MOORE: Thanks for listening. And as always, if you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter @KubernetesPod, or reach us by email at KubernetesPodcast@google.com.

CRAIG BOX: You can also check out our website at KubernetesPodcast.com, where you will find transcripts and show notes, as well as links to subscribe. Until next time, take care.

JIMMY MOORE: See you later.