#148 April 30, 2021

Liqo, with Alex Palesandro

Hosts: Craig Box, Patrick Flynn

Liqo is short for Liquid Computing. It’s a tool for extending Kubernetes onto others clusters, developed at the Polytechnic University of Turin. Research assistant and Liqo co-creator Alex Palesandro is our guest this week.

Do you have something cool to share? Some questions? Let us know:

Chatter of the week

News of the week

CRAIG BOX: Hi, and welcome to the Kubernetes Podcast from Google. I'm Craig Box, with my very special guest host, Patrick Flynn.

[MUSIC PLAYING]

CRAIG BOX: It was three years ago that this very podcast first launched at KubeCon Europe in Copenhagen. You were there, Patrick. What do you remember from that event?

PATRICK FLYNN: Oh, KubeCon in Copenhagen, that was fun. I think the most notable thing I remember is getting on the plane to go home and look in my bag and realizing I had lost my Nintendo Switch. That was pretty brutal.

I had my, like, saved game for "Zelda." And it was before the cloud save. It was pretty traumatic, actually.

CRAIG BOX: Any positive experiences from the time in Copenhagen, or is it just the loss of the Switch?

PATRICK FLYNN: No, I think the other thing that I remember really well is that I think it's when I first met the Jameses, James Rawling and James Strachan. I don't know if you know them, but they were of Jenkins X fame.

CRAIG BOX: Yes. And they will have been of maybe Fabric8 at the time?

PATRICK FLYNN: Yeah. That's right. They were working on Fabric8 at the time. And I was working on Java tools. So we were just starting the Jib project. And Dan Lorenc was there, who's sort of famous now for Tekton and his open source security tool chain work. But I think that was the beginning of the conversations around Tekton and that Jenkins X eventually adopted.

CRAIG BOX: It's amazing what happens at all of these in-person events. And it's kind of a shame to think, well, seeing as we haven't been able to meet up in hallways and just have these conversations, that we may find that we're a little bit behind in some of the serendipitous things that could come out of them. But perhaps with a bit more thoughtful planning, we'll still be able to have some cool stuff come out for the next couple of years until we're able to reprime the pump, as it were, with meetings.

PATRICK FLYNN: Yeah, I hear you. I really hope that we can be back in a conference setting pretty soon.

CRAIG BOX: Now, just as things started to close down around the world, I understand that you packed up and moved to Canada, but not for the reasons others may have.

PATRICK FLYNN: Yeah. I actually had the plan set up before COVID. And then COVID hit just as I was kind of pushing the button. So that was pretty difficult.

I have three children who were moving out of New York to Montreal during the quarantine. Yeah, it was a tough choice, but I was really excited by the project. I'm now working on reCAPTCHA. Actually, the product has been around for a long time, but we've just started realizing how valuable it is to enterprise and the fact that we had a big gap in terms of the enterprise offering. So I've been leading that effort.

CRAIG BOX: Is there any truth to the rumor that the fact that we're all now asked to identify what traffic lights look like has something to do with self-driving cars?

PATRICK FLYNN: It's completely false. In fact, it used to be the case that the reCAPTCHA challenges were used to derive value for the G organization and their machine learning algorithms. But that's no longer the case. Google doesn't really get anything out of those challenges. And in fact, we have a new version of the product that doesn't include challenges at all. And we encourage our customers to move away from them.

CRAIG BOX: So we're no longer having to decipher scrawly text. Does that mean we've solved the problem of scanning books?

PATRICK FLYNN: Well, think we just don't do that anymore.

CRAIG BOX: The robots have got too good at scanning books.

PATRICK FLYNN: Yeah.

CRAIG BOX: One thing that I remember about my time in Canada is a subtle difference on the topic of traffic lights. Where I had grown up in New Zealand and here in the UK as well, the traffic light is normally at the front of the intersection. So you drive up to the line where the first traffic light is, and then you stop. And in Canada, they hang them up in these larger intersections, but the traffic light you're looking at is actually at the back of the intersection.

So you have the whole box of the intersection, and the traffic light's hanging on the other side of it. And I do remember late at night one night thinking, oh, sh-- I've driven straight through the intersection because I'm driving up to try and stop at this red light. But it's not where I expect it to be.

PATRICK FLYNN: Yeah, I've noticed that as well. It seems like an accident waiting to happen. Fortunately, I haven't had a car. So I haven't had to worry too much about it.

CRAIG BOX: Now, unfortunately, by that point, I was well aware of which side of the road I was meant to be driving on. So I didn't have too much trouble there.

PATRICK FLYNN: Yeah, that's the extra challenge for people coming from the UK.

CRAIG BOX: Well, it's going to be a busy couple of weeks in terms of news. There have been a bunch of pre-KubeCon announcements that we'll cover today. And there'll be a whole lot more in next week's KubeCon roundup. So without further ado, let's get to the news.

[MUSIC PLAYING]

CRAIG BOX: Microsoft has acquired the cloud native company Kinvolk. Kinvolk started as a consultancy in Germany and has built products, including Flatcar Linux, a drop-in replacement for CoreOS container Linux. Microsoft's announcement calls out the intention to continue to support Flatcar as well as Kinvolk's other upstream contributions while the team will become part of the Azure engineering org. We talked to Kinvolk's CEO in episode 79.

PATRICK FLYNN: RedHat's virtual summit recently concluded, and they're very keen to let you know they're not just an OS company anymore. Announcements include a new version of OpenShift Container Platform called Platform Plus, which drops new episodes Fridays at midnight. Other OpenShift cloud services are in the areas of API management, Apache Kafka, and data sciences. And a new version of Enterprise Linux was also launched to remind you they’re still a little bit an OS company.

CRAIG BOX: Hosting vendor Rackspace has announced a strategic investment in Platform9, and using their technology, has launched Rackspace Managed Platform for Kubernetes. MPK is a Kubernetes as a service backed by Platform9's PMK, an easy shuffle of letters, and Rackspace's trademark fanaticism. Learn more about Platform9 in episode 88.

PATRICK FLYNN: Kubernetes IDE Lens is going multi-user in the new version 5.0 beta. Lens Spaces is a cloud service which lets you define a set of clusters relating to a team and grant access to multiple users. You can also proxy connections to them through a cluster connect service giving you a one-click access with no VPNs required. No word yet on if Spaces will be a paid feature when it leaves beta, but they did call out that lens will continue to work as a single user service.

CRAIG BOX: Another week, another enterprise data protection vendor gets on the Kubernetes train. Last week, it was Zerto. And this week, it's HYCU, or haiku, bringing out HYCU protege. The service will back up your YAML and your disks both. And there's GA for GKE with other services to follow soon.

PATRICK FLYNN: This week's billion security vendor is Sysdig, raising $189 million series F at a $1.19 billion valuation. Sysdig's commercial platform is built on their open source projects like Falco, which we learned about in episode 91. The round was led by Third Point Ventures and Premji Invest.

CRAIG BOX: Two new GKE features have been launched to preview. Multiple instance GPUs let you partition your NVIDIA A100 GPU into seven instances, which you can share between different containers for better utilization. You can attach up to 16 GPUs to a GCE virtual machine, which lets you have a whopping 112 individual schedulable GPUs with which to render your blender.

The GKE gateway controller lets you use the new Kubernetes gateway API to define role oriented traffic routing rules in an evolution of the Ingress concept. There are gateway classes for regional, internal, and global external load balancers. And with multi-cluster gateways using the GKE Hub, you can easily run services in multiple clusters worldwide.

PATRICK FLYNN: Finally, if you've been having trouble keeping up with the unrelenting pace of listening to four-release team lead interviews every year, you can rest up. The Kubernetes community has officially moved to a three releases per year model, permanently adopting the plan that was first thrust upon us by the year 2020.

CRAIG BOX: And that's the news.

[MUSIC PLAYING]

CRAIG BOX: Alex Palesandro is a research assistant at Politecnico di Torino in Italy and co-creator of the Liqo project. Welcome to the show, Alex.

ALEX PALESANDRO: Thanks, happy to be here.

CRAIG BOX: You studied at the Polytechnic of Turin before coming back and working there where you are now. Did your relationship with the professors change from when you were a student to now you're working there as staff?

ALEX PALESANDRO: Yeah, it's changed quite a bit. Because when you are on the other side, if you see the University life, it's completely different. So for example, you understand how much work has to be done in order to provide courses and labs and how many hours are spent. So it was really interesting. And it made me think about, when I was a student, I was always saying, oh, but this course is not that good. Now, there is always a lot of work behind it. And it's changed the way I think about the work of people that create courses.

CRAIG BOX: You did do a bachelor's and master's in Turin, but your PhD was from University Jean Moulin in Lyon in France. What language did you do it in?

ALEX PALESANDRO: Mainly in French. After the first year of master, I decided to apply for a double degree program with the University of Grenoble. And that made me to go to France.

I finished my degree there. And then I moved for an internship to Paris, where I also started starting my PhD. So my PhD was sponsored by a company with the University of Jean Moulin.

So it was mainly in French. It was a bit hard because you have not only to speak French, but also to know the technical vocabulary in French, which is really different from the English words. But, yeah, at the end, it was also fun.

CRAIG BOX: You obviously speak at least two more languages than I do. What order did you learn the languages in?

ALEX PALESANDRO: I think in France they are really kind and help you to try to express yourself in French. So it's nice to try. And they are not laughing at you if you misspell a word or do the R the wrong way.

As Italian, we make a lot of these kind of mistakes. I also learned French. I worked in a company speaking French all the time. It was fine.

CRAIG BOX: Had you learned French before that, or did you learn it at school?

ALEX PALESANDRO: I had some French courses at the high school, but I was not that confident. And when I moved to France, I had to really put my head on the books to understand a bit more verbs, and so on.

CRAIG BOX: What was the work you were doing in your internship in Paris?

ALEX PALESANDRO: Yeah, so I worked on the nested virtualization. So it was this idea of putting multiple layers of hypervisors in order to enforce more security on cloud infrastructure. So at the time, there was a lot of interest in hypervisors and how hypervisors were designed in order to minimize the trusted computing base.

So the idea was to have one really tiny hypervisor on the lowest layer and then another hypervisor that was the privilege somehow. And so it cannot harm the system. And it can be controlled by the other hypervisor, which is more privileged.

So this was the work. The main problem was the performance. So these multiple layers of virtualization were not really supported by the hardware at the time. I don't know now. And so the performance was not that good as you have with a single layer of virtualization.

CRAIG BOX: How many layers of virtualization can we do now? Can we go full inception and run hypervisors inside hypervisors inside hypervisors?

ALEX PALESANDRO: There are some papers that they ask that question and then reached, I think, six layers. But, yeah, I think that more than two are already a lot. In terms of performance, you are really doing this context switching between the different privilege of the operating system, different VMs that you have. Not that useful, I think.

CRAIG BOX: And your doctoral thesis was in the multi-cloud space?

ALEX PALESANDRO: I was working for a company that was really interested in seeing if we can break a bit the vendor lock-in that we have in cloud providers and try to define an abstract] infrastructure. So we can define how many VMs in our topology, virtual NICs, and so on. And then there is a kind of language that allows us to compile it on the cloud providers that we prefer.

It's something that a certain point has been achieved with Kubernetes today because you can have a Kubernetes cluster on any provider. But at the time, there was a huge focus on infrastructure as code. And so the idea was to define this meta language that was then compiled in the proprietary language of the cloud provider and then deployed. So it was this kind of compiler that I realized during the PhD was my main result.

CRAIG BOX: A lot of people who try and build a system like this, especially in the early days of infrastructure as a service, end up building a lowest common denominator system where they are only able to implement things that exist on all of their target platforms. How did you deal with that problem in your research?

ALEX PALESANDRO: The idea was to have an abstract definition and then have multiple implementation. We have an internal saying, OK, I have three provider implementing this kind of service. And then I can always do a custom.

So you know, I have a VPN as a service on one provider. And then I have the possibility, the specification, of a new infrastructure that builds a VM, installs the VPN size that I want and deploys it.

So the idea was to say, OK, if I don't have the managed service, I would fall back on the custom one on the least common denominator. But at least I try to describe my infrastructure in an abstract way. And so the managed service could be take in consideration because the problem was that you are normally building your infrastructure for a specific provider. And if the cloud provider is not that service, you are not really trying to use the managed service for that.

CRAIG BOX: We talked about this a little bit when we were talking to Daniel from Crossplane and maybe talking about the idea of, if you have a managed service on the cloud provider and you're running somewhere that doesn't have that service, you may have to deploy something. So they may have a load balancer that exists, but you may have to run NGINX or HAProxy or something on an environment that doesn't have that as a service.

ALEX PALESANDRO: It was really a nice project because it also uses the abstraction of Kubernetes, so the APIs of Kubernetes and this word that is quite popular today and is available on all clusters on all cloud providers. It's a really good project, and I felt really, really fantastic to have lots more work. But see, I have the same idea implemented. And seeing a lot of people enthusiastic about that was really nice.

CRAIG BOX: After spending a few years in Paris, you've moved back to Turin. And we spoke to Gianluca in episode 142. And he pointed out the Turin is basically cloud city in Italy.

ALEX PALESANDRO: We are known as a car city because there were a lot of factories that produced cars back in the last century. Now, I think we are in a breaking point where there are more and more companies working with IT and cloud. We have also plenty of students that are really interested in cloud technologies that came to Turin to study them. We have courses about it.

So I think that maybe it's wishful thinking, but I really think that Turin is becoming, in a sense, cloud city or a city where cloud computing companies are present and they want to invest.

CRAIG BOX: Which car companies were based in Turin?

ALEX PALESANDRO: It was Fiat. Now, it is Stellantis. Before the '80s, there were two or three smaller, then the market consolidated quite a lot. So Fiat owns all the other ones.

CRAIG BOX: Tell me about the research group that you work in now.

ALEX PALESANDRO: I'm working in the DAUIN, the department at Politecnico. So this is a department of computer science mainly. And inside that, we are in a research group called Netgroup. The historical focus of that group is on computer networks, in particular network processing.

There are a lot of work and also open source projects that work around the eBPF world recently. So for example, polycube, this was initiated there. Now, there is also a shift on more cloud native technologies, so Kubernetes.

There are also cloud computing courses at Politecnico that are done by our group. We also started another project aside Liqo called Crown Labs that is made by students. And it tries to provide a laboratory platform for students during the pandemic.

So the idea is that people can do the lab even if they are not at Politecnico. And they are using VMs on Kubernetes using KubeVirt, no VNC to connect to them. So it's a simple platform, but it's completely built by students.

It is also operated by students. So it's kind of interesting for them to see which are the jobs that you can do around the class or so. One, you operate and maintain it. One builds application in the Kubernetes native way.

CRAIG BOX: Is that using nested virtualization?

ALEX PALESANDRO: No, I don't think so. We use a bare metal cluster. It was challenging. Because in the beginning, it was kind of a bet that we made.

We had all the ingredients. So we have a rook. We have MetalLB, the open source project that cover what we have to do. So it's quite, also, fast.

CRAIG BOX: I love how we have the confluence of eBPF and bare metal all coming together in Turin again.

ALEX PALESANDRO: I've not really worked a lot on eBPF, so I cannot discuss it a lot. I know that they published some important papers last year on bare metal. So in the Netgroup, we have the intention of working with the real solid PC, the solid server, so to connect cables.

This is something that we really love. And even if it's not mandatory today, we ask the professor to buy us servers to play with. And we are playing, but we are also getting some interesting results also.

CRAIG BOX: Are you teaching yourself?

ALEX PALESANDRO: I'm doing more of the practical part for students. So there are those labs around Kubernetes, cloud computings, virtual machines, containers, Ansible, and all the main technologies that we have today.

CRAIG BOX: How do you balance the academic side of things and research on this platform versus the fact that this is obviously a marketable skill and that people want to learn things that they can then use when they go out into the real world?

ALEX PALESANDRO: There is the lab part where we have to be able to write code and make people learn how they can use cloud technologies. But there is also the fact that, in academics, if you do not build interesting project in terms of engineering, it's hard to have an impact with your research.

So for example, you can run your algorithm, you can put your intelligence, things that are really appreciated in papers. But you have not the underlying layers to make good experimentation to show that your idea is really valid and worth publishing in a good venue. So I think that the balance is that what we try to do is to have a huge work on engineering as a basis to show how our ideas are interesting.

CRAIG BOX: On the Liqo home page, you say that the project started because you had plenty of powerful computers in your labs, but there were always some students complaining about their laptops not being powerful enough to run their jobs. That sounds to me like a quote that could have come out in the year 2000 when I was at university. We had tools like distcc for compiling. There's things like SETI@home where you can run workloads that are effectively modern big data workloads.

And now, of course, we have the whole Hadoop ecosystem. What kind of jobs were your students looking to run? And why did you build something new versus using one of those existing technologies?

ALEX PALESANDRO: The idea was really to let them run, for example, desktop applications inside using their resources, but also the resources that we have at Politecnico. So for example, we have this course in graphics. And they use a lot of Blender, and they have to render a lot of project. And they struggle using their laptop.

And also, the solutions provided with virtualization are not powerful enough to cope with it. We would like to have something that is quite generic, and it also can work with containers and modern technologies, the modern way to package applications.

CRAIG BOX: You mentioned Blender there. Do 3D rendering packages like Blender have support for multiple backends? And if so, is that support built on top of containers today?

ALEX PALESANDRO: I have not really directly worked on this project. I know that in the Crown Labs project, there is undergoing work. From the test I saw, they are able to run the Blender worker. And they have a master blender, and the work is split among those workers. And they can tweak the amount of resources they consume with Resource Center requests on limits on Kubernetes.

We also did a test using a 3D card GPU. It worked well. You had to configure a bit more what you are doing. So you have to use Docker with the GPU support, but it worked pretty fine.

CRAIG BOX: We mentioned a problem that the Polytechnic has in terms of resource sharing. Can you talk a little bit now about Liqo, which is your solution to that problem, and what it does?

ALEX PALESANDRO: The idea of Liqo was started from the fact that we have a lot of clusters today. The idea is that Kubernetes, as we were saying before, is the common denominator of the infrastructure. The scenario that we have is that we have a lot of clusters almost everywhere.

Sometimes it's useful to see them as a unique cluster and so have the possibility to optimize the pod placement and application placement and the application flows, also in presence of a decentralized government, so the fact that some resources can belong to the partner of one organization and other resources may belong to another organization. The idea is really to create a way to peer, as we mentioned in the website, the cluster in order to create multi-cluster topologies to fulfill different tasks, for example, hybrid cloud bursting or disaster recovery, and so on.

CRAIG BOX: In the case where you have one owner or one university, in your case, who operate all the machines, you might just think that they should all be joined to one cluster. Why are they not?

ALEX PALESANDRO: For example, in Politecnico, we have several clusters that belong to different departments. So we have the computer science department, the electronic department, the IT department. And they are OK to say, when we are not using our resources, you can use them. But we'd prefer if you are not hanging on our resources or if you want to have a limited amount of resources on our cluster. And on the other hand, for our user, we would not have to say, OK, now you have to use that cluster. And so you have to change your kubeconfigs, change your configuration, and so on.

CRAIG BOX: Right.

ALEX PALESANDRO: So what we would like to have is a transparent expansion of our clusters. This is the situation that we would like to model. There are also other points that you may know, for example, from the network perspective, the Kubernetes scheduler does not take into account the latency.

So if you have nodes that are far away, they will be considered near. And so you can have strange behavior. Those were, more or less, the reason why we thought about Liqo in this way.

CRAIG BOX: What then is the process of joining these clusters together? You have a host cluster, and then you somehow need to teach it about the existence of other clusters and give it credentials to operate upon them?

ALEX PALESANDRO: The first problem that we fought to solve is about discovery. For example, we have a cluster. And in an organization, we may want to know which are the other clusters available.

And this is something that Liqo is able to handle using DNS, so records like we used in the C protocol. Or we implemented another approach based on mDNS. For example, if I have two Raspberry Pis that are in the same network, with Liqo installed, they can discover each other and see across them.

This is the first point, when we have discovered we have this protocol that establishes the administrative peer. So the idea is that one cluster says, I would like to know if you have resources that you can provide me. And the other says, OK, you received our resource offer.

And it says, OK, I can provide you with those resources. And if you are interested, you will accept and create the peering. This peering will have two effects.

The first one is to create a virtual node using a virtual Kubelet. So we implemented the provider for Kubernetes starting from the Microsoft project, which I think is now a CNCF project, which are Kubelet. And it creates, also, an overlay network that allows you to add pods communicating on the two clusters or from cluster A to cluster B, for example, trust pod.

CRAIG BOX: You mentioned there mDNS networking, or Bonjour as they call it in the Apple world, the idea of these things being able to just discover themselves. Is there an implication that, if you're on the same LAN, then you are part of the same trust domain and these things can just join? Or is this some sort of process where the administrators have to publish out they're willing to share?

ALEX PALESANDRO: You can't activate, enable, or disable the sharing. But if you want to enable it, we consider the LAN as a trust domain. But you also can enable a token that you have to provide off-line to make the other cluster capable of peering with you. So the idea is that it's like a bootstrap TLS. It's a lot simpler and not as secret, but the idea behind is the same.

CRAIG BOX: If the clusters trust each other by virtue of being on the same LAN, is there some form of identity on top of that? If I have user accounts on the first cluster, do they somehow replicate through to the subsequent clusters?

ALEX PALESANDRO: We have a cluster identity. So far, we are not focused on the users inside the cluster. So the identity are per cluster.

And they are expected to have a unique cluster ID. And this cluster ID is exchanged with the other one. So I know that there is that cluster. In that cluster is not another one.

So for example, there cannot be conflict on this part. There are many implications for security, so we are working on that. So in this peer to peer world, it's hard to establish who is who.

So what we are trying to avoid is that something is trying to fake its identity to become another ID that you discover it somehow has. So this is what we are facing in 03 version, which is the next version of Liqo. But what we think we should do at the next release is also to be capable of relying on third party services that can provide you an identity, which is trusted.

CRAIG BOX: You've talked about how these clusters are peers to each other. And if we think about how peering works in an internet network sense, we used the BGP protocol, which, again, is famous for sort of being very open in terms of trust and only recently has had to have identity put on top of that. When you have these networks that you connect together, they may well have the same network address space. So you'll need to do some sort of translation between them. How does the overlay system deal with translation between networks?

ALEX PALESANDRO: This was one of the design principles that we fought at the beginning. So the idea is that those two clusters may not know anything about each other. So they have been created in different times. And they have different CNIs, for example. But also, they may have conflicting [INAUDIBLE].

So what we implemented in the Liqo networking is a double notch mechanism that allows you to talk to another pod on the other cluster using an asset IP. Obviously, you will not discover this pod. You can discover it manually, but also the interesting part is that this translation is inserted in what we call the service replication, so the fact that the endpoints of a cluster are replicated on the other one in order to make those endpoints discoverable on the second cluster and so make services available also on the cluster that is on the other side.

CRAIG BOX: This sounds like a problem that IPv6 was designed to fix, albeit maybe designed 25 years ago. When Kubernetes supports IPv6, do you think that some of that complexity can go away?

ALEX PALESANDRO: Maybe. So far, we have made the assumption that IPv6 is kind of not adopted a lot and so we were stick to IPv4. IPv4 has also another important concern about the scalability of this model. So if we think of hitting multiple clusters, you easily may finish your private IP. So what do you do?

For that, we think that IPv6 may be a solution, that we would see a bit more adoption by community and system administrators. Because it seems to me always that IPv6 scares people. I personally love it. I don't understand why, but yeah.

CRAIG BOX: I moved to London just over 12 years ago. And one of the first things that I did was went to an Internet Society event that Vint Cerf had come over to present at. And he said, IPv6 is just around the corner, and you'll have it tomorrow and gave me some stickers, and so on. And we're still waiting.

ALEX PALESANDRO: I remember when I was a student and the professor said, IPv6 is there, but his slide was Dated in the '90s. So will it really be tomorrow, or do we have to wait for a few more years? But, yeah--

CRAIG BOX: Is there a lot of Linux on the desktop in your research group?

ALEX PALESANDRO: All of the people working on Liqo have Linux on their desktop.

CRAIG BOX: Well, there you go. If it's the year of Linux on the desktop, then we can't have IPv6 yet. You mentioned they're running a service with pods on different clusters. I was going to ask, is this used for hosting or real-time work? Or is it predominantly used for batch jobs, things like the Blender workloads you talked about before?

ALEX PALESANDRO: We did a demo with Google microservice application. So the idea is that we have this private cluster with a lot of microservices that, at a certain point, runs out of resources. And the pods are moved on a remote cluster on the cloud.

And the idea is that services should continue to work. And requests have to be balanced across different endpoints. The idea is that it has to be suitable also for real application.

And the reflection of endpoints, we start doing it with normal endpoints resources, which is very complex. Because you have only one resource per service, but we did point slices. You may have multiple resources to provide endpoints for the same service. And so it became easier to do that and faster also.

CRAIG BOX: How does your virtual Kubelet node know what the state of these clusters are? Is it only when it's done its discovery? Or if things change in the remote clusters, are they kept up to date?

ALEX PALESANDRO: We basically watch pods, and we scale on remote cluster. And when there is a change, the changes are reflected on the local cluster.

CRAIG BOX: And how does that reflection process work? Is that an agent that's running in the remote cluster that has permission to call back into the API server of the host?

ALEX PALESANDRO: It's a virtual Kubelet has informers connected to the remote API server, agent-less in a sense on the remote cluster. The reason we adopt that is that we want to keep as low as possible the consumed resource on the remote cluster. So we just want to consume resources when we deployed something.

So we have those kind of routines that watches the object and update them locally. This is done for pods we scheduled, for config maps, and services, and other objects. The point of having a a virtual Kubelet was that we create this virtual node, which is completely transparent for the control plane of Kubernetes, so the Kubernetes control plane.

It works smoothly with the basic virtual scheduler. So we have not changed a bit of it. And we also can have this dynamic placement of pods in the cluster because it sees it has an extra node of cluster. So it simply attaches a pod to them.

CRAIG BOX: Does this qualify as federation? What have you learned from previous attempts in Kubernetes that have used that scary word?

ALEX PALESANDRO: So I think that you can create some sort of federation. If you think, for example, to have one cluster that is peered with plenty of other ones-- that is the center of your star, for example-- you can create a topology like that. But it's not mandatory.

When we started looking at the federation options] that we had in Kubernetes, for example, kubefed, what we found a bit difficult to adopt is the fact that they ask you to change the way your workloads were defined. So you have this federated Deployment, which were really, really expressive, but you have to say your user change the way you are doing your work. And it's not really easy to do sometimes.

On the other hand, the virtual Kubelet approach allows you to spread your pods in a more dynamic way, I would say. So you have not to say, OK, this is deployment for cluster A and the deployment for cluster B. But you have a deployment. And then it goes on cluster A and goes on cluster B, then equally and transparently.

CRAIG BOX: We've talked about how this works in effectively a single-trust domain being the LAN of your university. Do you see applications for Liqo in commercial multi-tenant situations?

ALEX PALESANDRO: For the moment, we thought about the single organization because Liqo does not provide any isolation in terms of tenants. It's on our roadmap to use a third party project to have this isolation and to protect a bit the cluster. Hopefully, we will reach a level of isolation where this can be interesting. So we can say, OK, I rent a part of my cluster to someone else. For the moment, I think we are not mature enough for a scenario like that, but it may be possible actually.

CRAIG BOX: Another thing on your roadmap is support for Amazon's Elastic Kubernetes Service. You currently support GKE and Microsoft AKS. What is it that needs to change for supporting extra providers? What is it above standard vanilla Kubernetes that needs to be used here?

ALEX PALESANDRO: The main problem is related to networks. So every provider has his way to implement Kubernetes networking. This is true also for on-premise cluster. So you can have multiple CNIs, which have many, many different behaviors.

So we have a different behavior from GKE and AKS. So we have to deal with it. It requires us a bit more work.

And you have also slight differences concerning the control plane, which is something that is provided as a service in managed clusters. For example, the CSR API is slightly different. So it does not allow you to have certain certificates.

For example, if you want to use a virtual Kubelet with logs and exec commands, you have to let your API server connect to the virtual Kubelet. But you have to have a specific certificate that some providers do not let to have this kind of certificate. So it's ongoing, but yeah, a bit more complex than the others.

CRAIG BOX: And what other things are on your roadmap? Are they defined by the Polytechnics use case, or are they from, now, external community requests?

ALEX PALESANDRO: We are building an external community to feed our roadmap. This is the idea. So for the moment, we had more internal use cases because they were the only one that we had.

So we have one meeting on Monday 6:30 Central Europe time, so 18:30. We try to talk with other people with the organization, ask for new interesting problems to solve that can be solved in Liqo. So, yeah, we are a bit open.

So now, 02, we showed that the Liqo approach works. So we would like, now, to make it work for more use cases and find who interested in this economy.

CRAIG BOX: Great. Well, thank you very much for joining us today, Alex.

ALEX PALESANDRO: Thank you very much.

CRAIG BOX: You can find Alex on Twitter at @palexster. How do you pronounce that?

ALEX PALESANDRO: In Italian, it was Pa-lex-er. And the reason why I chose this is because my usual nickname at the time was already taken on Gmail, so it proposed to me Palexster. I like it.

CRAIG BOX: Brilliant. And you can find Liqo at liqo.io.

[MUSIC PLAYING]

CRAIG BOX: Patrick, thank you very much for helping us out with the show today.

PATRICK FLYNN: You're welcome. Thank you for having me.

CRAIG BOX: If you've enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on Twitter at @kubernetespod or reach us by email at kubernetespodcast@google.com.

PATRICK FLYNN: You can also check out the website at kubernetespodcast.com, where you will find transcripts and show notes as well as links to subscribe.

CRAIG BOX: I'll be back and at KubeCon next week. So tap me on the shoulder if you see me. Until then, thanks for listening.

[MUSIC PLAYING]