WEBVTT

00:00.000 --> 00:15.160
Okay, so welcome everybody, thanks for having us, to this talk about keyboard and how it's used

00:15.160 --> 00:18.000
let's say in the platform engineering team.

00:18.000 --> 00:23.480
So first of all, little introduction, myself, I'm Nina, and today, together with my colleague

00:23.480 --> 00:29.560
Thomas, which we are both platform engineers at Suze, where we're going through this and

00:29.560 --> 00:36.680
a little agenda topics for today, we're going to talk about the challenges that may arise

00:36.680 --> 00:42.280
while managing an internal development platform and we'll go into that in a minute, and

00:42.280 --> 00:51.000
how to address those challenges and how platform team can enforce strict regulation on multiple

00:51.000 --> 00:56.680
projects on a variety of customers, we'll see what this means in a really little time.

00:56.680 --> 01:02.760
So let's start with something simple, so I hope everybody is familiar with this command,

01:02.760 --> 01:08.600
but anyway, this is a simple, cube city or command, but what it will do, it will spawn

01:08.600 --> 01:16.080
an engine export, and we will be doing this while specifying just to option flags, the image

01:16.080 --> 01:24.280
and the village flag, and as we will see in a minute, here we are specifying the registry tag

01:24.280 --> 01:32.280
and most importantly, the privileged option, which has some security concerns due to the fact

01:32.280 --> 01:39.720
that it actually has some impact on the host where this internalized image is running.

01:39.720 --> 01:46.760
So this is because, even if that is dangerous, this is a valid command, and as we thought a

01:46.840 --> 01:53.400
minute ago, this actually does have succeed and existence shades apart. So, and this is

01:53.400 --> 01:59.400
because Kubernetes, as long as the changes are valid and they're corrected, it will

01:59.400 --> 02:05.640
happily accept it, and it will be, and they will be persistent. Even if this, those may have

02:05.640 --> 02:12.920
consequences or they may be dangerous for our environments. So the question arises, what's necessary

02:12.920 --> 02:18.840
to protect our environments, what can we do to do this? So to do this, we will need to enforce

02:18.840 --> 02:25.880
cardrails. So let's briefly go back and review the command that we, so just a minute ago.

02:25.880 --> 02:32.840
So here we can ask ourselves a few questions, and the first question that we can ask ourselves is

02:32.840 --> 02:40.600
about the registry. So do we want to allow the registry.sucet.com as a valid OCI registry,

02:40.680 --> 02:49.080
then next up, we specified the latest as the tag for the pod. So is this something that we

02:49.800 --> 02:56.920
consider valid and is something that we want to allow in environments? And last but not

02:56.920 --> 03:03.160
definitely least, as I was saying, do we really want to accept privileged pods in our environment?

03:03.160 --> 03:10.200
So those are all questions that we can ask ourselves with this very tiny small and simple example.

03:10.520 --> 03:16.120
But we can go further than that. There can be multiple and many more scenarios. So for

03:16.120 --> 03:24.600
instance, if the image that we want to use a cryptographically signed should the pod have or not

03:24.600 --> 03:31.560
have certain labels and for instance, regardless in PCI devices, should we allow the pod to

03:31.640 --> 03:40.840
access to the side GPUs? This and many more questions are valid and we will get this to this in a

03:40.840 --> 03:49.400
minute. So this kind of highlights very simply why will we need Godrails? And the reason is that

03:49.400 --> 03:55.800
even if the changes are valid, we want to reject the configuration change changes that we deem

03:55.800 --> 04:02.920
as bad. So that's before they actually do persist. So the reason is that if we don't do that,

04:02.920 --> 04:11.240
our environments can become insecure and they this is definitely something we don't want. And as

04:11.240 --> 04:18.680
well, managing a lot of code manually, YAMO code for those changes becomes not feasible and

04:18.680 --> 04:27.800
especially impossible at scale. So this kind of requires automation in the mix and automation

04:27.800 --> 04:36.520
possibly as code, we had policy as code. So to the core component that we would be leveraging

04:36.520 --> 04:45.080
to achieve this is the community submission controller. So and what it is, it is a piece of code

04:45.080 --> 04:52.040
that all it does it intercepts the requests coming in coming in and once authenticated and

04:52.040 --> 05:01.080
authorized and once this happens, it actually can validate it or mutate it and this it does that

05:01.640 --> 05:09.080
via the related webbooks. So for the sake of this presentation, we will just be considering

05:09.080 --> 05:16.760
the validation of webbooks. So in our environments, we will be leveraging multiple admission

05:16.760 --> 05:22.920
controllers and all those controllers together will make sure to allow or change or reject

05:22.920 --> 05:28.920
the changes. And this is because we will be leveraging multiple policies that may be small,

05:28.920 --> 05:35.960
but together in isolated, they will allow us to maintain that security and forcing the guidelines

05:35.960 --> 05:42.360
and enforcing the governance that we need. So we spoke about our internal platform but

05:42.360 --> 05:50.200
why we do so. We are a platform team. We need to provide tools and services and enable our

05:50.200 --> 05:59.320
customers to spawn their workloads and do their tasks on different clusters. And at a very high level,

05:59.320 --> 06:04.920
the internal platform, the internal development platform that we are using is composed of

06:04.920 --> 06:12.280
those three parts. At the center, we have Ranger which is the open source container management platform

06:12.280 --> 06:19.480
of choice and then we have the downstream arcade to clusters. And of course those are the arcade

06:19.480 --> 06:26.360
arcade to is the Kubernetes distribution which is supported out of the box via Ranger and which

06:26.440 --> 06:31.400
shows it because it's lightweight and secure. And then we have the NHTI, an hyper-commeraging

06:31.400 --> 06:37.240
infrastructure in the form of arbister because this helps this is due to the fact that we can

06:37.240 --> 06:46.760
actually manage compute, networking and storage as native Kubernetes objects. So and most importantly,

06:46.760 --> 06:53.800
in both arcade to clusters and also on the arbister HTS solution, we are running Kubernetes policy

06:53.880 --> 07:02.760
Engine which is what we will be going into detail in a few more slides. So now we get to the point

07:02.760 --> 07:08.680
what is keyword and how can it help us. So first of all, it was to say that Kubernetes

07:08.680 --> 07:18.680
CNCF policy engine and it allows us to use what we already know. We don't need to learn

07:18.680 --> 07:24.280
a DSL or a new programming language or something different. To write the policies that we will

07:24.280 --> 07:32.920
be using and those policies will be turned out turn into some web assembly with some policies

07:33.480 --> 07:41.160
as artifacts and those artifacts will also be able to be embedded in FSA, CICD pipelines and

07:41.160 --> 07:48.520
and such. So from now on, I would like to pass microphone to Michael Lee.

07:50.200 --> 07:54.920
Hello. Now I would like to talk to you about platform engineering at SUSA and provide a few

07:54.920 --> 08:00.520
examples of some policies that we've recently ridden. So let's first talk about our customers.

08:00.520 --> 08:05.080
Our platform is the primary way for teams and SUSA to run their workloads and with us, they run

08:05.080 --> 08:10.520
production services and they perform research and developments and they also can give sales

08:10.520 --> 08:16.760
demos to prospective customers. And we need to enable our customers but at the same time,

08:16.760 --> 08:20.840
we need to project them from each other. This is because we don't have unlimited resources.

08:21.400 --> 08:27.960
So we can't get everything to one team. They shouldn't be able to do everything. So we need to

08:27.960 --> 08:33.160
have some level of emissions and some environments are more protected than others, for example,

08:33.160 --> 08:40.760
certifications where we actually run common criteria. But there will always be a new feature

08:40.760 --> 08:46.200
a technology that our customers need and they need it right away. So let's go over a few examples.

08:47.400 --> 08:54.520
Hey platform team, I need GPUs. For some reason, multiple teams demanded GPUs last year.

08:54.520 --> 08:58.840
The problem is that everyone thinks that they need one and they need the most powerful one possible.

08:59.080 --> 09:03.080
And this is usually because of they don't know what resources that they actually need to run the

09:03.080 --> 09:07.640
workloads yet. And GPUs are scarce so we can't hand them out like candy.

09:09.480 --> 09:15.720
And we run our GPUs in Harvestor and they're we split a physical GPU into multiple VGPUs

09:15.720 --> 09:20.680
and attach in the virtual machines. And each VGPU has unique name. For example, here with the

09:20.680 --> 09:26.360
Tecton 27A with the number at the end and the problem is anyone who knows that unique name can

09:26.360 --> 09:31.320
attach it to their VM. And here I'd like to press that valid YAML is not the same as a valid

09:31.320 --> 09:36.520
change. Just because you know the name of that unique VGPU doesn't mean you should be able to

09:36.520 --> 09:42.280
attach it to your VM. So we need something more. If Kubernetes doesn't have built-in controls for the

09:42.280 --> 09:47.960
GPUs, we're going to just build a keyboard and policy for virtual machines instead. And our goal here is

09:47.960 --> 09:53.960
that Kubernetes will ask our policy whenever it's changed to a virtual machine happens. So let's look

09:54.120 --> 09:59.240
in an example request. We can see that it's for a virtual machine. We can see the Kubernetes

09:59.240 --> 10:06.680
name space. And we can see the unique GPU name. So how should we react? If we create a list of

10:06.680 --> 10:12.120
allowed namespace and VGPU bindings, we can get the namespace in the VGPU from the request.

10:13.160 --> 10:18.120
We can see the pair, see if the pair is in the list. And then with our specification is

10:18.680 --> 10:23.560
if a VGPU in a namespace are bound, we will accept it. But if they're not, we'll deny it.

10:24.920 --> 10:29.560
So here's some example code where we're checking if the GPU is allowed. And the first thing

10:29.560 --> 10:35.000
that we do is we iterate over the list of bindings. And then we check if the GPU and the namespace

10:35.000 --> 10:41.560
are bound and we return truth they are. And we return false if there's no match. And here's a

10:41.560 --> 10:46.120
level up where we're actually calling that function. So the first thing that we do is we get the

10:46.120 --> 10:51.320
virtual machine from the request. And then we get the namespace and the GPUs from the virtual

10:51.400 --> 10:59.000
machine. And then we iterate over them. And then after that, we check if the GPU is allowed.

10:59.000 --> 11:03.640
And if not, we reject the request. And if it wasn't rejected yet, then we allow it.

11:04.840 --> 11:08.120
And here we can see where we're setting with the policy. We're creating a cluster

11:08.120 --> 11:13.640
mission policy resource. And I'd like to draw special attention to the settings. We're

11:13.640 --> 11:19.400
setting the namespace device bindings. And anytime we want to add a new GPU, we would have to

11:19.480 --> 11:27.960
update it here. And here we can see it in action. We're on trying to add a GPU to a virtual machine.

11:28.520 --> 11:36.200
And because it's not bound, keyboard and denies the request, where we can see the red air message at

11:36.200 --> 11:44.360
the top. So for a second example, hey platform team, restrict that network. Because the

11:44.360 --> 11:48.440
harvester is self-service, anyone can create a virtual machine network and harvester.

11:49.000 --> 11:54.200
The problem is that if you know the VLAN, you can recreate it in another namespace. And some of

11:54.200 --> 11:59.640
our networks are very secure, like for common criteria. And we need to ensure that they can't be

11:59.640 --> 12:04.760
recreated. And while we can hide this information, it's better to prevent the change from happening in the

12:04.760 --> 12:12.440
first place. So how should we react? If we have a list of allowed VLAN and namespace bindings,

12:12.520 --> 12:19.080
we can get the namespace in the VLAN from the request. Then we can see if the list of our bindings,

12:19.080 --> 12:23.480
and the specifications would be, if the VLAN and the namespace are bound, we accept it.

12:24.200 --> 12:30.280
If the not bound, we deny it. And if neither the VLAN nor the namespace are bound, we accept it.

12:30.680 --> 12:37.400
And this is to allow for unrestricted VLANs. So if we don't care, and there's no protections for the

12:37.400 --> 12:43.800
VLAN or the namespace, then we just allow it by default. So here's looking at some code.

12:43.800 --> 12:49.080
We're calling that phone calling this. So the first thing that we do is we iterate over the bindings.

12:50.520 --> 12:56.520
We verify that the namespace and network are bound. And then the end, we return true if they're

12:56.520 --> 13:02.920
allowed and false if they're not. And like the previous policy, we have to set up this cluster

13:02.920 --> 13:08.920
mission policy resource. And the same thing with the settings, where we have the namespace VLAN bindings.

13:08.920 --> 13:13.000
If we ever want to add a new namespace or VLAN here, we have to update it here.

13:15.000 --> 13:19.000
And here we can see it in action, where we're trying to create a harvest or virtual machine network.

13:20.200 --> 13:22.840
But the VLAN's protected, so we can see that it's denied.

13:25.080 --> 13:29.560
For our third request, hay platform team restrict those big partitions.

13:29.640 --> 13:37.640
Make partitions are isolated instances of the physical GPU and they're very useful for us to

13:37.640 --> 13:43.400
give our users access to GPUs and Kubernetes. The problem is that we have a limited supply

13:43.400 --> 13:49.000
of them and some of them are more powerful than others. So we need to control which

13:49.000 --> 13:55.400
make partitions we give to which teams. So let's look at an example of request. We can see that

13:55.400 --> 14:00.680
as fair pot. We can see the Kubernetes namespace and we can see the make partition name.

14:01.720 --> 14:07.880
But instead of requesting a unique device, we're requesting a resource and that comes with a few

14:08.680 --> 14:16.440
problems. So what's different with the resources? So a big partition name like the NVIDIA.com

14:16.440 --> 14:22.200
make 1G12 gigabytes is not unique to one specific device. And here we're actually requesting a certain

14:22.280 --> 14:27.160
amount of them. So we can request one of them. We can request two of them or we can request none of them.

14:28.600 --> 14:33.080
And this raises a problem for us which is how do we keep track of the make partitions that

14:33.080 --> 14:38.680
namespace already has. So this is different than our previous examples because we had all the

14:38.680 --> 14:44.680
information that we needed in the request itself. So we had the Kubernetes namespace. We had the VLAN

14:45.320 --> 14:51.320
and we had the device name. But now we have to keep track of account. Luckily, Kubernetes have

14:51.320 --> 15:00.120
something built in which is the resource quota. And this is used to limit the amount of resources

15:00.120 --> 15:05.880
that a namespace can consume. And this is usually used for the CPU or for memory. And we can

15:05.880 --> 15:12.120
use it to keep track of the make partition. But it only keeps track of resources that it knows about.

15:12.680 --> 15:17.400
So if the make partitions in the quota, it limits it. But if it's not, it's allowed without limits.

15:18.360 --> 15:24.120
So how would the scale? If we use it as is right now, we even have to add the make partition

15:24.120 --> 15:30.280
to every resource quota in every name space anytime we add a new make partition. And if we

15:30.280 --> 15:35.240
forget a name space, then that make partitions allowed without limits. So this is where the

15:35.240 --> 15:39.720
keyboard and context of where a policy is coming. And they allow us to reach out to the Kubernetes

15:39.720 --> 15:45.240
API whenever and retrieve another resource. So in our case, we would actually want to get the resource

15:46.200 --> 15:52.360
quota. In our goal would be to deny request if it's requesting a make partition, but it's not

15:52.360 --> 15:59.160
in the resource quota. So here's an example of where we're trying to do that. So what we're doing

15:59.160 --> 16:06.440
is requesting all the resource quotas in a name space and returning them. And then here's where we're

16:06.440 --> 16:12.440
calling that function. So we're trying to get the list of make partitions from the name space,

16:12.520 --> 16:17.480
iterate over it, and then validate that the make partition is within the resource quota.

16:19.480 --> 16:22.760
And the best thing about this is that we don't need to add any settings for a policy.

16:23.400 --> 16:27.880
Everything's actually controlled by the resource quotas. So we just have to update it there.

16:29.160 --> 16:34.840
And here we can see an action where we're trying to create an Olamopot and use a make

16:34.840 --> 16:40.040
partition that isn't bound to the name space. So we can see that Kubernetes is denying the request.

16:41.000 --> 16:46.680
So the takeaway. As platform engineers, we can translate the business requirements to isolated

16:46.680 --> 16:51.080
policies of keyboard. And the best thing about this is we can also do a program in languages

16:51.080 --> 16:58.200
that we're already familiar with. So going, row, going, or rest. And there's already a vast

16:58.200 --> 17:03.480
library of existing policies out there. You can discover them on artifact hub. And this is where

17:03.480 --> 17:09.320
we publish our policies. And you can also find policies that are maintained by the keyboard

17:09.320 --> 17:15.400
and team itself. And hopefully we've convinced you to try keyboard and you might already have

17:15.400 --> 17:21.000
some guardrails that you'd love to create in your environment. My tips to be to start small and

17:21.000 --> 17:26.840
let me your scope to be sure to add unit test as well as end-tests. And when you're done,

17:27.400 --> 17:32.200
publish it to artifact hub and share it with the community. Thank you. Any questions?

17:32.200 --> 17:45.960
Yeah? Yeah, if you want to take this. Sorry. And the question was, what's the difference with

17:47.960 --> 17:55.240
the reason why we chose to use keyboard and rather than other solutions that the similar jobs

17:55.320 --> 18:02.920
is because we thought that for the use cases that we were trying to address this tool was

18:02.920 --> 18:08.200
the best fit for us. And on top of that, on top of that, we have a close relationship with the

18:08.200 --> 18:13.560
keyboard and team. So just within suicide itself. So that would probably help us solve

18:14.440 --> 18:21.000
questions and doubts sooner. So that was just where the two reasons why we chose this solution.

18:21.960 --> 18:28.600
And on top of that, the fact that we were able to do this by writing, go-code or rastcode

18:28.600 --> 18:36.200
tools that we programming, which is that we are already using. So that kind of helps us not

18:38.200 --> 18:43.640
learning a new thing and that spends things up. That's it. I hope this answers your question.

18:45.240 --> 18:46.200
Any other questions?

18:51.000 --> 19:15.960
So I don't think that we can actually reach out to other, oh, sorry. So the question was basically

19:16.920 --> 19:27.640
when we're reaching out to, sorry. Yeah, um, you might. So when we reaching out to other services,

19:27.640 --> 19:34.280
if there are any of you, and any of you, and any security concerns, the experience.

19:34.280 --> 19:50.920
Yes, it does. I mean, there's the keyboard and CL command that allows you to actually test it

19:50.920 --> 19:57.240
locally. And as Thomas was saying, by writing into and test, you can actually, you can actually

19:57.240 --> 20:01.880
test it on your local machine before action. And get all the button information that you can

20:02.520 --> 20:08.200
require. Sorry, I did not hear here, but your question at the beginning, because there is a little bit of noise outside.

20:11.400 --> 20:12.200
Any other questions?

20:17.800 --> 20:19.800
So the question is out, how to,

20:32.040 --> 20:42.200
okay, so the question is how would we install a keyboard and on a cluster? And a keyboard

20:42.200 --> 20:46.520
is actually a CNCF project. So it's not really connected to the rancher. I think that there is a

20:46.520 --> 20:51.480
supported version of keyboard and that you can have. But for everyone else, you can just go to the

20:51.480 --> 20:55.960
keyboard and project and install it in your cluster. And it comes up a home chart. That's what we're

20:56.040 --> 21:03.880
doing, and it's pretty simple. So if I may add, we will be sharing those slides. And there are

21:03.880 --> 21:10.120
all the links that we probably provide more details and information on how to get this installed and

21:11.080 --> 21:15.560
write. Any other questions?

21:20.520 --> 21:26.120
Because you showcase how it works with French and overseas, right? The most

21:26.120 --> 21:28.920
question we're going to understand is very low. And you ask for it, like,

21:29.640 --> 21:33.960
as a button all part, with all of you, you have to implement this in this specific language,

21:33.960 --> 21:38.120
I know real-world health call, recovery is just a computer I have, and as a

21:38.120 --> 21:41.800
kind of user, then see how it's going to be like, why do this which?

21:43.640 --> 21:48.840
The scope of this is just to explain our experience using this tool to achieve our

21:50.040 --> 21:55.720
use cases. And that's why we chose cute wooden. And this is not about convincing to switch

21:55.720 --> 22:02.200
from one solution to another. So every solution has its own pros and cons. And about the main

22:02.200 --> 22:09.240
one, again, why we chose this and against other solution is because we could use tool,

22:09.240 --> 22:16.200
we could use knowledge that we already have within the team, we go and, and was. And also the fact

22:16.200 --> 22:22.680
that being creating a wasm artifacts, we could store those artifacts on, you know, OCI registry,

22:22.680 --> 22:31.880
and we could hook up a CICD pipeline and run it and run it as part of our pipeline. So that was,

22:31.960 --> 22:37.320
our use case, but again, it's not that we're trying to convince to switch from another tool,

22:37.320 --> 22:41.960
that's the point. Yeah, I mean, also in the post-cube, if you believe you want to down,

22:41.960 --> 22:46.280
like, it's not long, long, long, long, long. It's not long, they're being developed. You have to

22:46.840 --> 22:52.600
pick something. So I was curious if you have any, if you were a certain selling

22:54.200 --> 22:59.720
keyboard into me as a k-pad user, I'm trying to be an engineer, but yeah, the

23:00.360 --> 23:06.680
try it out. And we can take this outside if you want to discuss this further, but

23:08.840 --> 23:12.200
I'm saying it in a very, not harmful way.

23:15.320 --> 23:19.880
Joking, of course. Any other questions?

23:23.720 --> 23:26.360
Okay, so thank you very much.