WEBVTT

00:00.000 --> 00:15.600
So, hi everybody. Thanks a lot for coming. I'm sorry. And I'm not allowed to go there. So, today,

00:15.600 --> 00:21.840
we want to show you something about WebAssembly on Kubernetes. And we want to go into

00:21.840 --> 00:30.000
how we can use WebAssembly instead of classical Docker, open, continuationative images to run your

00:30.000 --> 00:38.560
workload quicker, coming up, and so on. Who of you runs already Kubernetes? Okay, great. That's

00:38.560 --> 00:43.680
perfect. Otherwise, you would need some additional minutes for intro. And who already used WebAssembly?

00:44.640 --> 00:51.600
Okay, some. Great. Perfect. So, today, we have Linus Pazik. He's head of engineering at Kauru.

00:52.000 --> 00:58.080
And myself, I'm a bit of authority working at WebAssembly as a managing consultant and also a

00:58.080 --> 01:04.320
CNCF ambassador. And today, we want to show you where we went a little bit of five years,

01:04.320 --> 01:09.600
which has found out we already are working a little bit on how can we do event-driven architecture

01:09.600 --> 01:19.120
with Rust and some new types of working with that. So, how do we want to start? We want to look at

01:19.120 --> 01:26.560
one use case. It's not a real use case, but we want to go with you through that one in this presentation.

01:26.560 --> 01:34.160
And we want to show you how there we can use these technologies. Our use case is we have some

01:34.160 --> 01:41.360
emails that we get. It's on support use cases. There we have them. We need to channel them through

01:41.920 --> 01:47.600
to our compute. And then we will need to prioritize them for different use cases. So, we have

01:47.600 --> 01:53.840
maybe a VIP support. We have some support that specialize on some other things. And maybe also

01:54.720 --> 02:00.880
some third support for not so important stuff. So, there we need some triage. So, we need some compute

02:00.880 --> 02:07.360
and we have some emails at the front. And now our problem is we are really good, or maybe our

02:07.360 --> 02:13.360
software is really bad. And now we have managed support cases. So, we need to scale when we get a lot

02:13.440 --> 02:19.120
of emails. And we should do that fast. And we are not able to predict because it's really

02:19.120 --> 02:28.080
coming from the outside from the edge. And so, we want now to look at this example with some

02:28.080 --> 02:33.600
thing already in mind. We already use Kubernetes, but you already know and use as well. And we

02:33.600 --> 02:40.240
are now to look on how on Kubernetes we can now hear optimized things to dynamically scale

02:40.240 --> 02:48.000
and have time-sensitive scaling. So, when we start with something like that, right, we have

02:48.000 --> 02:54.000
first our pot. We have some fixed scale, maybe one instance. This will not work long. If we have

02:54.000 --> 03:00.880
a lot of things coming up. So, what we will do, first the simple thing is we go to horizontal

03:00.880 --> 03:06.320
pot of the scaling based on CPU RAM, right? That's the first thing. It's a really simple thing.

03:06.400 --> 03:12.320
But the problem here is we always need to wait. So, we will only find out after some time

03:12.320 --> 03:17.680
that our CPU is on full load and we probably should do more. Otherwise, if we just say,

03:17.680 --> 03:25.040
oh, we have some CPU spikes, we go up. We will scale way too fast. So, we can use some custom

03:26.000 --> 03:33.440
auto-scaling metrics. And the world is really open. We can do everything, but it's a lot of work.

03:33.440 --> 03:39.760
And there we have KDA, one CNCF project really that really helps there. They have

03:40.800 --> 03:45.440
some scaling policies. For example, we can use, if we now use before for the channel

03:46.160 --> 03:53.280
Kafka, we can go on that or we can also use a simple SQS service and also scale on that

03:53.280 --> 03:58.160
on the length of a queue. And so, we can really just say, hey, something came into the queue.

03:58.160 --> 04:02.800
Now, already let's scale up because we know there's a lot in the queue and also scale down

04:02.800 --> 04:09.520
if the queue gets fewer entries new ones, even though our CPU is still 100%.

04:10.560 --> 04:15.600
And then we can even go one step further. If we say, we want to do a little bit more than just

04:15.600 --> 04:22.000
auto-scaling. We can use a full framework like KNATIF. Also, a CNCF open source project.

04:22.000 --> 04:28.160
Then there is a little bit more opinionated about what we do. And there we can really then

04:28.160 --> 04:32.960
dive deeper a little bit afterwards and say, hey, this can now really help us with the whole

04:32.960 --> 04:38.080
journey, but this is really an opinionated product. And these are the two things that we want to

04:38.080 --> 04:45.120
look at how we can use KDA, KNATIF together, we represent. So, now first, let's look at how

04:45.120 --> 04:52.400
these KNATIF and KDA work together. So, as I said before, let's imagine we have Kafka and

04:52.400 --> 04:57.440
front, we have a topic there where this email information coming that we have some emails.

04:58.720 --> 05:06.320
Now, we can use KDA, which listens to Kafka and says, hey, we know we have a consumer group,

05:06.320 --> 05:10.880
right? We just do really the normal implementation of Kubernetes and our resource that says,

05:10.880 --> 05:15.280
now just grab the newest messages, right? We have some consumer group, normally or in a queue,

05:15.280 --> 05:22.000
we have a specific queue for these readers. And then KDA can just also with the same credentials,

05:22.080 --> 05:26.880
read these things and say, oh, okay, I know there are a lot coming in. Now, let's scale up,

05:26.880 --> 05:37.200
or there are not many more in allowance scale down. Now that we have scaling, how do we make it

05:37.200 --> 05:47.280
faster and here we know as well. Thank you very much. So, yeah, as I said, now we have the

05:47.280 --> 05:53.760
name scaling checked. Now, how we can make it scale like fast and as you may remember from the

05:53.760 --> 06:01.040
title of our talk, of course, the answer is, it's web assembly. So, before we start looking into

06:01.040 --> 06:07.200
web assembly, a short disclaimer. So, what you saw until now is really production ready stuff.

06:07.200 --> 06:12.960
So, KDA is used as a graduated project from CNCF, like it's widely used in out there in

06:12.960 --> 06:21.200
production. But, yeah, web assembly is also quite a, like mature technology, but in combination with

06:21.200 --> 06:29.360
Kubernetes and server side, it's like a little bit on the sharp edge. So, yeah, we also observed now

06:29.360 --> 06:34.960
over the years, it's like really slowly progressing, but there is potential. So, let's have a look.

06:34.960 --> 06:39.920
How does web assembly work on Kubernetes? So, it works very similar to your normal

06:40.480 --> 06:46.160
container-based workload. So, I repose it to a container registry, and then there's like

06:46.160 --> 06:52.240
a schedule, some work to be done. The cubelet will pick up the work, and we'll hand it over

06:52.240 --> 06:59.760
down to a CI runtime, and then there's a little bit of a split between containers and web assembly.

06:59.760 --> 07:05.840
So, it's still, download it from the registry, and then hand it over to the OCI runtime.

07:05.840 --> 07:10.880
And, yeah, there we have on hand one hand, have the container ones, which set up like the C groups,

07:10.880 --> 07:15.280
and all the stuff. And, on the other hand, we have the web assembly runtime.

07:16.880 --> 07:24.080
Usually, we can use the same CI runtime on top, and then just have a scheme, which then handles

07:24.080 --> 07:30.240
the translation, all the differences between containers and web assembly. And, yeah, with the

07:30.400 --> 07:37.360
runtime, it's then running like a sandbox. But, yeah, let's have a look at web assembly.

07:37.360 --> 07:41.600
So, what's web assembly? So, it's core web assembly, it's like a platform

07:41.600 --> 07:49.600
independent byte format. There are many, many languages, which compile down to web assembly,

07:49.600 --> 07:55.120
and there are over the years, now, more and more languages start to support. There are some

07:55.200 --> 08:00.240
interpreted languages, which support web assembly. But, yeah, these are like, they compiled out

08:00.240 --> 08:06.400
like the interpreter, then you run the interpreter in web assembly. You don't get all the advantages

08:06.400 --> 08:13.280
of web assembly if you do that. But, it's possible. Yeah. So, you have the languages to compile down.

08:13.280 --> 08:19.760
Then, one important building block is Wazi. This type of assembly system interface is a standard

08:19.840 --> 08:25.280
which defines some system interfaces, like IO access files, is to access clocks,

08:25.840 --> 08:31.840
like database stuff, and so on. It's just the interface, and that again is implemented by the

08:32.720 --> 08:37.280
wasn't run times. So, there wasn't run times to take over like the heavy lifting.

08:37.280 --> 08:41.520
That's what's about an interface for like HTTP requests to receive. So, the run time will

08:41.520 --> 08:45.760
like parse the request, and maybe there's some authentication, and just tend over like,

08:45.760 --> 08:52.160
really, the request itself, and the main work is time at the run time.

08:53.440 --> 08:59.040
And, yeah, we are at the wasm run time. It's like a virtual machine, which takes the

08:59.040 --> 09:05.280
web assembly byte code, and then interprets it on your specific platform. Yeah.

09:06.240 --> 09:12.560
Maybe the sound service familiar, and the I'd really like the next evolution of the

09:12.640 --> 09:19.920
project of what do Java did for us. And, yeah, from background, it's like really born in the web.

09:19.920 --> 09:26.480
So, as you maybe know, it's initially built for the browser to run like low level code safely

09:26.480 --> 09:32.880
into the browser. So, they do a lot of things to make it safer, and like differentiate from Java

09:32.880 --> 09:39.120
uplets, which are notoriously like the interest to run in the browser. And, also, they did

09:39.120 --> 09:44.000
something on the structure, which makes it easy to start interpreting the web assembly while it's

09:44.000 --> 09:54.160
downloading. So, yeah, that's a good thing to go over then to the advantages. So, we have, like,

09:54.160 --> 10:00.720
wanted to highlight like three advantages for web assembly over the containers. So, on the one hand,

10:00.720 --> 10:05.360
we have tiny images, because you're really just shipping the code, your business logic,

10:05.920 --> 10:10.800
classic with a container, and you're also bringing a lot of additional things, which,

10:10.800 --> 10:16.080
like, take time to download. Then, the other thing, also, I just like the byte code. So, it's,

10:16.080 --> 10:22.080
yeah, it's very fast to be interpreted. So, while download it, it's went before. And,

10:22.080 --> 10:27.600
yeah, it's platform independent. So, you have, like, a heterogeneous fleet of servers, which

10:27.600 --> 10:34.160
can run your stuff. And, then, the third advantage we see is the web assembly system,

10:34.720 --> 10:40.960
interface. So, yeah, it comes with a lot of battery included. So, you don't even have to include

10:40.960 --> 10:49.760
that in your code. So, the runtime will provide it for you. Yeah. One, this claim for the web assembly

10:49.760 --> 10:56.240
interface, like, it's, like, very early on. So, we currently had versions zero to two.

10:56.880 --> 11:02.880
And, they promised zero to three for, for next month. But, yeah, we will see. And, then,

11:02.880 --> 11:11.760
that, the one that will come at some point. Good. Then, yeah, let's have a look at the whole

11:12.640 --> 11:20.560
theory in, in practice. So, now we go back to our example from the beginning, where we now

11:20.640 --> 11:27.520
get our email. We put them on Kafka. Then, we have K native. Now, with Kda under the hood,

11:27.520 --> 11:33.600
which takes these events. So, before, right, we said Kda just checks the consumer lag. Now,

11:33.600 --> 11:39.600
Kda will do that. But, it's really just part. Now, of the demo, where we also use K native.

11:39.600 --> 11:46.000
K native has the benefit. So, it's really one opinionated system for everything. The drawback

11:46.000 --> 11:52.320
is, you will not put that on your system. Probably, when you already have something on. Because,

11:52.320 --> 11:58.080
it really says everything, how you should do it. It's started with certification management.

11:58.080 --> 12:03.120
Then, on how you integrate, also, a little bit of namespacing and so on. So, it's really a

12:03.120 --> 12:08.480
full-fledged thing. But, the great thing for them also is, and also, new system. It's a really

12:08.480 --> 12:14.240
one thing that also, everything, great plugs together. And, so here, we now have K native in

12:14.240 --> 12:20.240
between. So, we really just need to say to K native. We want to consume from here. And,

12:20.240 --> 12:26.720
there, on the other side, we have a small function, which will execute something with that.

12:26.720 --> 12:32.160
And, they will now read from Kafka. They will automatically scale with Kda under the hood.

12:32.160 --> 12:38.480
And, they will forward it to HTTP. So, one thing, also, that is really great to switching to HTTP.

12:38.480 --> 12:45.120
We said, Vossom is on the start with Vossy. So, there's currently no implementation to use Kafka

12:45.120 --> 12:50.560
directly. So, it's really great to go there over HTTP and K native will handle that all for us.

12:52.960 --> 12:57.600
So, and now we have a demo, which also loads. That's great.

12:58.400 --> 13:06.080
So, what we will show you now is this example from before. Here you will see on the left top

13:06.400 --> 13:11.680
all the pods on the cluster that we have. And, we really run it just on a MacBook. So, nothing

13:11.680 --> 13:16.800
powerful and you will see we will go over hundreds of containers in really quick time.

13:17.600 --> 13:23.520
On the bottom, we will just at the beginning with a small producer, simulate a lot of emails

13:23.520 --> 13:32.000
coming into the system and pushing them to Kafka this event. And, on the right, you see a UI for Kafka,

13:32.080 --> 13:37.600
where we see our consumer group, where we see that first, we will get a lot of emails,

13:37.600 --> 13:48.000
and then we will process them one by one. And, I hope that play button works. So, and now we

13:48.000 --> 13:54.800
just generate a thousand emails. And, you will now see, we have the email consumer coming up.

13:54.800 --> 14:01.360
It's one and we said that the rule to Kda, hey, just double it if there's still something on the queue.

14:01.360 --> 14:07.120
It helps for the demo. Otherwise, it will just be full of containers. But, something that you

14:07.120 --> 14:13.120
now really see is when new containers are coming up, they will be instantly from container created

14:13.120 --> 14:19.520
just in running almost, even on a small computer, because that's now really the first thing, right?

14:19.520 --> 14:27.600
And, these awesome images are 0.4 megabyte each, compared to the same logic that we also implemented

14:27.680 --> 14:34.800
with a container and the LPIN image, and it's still 4 megabytes. So, factor 10, with no additional

14:34.800 --> 14:40.160
libraries, and so on, which really shows, we're faster with smaller, and there is no startup time.

14:40.960 --> 14:46.480
So, this really helps there. And, you see now, they are all coming up, and we're going faster

14:46.480 --> 14:52.800
down, and also one thing that is, if you're particularly now going to Kafka, really helpful,

14:52.800 --> 15:00.400
if you use K-native, is that we only have one consumer here, right? This is because now K-native is

15:00.400 --> 15:07.040
using that, and just now Kafka, in consumer group, you add additional things. First, you have the

15:07.040 --> 15:13.760
limitation that you will never get more parallel consumers than you have partition. And, on the other

15:13.760 --> 15:20.640
side, negotiation that protocol with getting more consumers also takes time, which we don't have,

15:20.640 --> 15:27.440
if we have something like K-native in between. And, so like that, we went now for the consumer

15:27.440 --> 15:35.280
like, it's almost with zero. So, we have set our dummy code to use one minute to process an email,

15:35.280 --> 15:40.320
so that's why it still takes some time, but we see that we're now almost a zero, and I think,

15:40.320 --> 15:46.560
yeah, now it's done. So, that's our small demo that really shows, we can do something like that

15:46.640 --> 15:53.600
really quickly, and really scale fast, and when we use that assembly for event driven architecture

15:53.600 --> 16:01.360
use cases where we need some process something. Now, that's like our demo. So, where it's

16:01.360 --> 16:10.160
really used, here we want shortly to show three image, three well-used projects that now use

16:10.160 --> 16:20.160
vassan. So, once is the Depper thing, that's a big project on CNCF that now uses vassan for

16:20.160 --> 16:26.560
HTTP handlers. So, if you want to do that some custom logic, you can just add some plugins with vassan,

16:26.560 --> 16:31.600
which is really great, right? You don't need to compile them into the rest, you can just plug them

16:31.600 --> 16:38.000
to it, and they can process something. The same with Istio, also they are using it in their plugin

16:38.000 --> 16:46.000
architecture in production to their have a proxy vassan sandbox. So, again, you can just have

16:46.000 --> 16:51.520
you logic in the application language you want and just add it to this project without any

16:51.520 --> 16:56.960
need for compilation, which is really great. And the third thing, again, a plugin use case,

16:56.960 --> 17:02.560
you see where currently in production most open to this project see the benefit, it's really about

17:02.560 --> 17:10.080
plugins and not needing to recompile or anything. And the third thing is helm, and here it's not

17:10.080 --> 17:15.760
just a plugin that you don't need to recompile and you can use your own language, but the third benefit

17:15.760 --> 17:22.240
there is you don't need to trust the offer, right? So, helm is a package manager and there

17:22.240 --> 17:27.440
these plugins are on your machine. So, you can now download it, they have assigning algorithms,

17:27.840 --> 17:33.680
but you still don't need a lot of trust because they are fully sandboxed, you don't need a cluster

17:33.680 --> 17:38.320
or something, they will just run it in this vassan, it will only be interpreted the code,

17:38.320 --> 17:42.480
so it's a little bit more safe to run it on your system like this third party plugins.

17:45.440 --> 17:51.360
And how does it now work if you want to use vassan for plugins, which I think is a really

17:51.360 --> 17:58.160
great use case. So, here is a little bit really simplified how that is using it. So, on the right

17:58.160 --> 18:05.520
side, you see how you can implement it on the host. So, this is like a really simple thing,

18:05.520 --> 18:11.840
vassan ross, vassan ross. Sorry, it's a really simple run time for it that you can just use in

18:11.840 --> 18:18.640
goal and you really just say, hey, I want a new run time for web assembly, you need then to

18:18.640 --> 18:25.040
instantiate it and then you can really just call a method and this will then be passed there,

18:25.040 --> 18:31.200
it will be executed and you get the result. And on the other side, we see now how you can implement

18:31.200 --> 18:35.920
something, this is with tiny goal, they also have a booth somewhere, so if you want to know more

18:35.920 --> 18:40.800
about tiny goal, it's also the right conference for that. And here's just a really small

18:40.800 --> 18:46.160
HTTP function that is a little bit closer but to complicated to show on the right side. And you see,

18:46.240 --> 18:51.840
you clearly just have one function and you can do a small code, return and that's it. So,

18:51.840 --> 18:56.400
I think it's a really good thing if you want to use it for plugin and also wait at some safety

18:56.400 --> 19:08.080
and just decouples the programming languages. Good. So, what's the approach if you want to get started

19:08.080 --> 19:13.360
it wasn't? So, there's a little bit of reading to do, so there's the vassan component model

19:13.440 --> 19:20.800
we should be familiar with it before you start. And then there is also the assist major phases.

19:21.520 --> 19:27.040
And if you like understand them, then it's quite easy. We recommend to start with rods because

19:27.040 --> 19:33.200
there are ecosystems, it's really great. But nowadays, like you saw a tiny go and there are

19:33.200 --> 19:39.040
a lot of other languages which start to really support vassan as a compile target.

19:39.840 --> 19:46.640
And when you have your relevant app and you have created the vassan binary,

19:46.640 --> 19:53.360
then you need to decide what runtime you want to use. There's a big, big list or like a lot of projects.

19:53.360 --> 19:58.240
And some of them are really strictly implementing just vassan interface. So, I have some

20:00.000 --> 20:05.280
property interfaces that they implement as well to do additional things that the

20:05.280 --> 20:09.920
standard interface doesn't cover. And then if you really want to go there, I'll be told down

20:09.920 --> 20:19.520
with a Kubernetes. Now you really need to look into the run times because like the most common ones

20:19.520 --> 20:27.520
like runc doesn't support vassan directly, but there are some, for example, c run or run wasse

20:27.680 --> 20:35.360
that support I'm already today by the assembly on as a container type on Kubernetes.

20:38.400 --> 20:43.520
Good. Then yeah, that's it. Let's have some questions.

20:45.680 --> 20:52.320
Maybe why do we have a QR code? If you want to slide, you find them there. And also to point out,

20:52.320 --> 20:57.440
if you're still a couple days longer here, I'm on Monday starts all tail unplugged.

20:57.440 --> 21:02.960
If you want more about open telemetry, which also is heavily off the news with Kubernetes,

21:02.960 --> 21:06.960
that's a nice event I heard.

