WEBVTT

00:00.000 --> 00:11.520
What we have dubbed reachability analysis, and that requires a little bit of explanation,

00:11.520 --> 00:12.800
so we'll get to that.

00:12.800 --> 00:14.920
So, a little bit about me.

00:14.920 --> 00:15.920
My name is Danny Hammewing.

00:15.920 --> 00:20.200
I'm a software engineer at Isolvalent, which is now part of Cisco.

00:20.200 --> 00:26.880
I am a Selium Committer, a Selium is sort of at where the focus of the topic today.

00:27.440 --> 00:31.040
I'm a maintainer of the EBPF Go library.

00:31.040 --> 00:35.120
So, our Selium EBPF, if you're more familiar with that,

00:35.120 --> 00:42.400
and I also maintain the EBPF docs for anyone who has seen that before.

00:42.400 --> 00:48.640
So, I'm standing here on the podium, but I would also like to acknowledge

00:48.640 --> 00:51.840
my colleagues of mine, Timo and Robin.

00:51.840 --> 00:57.280
Timo has co-authored a lot of the work that I'll be presenting here today.

00:57.280 --> 01:05.200
He's a co-maintener of the Selium EBPF project, and Robin, who didn't, I don't think,

01:05.200 --> 01:10.240
directly worked on this, but he did review everything and gave us great feedback and stuff like that,

01:10.240 --> 01:13.440
so I wanted to acknowledge him as well.

01:13.440 --> 01:15.440
So, let's get in.

01:15.440 --> 01:21.280
So, we have been on a sort of journey to modernized Selium.

01:21.280 --> 01:27.760
The Selium is now 10 years old, actually one of the first users of EBPF ever.

01:27.760 --> 01:34.080
So, there's a lot of history there, and there's also a lot of technical that.

01:34.080 --> 01:38.160
So, one of those interesting things is that if you would write any project,

01:38.160 --> 01:43.600
EBPF project today, or most of you, we'll probably compile ahead of time, ship your

01:44.480 --> 01:50.880
programs as an elf, only your target, and then just vomit at it, and use the nice,

01:50.880 --> 01:54.720
little, the nice core facilities to make sure that it works everywhere.

01:55.520 --> 02:01.760
However, 10 years ago, all of that didn't exist, so back then, we decided to just ship

02:01.760 --> 02:08.240
all of clang, and all of our source code, always to target, modified it slightly for whatever we

02:08.320 --> 02:15.200
need it, and then run it, and we don't want to do that anymore, because we don't have to any more,

02:15.200 --> 02:19.280
but we have to technical that, so, there's a few reasons for doing that.

02:19.280 --> 02:22.320
If you don't have to compile, it's faster, compilation is really slow.

02:23.440 --> 02:28.160
If we don't ship clang, we have smaller images, we get a little CBE reports,

02:28.160 --> 02:36.160
unlike parts of our images, which we didn't, we don't use.

02:36.240 --> 02:40.080
They're like standard libraries that we had to import, because we also have clang.

02:40.080 --> 02:44.480
We can just get rid of all of that, where we have lost a look-less security,

02:45.120 --> 02:50.000
right, things, and hopefully, if you don't have a whole compiler in the loop at one time,

02:50.720 --> 02:57.600
there's, it's, hopefully, going to have less bugs. But this has been going since 2003.

02:57.600 --> 03:03.360
I found the issue that democracy had back then, and we're still lurking on this, but we're really

03:03.440 --> 03:11.120
we're getting really quite close now. So, the key in making this happen is low-time configuration.

03:11.120 --> 03:16.960
So, we can't do things at compile time anymore. We need to shift everything that happened at compile

03:16.960 --> 03:24.400
time to happening at low-time. Sillium is really complex, and it has a bunch of settings, options,

03:25.120 --> 03:30.720
a little of optional features. In fact, I counted them. At the moment, we're like,

03:30.720 --> 03:37.920
halfway through this conversion, but we have 94 different settings. They're being Boolean's,

03:37.920 --> 03:43.920
numbers, IP addresses, that sort of thing. That all need to, that all go into the data path

03:44.640 --> 03:49.760
at low-time. All compile time before that, but again, we're migrating from that.

03:51.440 --> 03:55.520
We used to do this with pre-processors macros, so we would conditionally compile

03:55.600 --> 04:02.720
bits of the code that we weren't using, and then a compiler would take care of all of that.

04:02.720 --> 04:07.200
But now, we compile everything in. Now, you need to at low-time disable or enable that.

04:07.760 --> 04:14.480
And the two bits, the two patch series that really made this happen for us are linked here.

04:14.480 --> 04:20.160
So, actually, the verifier can do death code elimination. So, what it does is, if it

04:21.120 --> 04:27.200
you provide it a variable, and it knows that this variable is always certain, like a static,

04:28.240 --> 04:34.640
you have a branch like an if statement, and it says, okay, hey, this branch is always taken or never taken.

04:34.640 --> 04:44.000
It can completely, it realizes one bit as that code, completely removes it. At least, if you're running

04:44.000 --> 04:48.160
as a privileged user, although I suggest patches that are within most and jumps off it,

04:49.120 --> 04:53.760
the other thing that makes this happen is tracking of the read-on in maps. So, if you have a

04:53.760 --> 05:00.880
global variable, like, so you get sort of the code that I have here. So, if you have a global variable,

05:00.880 --> 05:06.560
that gets put in a map, and a bpf map, that's how global variables work in bpf.

05:07.280 --> 05:14.400
If this read-only, so constant, and you freeze it when you load it, the verifier says, hey,

05:14.400 --> 05:21.360
that the content of that map is static, I'll just treat it as a scalar value, as a, as a, as a

05:21.360 --> 05:26.880
number, and then it flows, and then it works. So, you can write code like this, where you say,

05:26.880 --> 05:32.640
I have to make it volatile, because otherwise the compiler gets smart on me, but if you declare it

05:32.640 --> 05:37.680
like this, if a volatile const, I can change this constant value from user space, and I can actually

05:37.680 --> 05:43.120
enable or disable code in my bpf program, that will be completely removed from the jittered output.

05:44.720 --> 05:50.320
I'll show that example. So, this is what we use to do. So, the question now is, okay, cool,

05:50.320 --> 05:57.120
I can, like, control constable bits of my code, but how do I, how do I do this with maps?

05:57.760 --> 06:02.480
One of the core values of Sillium is that if you disable a feature, you don't pay for it,

06:02.480 --> 06:08.000
you don't pay for the CPU, you don't pay for the memory, you don't pay for bpf data path complexity.

06:08.000 --> 06:12.720
So, if I, if I'm close to a complexity limit, for example, which used to be the case more,

06:12.720 --> 06:19.200
nowadays it's less of an issue, but if that's the case, then I can disable a big feature,

06:19.200 --> 06:24.000
and I can enable more smaller features. That's the idea. So, you can use the user chooses what

06:24.000 --> 06:31.600
they want to run, and we just make that happen in a small case. But it also means that if I have

06:31.600 --> 06:37.680
feature A, which uses a map, how do I now make sure in this new world that I do not have to

06:37.680 --> 06:46.240
this map load it and there. So, this is what we had, and then you say, okay, we know the solution,

06:46.240 --> 06:54.720
let's do the naive fix, we create this global variable, the constants, we wrap our usage code,

06:54.720 --> 07:01.280
and then sure we have to provide the map initially, but surely after that map is free,

07:01.280 --> 07:09.040
because the only place that uses it is going, so we know the program that uses a map gets unloaded,

07:09.040 --> 07:16.640
it also removes your map. But unfortunately, unfortunately, unfortunately, that's not the case, actually.

07:16.640 --> 07:23.760
So, don't be scared, this is like bpf, this is simply, but it looks like almost like c. So,

07:24.320 --> 07:31.200
I'm going to help you steer it. The things I want to look at are, have the red line under it.

07:31.280 --> 07:40.240
So, the first thing that happens is we check our Enable flag. So, we load the red only map.

07:40.240 --> 07:48.160
We load some, we load the Boolean from that, it gets checked, and that's like. In this case,

07:50.080 --> 07:53.840
we have it set the true, so the body of the function still exists here. And then a body,

07:53.840 --> 07:58.640
we have the second map, which is the one we wanted to get rid of, if we don't use it. And then in the

07:58.640 --> 08:06.000
output, if we inspect, which maps are associated with which programs, we can see both maps are associated.

08:06.000 --> 08:11.920
Expected, that's normal. So, let's now disable this feature. So, disable my feature.

08:11.920 --> 08:17.440
We still have the red only map. We don't have the map anymore that I want to get rid of,

08:17.440 --> 08:25.680
but it's still associated. What gives? So, it turns out that it is a work of the way that

08:25.760 --> 08:31.760
verifier works at the moment. This is slightly technical. You can download the slides later and

08:31.760 --> 08:36.560
then follow the nice links. Essentially, what happens is in the verifier, in your nother program,

08:36.560 --> 08:42.960
there's a function called Resolve Studio, load IMM64, which is a little bit verbose.

08:42.960 --> 08:49.680
But every time you pause, file the scriptors for maps into the kernel, it converts these into

08:49.680 --> 08:56.000
into the actual pointers to the map, because at the end, kernel needs to work with the actual

08:56.000 --> 09:03.440
map SSDs in the kernel. It doesn't flood scan. It maintains this list of the use maps,

09:03.440 --> 09:09.520
which is what you see when you dump the program listing. And it increments the graph count.

09:09.520 --> 09:13.440
We make sure that this map stays loaded in the kernel while we are still using it.

09:13.440 --> 09:17.680
Then, the majority of all verification happens, like all the complex things to be

09:17.760 --> 09:26.800
relevant hate. And then, at the very end, once we have, like, good on all of the, all of the

09:26.800 --> 09:31.200
all the interesting things, there's this optimization pass, which is the way the death code

09:31.200 --> 09:36.160
elimination that I talked about earlier. It happens in three stages. So, first, it detects

09:36.160 --> 09:43.600
the branching thing, and it like hardwires it to normal jumps. And it puts like no ups in between.

09:44.560 --> 09:50.080
And then, it will actually gradually in steps, like we move the no ups and make it so that

09:50.080 --> 09:56.080
it's completely gone from legit to output. But the interesting part is that we never, if we

09:56.080 --> 10:02.160
remove the last usage of this map, we never released a ref count, we never removed this

10:02.160 --> 10:08.480
from this use maps. So, it means that even if I, after death code elimination, my map is still

10:08.480 --> 10:15.440
associated, that still holds on to the memory, even though it can never use it. So, that is just

10:15.440 --> 10:21.760
the state of things at the moment, and I'm sure that can be improved. So, we have to work around this.

10:21.760 --> 10:29.360
So, the first work around that we thought of was to do a double load. So, we would load our

10:29.360 --> 10:35.600
program once, but instead of loading, like maps that allocate like a megabytes of space,

10:36.560 --> 10:41.120
we would just set the motor to max entries of one. So, we still need to create the maps,

10:41.120 --> 10:47.760
but it's not that bad. We load our program, and then we can read it back and see, hey,

10:47.760 --> 10:55.600
we see that exactly these maps were eliminated, and now we can reload the program with the actual maps

10:55.600 --> 11:00.800
that we're going to use, minus what we determined is unused. And then, for the place,

11:00.800 --> 11:10.400
is where we had the original unused maps. We patch out the file descriptor. So, if the funny thing,

11:11.280 --> 11:16.080
even if you can't just set the file descriptor to zero, that always has to be

11:16.080 --> 11:21.840
like valid, but you can basically tell the load instruction that is not loading a

11:21.840 --> 11:26.080
file descriptor, but just a scalar number. So, that's what we do. We make it a scalar number,

11:26.160 --> 11:33.920
recognizable hex code, so that if we ever see it in an error message, then we know,

11:33.920 --> 11:40.320
okay, this is probably due to some bug in our logic. And then we, and then after this, we load again,

11:40.320 --> 11:46.400
and this works, like actually worked it decently, except for the cases where it doesn't.

11:48.240 --> 11:53.120
So, one problem with this is it takes a really long. We, Selium, has really big

11:53.280 --> 12:01.440
programs. We have 50 tail calls all together. Not all of them are always there, but like,

12:01.440 --> 12:07.360
we have a big programs, big tail calls, it can take like a significant like 200,

12:07.360 --> 12:14.400
200 milliseconds to load big programs. So, that's really undesirable, takes a little CPU, a lot of memory,

12:14.400 --> 12:20.320
et cetera. If you've still created all these maps, and also there is a limit in the kernel,

12:20.320 --> 12:29.360
where you can only use like 64 maps at max. And even, I think Selium currently has 80, 80,

12:29.360 --> 12:36.720
something maps defined. I would have to look it up again. We have a lot of maps. And of course,

12:36.720 --> 12:42.080
we don't all, we don't use them all at the same time, which is why we want this. But if we still have

12:42.080 --> 12:46.080
to like initially load with all of them at the same time, there's a chance we are going to run

12:46.160 --> 12:51.840
over this limit at some point. So, we don't avoid that. If I have to load a map, that might

12:51.840 --> 12:56.880
not be supported on a current kernel, like a arena map, this doesn't work, because I have to initially

12:56.880 --> 13:05.120
load it still, create it, to be able to do it the first time. And the last thing, which we actually

13:05.120 --> 13:10.480
discovered, like later on after we decide to switch off of this work, work around already,

13:10.560 --> 13:17.280
is that if you have a combination of CtL enabled, it actually prevents you entirely from reading

13:17.280 --> 13:24.320
back loaded instructions to get it instructions and even the accelerated instructions. And if that's

13:24.320 --> 13:29.920
the case, this approach just wouldn't work. So, if you're on a very secure system, this just wouldn't

13:29.920 --> 13:36.640
wouldn't have been a wounded if impossible. So, we may, we thought okay, second work around,

13:36.720 --> 13:43.600
slightly like maybe even simpler, or like in principle, it's simpler. You just match whatever you

13:43.600 --> 13:50.080
do in C and go in your user space. So, if you say in C, I only use this map if these features are

13:50.080 --> 13:54.240
enabled, then I match that in my user space. But now you have two places, they need to keep a

13:54.240 --> 13:59.680
sink. If you've ever done this, you know, it's very difficult to maintain. We have a lot of

13:59.680 --> 14:05.840
maintainers, a lot of people that might not know about this. A lot of chances for this to go wrong,

14:05.840 --> 14:11.360
and if we make a mistake, our programs just will not love, will not run, will not do anything.

14:12.400 --> 14:19.280
So, we decided, this is just not what we won't learn term. And thus, we landed on

14:19.280 --> 14:25.360
pitchability analysis. So, the idea of reachability analysis is, what if we knew, reach instructions

14:25.360 --> 14:32.480
were reachable before we were even, we had even loaded them into the kernel. So, this is essentially

14:32.560 --> 14:38.960
recreating a very, very small verse, a bit of the verifiers, specifically for these loads,

14:38.960 --> 14:45.520
I'm configurations. We adopt this reachability analysis, and this is actually applicable,

14:45.520 --> 14:50.960
not only to maps, which is like the menus case I'm presenting, but also for tilt calls with

14:50.960 --> 14:58.240
hard coded slots, and BPF to BPF functions, specifically global BPF to BPF functions, where

14:58.240 --> 15:04.880
if I know, I don't have to, like, if I don't use them, I might be able to just completely avoid

15:04.880 --> 15:11.440
loading them in a kernel in the first time, reduce a lot of time loading programs into the kernel

15:11.440 --> 15:16.640
that I've done never end up using, same from maps to same for global BPF programs.

15:18.400 --> 15:22.720
And it actually addressed most of the concerns we had with other deodorant approaches, so we

15:22.720 --> 15:27.440
don't have to load maps ahead of time, we don't have to do a bunch of things. The only thing is

15:28.320 --> 15:33.200
engineering effort, because now we have to do these interesting things in user space beforehand.

15:34.400 --> 15:41.840
So, how does this work? How does reachability analysis work? So, first, we need to break up our code,

15:41.840 --> 15:48.800
so I took the same or the code of this example program, we need to break this up into basic blocks.

15:48.800 --> 15:55.520
And a basic block is a block of code, so the sequential assertions that are guaranteed to run together.

15:57.440 --> 16:04.800
So, the way you do this is we first go over the code and we identify all of the branching instructions

16:04.800 --> 16:11.840
that we have here, and we say, okay, that's the end of a block. So, we start the one big, big list,

16:11.840 --> 16:17.280
that's the end of a block, and now we identify all of the points where we jump to,

16:18.640 --> 16:24.000
and every time you jump to it, that means that these instructions don't

16:24.000 --> 16:28.960
might not execute always together, so that's the start of a block, and that's our, this is our

16:28.960 --> 16:36.080
block list. So, we get this list of blocks, they execute, and now we can make assertions about these,

16:36.080 --> 16:41.760
and if we visualize this, you get something like this. So, we have this, we have our entry block,

16:41.760 --> 16:46.400
it can jump all the way to exit, it can do these other things, you might have seen this if you are

16:46.400 --> 16:53.120
into reverse engineering, or into compilers, or that sort of thing, and add a little bit of

16:53.120 --> 16:59.840
annotation here. So, when we do this, we maintain some metadata, we say, okay, we have a branching

16:59.840 --> 17:05.440
block, a fallback block, and we have a list of our parent blocks. The bit I left out of this is

17:05.440 --> 17:11.280
BPF functions, because it didn't fit on the slide frankly, but we also maintain a list of if we

17:11.280 --> 17:18.000
ever do a function call to BPF to BPF function, that doesn't, it's not branching, because it will

17:18.080 --> 17:25.840
always return, but we maintain that just so we can always discover, so if a basic block is

17:25.840 --> 17:31.440
reachable, then we also say, okay, that BPF function is not also reachable, and we can recurse through that.

17:32.320 --> 17:38.720
So, that's not here, but it does exist in the actual code. So, we've done all of this, we have our

17:38.720 --> 17:46.960
block list, now what? So, now we need to discover where our load time configuration is, and if you go

17:47.040 --> 17:56.080
look at the instructions, this is whatever, if you have these like volatile comes, then this is what

17:56.080 --> 18:05.200
it looks like. So, you load the pointer to the map, the read-only map, so it's every, all of your

18:05.200 --> 18:11.600
constant values, living the same map in the same array value, so it's on array map with only one

18:11.680 --> 18:17.040
entry, and it's a really long thing, and it actually, they take the L, read-only data section,

18:17.040 --> 18:23.760
it just put it in a map. So, we have this map, we take an offset, so we need to, we need to know the

18:23.760 --> 18:29.680
same map, and in blue, we have the offset, here it's the only variable, but you can imagine that

18:29.680 --> 18:35.520
that if you have a lot of them, then every variable has its own offset, it gets loaded into

18:35.520 --> 18:41.840
this, into this green register, and then that green register is compared, in this example,

18:41.840 --> 18:49.200
it's an equals operation to a, like an IMM value, so it's a value that is encoded in the instruction

18:49.200 --> 18:58.800
itself. And this is essentially the basic, the easy example that we look for. We know that we know

18:59.840 --> 19:04.640
this is a branching instruction, so it must live at the end of a block, so when we go and do this,

19:04.640 --> 19:10.800
we can just loop over all of our blocks, we see as the last block a branching instruction,

19:10.800 --> 19:18.560
and then we backtrace from that, so we go look and see if we can find a case where we find these

19:18.560 --> 19:24.720
instructions, and of course we can't be clobbered, so if we loop and we find that between these two

19:24.720 --> 19:30.400
instructions, our one gets repurposed for something else, and of course we haven't found it, so we're

19:30.480 --> 19:39.760
looking for this very positive pattern of, okay, we need exactly this, but there are some variations.

19:39.760 --> 19:44.800
Now, if once you've done this, you can actually, because we know which variables work on

19:44.800 --> 19:50.320
a lot with, because we already have the map value essentially prepared, we can just predict

19:50.320 --> 19:56.160
the outcome of this branching instruction, so we can say, okay, either this is always branches,

19:56.160 --> 20:01.600
or there's never branches, and then that has implications for our blocks, because if we now know

20:01.600 --> 20:07.680
that we can remove the green or the red arrow, now we suddenly know that, hey, these blocks are

20:07.680 --> 20:14.240
unreachable, or they might be, so we don't need to implement a whole verifier, but we only need to

20:14.240 --> 20:21.840
implement this slight bit of it. And of course, it's not as easy as just looking for these things,

20:21.840 --> 20:26.720
instructions that they are sent to each other. So one of the things that can happen is a verifier,

20:26.720 --> 20:33.120
or a compiler that's smart on you, and they're like, okay, so we're going to first load our

20:33.120 --> 20:36.320
this variable that we're going to compare with, and then we're going to insert a bunch of all the

20:36.320 --> 20:41.680
random instructions, including an other branching instruction, and only then are we going to use it.

20:42.640 --> 20:48.960
So, now we need to deal with this. The way what we've done is we looked at, like, specifically

20:48.960 --> 20:54.000
silly them, and what happened, what occurred, and what we found it in, like, essentially all the

20:54.000 --> 21:01.360
cases that we were able to find, there's usually only one ancestor, like, only one parent of the

21:01.360 --> 21:07.520
basic block where it is happening. So we don't only just backtrack in the local basic block,

21:07.520 --> 21:12.800
but we can also say, okay, if there's only one, exactly one parent, we can just treat it as

21:12.800 --> 21:19.520
a set that was also a basic block, because you can only get into this block if you go via via this

21:19.520 --> 21:27.120
virus, or one. So we implement this, and that can actually follow over multiple, and it's again

21:27.120 --> 21:33.360
a search that this register doesn't need to be covered. Your water, interesting, interesting cases,

21:33.360 --> 21:42.640
so if you have a signed, in this case, I made a signed 16-bit integer, and if the compiler decides

21:42.640 --> 21:53.600
to use a 64-bit comparison instruction, now you need to 64-bit numbers, which are the 16-bit

21:53.600 --> 22:00.000
number, which you always load as a signed. If you were to just put this in here, it wouldn't work.

22:00.640 --> 22:07.280
Register a 64-bit, the CPU things that if you even if I load only 16-bit into it, it'll

22:07.360 --> 22:13.760
still 3-bit, 3-bit as a 64-bit number, so compilers do this clever thing, where they shift the

22:13.760 --> 22:20.000
number up, and then use a signed shift down, so it preserves the sign, and now you basically

22:20.000 --> 22:28.720
cost your 16-bit sign number into a 64-bit sign number. But we have to account for the fact that

22:28.720 --> 22:35.440
this register might change and also how we need to track how the actual map value evolves so

22:35.440 --> 22:42.400
that it matches, so that we accurately can predict if it works, and only we're in CPU instructions

22:42.400 --> 22:51.600
to the ISA-FEC. There's variations of this, where it uses 32-bit numbers instead, because we

22:51.600 --> 22:58.560
go to instructions to do this. So there's a bunch of logic here, same goes for unsigned numbers.

22:58.560 --> 23:04.880
It can be that it's masked off, so you actually need to implement these like mask

23:04.880 --> 23:09.200
operations as well, and even masking and shifting if it was for example a bit

23:09.200 --> 23:17.360
field or something like this. Then the last one, I made a mistake on this slide, bonus points

23:17.360 --> 23:26.560
for anyone who spots it, but what is the late, I think, this is the last interesting case

23:26.560 --> 23:33.040
that I'm going to handle, one of the second one. If you have two 64-bit numbers, and you want

23:33.040 --> 23:40.160
to compare them, the compiler actually does this, so it uses a register register compare and puts

23:40.160 --> 23:45.440
a constant in the second register. The reason for this is that these like this IMAAM value that

23:45.440 --> 23:51.440
was encoded in the comparison instruction is only 32-bit, so if you need to do 64-bit

23:51.680 --> 23:59.280
constants, it does this all the better. So you need to keep track of how this is of

23:59.280 --> 24:03.680
so we needed to keep track of this pattern as well. And then the very last edge case, which

24:03.680 --> 24:14.320
unfortunately we had to do some actual macro stuff in our code, is this. Sometimes the compiler

24:14.320 --> 24:21.040
gets even more clever, and it says, okay, you use this variable like you check for this feature,

24:21.040 --> 24:27.360
flag like 16 times, perhaps we should store this in a stack. But this makes it really difficult to

24:27.360 --> 24:34.320
find this pattern without implementing a whole verifier. So if it does this, where it saves it into

24:34.320 --> 24:39.600
some register, and then it reads it back like 40 basic blocks later, and that's really annoying,

24:39.600 --> 24:45.360
and we just decided to just move handle this case. And the workaround that we found is if you

24:46.320 --> 24:54.080
change the way you write the code slightly, we can trick the compiler. So what we say to the

24:54.080 --> 25:02.400
compiler is in this ASM block, we always want to dereference, so we always get this value, we always

25:02.400 --> 25:08.880
want to dereference it, just always emit this right before the branching instruction. So we essentially

25:08.960 --> 25:16.640
force it to color the branching instruction and the de-referencing instruction. This one,

25:17.280 --> 25:23.680
we put these together, and then I believe, almost guaranteed, it will also put this R1 instruction

25:23.680 --> 25:32.240
next to it. So it isn't perfect, but it improves the reliability of this enormously.

25:32.880 --> 25:39.920
And that sort of brings me to final conclusions. So reachability analysis, this whole concept

25:39.920 --> 25:46.720
seems like a good tool for optimization, because it reduces a lot of, it reduces a lot of time,

25:46.720 --> 25:54.400
I remember using it during loading a bunch of these other things. Although I would really also

25:54.400 --> 25:59.760
like to see the verifier get improved, it would be really cool if the verifier could like,

25:59.840 --> 26:04.080
if we could teach the verifier to just release these maps, if they get that code eliminated.

26:05.600 --> 26:08.160
But even if that, even if that were the case in the future,

26:09.120 --> 26:15.840
we would likely still be a place for this logic, just for the, just for the tail calls and

26:15.840 --> 26:21.760
the BPF, the BPF functions. It should be noted that the system airs on the side of caution,

26:21.760 --> 26:26.560
so we always keep maps unless we can absolutely prove that they are dead.

26:27.520 --> 26:31.360
So the system has false negatives, but doesn't have false positives.

26:33.760 --> 26:38.160
And I wanted to basically put this question to the audience as well. It's like, is this,

26:38.160 --> 26:43.120
this very silly specific use case, like we ran into this, but do other people maybe

26:43.920 --> 26:49.840
see a use for this code, would this be something, or people are interested in? And something I,

26:49.840 --> 26:55.920
I just throw away thought perhaps, like, could we make this work with signing, or does this,

26:55.920 --> 27:03.280
does doing this usual space stuff, like make, make true signing in some way,

27:03.280 --> 27:10.320
where it's difficult or impossible? Interesting things to, well, maybe maybe we'll do a talk about that

27:10.320 --> 27:15.760
whenever, whenever I figure out the answer, the answer to that. Then before I go to the

27:15.760 --> 27:21.680
questions, I realized, I completely forgot to put the link to all of this code in the slide,

27:21.680 --> 27:28.080
which is sort of sad. But I do, I do have it, so if you, it lives in the Selim repository at the moment,

27:28.080 --> 27:35.600
so under package BPF analyzed, you can just look at it, see what this does. I believe you can even,

27:35.600 --> 27:42.240
if you wanted to copy, copy just this code, use it elsewhere, or import it for now, you would need to

27:42.320 --> 27:48.240
import like the whole of Helium, which is a bit excessive, but like you can already play with this today.

27:50.400 --> 27:55.440
And yeah, perhaps if, if we do, if this does end up being useful, we might just split this out

27:55.440 --> 28:01.760
into some separate repository, isn't it? So, and that brings me to two questions. If we

28:01.760 --> 28:07.840
still, I don't know how much time we exactly have left. Maybe one. Maybe one. Okay.

28:08.800 --> 28:11.440
Where are one, one hand going up?

28:11.440 --> 28:17.200
Hi. Very interesting presentation. Thank you. I was wondering if it's possible, instead of doing

28:17.760 --> 28:24.240
this analysis in user space, to somehow probe the verifier to get the information that you need about

28:24.240 --> 28:28.720
the reachability of certain blocks out, so then you just let the verifier do the heaviless

28:28.720 --> 28:35.200
lifting and data entry. It brings that the double loading problem, of course, but maybe you don't

28:35.280 --> 28:43.600
have to implement that's complicated analysis in user space. Yeah. So, what I, so technically,

28:43.600 --> 28:49.440
you can, you can just fix this with, if the verifier were to like release the map rough count,

28:49.440 --> 28:54.640
then you could just load your, you still have to dig maps, you load them, you have like a peak of

28:54.640 --> 29:00.400
memory usage, and then they just get released afterwards, so it's like it's not as nice as this,

29:00.400 --> 29:05.680
but it would be a way to avoid a lot of the user space work. I think that would be a good way

29:05.680 --> 29:10.880
for it to lose for the maps on the map side of it now, right? I'll go make room for the next speaker.

29:10.880 --> 29:15.120
Anyone that has the latest study to work on the verifier regarding the map rough count?

29:15.920 --> 29:19.840
I haven't, I haven't started working on this, but I really look forward to seeing if I can get

29:19.840 --> 29:26.640
if I can get a kernel patch upstream for the right. Thank you, Dylan.

