WEBVTT

00:00.000 --> 00:15.440
I will be talking about python bpf, what is it and how it works, but before that yeah a quick

00:15.440 --> 00:22.680
introduction about me and Varun. So, I am Pragyansh, I work at canonical in Ubuntu engineering,

00:22.680 --> 00:29.320
I am also the co-mentainer of python bpf which has nothing to do with my day job and Varun

00:29.480 --> 00:34.400
engineering student who works with me on python bpf, he could not make it here because

00:34.400 --> 00:40.360
of some reasons, but yeah another person I forgot to include in this slide, but I have

00:40.360 --> 00:46.560
a master acknowledge a scarthic here, a PhD student at e-mfl, who helped us with adults with

00:46.560 --> 00:57.560
ebpf and was very helpful in guiding us through this project. So, yeah the project python bpf is

00:57.960 --> 01:07.000
a python front end for writing ebpf programs. It allows you to write both ebpf and its corresponding

01:07.000 --> 01:13.080
user space code in python and they can be in the same file, they can be in different split

01:13.080 --> 01:20.040
across different files, it can also be in a python notebook across multiple cells. Now writing

01:20.040 --> 01:27.480
ebpf and python is not a new idea and might sound somewhat outdated as well as bcc or

01:27.560 --> 01:35.800
already existed in like what 2015 16, but how python bpf does this is very different from bcc.

01:38.680 --> 01:45.960
So, if you have used bcc for writing ebpf programs before, you might know what the might

01:45.960 --> 01:50.760
know that you still have to write the bpf specific part in a multi line string if you are using

01:50.920 --> 01:57.080
it in the same file or have to have a different file and then reopen that and read that

01:57.080 --> 02:04.520
in your python file and your ebpf specific code is written in a flavor of c and only the

02:04.520 --> 02:11.560
user space processing part is in python. In python bpf the users use a subset of the existing

02:11.560 --> 02:19.080
python grammar to write ebpf code and this has some benefits like python dev tools like

02:19.160 --> 02:26.680
lenders and pre-setters are useful for the whole file now. This also potentially lowers the barrier

02:26.680 --> 02:33.240
of entry for people to learn ebpf by hiding a lot of typing details and hand me in common

02:33.240 --> 02:41.640
verification failure mitigations behind the scenes. So, this is a side by side comparison of

02:41.720 --> 02:50.040
the same code written in bcc and python bpf. This was one of the examples in the bcc tutorial.

02:51.000 --> 02:56.600
You can see the dev tools at work with proper syntax highlighting as compared to the bcc

02:56.600 --> 03:07.240
example given on the left side. This is somewhat superficial when it comes to benefits as we will

03:07.320 --> 03:16.760
discuss what else python bpf enables in our latest slides. Let me give you a basic overview of

03:16.760 --> 03:25.320
this program. At the top we have the various imports of python bpf the decorators, the helpers,

03:25.320 --> 03:34.760
the maps and the type hints which are necessary. Then we declare a custom data type data t

03:35.720 --> 03:48.360
which has a pid timestamp and the command which is a caraday. Then we also have a perf event

03:48.360 --> 03:56.040
array map type bpf perf event array and then there is the function which attaches itself to

03:56.280 --> 04:05.640
says endoclone trace point and then outputs some data to the perf event array map. So,

04:05.640 --> 04:15.240
yeah that is the basic that is what a python bpf program would typically look like. Let us move on

04:15.240 --> 04:23.480
to how do we compile this. So, this is a big image and I wish I could employ you all to remember

04:23.480 --> 04:29.080
this for the rest of the presentation but we can come back to this slide again if we need it.

04:30.280 --> 04:38.680
This is what a comes firstly as a one time step we have to generate the VM Linux.py VM Linux

04:38.680 --> 04:47.720
header. So, we generated using a script that we have written that runs bpf2 to create VM Linux.h

04:47.800 --> 04:56.280
and then runs crank topy on it with some tweaks to give us VM Linux.py so that we can use your running

04:56.280 --> 05:04.760
kernels data structures in the bpf code that you are writing. Then we can take the input as a python

05:04.760 --> 05:15.080
file or python notebook source and then bpf chunks are marked out and processed by python bpf.

05:15.160 --> 05:22.920
What are bpf chunks these are just code snippets which are which will be like compile to a bpf

05:22.920 --> 05:28.280
object. So, if there is a file that contains both if bpf code and user space code

05:29.320 --> 05:36.120
we just we are just concerned with the bpf chunks rest of the code is run through any python

05:36.200 --> 05:47.720
interpreter. So, here and then we first look at VM Linux imports and how they are being used.

05:47.720 --> 05:54.920
So, if there is any kernel data structure or kfong or something that is present in VM Linux.py

05:54.920 --> 06:01.160
imported from there and being used there. So, as a first step we create the llvmir for

06:01.800 --> 06:09.720
that said import and then we create the symbol table for those imports and

06:10.760 --> 06:16.680
like inject them into our local and global symbol tables for the input file we have

06:17.320 --> 06:24.280
which will be populated with other program logic later. Then we do struct processing

06:24.600 --> 06:32.440
our custom data types you might have declared and then comes map processing where

06:33.480 --> 06:39.320
we generate llvmir for any maps you might be using we do this in this specific order because

06:39.880 --> 06:46.280
you might be using those custom struct types in your map declarations and might be using

06:46.680 --> 06:52.280
of course, always using those maps in your functions that you will be writing.

06:52.360 --> 06:59.720
So, this is the order we follow. Then comes the real part function processing where we

07:01.240 --> 07:09.240
do a pass to kind of guess how many local symbols you need and how many local

07:09.960 --> 07:19.000
how much local stack space you need and then assign those stack spaces to symbols at times

07:19.160 --> 07:27.160
you can have the same stack space for multiple symbols fill in type metadata and then process

07:27.160 --> 07:39.800
all of the expressions one by one here. So, conversion of Python ast to llvmir is done by a

07:39.800 --> 07:47.640
project called llvmir which is number you are thinking of using just llvmir but we figure that

07:47.880 --> 07:53.880
let us try with llvmir it allows us to write everything in Python and I do not know this goes

07:53.880 --> 08:06.280
in with the spirit of the project. Then now that we have that i of file we can just pass it

08:06.280 --> 08:15.480
well c and it gives us bpf object file. We also have a sister project called pile bpf which is

08:15.800 --> 08:21.880
essentially just python bindings for llvmir and we can pass on the

08:23.400 --> 08:31.800
struct definitions that we created during this pass and pass that on to our user space code

08:31.800 --> 08:37.880
which is using pile bpf. So, you do not have to declare the same structs twice and then use

08:38.760 --> 08:46.360
those map and struct definitions in your user space. So, yeah that is the compilation flow.

08:48.600 --> 08:57.880
Now, I will go through what is the anatomy of python bpf program through work through. So,

08:57.880 --> 09:07.160
this is the disk snoop example which we have ported from bcc to python bpf and so yeah

09:08.120 --> 09:15.880
first we take in the c types import. These are necessary even though type hints are

09:16.760 --> 09:25.080
optional in python for python bpf they are necessary because you need to compile it down to llvmir

09:25.080 --> 09:32.040
and there are some places where you just cannot guess types after some point. So, these are

09:32.120 --> 09:39.800
common and also like when you are creating suppose this hash map which is a map of type bpf hash

09:39.800 --> 09:49.720
it is not the data structure hash map. You have to give the key and value types and they could

09:49.720 --> 09:55.880
have been like instead of equals we could have used code in but that is something that we

09:55.880 --> 10:04.200
talking about in the next slides. Then there are some imports from rvmreenox.py struct ptx which

10:04.200 --> 10:10.600
is the type of the context argument pass to functions attached to k probes then there is a request

10:10.600 --> 10:20.840
struct which is a common data structure in vmreenox header. Then we have the decorators bpf bpf

10:21.160 --> 10:29.400
global compile is not a decorator but map and section. These necessary to point out the bpf chunks

10:29.400 --> 10:37.560
and what kind of bpf chunk there then we have the k time helper which maps to bpf k time and the hash

10:37.560 --> 10:48.040
map type is imported from bpf.maps there we have listed other map types there as well. So yeah

10:50.840 --> 11:04.440
next we have what a function body would look like. So first if we have to attach a function

11:04.440 --> 11:15.720
to like section we provide that section using the section decorator of course the bpf decorator

11:15.880 --> 11:25.400
should come first to mark that this is to be compiled down to bpf object and I am not quite

11:25.400 --> 11:32.360
happy with these syntax we use in the section as the argument for section and I will come to that

11:32.360 --> 11:42.040
later but yeah then we have trace completion this is I think I have shown this before and this

11:42.920 --> 11:51.400
zoomed in part of that. So struct ptx as I said is the type 4 context and this function returns

11:51.400 --> 11:59.960
an n64 at the end I have to add a type ignore return value because my pie complaints that 0

11:59.960 --> 12:09.720
is not c n64 but Python bpf internally would assume a constant n64 if no type is specified and

12:09.720 --> 12:16.360
so returns 0 is actually c n64 my pie doesn't know it so for satisfying it we have to add it

12:16.360 --> 12:25.000
you know so yeah then we have context or d i context if we just take the type of context or d i

12:25.000 --> 12:35.480
in lvmir it will come out to be n64 so but the user knows that it can it can we convert it to

12:35.480 --> 12:41.880
a pointer to struct so you just do struct request get the request object from it and then

12:41.880 --> 12:47.320
data land command flags request or data land request command flags there is also this

12:47.320 --> 12:53.800
interesting bit that we do not know if data land and command flags are pointers or pointers

12:53.800 --> 13:02.440
to pointers or integers we will just use a dot to d difference it to any depth level and yeah

13:02.520 --> 13:10.360
this is part of the type deduction we have to do then another interesting bit is start dot

13:10.360 --> 13:20.440
look up so in c for example if you were looking up if you were doing a map look up that would have

13:20.440 --> 13:28.680
been a stateless function where you have to pass the map as well and the argument but I think

13:29.320 --> 13:36.600
for pie then it makes sense to make it look like start is a data container and you can have a look

13:36.600 --> 13:45.400
up on it you can also overload the dundam method for indexing and then you have start square

13:45.400 --> 13:53.160
brackets request pointer and that does it pretty well then we have the usage of k time minus

13:53.240 --> 14:00.760
it equates tsp and this would be interesting because the type for k type k time and request

14:00.760 --> 14:07.320
tsp does not match here what still it works and then we have print which allows you to use python

14:07.320 --> 14:15.560
like format strings and print is actually bpf print k so you would see there are only three

14:15.640 --> 14:24.440
arguments because we cannot do four so that is the and I did last the I won't explain trace

14:24.440 --> 14:31.080
start because I think the an explanation of trace and covered it the things happening here pretty

14:31.080 --> 14:39.000
well there is this part of bpf global where we specify the gpl license because it is used you need

14:39.000 --> 14:47.160
to do this to use much of many of the helpers and k functions in the Linux kernel we haven't

14:47.160 --> 14:53.560
like automated it here it has to be done by the user itself at the end is the compile function

14:53.560 --> 14:58.840
and now this is important that the compile function has to be at the end of all bpf chunks

14:58.920 --> 15:10.840
and instead of and what compile does is spit out an object file which contains a bpf object file

15:10.840 --> 15:18.680
there are some variations of compile that you can use you can use compile to IR to only get

15:18.680 --> 15:24.520
the LLVMIR instead of the object file if you need to inspect something while debugging yourself

15:55.480 --> 16:04.520
oh nice instead of compile and compile IR you can also use the bpf object from pile of bpf

16:04.520 --> 16:12.440
which works pretty much like what bcc user space code does so and you can have all of your

16:12.440 --> 16:21.000
things in the same file and that's pretty neat then coming to demos and examples this was presented

16:21.080 --> 16:27.080
at Linux Plumbus Conference 2025 where we presented some demos I don't want to do that again

16:27.080 --> 16:32.440
because of two reasons a the demo was like 10 minutes showing four different things will be

16:32.440 --> 16:38.440
there are some people who have attended LPC 2025 and this talk would have been like

16:38.440 --> 16:47.320
a waste of time for them if I did this again but I do employ you to go see the recording

16:47.320 --> 16:54.600
from this time stamp five minutes 52 seconds we did for examples a TUI based container monitor

16:54.600 --> 17:02.040
syscola anomaly detection for Spotify and then a kernel symbolization using

17:02.040 --> 17:13.000
base sim and vfs read latency example with different like in a python notebook then in a TUI

17:13.560 --> 17:19.720
web dashboard and how you can leverage the python ecosystem to make your vpf tools better using

17:19.720 --> 17:30.520
python vpf also I welcome you to mess with the project and the examples we have on get up most

17:30.520 --> 17:36.840
of these are ported over from bcc we have a try it out section where we list out the steps you need

17:36.840 --> 17:48.280
to take to set up python vpf and mess with it yeah I think I have covered pits in this slide that

17:49.480 --> 17:58.440
the vpf decorator is what marks a vpf chunk and then you can specify for the if it has to be

17:58.440 --> 18:04.040
a struct if it has to be a map or if it's a global or if there's a section that you need to

18:04.040 --> 18:14.520
attach it to so those are the decorators the vpfs nvpf chunk will have at least two of these decorators

18:14.520 --> 18:24.120
one for marking it one for specifying it now a big challenge while working on it and one

18:24.120 --> 18:30.440
of the things that we are still working on that's why the slides are named the internals trying

18:30.440 --> 18:38.200
to make typing work one of the reasons why we are still working on python vpf is to lower the

18:38.200 --> 18:48.760
barrier of entry to learning vpf and writing those programs and abstracting away typing has some

18:48.760 --> 18:54.840
benefits because now people don't have to know everything about what they're writing and

18:54.920 --> 19:04.360
if we handle some vm linear x or verifier errors and typing ourselves behind the scenes it gives

19:04.360 --> 19:15.480
for a much easier experience but so let's define an action on a data container or data to be an

19:15.480 --> 19:27.080
operation which involves the set data or maybe a function call to which this data is an argument

19:27.080 --> 19:37.320
of then we can find such action points in the input file and employ our type deduction on it

19:37.320 --> 19:42.840
because these action points are the only places where we have to care about the data type of the

19:42.920 --> 19:52.200
container in question yeah I try to read about auto and decrytype and how they perform type

19:52.200 --> 19:59.720
reductions in c++ and the outcome is to have a set of rules and just wing it I don't know if we can

19:59.720 --> 20:06.520
ensure that all cases will be covered by us that might need some formal proofing or I don't know

20:06.520 --> 20:12.280
but for now we look at the expected type and the current type of the data at any action point

20:12.280 --> 20:18.040
and check if we can actually convert between them according to our set of rules and then try to do it

20:18.840 --> 20:24.200
one work which we haven't python vpf is there at times you need to convert some r values to

20:24.760 --> 20:38.760
values suppose you have a map with a hash map with your n keys and n values and if you look

20:38.760 --> 20:48.440
at the helpers at the dogs and see the helper signature you'll find out that the keys

20:48.760 --> 20:56.600
have to be for for look up the key has to be an L value but python vpf allows you to do stuff like

20:57.800 --> 21:07.480
map dot look up one or it can be one plus one or it can be k time

21:09.320 --> 21:18.040
on this returns and 864 so and this can be any expression plus one plus I don't know or x plus one

21:18.920 --> 21:27.320
now the outcome of these operations is usually in our value but you have to pass it to a helper

21:27.320 --> 21:35.800
which needs the address of this thing so this introduces a lot of complexity in how we allocate

21:36.600 --> 21:45.160
stack space to local variables another work of working with vpf is that you can only

21:45.640 --> 21:56.040
allocate stack space in the first basic block of a function so if your function has any f statement

21:56.040 --> 22:04.200
and you are creating a variable inside the body of that in f statement we need to see if we

22:04.200 --> 22:12.520
will ever reach that if statement and then probably create space for that as well now another thing

22:12.520 --> 22:20.200
is that map look up this function takes an L value but what kind of L value may be it takes

22:20.200 --> 22:26.040
up not just this helper but any function which we might have already cleared it can be like

22:27.000 --> 22:33.800
a pointer to a pointer or and you are passing just one integer so you need to create

22:34.520 --> 22:43.000
a temporary two temporary stack spaces here which can allow you to first create a pointer and then

22:43.000 --> 22:51.640
a pointer to a pointer and then pass it over to the helper so this because of this we have to create

22:51.640 --> 22:59.240
like scratch spaces where you go through each basic block and see how many such temporary

23:00.200 --> 23:09.080
scratch registers you need and then create them for each function so there is also the depth

23:09.800 --> 23:19.000
problem that suppose we have I think this was what I was hinting to here K time minus request

23:19.000 --> 23:25.320
tsp the here it is very easy because request time stamp is a pointer and you just need to

23:25.400 --> 23:34.760
dereference it because we know that the outcome of this expression has to be and in 64 or

23:34.760 --> 23:42.920
in 32 depending on what the widest integer in this expression is K time is in 64 so this will be

23:42.920 --> 23:50.840
64 bit minus d reference a request tsp until you get a 64 bit integer if you don't then just don't

23:51.000 --> 23:57.240
do it that's one of the things we do and we also have to do it the other way like

23:57.960 --> 24:03.960
allocate as much stack space as you can until you get to a point where you can pass it to

24:04.920 --> 24:12.040
with or right type to the function or helper you want to pass it to then the to

24:12.920 --> 24:18.680
victor stock here was also interesting for string and calories and this is a problem that we

24:18.920 --> 24:25.320
view facing here that there is no clear distinction between strings and calories in python bpf

24:25.320 --> 24:32.120
so if a function needs a pointer to a string you just give it a pointer to the first character

24:32.120 --> 24:40.520
of the character array but now if you want to have a character array at an action point so we'll first

24:40.520 --> 24:47.480
do call to the helper of bpf corner read string and give the output of that string behind the

24:47.480 --> 25:00.200
scenes to whatever action point that that car array is required at so yeah coming back to yeah

25:00.600 --> 25:07.000
so this is somewhat difficult and we constantly test and improve our type deduction and this is

25:07.000 --> 25:13.480
an area where we face most of our hiccups whenever we try something new in python bpf

25:14.440 --> 25:22.600
this is an example which can probably I yeah I think I did go back to this example like here and

25:22.600 --> 25:31.240
then if you see you don't need to worry about types then we have this core read which will read the

25:33.240 --> 25:39.800
exact data for you but here we employ the same thing using just the D difference the struct member

25:39.960 --> 25:56.200
syntax and yeah that's that's it for this right I believe and one more thing about this is

25:56.200 --> 26:06.200
that you might see that this looks cleaner and bless complex and this is not about code

26:06.200 --> 26:13.960
goal thing this is that it's it's fine if the user doesn't know about exact types as I don't think

26:13.960 --> 26:26.440
they should be knowing that to write bpf programs then we have this this about vmnex.py and how it works

26:26.440 --> 26:35.880
the quark here was that we have to generate a debugging for for this and we encountered some types

26:35.960 --> 26:42.440
that we weren't encountering when we write programs without vmnex like function pointers or

26:42.440 --> 26:50.680
structs within structs within structs and that led to more challenges which we overcame

26:52.760 --> 27:01.160
and then where what python bpf does well it tries to shield users from typing and verify

27:01.480 --> 27:08.920
bit pitfalls by basic verifier failure mitigations like auto generating null checks for pointers

27:08.920 --> 27:17.000
or boundary checks for raw data from packets then a concise python x syntax

27:18.040 --> 27:23.080
you can leverage the python ecosystem to create created data analysis and visualization tools

27:23.160 --> 27:33.880
which we have demoed in the LPC recording which I urge you to watch and this can be ideal for

27:33.880 --> 27:44.920
beginners to vbpf and for quick prototyping and there are some things that we still have to work on

27:45.480 --> 27:51.640
we got some nice reviews after presentation at LPC from the attendees one of the things that we

27:51.720 --> 27:59.320
have to work on is that it's not quite feature complete which is true we only have support for some

27:59.320 --> 28:07.080
maps and helpers right now partly due to because we started by handwriting all of the helpers

28:07.080 --> 28:16.760
but now that we can pass vmnex.py and have some what robust type inference system we can

28:16.760 --> 28:23.720
probably create auto generate python signatures from vmnex.py and try to

28:25.640 --> 28:34.120
not insert a writing it by hand just we use the exact helper okay from and pass arguments

28:34.120 --> 28:38.680
and it should work barring for some cases in which we will still have to write exceptions

28:39.640 --> 28:47.880
there was also this review that in vpf raise someone wrote a snake game can someone write a

28:47.880 --> 28:53.640
snake game in python vpf and yeah you cannot write now because we don't support for loops

28:55.160 --> 29:01.800
and this becomes a vacuum old game then that you present this project to someone and someone will

29:01.960 --> 29:09.560
come with a feature request and then you rush to complete that feature request and this might work

29:09.560 --> 29:18.200
but this is not how a project should work we should know like what things we have to implement

29:18.200 --> 29:25.720
for this to work one idea for that was to try to put all the kernel self tests to python vpf

29:26.600 --> 29:36.200
but that seemed like a pretty big undertaking through that it's a very representative of what kind

29:36.200 --> 29:44.760
of vpf programs you can write but it's a huge a lot of code that we will have to work on so

29:44.760 --> 29:56.280
another thing that we focus on would be making a like a simple minimum viable product

29:56.280 --> 30:01.480
and perfecting it and that comes to the second thing establishing a feedback loop with our

30:01.480 --> 30:07.480
early adopters people will come try this find one of the one or two features they don't like

30:07.480 --> 30:14.040
and or maybe python vpf doesn't support and relieve it and we need to have a feedback loop to

30:14.040 --> 30:22.040
our early users to be able to make this better so we need users whose needs and use cases are

30:22.040 --> 30:29.640
very clearly defined and we need to have a wrap up with them to have constant feedbacks a good

30:29.640 --> 30:39.000
feedback loop this can solve our problem we had with kernel self tests as well it's a huge thing

30:39.000 --> 30:47.960
but what if we create it for grad students first for like courses where you have to do a lab

30:49.640 --> 31:03.400
is the time done yeah okay then so so yeah this is something we're working on to make a lab for

31:03.560 --> 31:15.080
my alma mater for the fall sim that would be but that would be a good feedback loop from

31:15.080 --> 31:23.560
them from the students there's also how opinionated python vpf has to be so like for things like

31:23.560 --> 31:29.880
k time that can be time and using print instead of print k because print already existing python

31:29.960 --> 31:38.120
so why should we use the whole vpf print k thing and instead of having a string arguments to sections

31:38.120 --> 31:45.960
having name couples so that your id hints can help you find if a specific trace point exists or not

31:46.680 --> 31:54.680
so things like that can be added a goal is to make python vpf a crime choice and make this

31:55.640 --> 32:06.520
accessible and this is the summary of whatever of this presentation the links I didn't have a QR code

32:06.520 --> 32:13.800
but you can download and visit them and yeah that's all any questions

32:14.440 --> 32:32.760
so you mentioned that we're producing the tools from bcc as example using python vpf do you have

32:32.760 --> 32:42.120
some other tools like property python vpf as well that are not in bcc I mean not tools but we have like

32:44.200 --> 32:50.840
kind of tests so like we implement a feature and then try to write tests for that and those are

32:50.840 --> 33:00.040
not tools that people can use it's more like this we pivoted to using converting bcc tools to python

33:00.040 --> 33:06.600
vpf because we thought that they might already have users maybe just a small comment if you want

33:06.600 --> 33:13.480
to get grad students interested I think it would be good to support for example the vpf

33:13.800 --> 33:21.560
m or the sketch to implement schedulers and start experimenting with that to make it really easy

33:21.560 --> 33:29.480
for students to hike on it or like networking you know some cases like that yeah that's something

33:29.480 --> 33:36.040
we are discussing with the professors that what they exactly need for their labs and since we have

33:36.040 --> 33:42.440
a good timeline for the false them we can implement all of these and then try to see if they're robust

33:42.520 --> 33:48.200
and because students would essentially be fuzzing it for us so yeah that's the plan

33:50.600 --> 33:54.600
all right thank you thank you

