WEBVTT

00:00.000 --> 00:10.680
So, yeah, my name is Mikhail, and I'm going to talk about IA, and so about what's new

00:10.680 --> 00:14.080
in Rust for a BFF.

00:14.080 --> 00:20.000
We were talking about IA in previous editions of PhosDem, so that's not going to be an

00:20.000 --> 00:25.840
introduction or a talk, quite the opposite, we'll deep dive into what's new, but to people

00:25.840 --> 00:33.280
who don't know, IA is a BFF library written purely in Rust, and it allows you to write

00:33.280 --> 00:37.680
both the user space part and BFF part in Rust.

00:37.680 --> 00:42.720
Of course, mixing with other languages is not possible, like the BFF component in certain

00:42.720 --> 00:47.440
other languages and still using IA in the user space or the other way around, but you

00:47.440 --> 00:50.480
can also do both in Rust.

00:50.480 --> 00:56.560
So the first question, which usually here from people, when we say that we do BFF in Rust

00:56.560 --> 01:04.560
is why, because the main selling point of Rust is not so valid here, because BFF is

01:04.560 --> 01:11.440
more used here, because we're a verifier, and well, my answer to that is usually that our

01:11.440 --> 01:16.800
choice of Rust, and our preference towards Rust is more about the developer experience

01:16.800 --> 01:19.920
rather than the memory safety.

01:19.920 --> 01:26.080
So we like cargo, and many people who are Rust developers, they today, like cargo, they

01:26.080 --> 01:33.760
like music packages, and they like all the tooling, which is built around cargo, and in IA,

01:33.760 --> 01:35.440
it's similar.

01:35.440 --> 01:41.600
For example, in the start with IA, we can start with cargo generates cover of the project.

01:41.600 --> 01:46.640
You can also use dependencies from Crase IA, so actually even in the BFF part, you can

01:46.640 --> 01:54.240
use crates if they are small enough, and plus, but not least, like Rust, there is option

01:54.240 --> 01:59.600
and result in pattern matching, and you can leverage this in the kernel space in BFF as well.

02:00.800 --> 02:06.640
So a cargo generates, like a sub-command of cargo, which you need to install additionally,

02:06.640 --> 02:11.280
which generates projects from templates, and we have a template as well.

02:11.520 --> 02:18.080
So the best way to start with IA, if you want to do it, is to do cargo generates IA, or S, IA template.

02:18.800 --> 02:24.160
And when you type this comment, it will prompt you, it will ask you some questions,

02:24.160 --> 02:30.960
it will ask you like what type of program you want to do, whether it's XDP, whether TCE, whether LSN,

02:32.160 --> 02:36.560
it will ask you some follow-up questions specific to the program, and then,

02:36.880 --> 02:41.280
yeah, it will generate something, which should be like kind of a hell of work for

02:41.280 --> 02:48.000
the given program type, so for XDP it will show you a program which locks packets for

02:48.640 --> 02:55.520
LSN, it will do some basic lesson hook for, I know, accepting the certain action.

02:56.720 --> 03:04.160
And about crates, so I mentioned that crates, which use a dependencies in BFF part need to be small enough,

03:05.120 --> 03:10.080
in RASN there is a concept called NoSDD, which means no study of library,

03:11.120 --> 03:16.560
which means that you can use only like the smaller version of study of library, PortCore,

03:16.560 --> 03:22.400
which has no operating system dependency, it doesn't have allocations, it doesn't have

03:22.400 --> 03:27.200
a synchronization primitive, but otherwise it's still like a basic library,

03:27.280 --> 03:34.960
which raspberries out of the box, and one of the main crates, external crates, which you can use

03:34.960 --> 03:42.160
in the kernel space when you write programs in IR, it's network types, it provides you the

03:42.160 --> 03:50.320
definitions of two and three networking headers, when you write BFF programs with 3 BFF,

03:50.320 --> 03:55.440
the way you go with generating network types is usually like getting them from VM Linux,

03:55.440 --> 04:03.360
so you use BFF tool to generate a Cheader 5 from BTF, and that includes the network types,

04:03.360 --> 04:08.720
and networks, but on the other hand, network types don't really change, they are standardized,

04:08.720 --> 04:16.160
so we decided that, yeah, we will ship it as a crate, and the big pro of this approach is that

04:16.800 --> 04:23.280
in kernel space you can use the network types crate with like minimum set of features,

04:23.360 --> 04:29.120
but then in user space you can do some more sophisticated packet parsing and cellularization,

04:30.800 --> 04:38.400
and yeah, we were presenting about IR, previous times as well, and the main points which

04:38.400 --> 04:45.680
changed since of time is first of all we have integration test framework, we are testing

04:45.760 --> 04:51.760
staff or arm, and we are also testing unit test and many other architectures,

04:52.880 --> 05:00.000
we ported all the infrastructure around IR from using perf buffers to ring buffers, and that includes

05:00.000 --> 05:08.400
the logging part of IR, so like the standard way of printing clocks in IR is using IR log,

05:08.400 --> 05:14.400
and IR log is using ring buffers to basically push the information from kernel space to user space,

05:14.400 --> 05:21.840
and then user space component is basically formatting those arguments and printing the outputs

05:21.840 --> 05:27.680
on your SDD out by default. We added more map types, for example, SK storage,

05:27.680 --> 05:32.960
task storage was coming very soon, we added for the sector and iterator programs,

05:34.160 --> 05:41.440
we added support of entirety of raw types in BTF, and I will dive into that and I will dive

05:41.440 --> 05:48.320
exactly into the status of BTF support in IR ecosystem, and we added also TCX links.

05:50.400 --> 05:57.920
Well, to give you some context, BTF is the type format like the debug sort of type format,

05:58.720 --> 06:07.440
which is way smart and worth, and it's used in the BTF world to do relocations of kernel types

06:07.440 --> 06:17.280
across versions, so for example, if your BTF program with kernel 6.2 and you'll love it in kernel 6.12,

06:18.480 --> 06:23.840
the BTF relocations are basically way of figuring out what's the offset of the field

06:24.560 --> 06:33.680
of the kernel structure accessing, and BTF is generated in two ways, so for BTF programs is generated

06:33.760 --> 06:39.760
by the compiler, so if you build your BTF program, then LLVM is basically responsible for producing

06:39.760 --> 06:47.120
BTF. When you build the kernel, you know, kernel is contains dwarf debug information,

06:47.120 --> 06:54.720
so what happens on the kernel site is that there is a tool called Payhole, which then

06:54.720 --> 07:03.280
transpires dwarf into BTF. For people who are not familiar with VM architecture,

07:03.360 --> 07:12.080
I mean, it's a very basic outline of it. LVM is you could say library for writing your own compilers,

07:12.080 --> 07:19.280
which provides you all the backends and LLVM provides all the infrastructure to build

07:19.280 --> 07:27.760
binaries for different architectures, and how you can use LVM is to create your own programming

07:27.840 --> 07:33.840
language, or like, support some programming language, while not worrying about

07:36.320 --> 07:42.640
compiling the target code, and both clank and trust see, like the default implementation of trust

07:42.640 --> 07:52.080
see are using LLVM underneath, and then how LVM works, well, frontends are supposed to generate

07:52.640 --> 07:59.440
immediate representation of LLVM, which is like a very minified language, which then

07:59.440 --> 08:08.640
decide LLVM is being optimized first, and then compile to the machine code, and when you have

08:08.640 --> 08:17.920
simple rust code like this, LLVM, when it's compile to LLVM IR, LLVM has this representation

08:17.920 --> 08:24.880
of the back info, which is like very similar to Dwarf, it already has Dwarf tags, it's not technically

08:24.880 --> 08:31.520
Dwarf yet, but it's like, once that before having a Dwarf, it's a lot of information, basically

08:31.520 --> 08:40.480
for every structure you are using, and for like all the details of your code, it has information

08:40.480 --> 08:48.800
like including also the, which part of file, in which part of file that code is, but BTF is

08:48.800 --> 08:56.480
minified, BTF doesn't need everything from that, so like from this debug info, translating it

08:56.480 --> 09:02.320
do Dwarf would be like basically translate it into one, and then BTF we kind of modify it just to

09:02.880 --> 09:13.440
have enough information to make relocations in the offsets, and BTF relocations lift in an

09:13.440 --> 09:23.840
LF header and VM produces them using the in-frenzix, and all the language front and in LLVM, which

09:23.920 --> 09:34.240
support BTF are supposed to use those in-frenzix to define the access to the fields for

09:34.240 --> 09:40.480
the structs in which we want to support BTF relocations, and in LLVM there is a concept of

09:40.480 --> 09:46.720
good element PTR instruction, which is basically accessing the fields of structs or enums or

09:46.800 --> 09:55.120
arrays, and what happens, what should happen in the compiler front ends, which want to leverage

09:55.120 --> 10:02.000
BTF relocations, is to detect, get the element PTR calls, which are concerning the structs related

10:02.000 --> 10:09.840
to the kernel, and then kind of rub them in those in-frenzix, and in clank is achieved by

10:10.160 --> 10:17.600
defining this preserved access index attribute to the struct, for which we want relocations,

10:17.600 --> 10:24.960
so for example, the task struct is very common, it's struct in the kernel which represents the

10:24.960 --> 10:32.640
tasks, the processes, for to make this example very short, I'm just outlining the PID of this process,

10:33.280 --> 10:39.600
and when you write the code, where you I know access or return the PID field,

10:40.800 --> 10:47.200
and you have this preserved access index attribute, you have this BTF, and let's assume that in the

10:47.200 --> 10:55.600
kernel, where you are compiling this program, PID is the 96 fields, and it has offset of those

10:55.600 --> 11:06.400
like 21,000 beats. What you end up with in BTF assembly is having a pointer of this task,

11:07.120 --> 11:13.680
and then adding offset of certain amount of bytes, like this amount of bytes is like this

11:13.680 --> 11:19.920
corresponds to this amount of beats, and yeah that's how it ends up in the assembly, but at the

11:19.920 --> 11:26.640
same time because we added this preserved access index, in the f section of BTF relocations,

11:26.640 --> 11:36.560
BTF X, you had this information that for tasks struct, and the field PID, in the assembly line 5,

11:36.560 --> 11:45.520
like in as offset, like refers to the number of line in the assembly code of your program,

11:46.240 --> 11:51.760
like here, and this is the information that the loader of the BTF program needs to do a relocation

11:51.760 --> 11:59.840
there and make sure to check out what's the offset of PID fields in the kernel to which the

11:59.840 --> 12:08.080
program is being closed is, and BTF loaders, like IR or BTF are responsible for doing those

12:08.080 --> 12:13.840
relocations, so relocations are not done in the kernel, they are done by the user space loader,

12:13.840 --> 12:21.440
which is about to load the program, and both IR and BTF are able to do it, so basically when

12:21.440 --> 12:29.120
you load the program and you have the relocation, it will also like alongside load the BTF from

12:29.120 --> 12:37.360
the kernel and we'll fix up your byte code, so for example in this case, again your program has

12:37.440 --> 12:42.640
different offsets, and the kernel has different offsets, for example in the kernel like there's one

12:42.640 --> 12:48.240
more field added before in the task struct, and like the offsets is like a bit forward,

12:49.520 --> 12:56.240
the BTF loader is supposed to do this patching, and IR does this kind of thing,

12:58.240 --> 13:05.760
but the missing piece in Rust is emitting the relocations, so IR is capable of doing

13:05.760 --> 13:12.240
relocations when they are there, so if you load BTF program, written in C compiler with clank,

13:12.240 --> 13:17.920
and load it with IR, IIS able to do the relocations, the missing piece which we are working on

13:17.920 --> 13:25.920
is adding the relocations support in the Rust compiler, and our idea is basically like

13:25.920 --> 13:31.120
to do something very similar as you have in clank to add this preserve access index

13:31.600 --> 13:38.480
attribute to the types and then the magic wood happen of about generating the relocations,

13:39.200 --> 13:45.440
and to sum it up, so like the BTF support in IR, you know, one year, two years ago it would be

13:45.440 --> 13:54.560
mostly red, but nowadays like the first thing is that when you have Rust code and all the

13:54.560 --> 14:00.640
Rust types can be converted into a correct BTF information by the VM, IIS able to do

14:01.120 --> 14:12.640
relocations, and very soon already on Master, we have like fully compatible with BTF map definitions,

14:14.080 --> 14:20.320
the last missing piece is the compiler support in Rust for emitting the relocations,

14:21.520 --> 14:29.680
and actually like last year we did quite a bit of work to about the first point,

14:30.000 --> 14:40.160
and we had to do some work at the VM to support all the Rust types, and we had to basically

14:40.160 --> 14:45.920
do two things, we needed to add this support for variant parts, and if you don't know what is

14:45.920 --> 14:51.440
that I will explain in the next slide, and we also had to strip name of names of pointer types

14:51.440 --> 14:58.400
in the debug info, so basically like the small difference between C and Rust is that C by default,

14:58.400 --> 15:04.240
if you have a pointer type, like for example start const, taskstract, it's not gonna generate

15:05.200 --> 15:11.760
name for it in debug info, Rust would do it, and then kernel and verifier and all the BTF

15:11.760 --> 15:18.400
infrastructure kind of realize on pointer types like not having a name, so we had to do a little fix

15:18.400 --> 15:26.480
up there, but variant parts was the most crucial work, we had to do in a little VM in order to

15:26.720 --> 15:35.680
support generation of really entirety of Rust types in BTF, and variant part is kind of like union

15:35.680 --> 15:45.600
in C, but it has a discriminant at the very beginning, and for example like in C, the union is

15:46.400 --> 15:51.760
segment of memory which can carry different variants, and like usually like it would take the

15:51.760 --> 15:59.680
amount of memory of the biggest variant, and for example like you can have to this union has to

15:59.680 --> 16:05.520
variants, it has this struct with to integer fields or has just one integer, so like by default

16:05.520 --> 16:10.320
this union like regardless of which variant it carries would have like enough space for this

16:10.320 --> 16:21.680
like upper struct, and in Rust like a very similar thing to to see unions are in us, and like

16:21.680 --> 16:28.560
they look the same in the code, but the crucial differences that in C like the unions they don't have

16:28.560 --> 16:34.960
the discriminator, and the most of C developers kind of work it around because like normally

16:34.960 --> 16:42.480
if you just have union by itself, if you do a mistake, and you do an assumption that it's a

16:42.480 --> 16:48.320
variant B and it's variant A, like you are kind of in trouble, it's undivided behavior, of course

16:48.320 --> 16:53.120
like they also see developers who would like embed this union into some struct and have a

16:53.120 --> 16:58.400
field indicating which variant it is or have some like integer outside, or like write the code

16:58.400 --> 17:03.440
in a way that like they are really sure that in the certain part of code, the certain variant

17:03.440 --> 17:10.080
like it's always going to be there, but in Rust they have this discriminant just for like

17:10.080 --> 17:19.920
extra safety, and well at the beginning like when we started the work on BTF, and we weren't

17:19.920 --> 17:27.120
aware of you know the BFF bucket in LLVM not supporting variant parts, if you had this kind of

17:27.280 --> 17:33.200
in Rust and you would try to compile it to BFF target with the back info, the M was just crashing

17:33.200 --> 17:39.360
like it just like didn't it was panicking, it's not what to do in variant part, after adding

17:39.360 --> 17:47.600
support to it, there was a lot of discussion how to represent this thing like one idea was to like

17:47.600 --> 17:52.880
represent it as a struct with the discriminator of field and then second field which would be the

17:53.840 --> 18:00.880
which would be represented as union, what we end up doing is that we just represent the whole

18:00.880 --> 18:08.720
thing as union, and the first field is a discriminant, and then we represents the variants later on,

18:10.240 --> 18:17.600
and yeah to sum it up, our future plans and ideas like in general, first of all,

18:17.600 --> 18:24.320
yes, doing the BTF locations Rust compilates really the last piece missing, to have a full

18:24.320 --> 18:33.200
query support, the full everything written in Rust in IA projects, second thing which we want to do,

18:33.200 --> 18:39.760
so for now current the IA has its own custom linker using the MS library,

18:40.080 --> 18:48.560
which is called BFF linker, and it operates on LLVM bitcolt, we would like to eventually

18:48.560 --> 18:56.480
like stop relying only on BFF linker and like align more with upstream and allow it to link

18:56.480 --> 19:04.720
with the new tiers, or also LLD if possible, but we would like to start with the new tiers because

19:04.720 --> 19:11.200
being used is already has BTF support, a BTF target support, and then also the idea which

19:11.200 --> 19:18.240
like is very widely discussed, or did there is like a first draft PR, a scabbing support for

19:18.240 --> 19:24.320
structups and scabics, I adopted by different projects and different companies,

19:24.320 --> 19:34.720
how offline just the new ones, so one of the heavy users of IA, or whom I work actually,

19:34.720 --> 19:42.400
it's Anza, Anza is a blockchain company which works on Solana, if you know it, and

19:43.360 --> 19:50.800
basically Anza develops agave the main Solana by data implementation, we care about performance

19:50.880 --> 19:57.520
a lot, and you know in distributed system it's very important that all the nodes communicate with

19:57.520 --> 20:06.160
each other very fast, so we use AFXZP for it, and we already started developing agave XDP library

20:06.160 --> 20:16.400
period and rest for using AFXDP, it's not mature enough to like move it to IA, but that's the

20:16.480 --> 20:25.840
long-term plan, a new recent user of IA is the main in the middle proxy, or meeting proxy,

20:25.840 --> 20:33.200
MIT and proxy, they started using IA, like not ever, ever yet, but they have like certain modes,

20:33.200 --> 20:41.200
like including wiregun and local redirect, where they offload some parts to BTF, and they

20:41.280 --> 20:49.280
write everything in IA for that, and yeah, I'm done here, and thank you for listening,

20:49.280 --> 20:52.560
and yeah, it's time for questions.

20:52.560 --> 21:07.200
Thank you.

21:09.200 --> 21:19.440
So my question would be in the description, it is said that you are now a tier 2 rest package,

21:19.440 --> 21:24.400
and you are going to go to tier 1, so what's missing to be there?

21:24.400 --> 21:29.680
Actually, it's a bit worsening, this is perhaps where we are in tier 3, and we are trying to go into tier 2,

21:31.200 --> 21:39.520
so yeah, we are in touch with the Rust team about that, there are two requirements from them,

21:39.520 --> 21:44.560
which we need to meet to like move up to this year, actually I mentioned that we want to

21:44.560 --> 21:49.920
support new tiers, like other upstream linkers, that's one of the requirements, like the Rust team

21:50.560 --> 21:56.320
is not really happy with us having like completely custom linker made just for one architecture,

21:58.080 --> 22:07.280
and the second thing is, so BTF's calling convention allows only having five arguments per

22:07.280 --> 22:11.840
function, and if you define a function with more arguments than that, it's not supported,

22:12.800 --> 22:18.800
Rust team is also not happy with that, so we will need to address that and find some solution for it.

22:20.880 --> 22:21.520
Those two things.

22:23.520 --> 22:29.520
So I wanted to just say, I work in the Rust container team, so I want to talk about this afterwards,

22:29.520 --> 22:34.080
maybe, let me make some progress there, and you have a question as well, yes.

22:34.080 --> 22:40.000
Yeah, you mentioned in the inam that you are omitting the discriminant and the debug info,

22:40.080 --> 22:41.920
how does this work with niche optimization?

22:44.480 --> 22:49.440
This is when Rust inam gets optimized the way, like the discriminant gets optimized the way,

22:49.440 --> 22:53.840
because the first variant, for example, is none, and the second is some of none, no pointer,

22:54.400 --> 22:58.000
which means that you don't need a discriminant to actually determine the variant,

22:59.600 --> 23:01.200
so there is no discriminant memory.

23:01.600 --> 23:08.000
Okay, so honestly for that, and does this optimization also affect the debug info,

23:08.080 --> 23:12.240
is the debug info fixed up as well, then it should be fine, because like what we do at the end,

23:12.240 --> 23:20.880
is that at the very end, after all passes are done, we basically transpile the VN debug info into BTF,

23:20.880 --> 23:28.720
so if the optimising away the discriminator already happens through passes, then we are safe.

23:28.720 --> 23:32.240
And so you can translate arbitrary debug info at this point.

23:32.240 --> 23:37.600
Yeah, exactly, okay, so if the debug info doesn't have this discriminator, it will not end up in BTF.

23:38.080 --> 23:43.760
Any other questions?

23:46.160 --> 23:46.480
Yes.

23:49.040 --> 23:52.080
Is CP1 a why a utilisation during buffers?

23:54.400 --> 23:56.160
Yeah, they are faster than per buffers.

23:56.960 --> 24:02.800
And thirdly, I mean, it's a synchronisation hotspot kind of.

24:03.440 --> 24:13.920
Yeah, I mean, so per buffers are an older type of a mapping VPF, but I mean, don't get me around,

24:13.920 --> 24:16.880
like per buffers support, in general, it's not going away.

24:16.880 --> 24:21.760
If you want to write your own program, which uses per buffers, it's still possible.

24:22.640 --> 24:28.800
We just switched over like the core components of IA to use ring buffer at the different buffers.

24:29.760 --> 24:32.800
And it's mostly for logging, so to give you some context,

24:34.240 --> 24:40.560
well, in the main way of logging and having messages from kernel space to user space,

24:41.280 --> 24:47.840
in the LBPF is using BPF printk, so basically like prints messages to the traceFS.

24:47.840 --> 24:54.160
In our case, we decided like first to push the logs through perf buffers,

24:55.440 --> 24:57.600
and we migrated that to ring buffers.

24:58.800 --> 25:04.560
Yeah, but per buffers support is not going away.

25:09.680 --> 25:10.560
Any more questions?

25:13.840 --> 25:14.720
Thank you.

25:14.720 --> 25:15.360
Thank you.

