WEBVTT

00:00.000 --> 00:15.000
I'm a bit close.

00:15.000 --> 00:18.000
Okay, you're rising.

00:18.000 --> 00:22.000
You're so wet here.

00:22.000 --> 00:48.000
Hello?

00:48.000 --> 01:16.000
Okay.

01:16.000 --> 01:32.000
All right.

01:32.000 --> 01:33.000
Hi.

01:33.000 --> 01:34.000
I'm Amy.

01:34.000 --> 01:38.000
I'm going to talk about ABI stability in the next kernel.

01:38.000 --> 01:41.000
How stable it is and how stable it could be.

01:41.000 --> 01:44.000
I work at Chain Guard as a software engineer.

01:44.000 --> 01:51.000
Chain Guard is a software supply chain company, so we build a lot of open source software, including Linux kernel.

01:51.000 --> 02:00.000
For some of those builds, we want to reuse pre-compiled object files from another kernel build.

02:00.000 --> 02:05.000
And when you do this, you quickly run into problems with the kernel's ABI.

02:05.000 --> 02:14.000
So a lot of this talk is from that perspective of trying to reuse these objects and work around the instability that there is as much as possible.

02:14.000 --> 02:15.000
So this is what I'll cover.

02:15.000 --> 02:22.000
We'll talk about exactly what we're reusing, and we'll talk about the issues that one encounter is when we're using the objects.

02:22.000 --> 02:34.000
How breakages start to appear and then how we could work around them and how the kernel could prevent some of these issues in the first place.

02:35.000 --> 02:46.000
So what I say when I mean object reuse is that I've compiled something.0 during a kernel build at some point, and I keep it around somewhere.

02:46.000 --> 03:03.000
Then at some point in a future kernel build, which might be a different kernel config or a different version, I pull something.0 into my new source tree, and I re-link it into VM Linux without re-compiling it.

03:03.000 --> 03:12.000
This is kind of hacky. Why would you do this? For a chain guard it is fips. If you are familiar, you might be familiar with fips.

03:12.000 --> 03:20.000
If you're not, it's a series of standards for processing government data and doing cryptographic operations on them.

03:20.000 --> 03:32.000
If you want to comply with fips, you need certain things such as crypto modules and entropy sources to be certified, and the certification can only be applied to a binary, not to the source code.

03:33.000 --> 03:47.000
We could certify the whole kernel binary, but that would prevent us from updating it ever until we build a new kernel and certify that, which takes time and money is a big lift.

03:47.000 --> 03:54.000
So how do we actually do this? First, we want to enable config WR and config object to old WR.

03:54.000 --> 04:02.000
When you're doing something unusual with the build system, pretty much any warning is indicative of a failure that you're going to see later on.

04:02.000 --> 04:06.000
These are usually indicative of deeper problems.

04:06.000 --> 04:15.000
Then we need to take our pre-built objects, copy them into the source tree, and write a make file rule to avoid re-compiling the object.

04:15.000 --> 04:20.000
Just touch something.0 is fine.

04:20.000 --> 04:29.000
So when you do this, how long does it actually remain compatible? It depends on how tightly you couple with the surrounding kernel code.

04:29.000 --> 04:39.000
All you do is call print k. You can keep rebuilding for a very long time. If you do actual work and you call actual APIs, you break a little more often.

04:39.000 --> 04:45.000
But generally, you can expect this to break around once per major kernel version.

04:45.000 --> 05:01.000
So for example, if we build an object from 6.6.1, and we try to build that object with 6.6.2, things generally work fine, and we can continue with pretty much the whole 6.6 series.

05:01.000 --> 05:15.000
The APIs are pretty stable within a major version, but as soon as we move to 6.7.1, things break pretty much immediately.

05:15.000 --> 05:22.000
So what is actually breaking? Most of the time, it's not the source code compatibility.

05:22.000 --> 05:33.000
We can take our source code for 6.6.1, put it in the 6.17, 6.7.1 tree, and recompile, and everything pretty much works.

05:33.000 --> 05:41.000
The internal APIs and the functions that you're calling are pretty stable and don't change a lot.

05:41.000 --> 05:52.000
And when our builds do break, it looks something like this. We either get undefined symbols, unreachable instructions, or BTFID mismatches.

05:52.000 --> 06:03.000
Undefined symbols generally means that you're calling a function or accessing a constant somewhere that doesn't exist anymore.

06:03.000 --> 06:12.000
Unreachable instructions is usually because someone was supposed to call you, and their code was refactor, and they stopped calling you.

06:12.000 --> 06:29.000
And BTFID mismatches are due to the kernel assigning different IDs during BTFID generation, and your object disagrees about the ID of a symbol.

06:29.000 --> 06:39.000
So if a compile does it actually work? Well, there are really only two outcomes. You either boot or you page fault pretty much immediately.

06:39.000 --> 06:46.000
If you booted, you probably reached user space, and your module is either loaded or it's built in, things are great.

06:46.000 --> 06:57.000
If you did not boot, it's probably a page fault from your functions accessing something they weren't supposed to, either while reading from the stack or while returning.

06:57.000 --> 07:10.000
Generally, it's because your functions in your pre-compiled object and functions in the rest of the kernel that you freshly compiled disagree about what registers things are supposed to be in.

07:10.000 --> 07:16.000
Generally, either works or it doesn't, and if it works, you'll know right away.

07:16.000 --> 07:22.000
So how do we fix these issues and make it easier for us to keep reusing objects?

07:22.000 --> 07:38.000
There's a couple things that are pretty easy to fix quickly. One of them is your tool chain. Generally, this isn't a big deal because a valid L object is a valid L object, but if you really need to, you can just pick a major version of your compiler and move on.

07:38.000 --> 07:48.000
The second is BTF mismatch issues. When generating the BTF information, some of this gets embedded into the object itself in the data sections.

07:48.000 --> 07:55.000
And the workaround for this is to just move all of your module underscore something macros to a different source file.

07:55.000 --> 08:06.000
All of the BTF information lives with those. So you just stick them somewhere else. Then you don't have any BTF information to worry about.

08:06.000 --> 08:21.000
So this is the first of the two difficult parts is making function calls. If you call a function in another object, the binary interface there needs to be pretty much the same as the function, your pre compiled object was expecting to call.

08:21.000 --> 08:43.000
If the interface for the function is stable, then everything is okay. But even if the API for a function is stable, it's still possible to break the API by changing the type of an argument in the function signature or other minor details that change the way you're compiler decides to set up the stack.

08:43.000 --> 08:57.000
What matters is the prologue of the function, we want the stack to be exactly the same way as we were originally expecting or at least in a way that looks reasonable to the linker and behaves the same at runtime.

08:57.000 --> 09:04.000
The simplest way to deal with this is just to write a shim layer and only call functions that you control.

09:04.000 --> 09:19.000
The universal print k, you call shim print k. The API and API of your function never change because you're in control of them and the real functions that you're actually calling can change as much as they want.

09:19.000 --> 09:32.000
There are some downsides to this, namely there's a performance penalty for calling two functions every time you want to call one function. And of course you can't inline anything because that defeats the purpose of the shim.

09:33.000 --> 09:40.000
The kernel also has these great validation tools for memory access, undefined behavior and code coverage.

09:40.000 --> 09:55.000
Unfortunately, all of them work via compiler instrumentation and sometimes they inject function calls into your code, which have all of the same downsides as function calls that you make on purpose, except that it's not in the source code.

09:55.000 --> 10:07.000
So I can't run any shims for these. So we need to disable them, but fortunately we can disable them just for our pre compiled objects, not for everything.

10:07.000 --> 10:14.000
And sometimes the kernel changes something very low level about the way objects or functions are laid out.

10:14.000 --> 10:26.000
They have two specific commits up here, which are renaming a data section in the object files and changing the register of the stack protector guard on x86.

10:26.000 --> 10:37.000
These kinds of low level changes are inlined into every function and every object layout and everything the build system touches has to agree on these things.

10:38.000 --> 10:54.000
When these kinds of things change, there is no getting around this. If you want to build with a kernel that has these commits from an object that was built without these commits, you need to revert them or patch around them, which you really don't want to do.

10:54.000 --> 11:05.000
The whole point of trying to get a stable ABI is to pull new changes without having to patch things to recompile.

11:05.000 --> 11:15.000
So could we have a more stable ABI without large changes to the way the kernel is developed? It can't be stable stable.

11:15.000 --> 11:25.000
Part of the problem is just how expansive an ABI really is. It's more than compiler plus architecture plus source code equals ABI.

11:25.000 --> 11:34.000
Anytime you embed information into a binary and other binaries and to agree on that information, it's now part of the ABI.

11:34.000 --> 11:41.000
Even the IDs for the VPF type information, for example, need to be compatible.

11:41.000 --> 11:45.000
There's a lot of good reasons for the kernel ABI to remain unstable.

11:45.000 --> 11:56.000
And if you want to read more, there's a file in the kernel source tree called stableAPI nonsense.rst, which talks more about why having a stable ABI is a bad idea.

11:56.000 --> 12:05.000
However, there are a few changes which could be introduced to kernel development which could make a stableish ABI.

12:05.000 --> 12:29.000
It wouldn't actually be stable, but it would be stable enough that Linux distributions which can control tool chain kernel conflicts, target architectures, and kernel conflicts very tightly can build on top of this stableish base to make kernel packages which have a nearly stable ABI.

12:29.000 --> 12:45.000
How do we accomplish that? First, everything I'm saying is relevant to LTS kernels. If you want stability, you're going to pick an LTS kernel anyway, because fewer patches mean that it's already more stable, even though there are no promises.

12:45.000 --> 12:54.000
The problem is that an LTS version would have a stableish ABI would make an official pathway to something like this kind of project.

12:54.000 --> 13:10.000
Even incremental improvements for resolving some of these issues would improve ABI stability a lot. If you did convince greKH that this was a good idea, what this would look like is these two changes to how patches are accepted into LTS kernels.

13:10.000 --> 13:27.000
The first is freezing the function signatures for anything with export symbol. If you want to stable ABI, but you don't want to enforce a stable ABI, what you can do is enforce restrictions on function signature changes.

13:27.000 --> 13:43.000
You change the entire content of a function at any time. You do not have a stable ABI. But if you're restricted in when you can change the signature, then you have a stable enough base to build a stable ABI on top of it.

13:43.000 --> 13:52.000
Your function might do something completely different from version diversion, but you don't have to recompile the call it.

13:52.000 --> 14:12.000
The second change is refusing patches that make changes to the low level build system primitives. These are pretty much showstoppers for ABI stability. They can't be worked around. You have to patch around them, and even just restricting these kinds of changes would make the ABI much more stable.

14:13.000 --> 14:32.000
The result of this is not a stable ABI. It's a stable ABI. To take advantage of that stable ABI would still require some patches to the pre-built object, requires you to write code in a certain way, a disabled compiler instrumentation.

14:32.000 --> 14:45.000
But it avoids the need to patch new code from the kernel when you pull updates and recompile with your pre-built object.

14:45.000 --> 14:57.000
It's an ABI that's stable enough that the code that you're compiling to make your pre-built object doesn't need major patches to take advantage of the stability.

14:57.000 --> 15:23.000
So why even do this? Who would benefit? The fifth case that we have is making a certified binary, and it means that if you use the least reusable approach as it is now, certifying your whole kernel, you are missing out on security updates to keep your certification.

15:23.000 --> 15:36.000
This is the way many organizations manage their fifth kernels today. They certify a single binary, and they stay on that kernel for maybe a year, maybe two years.

15:36.000 --> 15:55.000
The whole point of fifths is to improve security, but to achieve compliance with it, you need to forego updates and stay on this kernel for one or two years as vulnerabilities are discovered and CPDs accumulate.

15:55.000 --> 16:00.000
That's all I have. Thank you for listening.

16:00.000 --> 16:13.000
Questions?

16:13.000 --> 16:42.000
So again, isn't this a case of the tailwagging the dog?

16:42.000 --> 16:53.000
I think this fifths restriction to compile objects or certification only for binary objects is something that was related to worries about the compiler being corrupted or something like that.

16:53.000 --> 17:00.000
Wouldn't it make more sense to lobby fifths to sort of update its outdated criteria?

17:00.000 --> 17:17.000
Yeah, absolutely. Certifying binaries is how valuable is it really that I know that my crypto module works exactly the way I expect within the down to the assembly.

17:17.000 --> 17:24.000
I also know that that module has like five or six CVEs that I'm not allowed to update to patch. I agree.

17:24.000 --> 17:31.000
Yeah, the better solution is to stop certifying binaries, but personally I have no access to lobbyists who could do such a thing.

17:31.000 --> 17:36.000
So I'm left with this.

17:36.000 --> 17:47.000
I'm still a bit confused on what you actually then do instead. Do you certify specific object files from a previous kernel or is this about the specific kernel module that you're certifying?

17:47.000 --> 17:59.000
Yes, we certify specific objects from a previous kernel build right and then it does can be built in those can be linked into a module.

18:06.000 --> 18:23.000
I have the question, how do you certify based on the code that is internally used on the function calling and the branch predictions or just regularly on the functionality and how it behaves?

18:23.000 --> 18:27.000
What is the perspective of certifying such a binary?

18:27.000 --> 18:36.000
The process for certifying a binary is functional testing and code review.

18:36.000 --> 18:53.000
So we build the object based on the object over, we send a kernel built from the object and the source for use to build it and the review process involves functional testing of what's in the object.

18:53.000 --> 19:17.000
In this case, in our case it was an entropy source, so the functional testing involves entropy sampling and ensuring that it's ensuring that it's random enough and then also code review of the C source code.

19:17.000 --> 19:26.000
I'm not sure if I heard correctly, did you say you hoped that the LTS maintainer's suit agree with this or you talked to them already?

19:26.000 --> 19:31.000
No, I have not talked them about this.

19:31.000 --> 19:36.000
I don't know if I agree that this is a good idea.

19:36.000 --> 19:45.000
I think this is the path I'm left with for now, is patching my copy of the kernel source code to be capable of doing this.

19:45.000 --> 19:55.000
This is a path that the kernel could take a better option, would be if FIP certification stopped requiring binaries.

19:55.000 --> 20:11.000
So I'm not sure whether Greg actually wrote this table, I've been on since document, but I've seen him reference it a lot of times, so I think he would not be amenable to this.

20:11.000 --> 20:13.000
I don't think so either.

20:13.000 --> 20:27.000
But you can adjust the stable backport sloker to work around if they change something.

20:27.000 --> 20:40.000
For the most part, we can implement most of these work around ourselves, which is occasionally have to patch new changes to avoid object layout or stack protector changes.

20:42.000 --> 20:43.000
Does that answer your question?

20:43.000 --> 20:44.000
Yeah, thank you.

20:52.000 --> 21:05.000
When it comes to the changes that you had vice-poor, would you advise like a gradual adoption across the subsystems or it's just all or nothing?

21:05.000 --> 21:31.000
It's all or it's all or the obsystem, right, for example, there's our there is subsystem that tend to fuck up the situation like more of them than others or like

21:31.000 --> 21:46.000
I'm not sure if there are subsystems which would already be more or less stable, which could afford to adopt this more easily.

21:46.000 --> 21:53.000
I think there are subsystems where this is more valuable to adopt than others.

21:53.000 --> 21:56.000
The crypto subsystem specifically.

21:56.000 --> 21:57.000
Obviously.

21:57.000 --> 21:59.000
That's my use case here.

21:59.000 --> 22:01.000
Thank you so much.

22:03.000 --> 22:08.000
With the addition of Rust, even the compiler doesn't.

22:08.000 --> 22:14.000
Even if you use the same compiler, you cannot guarantee a stable API for us to work.

22:14.000 --> 22:17.000
So how can we handle that?

22:17.000 --> 22:19.000
Sorry, can you repeat the question?

22:19.000 --> 22:20.000
And it's quite here.

22:20.000 --> 22:21.000
Yeah.

22:21.000 --> 22:27.000
Even if you use the same Rusty compiler, you cannot guarantee the API.

22:27.000 --> 22:34.000
If you stick to the same version of the compiler?

22:34.000 --> 22:35.000
Yeah.

22:35.000 --> 22:38.000
So what are the plans to handle that?

22:38.000 --> 22:44.000
You can guarantee aBI stability if you stick to the same version of the compiler.

22:44.000 --> 22:53.000
But even if things like the kernel config can change the way the API is laid out.

22:53.000 --> 22:54.000
Yeah.

22:54.000 --> 23:01.000
And when you're pulling in new changes from the kernel, these might include updates to the defaults or anything like that.

23:01.000 --> 23:11.000
The compiler is just one variable and easily the most easily controlled, I think.

23:11.000 --> 23:14.000
Anything else?

23:14.000 --> 23:19.000
Oh.

23:19.000 --> 23:20.000
Do you have any ID?

23:20.000 --> 23:25.000
How many researchifications that this saves you?

23:25.000 --> 23:31.000
Or what is like, how long does this certification of an object generally take?

23:31.000 --> 23:34.000
Yeah, like how long in calendar time?

23:34.000 --> 23:37.000
Yeah.

23:37.000 --> 23:42.000
Without any of these changes, it lasts like two months.

23:42.000 --> 23:50.000
With all of these changes, roughly one to two years before you have to build a new object.

23:50.000 --> 23:55.000
Before you hit some change that you can't patch around and you can't deal with.

23:55.000 --> 24:00.000
If you want to certify, if you go to FIPs to certify it.

24:00.000 --> 24:02.000
Depends on what kind of certification you ask for.

24:02.000 --> 24:09.000
If you ask for certification of your entropy source, you can get it back in three to six months.

24:09.000 --> 24:15.000
You ask for certification of your whole kernel acting as a crypto module.

24:15.000 --> 24:19.000
You get it back in one to two years.

24:19.000 --> 24:21.000
Maybe three if you're on lucky?

24:21.000 --> 24:24.000
No, I understand why you do this.

24:32.000 --> 24:36.000
So there is a way of doing this just for FIPs compliance.

24:36.000 --> 24:40.000
You're interested in and we've discussed it at the kernel maintainer level.

24:40.000 --> 24:43.000
It's not actually making the ABI stable.

24:43.000 --> 24:48.000
What we've actually discussed is working out the crypto modules so that we can compile them separately

24:48.000 --> 24:51.000
and then feed the binaries into the kernel build.

24:51.000 --> 24:56.000
So this is not a stable ABI, but we would be able to keep the crypto module stable across

24:56.000 --> 25:01.000
rebuilds of the kernel, which as far as we think is sufficient to certify FIPs.

25:01.000 --> 25:04.000
Or at least as far as my FIPs are to regulate as tell me.

25:04.000 --> 25:06.000
That's sufficient to certify FIPs.

25:06.000 --> 25:08.000
Would that also work for you?

25:08.000 --> 25:12.000
Are you reusing the same elf objects?

25:12.000 --> 25:13.000
Yes.

25:13.000 --> 25:16.000
So basically it's a hack to the kernel build.

25:16.000 --> 25:20.000
So you get the build of the object modules that are the crypto ones.

25:20.000 --> 25:22.000
You take them out and certify that.

25:22.000 --> 25:27.000
And then the next time you do a kernel build, instead of compiling crypto modules from source,

25:27.000 --> 25:31.000
you just put the binary object files back and we build it.

25:31.000 --> 25:35.000
So it doesn't give you the exact binary stability you're looking for, but it,

25:35.000 --> 25:40.000
it, to us it seems to be sufficient to satisfy the FIPs criteria of not offering a crypto.

25:40.000 --> 25:41.000
All right.

25:41.000 --> 25:43.000
Well, I didn't know.

25:43.000 --> 25:44.000
I was unfamiliar.

25:44.000 --> 25:48.000
So David Woodhouse is the one at Amazon who's actually looking at that.

25:48.000 --> 25:51.000
Microsoft will probably follow whatever Amazon does.

25:51.000 --> 25:52.000
Okay.

25:52.000 --> 25:53.000
Yeah.

25:53.000 --> 25:54.000
I'm unfamiliar.

25:58.000 --> 26:00.000
Anything else?

26:00.000 --> 26:04.000
I think one's twice done.

26:04.000 --> 26:05.000
Thank you.

26:05.000 --> 26:06.000
Thank you.