WEBVTT

00:00.000 --> 00:10.400
So my name is Peter, I work for arm and I've been sort of tangentially involved in the

00:10.400 --> 00:16.240
point of authentication IBI, mostly by deans of being the author of the specification.

00:16.240 --> 00:21.440
I've written absolutely none of the code, so at this particular point I'm presenting

00:21.440 --> 00:24.240
mostly other people's work here.

00:24.240 --> 00:29.000
So first of all, what I'd like to do is quickly explain what point of authentication is,

00:29.080 --> 00:32.920
I'm not sure everyone will know what that is and then I'll go on for why I'm talking about this.

00:34.520 --> 00:40.440
So just in a nutshell, and it's very, very simple as forms, you've got two basic operations

00:40.440 --> 00:46.600
to first of which is sign and sign will take you a raw pointer, a secret key that's sort of

00:46.600 --> 00:52.200
known to the hardware and a local discriminator that's sort of normally derived from something local

00:52.200 --> 00:57.000
to a particular function and the instructions that you'll see in a sembler here are called

00:57.000 --> 01:00.520
start with pack for a point of authentication code.

01:00.520 --> 01:07.720
The nicer they start with sign, but unfortunately it's pack and then there's the reverse operation

01:07.720 --> 01:13.160
of that which is an auth short for authenticator obviously and that takes a sign pointer in terms

01:13.160 --> 01:18.040
it back into a raw pointer and it uses the same key and the same discriminator that was used

01:18.040 --> 01:23.000
when you sign your pointer and then you have the alt instructions that are basically the

01:23.000 --> 01:27.880
mirror image of the previous ones at that point. So how's this working hardware?

01:27.880 --> 01:32.760
Sorry, I just didn't mention the key is shared secret, discriminator is local and

01:34.200 --> 01:40.040
the tricky bit about the point of authentication, ABI is that you've got to choose which

01:40.040 --> 01:44.600
point is to sign, you've got to choose what key to use and you've got to choose what to discriminate

01:44.600 --> 01:48.680
to use for every particular pointer in the program and everyone who's signing and everyone who's

01:48.760 --> 01:54.520
authenticating has to agree on what those are and every time you change it it's an ABI break.

01:54.520 --> 02:00.120
So it's one of those things where there are deployment challenges with this to say the least.

02:02.120 --> 02:11.080
And there's another form of what you would call ABI clean or at least an ABI neutral form

02:11.080 --> 02:17.480
of pointer authentication. You might use already on an A64 system that's branch protection equal standard

02:17.560 --> 02:22.200
which is where we use pointer authentication to protect the return address. Whereas pointer

02:22.200 --> 02:28.440
authentication ABI goes a lot further than that. So how's this working hardware? Well,

02:28.440 --> 02:33.720
what we've basically explained is that no one yet has got 64 bits of virtual address,

02:34.440 --> 02:41.560
addressable memory. So some of the spare values at the top of the pointer available for extra

02:41.640 --> 02:47.480
metadata and what we do is we use some of that to store the pointer authentication code.

02:47.480 --> 02:53.480
So when we do one of these pack instructions, we take the key and then two registers, one of them

02:53.480 --> 02:59.160
contains the pointer that we want to sign and another one that takes the modifier and then we put

02:59.160 --> 03:07.800
the result of that calculation into the top bits at that particular point. So why am I actually talking

03:08.760 --> 03:15.080
about this? So pointer authentication ABI is not new. It's currently restricted

03:16.600 --> 03:22.520
at the moment in terms of use of pack to the branch protection equal standard. And that's as I've

03:22.520 --> 03:26.840
mentioned before, optimised for being as deployable as widely as possible. It all work on all

03:26.840 --> 03:33.720
machines because it's implemented in the hidden space. Now the PLTABI has existed as a specification

03:33.800 --> 03:41.640
for about five years and this is all derived from Apple's 64E ABI. So that was the genesis of this

03:41.640 --> 03:47.160
and Apple have done a lot of the hard work for this and you'll be able to find a lot more information

03:47.160 --> 03:53.240
on Apple platforms for the ARM 64E. I'm talking specifically here about the elf implementation

03:53.240 --> 03:59.000
of that which is sort of more deployable on potentially Linux potentially other systems that use

03:59.560 --> 04:05.480
else. Now the pointer authentication ABI extends pointer authentication to all code pointers

04:06.680 --> 04:13.160
but the downside of that is that you need hardware that supports it which is ARM V8.3A and it's

04:13.160 --> 04:20.600
likely an ABI break if you've already got an existing system. So the specs being around for about

04:20.600 --> 04:26.440
five years there's been a group diligently, well from access, off-tech, a company who's been

04:26.520 --> 04:31.480
diligently working away on an elf implementation for, I don't know, probably the last three years

04:31.480 --> 04:38.760
or so, maybe two, and upstream LLVM has finally got an implementation for testing.

04:40.120 --> 04:47.000
It's not an official target as yet because there's not a fully defined ABI but it's certainly

04:47.000 --> 04:51.160
now something that you can try out with upstream LLVM. So I'm just basically talking about it here

04:51.160 --> 04:57.960
to ways raise a bit of awareness of it or what it might be able to do because we're also kind

04:57.960 --> 05:01.880
of in a chicken and egg problem and until somebody implements it no one can try it and no one's

05:01.880 --> 05:09.240
going to bother implementing until somebody wants to use it. Okay and I'm talking about embedded

05:09.240 --> 05:14.120
systems particularly that's mostly because that's where my background is but also their metal doesn't

05:14.120 --> 05:20.360
have as many ABI continuity problems you might be able to just have a completely separate

05:20.360 --> 05:27.000
system target that and go rather than trying to say false it into an extribution. Okay and yeah

05:27.000 --> 05:31.640
I would like to see points for authentication ABI to have an official elf target that anyone can

05:31.640 --> 05:40.600
use to build a system. Okay so yes this is just an example code to show how things would go. So

05:40.600 --> 05:46.920
you can imagine we've got a C++ VTables, so VTables are tables of co-pointers for example.

05:47.000 --> 05:51.320
VTables is an interesting case because there's actually a VTable point for itself and that is

05:51.320 --> 05:55.480
strictly speaking a data pointer but the point to authentication ABI thinks that is important

05:55.480 --> 06:02.920
enough to protect anyway. So we take this so what this thing is actually going to do is going to load

06:02.920 --> 06:08.920
the VTable VB, it's going to load the address of the function from the VTable and then it's going

06:08.920 --> 06:14.280
to indirectly call it. So we look at the generated code from this we see the use of the point

06:14.360 --> 06:21.960
to authentication instructions. So we've got the first one pack IBSP which is kind of a short hand

06:23.160 --> 06:29.000
instruction for protecting the return address using the stack pointer as modifier and you'll see

06:29.000 --> 06:36.280
that matched at the bottom with the retab which is kind of a combined return and authenticate at this

06:36.280 --> 06:42.920
point. So the other instruction you can see there is AltDA which is DN is authentic the VTable point

06:42.920 --> 06:50.280
of so that's already been signed when the class was created and then this BLRAA which is a combined

06:50.280 --> 06:54.440
branch and authenticate at that particular point. So at this point you don't see any of the signing

06:54.440 --> 07:01.080
but above those or engine instructions you can see the moth case that's the blending of the

07:02.120 --> 07:08.200
discriminator as 16 bit value into the address to form the modifier, particular point.

07:08.280 --> 07:15.640
Okay, so what actually exists in upstream L of the end today? So we have an experimental

07:15.640 --> 07:22.840
target called PLF test which you can use with the Linux OS at the moment. I think it's

07:22.840 --> 07:27.800
restricted to Linux if you try it with something else it probably won't work but yes target equals

07:27.800 --> 07:34.280
AR64 Linux PLF test to get some sample code generation going. Yeah and there's no one's committed

07:34.360 --> 07:39.320
to an ABI yet. It's currently for testing purposes so it's been deployed so that we can

07:39.320 --> 07:45.560
write tests and upstream L of the end to it. There's also a header file coin port pointer off

07:45.560 --> 07:51.400
dot h and that's where you can manually control how pointers are signed with a things like the

07:51.400 --> 07:57.880
pointer off qualifier and that's recommended for things like C function pointers because C lets you do all sorts

07:57.880 --> 08:02.920
of stuff with function pointers that limits what sort of modifier and address diversity can

08:03.720 --> 08:09.880
be done with it but if you're manually doing it yourself you can put more protection on it than the

08:09.880 --> 08:15.560
basic one. Okay and of course the killer at the moment is you need compatible runtime libraries

08:15.560 --> 08:21.960
and you need a dynamic linker to support it. So why do you need dynamic linker support? It turns out

08:21.960 --> 08:27.800
this is because quite a lot of set pointers are statically initialized and because the keys only

08:27.800 --> 08:34.120
known at runtime the static linker can't sign any pointers and before hands it has to be signed

08:34.840 --> 08:40.600
when the program loads up and to do that we've extended some of the dynamic relocations and so

08:40.600 --> 08:46.200
that when they do the relocation calculation they do an immediate sign afterwards. Of course the

08:46.200 --> 08:53.320
dynamic linker needs to know how the signings done and what I was assigning schema is sort of embedded

08:53.400 --> 08:58.760
in the location that you're relocating so if the dynamic linker will pick that up read it which tells

08:58.760 --> 09:05.080
it exactly all the instructions it needs to do. Okay so if you want to try this out today in Linux

09:05.080 --> 09:11.800
user space the team ss of tech have provided some build scripts where you can build a muscle based

09:12.920 --> 09:21.000
Linux statically linked Linux tool chain so yes basically there's a a docker container that you

09:21.000 --> 09:25.960
can run that will build your tool chain. It produces a squash FS file system that you can mount

09:25.960 --> 09:33.000
and then from that you can run the the programs produced on quennu user mode or if you

09:33.000 --> 09:39.160
lucky enough to have an 8.3 save for example you're running Linux on a on a Mac via some kind of

09:40.200 --> 09:45.640
virtualization or whatever then you can run it directly on that at that point so that's the easiest way

09:45.640 --> 09:52.760
to try this out today yeah building the tool chain isn't too much for problem okay so what to what

09:52.760 --> 09:59.000
my part in this I guess is trying to see can we run this on bare metal okay so as I mentioned before

09:59.960 --> 10:04.840
hardware requirements are less of a problem at that point so what we need to do build ourselves

10:04.840 --> 10:10.120
an embedded tool chain used quennu to check everything's running so what's on my shopping list at the

10:10.120 --> 10:14.760
moment so first of all I could add peel test support to the bare metal drive because it's currently

10:14.920 --> 10:21.080
fixed Linux I've got a compile all my runtime libraries I've got to get some support code to actually

10:21.080 --> 10:26.840
turn the point of authentication on I've got to get a link a script that's going to get the dynamic

10:26.840 --> 10:31.480
relocations and put them in a location where I know where they are so I can go and resolve them and

10:31.480 --> 10:35.880
then I've actually got to write the resolve that I go and do that okay and then I've got to make

10:35.880 --> 10:43.720
sure the excuse me the the link can go and pick up the things down okay so what I've done here

10:44.120 --> 10:50.040
is I've made a fork of L of the M which builds all all of this from L of the M components

10:50.040 --> 10:55.640
and I've put the link in the reference as many as interested in trying it out so what's my experience

10:55.640 --> 10:59.560
so adding the bare metal target didn't take very long it was just basically cutting pasting

10:59.560 --> 11:04.920
code from the Linux driver L of the M lipsy had some assembly that changed I won't go into too much

11:04.920 --> 11:10.120
of the details there but it's essentially down to that in minor assemblies doing direct branches

11:10.200 --> 11:15.160
but when you access things with the point of authentication ABI it wants to think it wants it

11:15.160 --> 11:21.240
done indirectly via a register so you need to change the constraints and instruction used and

11:21.240 --> 11:26.840
that bit of link a script there is just going to define some symbols relay didn't start relay

11:26.840 --> 11:32.360
didn't end I just I chose those names so that I can find them later on okay this is how you

11:32.360 --> 11:37.240
would do the dynamic relocations what I've got there is the things that are in sign on there

11:37.240 --> 11:42.920
are just where the signing scheme was encoded for example it would be like if addressed diversity

11:42.920 --> 11:49.560
and discriminator then take the current location of the relocation blend the 16 bits of

11:49.560 --> 11:53.960
the discriminator into that value then sign with the key that's in that particular bit so it's not

11:53.960 --> 12:00.040
particularly complicated okay so my last slide here besides the references what the next steps

12:00.040 --> 12:03.800
are P.O.T.A.B.I so yeah the support is currently chicken in the air at the moment so

12:04.440 --> 12:10.280
it's in their own the team access soft tech are doing this for a downstream platform so this

12:10.280 --> 12:16.760
is not something that's easily available to use but to get upstream support we need at least

12:16.760 --> 12:22.840
one platform of something real to actually use it I think that it's likely that the number of

12:22.840 --> 12:29.000
possible signing schemers will coalesce into a small number of variants maybe say three and at that point

12:29.000 --> 12:33.480
there's a chance to get this in as a properly supported target and then anyone can build a tool

12:33.480 --> 12:38.600
chain from there okay so just some last references here the one thing I would say so if you

12:38.600 --> 12:42.760
if you are interested in this at all the first one there the client documentation on point

12:42.760 --> 12:46.840
authentication is very good because it also goes through some of the caveats that you need to do

12:46.840 --> 12:51.240
to avoid things like signing or recalls and how to really protect your system so it's a very

12:51.240 --> 12:55.880
very good document fully recommend that and then there was a talk last year at the L.O.B. I'm dev

12:55.960 --> 13:02.600
meeting so it's now two years ago we're in 26 now from access soft tech that's also with

13:02.600 --> 13:08.440
watching so that's all I had for you today there is there maybe some time for questions or I may

13:08.440 --> 13:12.840
have over run two minutes I can probably take a couple of questions but if not find me outside

13:13.480 --> 13:17.480
Thank you

13:21.720 --> 13:28.920
you know go a bit more to detail like how the P also ABI and I'm six before E.D.E.D. and so what are the

13:28.920 --> 13:33.800
differences on that promising for example the keys to be used for the different signing of

13:34.440 --> 13:39.640
right okay so I don't know precise details obviously the object file format's very different

13:39.640 --> 13:45.880
so there's macro at one point L from the other I believe there may be also differences in some

13:45.880 --> 13:53.400
of the relocations that get used I believe the signing scheme I was picked to be as close as possible

13:53.400 --> 13:57.400
just to know as you don't have if this do this all in the other in devrims and no

13:57.400 --> 14:03.880
Nicholas differentiation and there are things like is the got signed or access to the got because

14:03.880 --> 14:09.960
it's your to if you've got relocation we don't need to sign the got but if your platform doesn't

14:09.960 --> 14:18.200
have that you do so it's mostly a whole bunch of toggleable things and so yeah I'd say it's

14:18.280 --> 14:29.320
very similar at that point that I couldn't give you detail sorry yes yeah where is this a bit

14:29.320 --> 14:33.560
on finance okay it's it's hard work yeah so there are various keys for various

14:33.560 --> 14:39.240
exception levels so user space is one exception level there'll be a kernel level exception level

14:39.240 --> 14:47.160
each of them have got keys that can be set by the exception level higher

14:47.160 --> 14:54.840
it's I think it's just standards set hardware is to that type of thing but yeah

14:55.400 --> 14:59.480
okay so I guess if there's no more questions it would probably best to hand over the next

14:59.480 --> 15:09.240
um oh one more with that and I don't know the full details I think at the moment it's used

15:09.240 --> 15:15.720
by kernel space behind the scenes I don't I there is at least some documentation of how to

15:15.720 --> 15:21.560
use it in user space but I don't know how widely it is unfortunately not very familiar with macOS at

15:21.560 --> 15:30.920
that point yeah okay thank you very much

