WEBVTT

00:00.000 --> 00:22.000
Okay, should be back online, yeah.

00:22.000 --> 00:25.000
Right, the room is full.

00:25.000 --> 00:29.000
That's great as every year.

00:29.000 --> 00:33.000
It's a third year at GCC Deafroom, fully packed, great.

00:33.000 --> 00:42.000
Up next is Slantlot 6, who is a maintainer of GDP, in particular of the AMD GPU back end,

00:42.000 --> 00:49.000
and developer of the downstream rock GDP support for AMD GPUs.

00:49.000 --> 00:56.000
He will, well, I guess he won't actually talk about AMD GPUs specifically, but rather about dwarf 6.

00:56.000 --> 01:01.000
That's dark.

01:01.000 --> 01:04.000
Thank you very much.

01:04.000 --> 01:06.000
Can I keep that, actually?

01:06.000 --> 01:09.000
Let me see.

01:09.000 --> 01:10.000
That work?

01:10.000 --> 01:12.000
Yes, so perfect.

01:12.000 --> 01:15.000
So yes, I said, I'm Lenslot 6.

01:15.000 --> 01:21.000
I'm working for AMD on ROGGB, which is a developer for debugging AMD GPU programs.

01:21.000 --> 01:26.000
And we're not going to be talking about GPUs, we're going to be talking about debugging information,

01:26.000 --> 01:33.000
and some changes that recently made it into what should become dwarf 6 eventually.

01:33.000 --> 01:39.000
Before we start, just a tiny bit of background, what I'm going to be talking about is,

01:39.000 --> 01:48.000
initially being developed within AMD, and that was to, like, overcome the limitation of dwarf 5 to debug GPU code,

01:49.000 --> 01:55.000
mainly, like, GPUs are a bit different from from CPUs of different optimization opportunities,

01:55.000 --> 01:59.000
working differently, and so the dwarf 5 didn't really work for that.

01:59.000 --> 02:02.000
So that's why those extensions are to be made.

02:02.000 --> 02:07.000
It's currently implemented so in the work on blockchain, so LVM based on the compiler site,

02:07.000 --> 02:10.000
and GDP based on the debugger site.

02:11.000 --> 02:19.000
And what's initially published as one big omnibus documentation of what's been changed from from standard 12,

02:19.000 --> 02:22.000
and that changed as it is.

02:22.000 --> 02:30.000
It's not really fit the, what could be said, we did for to offer inclusion for 12 of the standard.

02:30.000 --> 02:36.000
And so eventually what happens, some years ago, four years ago, almost, I guess,

02:36.000 --> 02:44.000
a kind of hard work working group kind of formed to rework those extensions, including, you know,

02:44.000 --> 02:48.000
maybe it's industry players and individuals.

02:48.000 --> 02:56.000
So that group at kind of, you know, two task one is make sure that the extension we come with work for all the parties involved,

02:56.000 --> 03:03.000
because we have multiple GPU vendors involved, and the other bit was to kind of,

03:04.000 --> 03:13.000
split the changes that have been made into more manageable chunks that can be submitted for, for review to the toaster edit.

03:13.000 --> 03:19.000
One thing to note here, as you see, I am not one of your third of the original work.

03:19.000 --> 03:25.000
So I don't take any ownership on what I'm presenting, I'm just presenting it.

03:25.000 --> 03:29.000
We do have some of the other here in the room and thanks.

03:29.000 --> 03:35.000
So one of the outcome of that particular work was a proposal called here,

03:35.000 --> 03:41.000
the location description on the 12 stack and it's been accepted last October and last October.

03:41.000 --> 03:43.000
So it's been quite a track to get there.

03:43.000 --> 03:53.000
And as we said, part of the work of that committee was to try to reduce the size of, you know, what we have and have something manageable.

03:53.000 --> 03:58.000
And we ended up with something which is like 20 pages long, which is, let's say,

03:58.000 --> 04:02.000
larger than the typical dwarf proposal.

04:02.000 --> 04:08.000
But the point I'm going to try to make today is to show that what we did is actually a pretty simple change,

04:08.000 --> 04:13.000
but it just happens to be quite fundamental to some aspects of our working.

04:13.000 --> 04:22.000
And so that doesn't apply a lot of editorials, so the standard, you know, remains consistent.

04:22.000 --> 04:30.000
So for those who don't really know dwarf that much, dwarf in the nutshell is the debugging formation produced by comparison.

04:30.000 --> 04:37.000
So the, a consumer usually a debugger can, some of the understanding of the program.

04:37.000 --> 04:45.000
It's pretty much a tree structure for the main plots where you know we have cooperation units, which contains types and functions,

04:45.000 --> 04:50.000
and like no function they kind of basically block and variables and everything.

04:50.000 --> 04:54.000
And the nodes, they have attributes.

04:54.000 --> 04:59.000
Most of the attributes usually are statically known, you know, values.

04:59.000 --> 05:05.000
So like when you have a function like the, the name of functions here main, that's something that,

05:05.000 --> 05:08.000
that's, that means statically and we can express that.

05:08.000 --> 05:11.000
But there are some properties that we do not know statically.

05:11.000 --> 05:16.000
Let's say, let's go serious here, you know, we have a simple program.

05:16.000 --> 05:20.000
We have, you know, strikes, which uses that, you know, a bit through the presentation.

05:20.000 --> 05:21.000
We have an instance.

05:21.000 --> 05:26.000
And so in the debugging formation for that, we see, you know, the name F, we have some information of, you know,

05:26.000 --> 05:29.000
where it was declared the type, all of that is known statically.

05:29.000 --> 05:34.000
But the location where do you go to find the bytes that make up the value?

05:34.000 --> 05:37.000
You cannot, you know, at compile time.

05:37.000 --> 05:41.000
So what, what we have instead is instead of the value.

05:41.000 --> 05:46.000
We have what's going to do our expression, which when the value age is, will give you the value.

05:46.000 --> 05:51.000
So it's something that the client needs to evaluate.

05:51.000 --> 05:57.000
So dwarf expression, it's, it's not programmed generated by the compiler, that the consumer can,

05:57.000 --> 06:02.000
can use to describe a dynamic value location.

06:02.000 --> 06:07.000
Like when executed, it will yield, yields a location or value.

06:07.000 --> 06:16.000
And it's a pretty simple model to, like, work on values.

06:16.000 --> 06:22.000
It's a stack-based, you know, language, like mini language, where you can push in for values.

06:22.000 --> 06:28.000
And you have, of course, you know, taking the addition, you pop to values, you add them together, you push the value back.

06:28.000 --> 06:32.000
And that's how it works with, you know, a lot of, of course, for that.

06:32.000 --> 06:35.000
The locations will come a bit to that later.

06:35.000 --> 06:38.000
It, you know, it was a bit differently.

06:38.000 --> 06:43.000
So let's explore a bit, you know, what we can do with expressions or it works.

06:43.000 --> 06:53.000
So let's take, you know, us same structure and two variables, where we have an instance of that, that class.

06:53.000 --> 06:59.000
And we have a pointer to a member, a, in member of that class.

06:59.000 --> 07:02.000
So we have here the deeper information that goes with that.

07:02.000 --> 07:07.000
And so far, nothing is, you know, pretty unusual. We have the F variable.

07:07.000 --> 07:13.000
And the information says here, if you want to get the value, we have DWP FB range, which pretty much means,

07:13.000 --> 07:20.000
take the base, like the base frame address, subtract 32, and that will give you the address of your variable.

07:20.000 --> 07:22.000
The same for the pointer to member.

07:22.000 --> 07:25.000
And we have some information in the pointer to member type.

07:25.000 --> 07:34.000
And especially the DWAT use location, which is used to that dear reference the pointer in the way.

07:34.000 --> 07:38.000
And if you look at our use doc, let's say we have that, that's actually in the debugger.

07:38.000 --> 07:42.000
So we have our object, we can take it address.

07:42.000 --> 07:44.000
So why seven F have something.

07:44.000 --> 07:47.000
We can look at the value of the pointer to member.

07:47.000 --> 07:52.000
And what to the debugger do to evaluate, you know, prince F dot star member.

07:52.000 --> 07:58.000
And the way it works is we use that DWAT use location attribute that we add.

07:58.000 --> 08:01.000
And the standard says, yeah, it's easy busy.

08:01.000 --> 08:04.000
You initialize a stack of two values.

08:04.000 --> 08:06.000
One of them is going to be the address of your object.

08:06.000 --> 08:10.000
The other one is going to be the value of the member.

08:10.000 --> 08:18.000
And then you evaluate the expression associated with DWAT use location.

08:18.000 --> 08:23.000
Let's do that real quick. We have our address on the tag here.

08:23.000 --> 08:26.000
So like seven F has something.

08:26.000 --> 08:28.000
We have four, which is the value of the member.

08:28.000 --> 08:33.000
If you do DWAT plus, we plot two values of them together.

08:33.000 --> 08:35.000
Put the result back.

08:35.000 --> 08:40.000
And that's the address in memory of the field that we are looking for.

08:40.000 --> 08:43.000
Easy busy.

08:44.000 --> 08:49.000
Now let's forget all the seconds about the pointer to member.

08:49.000 --> 08:53.000
And look at what can happen to our object.

08:53.000 --> 08:56.000
Let's say we have an optimizing computer.

08:56.000 --> 08:58.000
That's decides for some reason.

08:58.000 --> 09:02.000
That's, yeah, let's not put that value inside memory.

09:02.000 --> 09:05.000
But let's promote it to registers.

09:05.000 --> 09:10.000
We did that kind of a lot on the GPUs.

09:10.000 --> 09:17.000
So again, we have some some way to expose that in DWAT with DWAT expression,

09:17.000 --> 09:22.000
which are composites that kind of describe this kind of layout.

09:22.000 --> 09:30.000
And how do we, you know, layout the data in different memory areas.

09:30.000 --> 09:37.000
So we have here a way to represent that in DWAT, that expression over there.

09:37.000 --> 09:41.000
We'll go and look at how that evaluates.

09:41.000 --> 09:43.000
And see what we have.

09:43.000 --> 09:49.000
So that would be the expression in DWAT location of the variable itself.

09:49.000 --> 09:50.000
So if we have that.

09:50.000 --> 09:53.000
So the first of code of code is DWAT.

09:53.000 --> 09:54.000
Red 0.

09:54.000 --> 09:58.000
So that means we put, we're pretty much saying that.

09:58.000 --> 10:00.000
Yeah, we're going to talk about registered here.

10:00.000 --> 10:05.000
Then we have DWAT 4, which means take four bytes out of red 0,

10:05.000 --> 10:08.000
which is like, in our case, the other register.

10:08.000 --> 10:10.000
And start a composite with that.

10:10.000 --> 10:13.000
So we now reference four bytes out of red 0.

10:13.000 --> 10:15.000
And then we continue.

10:15.000 --> 10:18.000
Register 1.

10:18.000 --> 10:20.000
So we talk about registered 1.

10:20.000 --> 10:24.000
And then DWAT is 4, which means take four bytes of red 0.

10:24.000 --> 10:28.000
And just concatenate that with the end of red 0.

10:28.000 --> 10:33.000
So now we have a composite which describes four bytes outside of red 0.

10:33.000 --> 10:34.000
And this is 0.

10:34.000 --> 10:35.000
Then four bytes of red 0.

10:35.000 --> 10:36.000
Which is the one which is 0.

10:36.000 --> 10:37.000
8 bytes in total.

10:37.000 --> 10:42.000
And then we have the entire value for update.

10:42.000 --> 10:44.000
And that works great.

10:44.000 --> 10:46.000
You know, if you want to print the value of the object,

10:46.000 --> 10:47.000
you can do that.

10:47.000 --> 10:49.000
No, be it.

10:49.000 --> 10:53.000
But how do we make that work?

10:53.000 --> 10:55.000
How do we evaluate, you know,

10:55.000 --> 10:58.000
the pre print food of star member in that case.

10:58.000 --> 11:00.000
If you go back to what we had,

11:01.000 --> 11:03.000
we should create a dwarf type.

11:03.000 --> 11:04.000
We should do values.

11:04.000 --> 11:08.000
One of them is going to be the value of the pointer.

11:08.000 --> 11:09.000
That's okay.

11:09.000 --> 11:12.000
And the other one should be the address of the object.

11:12.000 --> 11:14.000
But we didn't have an address.

11:14.000 --> 11:17.000
So there is pretty much nothing we can do.

11:17.000 --> 11:19.000
And that's kind of, you know, one

11:19.000 --> 11:23.000
illustration of what didn't really work in Wi-Fi.

11:23.000 --> 11:26.000
And that's known as being to,

11:26.000 --> 11:29.000
as letting all that work.

11:30.000 --> 11:36.000
There are many places in Wi-Fi where we assume that's.

11:36.000 --> 11:37.000
Location is an address.

11:37.000 --> 11:41.000
And we can take the address and use it as a value.

11:41.000 --> 11:43.000
And work on the dwarf type because it's,

11:43.000 --> 11:46.000
you know, the numerical value of the address.

11:46.000 --> 11:48.000
And as soon as you start to optimize

11:48.000 --> 11:50.000
stuff away and, you know,

11:50.000 --> 11:52.000
they're promoted to registers or they live in different

11:52.000 --> 11:54.000
different interfaces and everything.

11:54.000 --> 11:56.000
That kind of, you know, doesn't really work.

11:56.000 --> 11:58.000
So DWT location doesn't work.

11:58.000 --> 12:01.000
We have DWT, push object address,

12:01.000 --> 12:03.000
which has kind of the same problem.

12:03.000 --> 12:07.000
And that's what waft 6 is trying to address.

12:07.000 --> 12:10.000
So out of the 20 pages,

12:10.000 --> 12:13.000
also, you know, of proposal.

12:13.000 --> 12:16.000
I guess it all fits in one slide.

12:16.000 --> 12:19.000
And that's what has been proposed and no accepted.

12:19.000 --> 12:22.000
The answer is actually pretty simple.

12:22.000 --> 12:24.000
It's say we have a stack,

12:24.000 --> 12:26.000
which used to contain only values.

12:26.000 --> 12:30.000
And so now we'll make it so you can contain values.

12:30.000 --> 12:32.000
And locations.

12:32.000 --> 12:36.000
A location is really like a generalization of what an address is.

12:36.000 --> 12:38.000
It just points to the beginning or learning

12:38.000 --> 12:42.000
of the beginning of an object somewhere in some memory.

12:42.000 --> 12:46.000
So a location is just a double containing, you know,

12:46.000 --> 12:49.000
a reference to some storage and enough settings.

12:49.000 --> 12:51.000
I guess storage.

12:51.000 --> 12:54.000
Storage can be pretty much anything which is

12:54.000 --> 12:58.000
no way stream of bits and indexable stream of bits.

12:58.000 --> 13:02.000
And we define a couple of kinds of storage here.

13:02.000 --> 13:06.000
So memory register composites, which we'll, you know,

13:06.000 --> 13:08.000
kind of come back to.

13:08.000 --> 13:10.000
And we have implicit and undefined.

13:10.000 --> 13:13.000
We don't really, you know, just let's hear.

13:13.000 --> 13:16.000
We have junior upgrades to work with location.

13:16.000 --> 13:20.000
So DWT, and DWT of DWT of DWT of sets.

13:20.000 --> 13:22.000
And the rest is really.

13:22.000 --> 13:24.000
A tutorial changes and, you know,

13:24.000 --> 13:26.000
but what's compatibility with DWT5.

13:26.000 --> 13:30.000
So everything used to having DWT5 still works in DWT6.

13:30.000 --> 13:34.000
So with that, let's go and briefly,

13:34.000 --> 13:36.000
I know our point to remember example we had.

13:36.000 --> 13:38.000
And see how that works in DWT6.

13:38.000 --> 13:42.000
So we have exactly the same program.

13:42.000 --> 13:44.000
And DWT6 can't debug information.

13:44.000 --> 13:46.000
So we have the same variable.

13:46.000 --> 13:50.000
The location for our, you know,

13:50.000 --> 13:54.000
variable is, you know, chromatic registers.

13:54.000 --> 13:56.000
We have the pointer here.

13:56.000 --> 13:58.000
Doesn't really matter where it is.

13:58.000 --> 14:00.000
And we have some slight change to the,

14:00.000 --> 14:02.000
the pointer pointer to map a type.

14:02.000 --> 14:06.000
And DWT used location the way it's going to be defined.

14:06.000 --> 14:08.000
And so if we do the same exist again,

14:08.000 --> 14:10.000
we need to find out what is,

14:10.000 --> 14:14.000
know, the location of our variable.

14:14.000 --> 14:18.000
And so we have the exact same piece of dwarf.

14:18.000 --> 14:20.000
We need the same expression that we need to evaluate.

14:20.000 --> 14:22.000
So let's go and do that.

14:22.000 --> 14:24.000
So we have DWT6,

14:24.000 --> 14:26.000
which is really defined as,

14:26.000 --> 14:28.000
push a location on the stack,

14:28.000 --> 14:30.000
which, you know, references register zero.

14:30.000 --> 14:32.000
Um, out of set zero.

14:32.000 --> 14:33.000
So let's do that.

14:33.000 --> 14:35.000
And now we have a location on the stack,

14:35.000 --> 14:36.000
which is not in the dress.

14:36.000 --> 14:40.000
Just a generic location that points to some bits somewhere.

14:40.000 --> 14:42.000
That's kind of the, the new thing in,

14:42.000 --> 14:43.000
in DWT6.

14:43.000 --> 14:45.000
And then we can, you know,

14:45.000 --> 14:48.000
carry on and execute the,

14:48.000 --> 14:49.000
the rest.

14:49.000 --> 14:51.000
So DWT4,

14:51.000 --> 14:54.000
we take four bytes out of,

14:54.000 --> 14:56.000
a reference to four bytes out of that,

14:56.000 --> 14:59.000
that location and create a new composite.

14:59.000 --> 15:01.000
So the text is here.

15:01.000 --> 15:02.000
You can, you know,

15:02.000 --> 15:03.000
since it afterwards,

15:03.000 --> 15:04.000
you know, we'll publish the slides,

15:04.000 --> 15:07.000
but like I'm not going to read through that just at the moment.

15:07.000 --> 15:08.000
Um,

15:08.000 --> 15:10.000
but that's what we do.

15:10.000 --> 15:11.000
We take four,

15:11.000 --> 15:13.000
four bytes of the,

15:14.000 --> 15:16.000
um,

15:16.000 --> 15:17.000
of that register.

15:17.000 --> 15:18.000
Great composite for that.

15:18.000 --> 15:20.000
Push the composite on the stack.

15:20.000 --> 15:21.000
And we can continue.

15:21.000 --> 15:22.000
So wedge one.

15:22.000 --> 15:23.000
We put, push one new,

15:23.000 --> 15:24.000
um,

15:24.000 --> 15:26.000
new location of the stack,

15:26.000 --> 15:27.000
which is no.

15:27.000 --> 15:29.000
Of that zero into register one.

15:29.000 --> 15:31.000
And DWT4,

15:31.000 --> 15:34.000
we take four bytes out of register one.

15:34.000 --> 15:36.000
That we can catch in a concatenate out of the four,

15:36.000 --> 15:38.000
four bytes of register zero.

15:38.000 --> 15:39.000
That gives us a,

15:39.000 --> 15:40.000
a composite of eight bytes.

15:40.000 --> 15:42.000
And we're pointing to offset zero of,

15:43.000 --> 15:44.000
that composite.

15:44.000 --> 15:46.000
And so at the end of the evaluation,

15:46.000 --> 15:47.000
we have.

15:47.000 --> 15:48.000
That location,

15:48.000 --> 15:50.000
which is on top of the stack and we just say,

15:50.000 --> 15:51.000
whatever,

15:51.000 --> 15:52.000
you know,

15:52.000 --> 15:54.000
remains at the top of the stack at the end of the evaluation,

15:54.000 --> 15:57.000
is no way can go and look for your value.

15:57.000 --> 15:58.000
So so far,

15:58.000 --> 15:59.000
it works.

15:59.000 --> 16:00.000
Um,

16:00.000 --> 16:01.000
no issue.

16:01.000 --> 16:02.000
And as you see,

16:02.000 --> 16:04.000
we have the same outcome as into our five.

16:04.000 --> 16:06.000
So we describe the exact same layout,

16:06.000 --> 16:07.000
which we have the same.

16:07.000 --> 16:09.000
By the code.

16:09.000 --> 16:10.000
So we have the same expression.

16:10.000 --> 16:12.000
It remains compatible.

16:12.000 --> 16:15.000
The only thing that we really changed is the semantics of,

16:15.000 --> 16:17.000
how do we evaluate and how do we,

16:17.000 --> 16:18.000
um,

16:18.000 --> 16:21.000
go through evaluating that.

16:21.000 --> 16:23.000
But now that we have that,

16:23.000 --> 16:24.000
we can,

16:24.000 --> 16:25.000
we did,

16:25.000 --> 16:26.000
you know,

16:26.000 --> 16:28.000
pointed to member kind of thing.

16:28.000 --> 16:30.000
And it kind of makes sense now.

16:30.000 --> 16:32.000
If you want to evaluate DWT's location,

16:32.000 --> 16:33.000
we can do what,

16:33.000 --> 16:34.000
you know,

16:34.000 --> 16:35.000
what was impossible before,

16:35.000 --> 16:36.000
which is,

16:36.000 --> 16:38.000
we initialize a stack.

16:39.000 --> 16:40.000
We,

16:40.000 --> 16:41.000
two elements.

16:41.000 --> 16:44.000
One of them is going to be the location of the objects.

16:44.000 --> 16:47.000
The other one is going to be the value of the pointer.

16:47.000 --> 16:49.000
And so that's what we have here.

16:49.000 --> 16:50.000
We have a stack,

16:50.000 --> 16:51.000
which is,

16:51.000 --> 16:52.000
which makes it together.

16:52.000 --> 16:54.000
Locations and values.

16:54.000 --> 16:55.000
Um,

16:55.000 --> 16:58.000
and then we can run our program as before.

16:58.000 --> 17:01.000
So DWT of sets.

17:01.000 --> 17:02.000
Again,

17:02.000 --> 17:04.000
you have the text here for the walls.

17:04.000 --> 17:05.000
Um,

17:05.000 --> 17:06.000
mostly for online.

17:07.000 --> 17:08.000
But um,

17:08.000 --> 17:10.000
what we do is we pop the value,

17:10.000 --> 17:12.000
which is in your initial set.

17:12.000 --> 17:13.000
We pop the.

17:13.000 --> 17:14.000
The location,

17:14.000 --> 17:15.000
which was behind it,

17:15.000 --> 17:17.000
we just modified the offset of the location.

17:17.000 --> 17:18.000
So we add,

17:18.000 --> 17:19.000
we add the offset.

17:19.000 --> 17:21.000
And we push that back.

17:21.000 --> 17:23.000
And we end up now with a composite,

17:23.000 --> 17:24.000
consisting of,

17:24.000 --> 17:26.000
consisting of widgets as you will register one.

17:26.000 --> 17:27.000
We add,

17:27.000 --> 17:28.000
offset four,

17:28.000 --> 17:31.000
four bytes inside that composite that's really registered one.

17:31.000 --> 17:33.000
And that's where the member we're looking for is.

17:33.000 --> 17:35.000
And so we can go and evaluate that expression now.

17:35.000 --> 17:46.000
now. And really, that's all it is. So, location on the stack, it's when you look at it

17:46.000 --> 17:51.000
quite simple as a change in dwarf, but it's pretty fundamental in the way we define

17:51.000 --> 17:59.000
the evaluation of an expression. It simplifies quite greatly the, you know, the

17:59.000 --> 18:02.000
evaluation mode that are at least in my mind. We don't have that, you know, the

18:02.000 --> 18:07.000
location that are at the side and some other rules that we challenge. But we

18:07.000 --> 18:13.000
need to follow. We have backwards compatibility with R5. So, we keep the same

18:13.000 --> 18:20.000
back, we keep the same outcome. We just change the, the semantics of the

18:20.000 --> 18:25.000
operation to get there. And from there, that does open up a lot of

18:25.000 --> 18:29.000
options to do more stuff like multiple other space support, which we need on

18:30.000 --> 18:35.000
GPUs. We can really find out we know we create over this, that has some

18:35.000 --> 18:39.000
application that, you know, we can do some extensions on second winding, which

18:39.000 --> 18:46.000
also uses expression and so on. And so that's pretty much it. So, we have

18:46.000 --> 18:50.000
acknowledgement. So, a big thanks to all the people involved in that, you know,

18:50.000 --> 18:55.000
dwarf or GPU work group. They've been doing most of the work. I

18:55.000 --> 19:00.000
part of that, but like I don't really claim authorship of all of that. They did all

19:00.000 --> 19:05.000
the work. And thanks to everyone who helped me put that together. And I think

19:05.000 --> 19:12.000
with it, let's go slide, because I have to. And questions.

19:12.000 --> 19:30.000
We had one question behind, and then, yeah.

19:30.000 --> 19:33.000
So, the question is about, you know, what is, what happens is the value is

19:33.000 --> 19:38.000
partially under stack, partially on, on registers. We can create a composite

19:38.000 --> 19:43.000
system. It can reference multiple storage. So, you can say, I want to take four

19:43.000 --> 19:48.000
bytes out of memory from that address, then take two bytes of that register, then take

19:48.000 --> 19:52.000
16 bytes of memory in another, I just mentioned you wanted to. And then you can say,

19:52.000 --> 19:56.000
then there are two bytes that have been completely optimized. I will title notes. So, we

19:56.000 --> 20:01.000
use the define storage and so on. So, a composite doesn't have to be uniform in

20:01.000 --> 20:07.000
anywhere shape or form. So, that also means when the multiple address space

20:07.000 --> 20:14.000
thing will come later, like, that will also work. Yes. And the question is, that is like

20:14.000 --> 20:19.000
an observation that when multiple address space is coming into play later. Yes, we can

20:19.000 --> 20:23.000
have objects which are spread across multiple address spaces, and that will work

20:23.000 --> 20:27.000
as composite just fine. That was one of the main, you know, motivation initially to

20:27.000 --> 20:31.000
write all that work. I think you had a question. Yes.

20:31.000 --> 20:36.000
So, none of this used to be a problem because even the law of the address of a red

20:36.000 --> 20:40.000
server will be taken. Yes.

20:40.000 --> 20:46.000
Then, like, no, it's not absolutely sure that we're going to have.

20:46.000 --> 20:52.000
I will come back to that. But, yeah, Karen.

20:52.000 --> 20:57.000
So, my mind says, you're coming from the GPU field. Is that you can use

20:57.000 --> 21:02.000
our notes, putting a lot more data into the larger address or something like that?

21:02.000 --> 21:07.000
Yes. So, the rare mark is, you know, that was not a problem because see,

21:07.000 --> 21:11.000
it doesn't really allow you to take the address of the register anyway. And I argue,

21:11.000 --> 21:16.000
most of the time, if your program itself doesn't use that pointer to

21:16.000 --> 21:20.000
remember, you know, in the region of your program, the compiler is absolutely

21:20.000 --> 21:24.000
allowed to promote your object into a register.

21:24.000 --> 21:28.000
Because from your program perspective, it's just going to be the same. Your program

21:28.000 --> 21:32.000
will never take the address of your register. It just, once you're in your

21:32.000 --> 21:36.000
debugger, we might want to do that. And, like, in my example here, it's a bit

21:36.000 --> 21:40.000
contrived because I still have, you know, the pointer to remember, which is still

21:40.000 --> 21:44.000
on the stack and everything. And that's, like, not likely what's what

21:44.000 --> 21:48.000
happened is, you know, your register says, I know the pointer to remember,

21:48.000 --> 21:53.000
I know the value for that section. And so, I can promote stuff and make it

21:53.000 --> 21:58.000
so everything, you know, looks like it, but it's never really, like the program

21:58.000 --> 22:02.000
is never really going to do that reference that I wrote. It's just something

22:02.000 --> 22:06.000
that you do in your debug session. So, we could have that in optimized, you know,

22:06.000 --> 22:12.000
in optimized cases on regularcy. And the next part of the question is, do we have

22:12.000 --> 22:18.000
that because on register, you know, on GPUs, we have larger register banks. And, yes,

22:18.000 --> 22:24.000
definitely. On GPUs, we have, like, really, hundreds of registers. And the

22:24.000 --> 22:28.000
extra memory is quite expensive. So, one of the, the first optimization

22:28.000 --> 22:32.000
in your compiler would do is just no take as much of the data that you can

22:32.000 --> 22:36.000
out of the memory and put that into registers as long as it fits. And we

22:36.000 --> 22:40.000
can have those that. And I don't know for all the vendors, like, for

22:40.000 --> 22:45.000
the, we do even have some way to, like, indexing. We have some instruction

22:45.000 --> 22:50.000
where we can access, you know, where we can index into registers

22:50.000 --> 22:54.000
in a way. So, like, the move relative kind of instruction would say, like,

22:54.000 --> 22:59.000
take some data out, you know, you say register five. But then the,

22:59.000 --> 23:03.000
the semantics of the operation is just, it's registered five,

23:03.000 --> 23:07.000
plus the value you have in some of the register. So, we can do that on

23:07.000 --> 23:11.000
a few years. And these are five options.

23:11.000 --> 23:12.000
These are five options.

23:12.000 --> 23:15.000
Although there are register offsets. You register number offsets.

23:15.000 --> 23:18.000
So, also, you register indices.

23:18.000 --> 23:21.000
Yes, that we do register indices. Yeah.

23:21.000 --> 23:22.000
Yeah.

23:22.000 --> 23:24.000
I don't think I have to add one more case.

23:24.000 --> 23:27.000
I don't know, there are many other cases that are going to be

23:27.000 --> 23:28.000
easily disabled.

23:28.000 --> 23:32.000
But I used to work on a specific architecture, but at a few,

23:33.000 --> 23:35.000
a few, to register back.

23:35.000 --> 23:38.000
They actually used it in stack.

23:38.000 --> 23:41.000
And then we use a lot of the indices that are going to

23:41.000 --> 23:42.000
register.

23:42.000 --> 23:44.000
And then they do basically the stack.

23:44.000 --> 23:47.000
These are the, in the basic space.

23:47.000 --> 23:48.000
Yeah.

23:48.000 --> 23:49.000
Yeah.

23:49.000 --> 23:53.000
Just, so, like, just repeat that for the mic in every,

23:53.000 --> 23:55.000
every and every outside of the room.

23:55.000 --> 23:58.000
So, the remark was, um, someone who's been working on the

23:59.000 --> 24:02.000
architecture, where using quite large register bank, where the

24:02.000 --> 24:05.000
stack is actually actually implemented in register.

24:05.000 --> 24:08.000
And you just offset inside your register bank.

24:08.000 --> 24:12.000
Um, which is kind of what we would be describing here.

24:12.000 --> 24:15.000
Yes.

24:15.000 --> 24:19.000
I suppose that won't work, but what about the function

24:19.000 --> 24:24.000
where those, like, the complete function,

24:25.000 --> 24:31.000
so the function is, yeah, I'm just,

24:31.000 --> 24:34.000
wrapping up that question and then I'm, I'm familiar.

24:34.000 --> 24:37.000
Uh, I guess you can start and, like, you know,

24:37.000 --> 24:38.000
I think.

24:38.000 --> 24:41.000
Um, the case of function pointer, where, you know,

24:41.000 --> 24:43.000
when you optimize that with the function, or you just,

24:43.000 --> 24:47.000
know, just so much, we have the, the same problem that, like,

24:47.000 --> 24:51.000
where, we can actually take the, you know,

24:51.000 --> 24:55.000
like, where we can actually take the address of the function.

24:55.000 --> 24:57.000
It's not going to be the address of the same function.

24:57.000 --> 25:00.000
Like, 12, as way to, to tell you, I know that object exists,

25:00.000 --> 25:02.000
but I know I cannot give you the address.

25:02.000 --> 25:05.000
And, you know, there are cases where that, that's actually true.

25:05.000 --> 25:08.000
And what we're trying to make sure is reduce,

25:08.000 --> 25:11.000
know the set of cases where, where, where,

25:11.000 --> 25:13.000
roughly, know, it just comes to an end and say,

25:13.000 --> 25:15.000
no, I, I cannot answer that question.

25:15.000 --> 25:18.000
To only the case is where, yes, that question is just

25:18.000 --> 25:22.000
genuinely not answerable.

25:22.000 --> 25:23.000
And I guess that's it.

25:23.000 --> 25:24.000
Yeah.

25:24.000 --> 25:25.000
That's it.

25:25.000 --> 25:26.000
That's it.

25:26.000 --> 25:28.000
Thank you very much.

