WEBVTT

00:00.000 --> 00:11.760
All right, hello everyone, my name is Jonas, and I work on the bugging at Apple, and today

00:11.760 --> 00:15.000
I want to talk about the Webbot assembly the bugging in all of the B.

00:15.000 --> 00:22.560
So I'll do my best, the speaker also kind of echoes.

00:22.560 --> 00:28.840
So this work was primarily motivated by a Swift, and targeting Webbot assembly from Swift has

00:28.840 --> 00:29.960
come a long way.

00:29.960 --> 00:36.280
It started originally as a community project, Akio created WAKIP as an open source project

00:36.280 --> 00:42.200
in 2018, and it was the first Webbot assembly run time written in Swift targeting Swift.

00:42.200 --> 00:47.320
And then a few years later, Max, who may or may not be in the room, there he is, he took

00:47.320 --> 00:52.240
over, I mean, the ownership of the project, and he renamed it to WasmKIP.

00:52.240 --> 00:58.560
After that, Yuda, he interned that Apple, and together with Max, they got 100% specta scofferage.

00:58.680 --> 01:01.680
For Webbot assembly with WasmKIP.

01:01.680 --> 01:09.280
And then with the release of Swift 6.0 in 2024, and that at Swift targeting Webb assembly became an experimental

01:09.280 --> 01:16.680
feature, and also in Swift CI, we adopted WasmKIP to target targeting, so the compiler support.

01:16.680 --> 01:23.640
And then last year in 2025, all their hard work culminated in Webb assembly being officially

01:23.640 --> 01:29.440
supported as of Swift 6.2, and WasmKIP now ships as part of the tool chain.

01:29.440 --> 01:33.760
And of course, all of this is open source, I know the kit at the end might be kind of confusing.

01:33.760 --> 01:38.520
The run time is developed under the Swift Wasm organization on GitHub, and of course, the compiler

01:38.520 --> 01:40.800
support is just in upstream Swift.

01:40.800 --> 01:44.880
And if you want to learn more, check out SwiftWasm.org.

01:44.880 --> 01:50.320
So our goal for Swift Compile to Webb assembly is to provide a first class debugging experience,

01:50.320 --> 01:54.800
and that means full source level debugging, where you can set breakpoints, step in, and

01:54.800 --> 01:58.320
over Swift code, and of course, show your variables.

01:58.320 --> 02:03.200
And so in order to achieve that, the debugger needs to know about the Swift programming language,

02:03.200 --> 02:04.920
which means that there's two approaches.

02:04.920 --> 02:10.920
One, we could teach existing tools to debug Webb assembly and teach it about Swift, or we could

02:10.920 --> 02:15.720
teach LDB, which already knows about Swift, about Webb assembly.

02:15.720 --> 02:20.200
So maybe the first approach sounds simpler on the surface, but it really isn't.

02:20.200 --> 02:24.800
The debugging Swift is far from trivial, and LDB already has years and years of investments

02:24.800 --> 02:26.200
in that space.

02:26.200 --> 02:30.520
On top of that, if we add Webb assembly support to LDB, that doesn't just benefit Swift,

02:30.520 --> 02:34.080
but pretty much every language that is supported by all the game.

02:34.080 --> 02:37.080
And that's why we thought that was the better approach.

02:37.080 --> 02:42.080
So before we commit to any approach, I do want to compare a few options that exist for

02:42.080 --> 02:44.760
the debugging, or existed before I started all this.

02:44.760 --> 02:50.120
And we identified three main approaches, each with their own pros and cons, and an interesting

02:50.120 --> 02:55.520
observation here is that they all use LDB one way or another.

02:55.520 --> 03:00.200
If you've played around with Webb assembly, you're probably already familiar with WasmTime.

03:00.200 --> 03:04.080
It is the reference implementation of the Webb assembly runtime, and it's developed by the

03:04.080 --> 03:05.880
bytecode alliance.

03:05.880 --> 03:10.640
It uses just in time compilation to generate native machine code with the debugging

03:10.640 --> 03:11.640
info.

03:11.640 --> 03:16.160
And so the debugging in WasmTime means debugging the runtime, and the jitter code just runs

03:16.160 --> 03:18.680
as part of that same process.

03:18.680 --> 03:23.280
And so you can debug WasmTime this way with either GDB or LDB.

03:23.280 --> 03:26.720
And to the debugger, it just looks like you're debugging native code.

03:26.720 --> 03:29.720
And so what's neat about this approach is that like we're like no changes necessary,

03:29.720 --> 03:32.640
LDB doesn't need to know about Webb assembly at all.

03:32.640 --> 03:36.840
The trade-off on the other hand is that you are debugging the runtime, and so it can make

03:36.840 --> 03:40.800
it kind of hard to distinguish between the runtime code and like decoded you're trying

03:40.800 --> 03:41.800
to debug.

03:41.800 --> 03:44.440
For example, if you're looking at a backtrace, you're going to get all these frames coming

03:44.440 --> 03:48.600
from the runtime that you have to mentally filter out.

03:48.600 --> 03:50.720
Chrome takes a different approach.

03:50.720 --> 03:55.600
They have a delft tools extension, which adds support for debugging C and C++ code,

03:55.600 --> 03:58.960
right from within the browser's built-in developer tools.

03:58.960 --> 04:03.720
And this extension also uses LDB under the hood, and it uses that to parse dwarf and create

04:03.720 --> 04:08.680
types, but pretty much everything else is done by the browser itself.

04:08.680 --> 04:12.400
And so as a developer, that's nice because you can use the same tools for debugging JavaScript

04:12.400 --> 04:14.520
as you do for a web assembly.

04:14.520 --> 04:18.680
The downside is for Swift supporter and the other language, you need to extend the browser

04:18.680 --> 04:22.840
and teach it about it, which is one of the things we wanted to avoid.

04:22.840 --> 04:28.480
The third approach is the web assembly micro runtime, sometimes abbreviated as whammer, and

04:28.480 --> 04:33.360
it's also developed by the bytecode alliance, and it's a lightweight standalone runtime

04:33.360 --> 04:36.720
targeting embedded and IoT applications.

04:36.720 --> 04:42.000
And this has a small debug server that talks the GDB remote protocol, which is an industry

04:42.000 --> 04:47.440
standard that's supported by both GDB and LDB, the spitany.

04:47.440 --> 04:51.240
It's also the approach we started pursuing, and the debug stop in a micro runtime meant

04:51.240 --> 04:55.600
that we had something that already existed and we could work with.

04:55.600 --> 04:58.080
So at this point, you're probably curious what this looks like.

04:58.080 --> 05:01.680
So I'm going to jump straight in with a demo and show you what it looks like before

05:01.680 --> 05:05.040
I talk about boring implementation details.

05:05.040 --> 05:08.960
So hopefully, that's not too small, but I realize that I might have underestimated the

05:08.960 --> 05:10.440
size of the screen.

05:10.440 --> 05:14.840
So there's a short recording of Swift, which is compiled to web assembly, and we're using

05:14.840 --> 05:17.680
LDB individual studio code to the bug is.

05:17.680 --> 05:20.760
The code is fairly trivial, especially for the people in the back.

05:20.760 --> 05:26.920
We got a dictionary mapping some fruits to prices, and then we have a kind of contrived function

05:26.920 --> 05:28.920
that just adds entries to the maps.

05:28.920 --> 05:30.920
So I'm going to start playing the demo.

05:30.920 --> 05:36.080
So I'm going to start a break point here, and then I'm going to run already compiled this,

05:36.080 --> 05:40.600
and you can see we got a back trace, exactly as you would expect.

05:40.600 --> 05:44.000
Then we have local variables.

05:44.000 --> 05:47.760
Show you, this is the input to the function, function arguments.

05:47.760 --> 05:50.800
Then I continue, so you're going to see the values changing, because we're in the next

05:50.800 --> 05:53.000
iteration of this function getting called.

05:53.000 --> 05:55.240
And I'm going to continue stepping out of it.

05:55.360 --> 05:59.600
Here I'm just proving that those values look the way you expect, so I continue, or

05:59.600 --> 06:04.960
step out of it, and then here's the fruit prices variable that we can also inspect.

06:04.960 --> 06:08.800
So far, everything just looks like native debugging, so to prove to you that this is

06:08.800 --> 06:15.760
actually debugging web assembly, I'm doing a business assembly here to show you all the instructions.

06:15.760 --> 06:19.360
But in order to get here, we had to build several key pieces.

06:19.360 --> 06:22.160
So let me talk about that next.

06:22.160 --> 06:27.160
One of the key design decisions was using the GDPRot protocol, so I want to start out with

06:27.160 --> 06:29.280
an overview of that architecture.

06:29.280 --> 06:34.160
So in a native debug session, LDB uses a client server architecture, where the client

06:34.160 --> 06:39.680
is LDB itself, and it runs on the host, and then it communicates with a debug stub, which

06:39.680 --> 06:44.160
runs next to the program you're debugging, which we call the inferior.

06:44.160 --> 06:49.360
The stub is a lightweight and tidal binary that directly controls the inferior, and usually

06:49.360 --> 06:53.040
does so with help from the operating system, for example, a mechOS, and Linux that's

06:53.040 --> 06:54.840
going to be P3s.

06:54.840 --> 06:58.360
And also I want to call out that the same client server architecture is used regardless

06:58.360 --> 07:02.160
of whether you're debugging remotely or locally, in the local case, like you know the host

07:02.160 --> 07:05.000
and the target are just the same machine.

07:05.000 --> 07:10.360
The GDPRot protocol is simple, it's well documented, it's widely adopted, and it's

07:10.360 --> 07:15.840
supported by a bunch of tools, including LDB and GDB, and I realized that this year, it

07:15.840 --> 07:19.320
has existed for 40 years, so it has really proven itself.

07:19.320 --> 07:22.240
Which is the reason we want to build on top of that.

07:22.240 --> 07:26.080
So here's what that architecture looks like for WebAssembly.

07:26.080 --> 07:30.560
Here the runtime would implement the GDPRot stub, and how it does that is entirely implementation

07:30.560 --> 07:31.560
defined.

07:31.560 --> 07:37.080
It can different run thanks can choose to optimize for their own constraints.

07:37.080 --> 07:42.600
And then just the normal way LDB will talk using the GBRot mode protocol to the debug stub,

07:42.600 --> 07:45.920
and that way control the WebAssembly inferior.

07:45.920 --> 07:50.960
And the key takeaway here is that this approach provides a standardized way for any debugger

07:50.960 --> 07:56.480
to talk to any WebAssembly runtime that exposes the GDPRot protocol, and the nice thing here

07:56.480 --> 08:00.040
is like LDB doesn't need to know about the implementation of the runtime and the runtime

08:00.040 --> 08:04.440
doesn't need to be aware of that debugger debugging it.

08:04.440 --> 08:09.280
And also WebAssembly is modeled around a stack-based virtual machine, and as we'll see later

08:09.280 --> 08:15.840
that poses some concepts that can directly be translated into concepts in the GDPRot protocol,

08:15.840 --> 08:21.120
and so this requires a handful of extensions that both the runtime and the debugger need to support.

08:21.120 --> 08:28.080
Luckily the protocol was always designed to be extensible, and the majority of the packets are all standard.

08:28.080 --> 08:33.080
So let's take a look at the implementation, and previous to this work LDB already had some

08:33.080 --> 08:38.760
wares in support, in particular for the Chrome Dev tools integration that I mentioned earlier.

08:38.760 --> 08:45.200
It supports reading, wares and binaries, loading them into memory, and then using the dwarf to create types.

08:45.200 --> 08:50.320
And the micro runtime, as a debug stop, was also designed to work with LDB.

08:50.320 --> 08:55.040
The repository contained a patch, which was based on work from follows Severini.

08:55.040 --> 09:00.320
He was the driving force behind some of the earlier existing WebAssembly supporting LDB,

09:00.320 --> 09:04.960
and some of my work is a continuation of some of the PRs he put up there.

09:04.960 --> 09:10.040
And we also made it a goal to remain compatible with the debug stop in the micro runtime.

09:10.040 --> 09:15.160
During this process we encounter some places where we might have taken some small different design decisions.

09:15.160 --> 09:19.160
But nothing that warrants and breaking fatability.

09:19.160 --> 09:21.240
All right, objects files.

09:21.240 --> 09:28.280
So the first challenge here was that LDB's existing objects file wasn't plugin, was rather rudimentary.

09:28.280 --> 09:33.960
To keep things simple, we were looking for a handful of known sections and ignoring everything else.

09:33.960 --> 09:39.720
And we needed to expand this to read arbitrary sections, in particular Swift Castle metadata,

09:39.720 --> 09:43.000
or language-specific metadata sections that we needed to read.

09:43.080 --> 09:48.680
And most of the work was reading the spec, knowing how to parse those data sections.

09:48.680 --> 09:53.480
WebAssembly generally follows the elf model where sections contain segments.

09:53.480 --> 09:57.560
The one interesting thing is that those segments can be active or passive.

09:57.560 --> 10:02.680
And active segments are automatically loaded into memory during module initialization,

10:02.680 --> 10:06.280
and they use what's called an init expression to specify where.

10:06.280 --> 10:11.320
And it's an init expression consists of a series of wasom operations or subset of them,

10:11.400 --> 10:16.200
which meant that we had to implement a tiny wasom interpreter in her objects file plugin,

10:16.200 --> 10:19.800
which was a fun exercise, as you can imagine.

10:20.680 --> 10:24.920
The next challenge was supporting symbols, and for that we needed a symbol table.

10:24.920 --> 10:29.480
But WebAssembly's concept of a symbol table is exclusively used for linking,

10:29.480 --> 10:35.560
and it isn't preserved in the final binary, but wasom does provide something called a name section,

10:35.560 --> 10:38.360
which is one of those new sections we had to parse.

10:38.360 --> 10:42.520
And the name section contains the function names, and an offset, or actually an index

10:42.520 --> 10:47.720
into the function section, and the function section contains an offset into the code section,

10:47.720 --> 10:50.920
together with a size, and I think that's it.

10:50.920 --> 10:54.920
And so we had to combine all of this together, the name, the size, the offset,

10:54.920 --> 10:57.960
and then populate all the B's concept of a symbol table.

10:57.960 --> 11:00.600
And so with that you can now set breakpoints by symbol name,

11:00.600 --> 11:03.000
even if you do not have to worth debugging information.

11:03.000 --> 11:09.000
Now, now we can set, so we have a symbol table, we can set breakpoints,

11:09.000 --> 11:10.200
we can run to them.

11:10.200 --> 11:13.400
The next thing you want to do is be able to see where you are.

11:13.400 --> 11:15.400
So for that we do the BT command.

11:15.400 --> 11:20.760
And so normally LDB would examine the frame pointer register, and then walk the stack.

11:21.800 --> 11:27.800
But doing unwinding depends on concepts that such as registers, and an API defined stack layout,

11:27.800 --> 11:29.800
and these things just don't exist in WebAssembly.

11:30.760 --> 11:34.440
So for native code, LDB has to do the unwinding ourselves,

11:34.440 --> 11:37.960
but we realize that for WebAssembly, we can rely on the runtime,

11:37.960 --> 11:41.080
which already needs to know this information.

11:41.080 --> 11:43.480
And this brings us to our first extension packet,

11:43.480 --> 11:45.480
to the GDB remote protocol.

11:45.480 --> 11:50.680
It's the Q-Wasm call stack packet, and this queries the runtime for a list of program counters,

11:50.680 --> 11:53.240
representing the call stack for the current thread.

11:53.240 --> 11:57.160
LDB then symbolicates in using the information from the symbol table,

11:57.160 --> 12:00.760
or from the dwarf debug info, and then you get the backtrace, as you would expect.

12:02.680 --> 12:03.880
Next up is variables.

12:05.000 --> 12:08.200
For that, we need a little bit of background about how the bugger is used to work

12:08.840 --> 12:10.520
to parse debug information.

12:10.520 --> 12:14.920
If you were in the talk this morning about dwarf 6, you might have already been prepared.

12:16.120 --> 12:21.000
So the bugger uses dwarf location descriptions to find and recover variable values

12:22.360 --> 12:25.560
at runtime, and these things can take several forms.

12:25.640 --> 12:29.240
So they can be empty, in case the value is unavailable,

12:29.240 --> 12:32.040
because the variable has been optimized out.

12:32.040 --> 12:35.880
They can be implicit when the value is known, but there's no runtime representation,

12:35.880 --> 12:37.400
something like a constant.

12:37.400 --> 12:40.840
If a variable is in memory, you have a memory location, and we get an address,

12:40.840 --> 12:44.760
and if it lives in a register, we have a register location, and we get a registered name.

12:45.640 --> 12:50.200
Empty and implicit locations work exactly the same way in WebAssembly as they do in for native code,

12:50.840 --> 12:53.960
but memory and registers be slightly differently.

12:54.920 --> 13:00.440
WebAssembly doesn't have a concept of registers, and so therefore no register location descriptions.

13:01.000 --> 13:05.640
However, WebAssembly has a few other places where it can store values, namely locals,

13:05.640 --> 13:07.480
globals, and an operand stack.

13:07.480 --> 13:11.720
And so to handle those cases, WebAssembly uses something called virtual registers,

13:11.720 --> 13:15.160
which is entirely at the bugging concept, to describe this in dwarf,

13:15.160 --> 13:17.240
and have the bugger query the runtime.

13:17.960 --> 13:22.200
So when a value is stored in WebAssembly, local global or along the operand stack,

13:22.600 --> 13:28.280
if you use a dwarf vendor extension called DWOPWASM location, it takes two arguments,

13:28.280 --> 13:32.120
the first one's specifying which one of these three it is, and the second one's specifying it

13:32.120 --> 13:34.440
index like the first or the second or the third global.

13:36.520 --> 13:41.320
Each register also has its own corresponding GDP remote packet, these are also extensions,

13:41.320 --> 13:45.640
that the bugger then uses when it encounters it to query the runtime and get the value back out.

13:46.440 --> 13:51.000
So let's look at an example for a function argument,

13:51.160 --> 13:56.520
so in native code, depending on the API, you might expect that to be passed in a register and get a register name.

13:57.080 --> 14:03.400
So in WebAssembly, we will get a dwarf location description that uses the DWOPWASM location.

14:03.400 --> 14:08.200
And so in this particular case, the value is stored as a function local at index two,

14:08.200 --> 14:09.560
so the second function local.

14:09.560 --> 14:11.160
So when all of the being encounters this,

14:11.160 --> 14:16.200
parts as the dwarf expression, encounters this, and then it's going to query the runtime with a queue

14:16.920 --> 14:21.320
with local and index two, gets the value back and it shows it to you.

14:23.320 --> 14:28.520
Variables located in memory, behave pretty much the same and can use dwarf standard memory locations,

14:29.160 --> 14:32.120
but WebAssembly's architecture requires a little bit of care.

14:32.120 --> 14:38.360
Specifically, WebAssembly follows a segmented memory model, where code and data

14:39.000 --> 14:43.640
live at separate address spaces, also different modules each have their own address space.

14:44.360 --> 14:47.240
And hello, the B doesn't currently have address space support,

14:47.240 --> 14:51.240
which is something that's come up a few times today, so we needed a creative solution,

14:51.240 --> 14:56.680
and what we did is we used the top 32 bits of a 64-bit address to encode the address space.

14:56.680 --> 15:02.920
We used the first two bits for the type, so like code or memory, and then the remaining

15:02.920 --> 15:07.320
30 bits we used to encode the module. And so this approach works well for 32-bit

15:07.320 --> 15:12.760
wasom, which is still very much the default, but obviously for WebAssembly 64, we'll need all 64 bits

15:12.760 --> 15:17.160
for the address, and we will need to come up with something different. Luckily, this address space

15:17.160 --> 15:21.560
support node gives something that we've been discussing on the forums. I need to hurry up.

15:22.840 --> 15:27.240
Although Swift uses dwarf as it's a debugging format for anything but trivial types,

15:27.240 --> 15:32.440
we also need the Swift Reflection metadata, for example, to resolve the concrete type of

15:32.440 --> 15:37.880
a generic at runtime. And so Swift uses a common library, a lib Swift Reflection,

15:37.880 --> 15:44.200
and it's used at runtime for performing reflection, and it's used by the debugger to generate types.

15:44.920 --> 15:49.640
And so what that means is it's nice to have one implementation, so we don't have to duplicate the code.

15:49.640 --> 15:53.000
What that means is that this library also needs to be able to parse

15:53.000 --> 15:58.520
those sections I mentioned earlier, and so we had to redo some of that work in lib Swift Reflection.

15:58.600 --> 16:04.040
But the thing is that they was pretty much all it took to support Swift, because all the other things

16:04.040 --> 16:09.160
are built on top of the primitives provided by the GDP remote protocol, and we also didn't

16:09.160 --> 16:15.240
need to modify the runtime or anything to make this all work. Finally, we introduced a

16:15.240 --> 16:20.440
WebAssembly platform to all DB, and so when you create a target in all DB with a WebAssembly

16:20.440 --> 16:25.240
triple, this platform will automatically get selected, and so one of the responsibilities of the

16:25.240 --> 16:30.200
platform is launching binaries. And this platform can be configured with a specific runtime,

16:30.200 --> 16:34.200
so that when you load it in all DB, and you type run, it's automatically going to launch that

16:34.200 --> 16:39.800
under that runtime, connected to GDP, and make it look like you're just debugging something natively.

16:39.800 --> 16:44.600
And once configured, like this thing disappears, and you get the user experience that you're used to.

16:46.200 --> 16:50.840
So with all that, I'm happy to say that I think we delivered on our original goal,

16:50.840 --> 16:54.440
and all the B now have first class debugging support for WebAssembly,

16:54.440 --> 16:59.880
and not just for Swift, but for any language that is supported. We also succeeded in building a

16:59.880 --> 17:05.240
solution that's not tied to a particular runtime. We have already three runtime today that support

17:05.240 --> 17:10.600
debugging with all the B this way, so besides the micro runtime and Wasm kit, there's also support

17:10.600 --> 17:16.360
in WebKit's JavaScript core engine. And so all these runtime support, the protocol extensions

17:16.360 --> 17:20.920
that I've discussed earlier, they are formally documented on the LDB website, and I hope to see

17:20.920 --> 17:28.120
more the Buggers and run times adopt them. So it's next, the immediate priority for us is to

17:28.120 --> 17:33.080
extend our test suite to build all our test binaries for WebAssembly, and then run them under a runtime.

17:33.080 --> 17:38.600
That's going to allow us to reuse our thousands of existing tests to test our WebAssembly support,

17:38.600 --> 17:43.880
and hopefully this will also help uncover bugs in a different run times GDP stops.

17:44.680 --> 17:48.680
And beyond that, we'll want to support more Swift features as they make their way over to WebAssembly,

17:49.320 --> 17:54.040
and then add our spaces is something we'll need to do in order to get Wasm 64 support going.

17:55.080 --> 18:00.520
I want to thank everyone that made it possible to make this happen, Apollo for his work on WebAssembly

18:00.520 --> 18:06.280
and LDB, Adrian Prantel for his work on the Swift parts, David Spickett for viewing my PRs,

18:06.280 --> 18:11.480
Lex and Yuda for their work on Wasm kit, and Yija show and Mark on the WebKit team for

18:11.480 --> 18:16.520
adding this to JavaScript core. Thank you. I think there should be a little bit of time for questions.

