WEBVTT

00:00.000 --> 00:12.000
All right. Hello, everyone. I'm Martin and I'm going to talk a little bit about my tool chain,

00:12.000 --> 00:22.800
and I'll be in Mnjw. So first off for a little context, what is Mnjw? It's a weird acronym that stands for

00:22.800 --> 00:29.040
minimalistic new for Windows. Most of you may hear Mnjw, probably think of GCC on Windows or GCC

00:29.040 --> 00:35.600
for Windows, but a traditional Mnjw tool chain usually consists of GCC. It has been new to us,

00:35.600 --> 00:41.840
and you have the Mnjw component. Where the Mnjw component is what provides the Windows headers

00:41.840 --> 00:47.600
and libraries. So this is a standalone re-implementation of the whole Windows SDK, which makes that

00:47.600 --> 00:53.200
freely redistributable, which makes things much much easier and much more open source use.

00:54.000 --> 01:02.800
And these days the Mnjw component is essentially the Mnjw-W64 fork. And also the point out,

01:02.800 --> 01:09.200
Mnjw is building totally normal native Windows executables. It's not about emulating

01:09.360 --> 01:16.720
in POSIX, which is what Sigwin does. So what does Lnvm do in this context? When you're targeting

01:16.720 --> 01:22.880
Windows, Lnvm can emulate both Mnjw and GCC depending on what you pass as the target argument.

01:25.840 --> 01:33.520
So with Lnvm, you can emulate what GCC does there. So then what is Lnvm, Mnjw?

01:34.160 --> 01:42.560
Well, it's a Nnjw tool chain where we swept out all of the GCC components to their Lnvm counterparts.

01:42.560 --> 01:48.160
So instead of the GCC, you can have Clang. Instead of the GNU linker, you can have Lnvd. Instead of

01:48.160 --> 01:55.840
the GCC, you can have compiler RT and Libanwind. Instead of LibsDDC++, you have Libs4++, and then you have

01:55.840 --> 02:03.120
LnvsDbugger. So it's a complete tool chain set up from scratch only using the Lnvm component.

02:03.120 --> 02:09.040
It's not all the GNU parts. And if you're contemplating this today, it doesn't sound too hard or

02:09.040 --> 02:14.640
weird to do. But back many years ago, when I started doing this, it wasn't quite a straightforward.

02:16.400 --> 02:21.440
Then for the Y, why would you do this? Well, if you want the target arm, now I'm 64,

02:21.440 --> 02:27.040
you might want to do this. This tool chain also gave its support for the Microsoft BDB debug format.

02:27.200 --> 02:34.400
It gives you sanitizers, which is nice to have. It gives you C++ 11 threads without using the

02:34.400 --> 02:41.280
WinP threads library. It gives you Windows native TLS. Instead of the emulated TLS, you get working

02:41.280 --> 02:48.160
weeks, symbols. And for other parts, you get one single tool chain that lets you cross compile

02:48.160 --> 02:53.840
and target all of the four architectures that Windows runs at the moment instead of having to have

02:53.840 --> 03:00.560
four different tool chains for this. And then from the developer perspective, it's built on a

03:00.560 --> 03:07.200
modern code base, which is quite easy to work on, easy to fix new things on. These were the

03:07.200 --> 03:15.040
selling points of this tool chain in say 2019. To be completely honest, though, GCC is catching up

03:15.040 --> 03:21.040
on many of these fronts. So for example, for arm 64 support, there is support that in progress.

03:21.040 --> 03:27.520
It's not completely done, I'm sure, but it's getting there. There is work on PDB. There is,

03:27.520 --> 03:32.080
I have seen some patches somewhere for sanitizers. I don't know if they're being upstream,

03:32.080 --> 03:39.040
but they do exist in some foreign somewhere. There is C++ 11 threads without WinP threads,

03:39.040 --> 03:47.040
since a couple of releases. There's been patches for native TLS on x86, merged a couple of months ago,

03:47.040 --> 03:52.480
so I'm, but you don't get the single tool chain targeting on four architectures.

03:53.840 --> 03:58.880
So the goal of the LVM in GW2 chain is that you, it's supposed to be dropping compatible

03:58.880 --> 04:05.920
for any existing project that is ready to be built in in GW2 context. Both that you can have

04:07.600 --> 04:15.520
the source is supposed to work as it used to, but also making it compatible with the build tools.

04:15.680 --> 04:20.400
And the build system, whatever they expect from in GW2, kind of tool chain.

04:22.400 --> 04:28.320
So I'm going to talk about how this project got started, which is the more interesting part.

04:28.880 --> 04:34.480
So I have a background in multimedia, in FFNPEG and BLC and radar project. I have been

04:34.480 --> 04:40.160
porting FFNPEG to weird mobile platforms since 2006, soon been on Windows CE and these kind of things.

04:41.040 --> 04:45.760
And in 2014, BLC had the desire to run on Windows Phone.

04:48.720 --> 04:54.960
And the problem is that BLC is very heavily tied to being built with the NW2 chains,

04:54.960 --> 05:00.480
we'll see for Windows. They're building over 103rd party libraries, most of them at the time

05:00.480 --> 05:05.360
built with AutoCone of AutoMake and so on, and you primarily cross-compiled this from

05:05.360 --> 05:13.840
from Unix. And in upstream GCC, and there was no support for anything else than X86,

05:13.840 --> 05:17.840
if you want to target Windows. They tried to have a GCC maintainer involved in this,

05:17.840 --> 05:23.520
and there were some patches, but it didn't really go anywhere. So I tried to help them out,

05:23.520 --> 05:30.320
and I came up with a proof concept. So you can wrap them as VCCL.exe compiler, wrap it in a shell strip,

05:30.400 --> 05:38.400
a GCC style command by an argument, and then invoke CL.exe internally. I gave them a proof

05:38.400 --> 05:43.360
of a concept that worked for the most trivial library, and they took it on, and made it actually

05:43.360 --> 05:50.640
work for them. So they did ship Windows, VSE for Windows Phone. With this, it was most certainly

05:50.640 --> 05:58.080
not pleasurable to use, but it was possible to build things with it. So they still had some

05:58.080 --> 06:02.640
desire. They wanted to have a proper Mingi W2 chain to target arm, and they got,

06:03.680 --> 06:10.480
got aware of the fact that LVM did support generating code for Windows on arm, and LVM did support

06:10.480 --> 06:17.280
targeting Mingi W. If you had the next existing Mingi W2 chain. But they were missing the linker,

06:17.280 --> 06:22.800
they were missing building all of the runtime libraries, they were building the import libraries

06:22.800 --> 06:29.440
that you need, and tying it all together. So they had involved a guy called Martin Malone in

06:29.440 --> 06:39.920
around 2015, and I got involved in this also in 2016. Then as a complete separate track, in 2017,

06:39.920 --> 06:48.640
there were lots of rumors about Windows coming to arm 64. If you downloaded the Windows SDK from

06:48.720 --> 06:54.880
Microsoft, it did show that they were import libraries for arm 64, and it had a handful of

06:54.880 --> 07:01.920
executables that you can't run anywhere because this thing doesn't exist. But then people from the

07:01.920 --> 07:09.760
wine project, a guy called Andrey, had the support to wine for running arm 64 binaries with wine

07:09.760 --> 07:16.240
on Linux on arm 64. So from the dozen or so of binaries that were out there in the SDK,

07:17.200 --> 07:23.200
most of them crashed. But maybe one or two of them actually did successfully do something.

07:23.200 --> 07:28.080
So I tried this out, tried to play with it, and tried to look at what's failing in the ones

07:28.080 --> 07:34.480
that are crashing, whether crashing in printf. And looking a bit more on it, I realized that printf

07:34.480 --> 07:39.760
seems to be using any function with a variable argument calling convention, it's using a different

07:39.760 --> 07:47.360
calling convention than the regular one on Linux. So to solve this issue, I realized that well,

07:47.360 --> 07:52.320
first I did a number of misguided attempts, can we hack around this with some inline assembly,

07:52.320 --> 07:56.400
or whatever you can't really. You do need to have compiler support to implement the different

07:56.400 --> 08:01.840
calling convention. And as I happened to have LVM checked out, and was somewhat familiar with it,

08:01.840 --> 08:06.320
tried to have a look at how hard could it be. Well, and after noon of gripping around,

08:06.800 --> 08:15.040
gets you somewhere. So I got some of the calls to printf working. So, and take this on word a

08:15.040 --> 08:20.400
couple of months, I had mostly working wine environments where you can run executables,

08:20.400 --> 08:25.760
but Windows are norm 64, which does not exist, except for these dozen or so executables that

08:25.760 --> 08:33.600
happened to live in the Windows SDK. So we're having the test environment. There was some initial

08:33.600 --> 08:40.320
commits to LVM and client, but people from Qualcomm for generating the Qualcomm 64 objects files.

08:41.200 --> 08:45.360
You didn't have anything of the rest of the ecosystem, but you could generate like trivial

08:45.360 --> 08:52.880
objects files. And the Windows SDK contained the import libraries, but the MSVC tool chain itself

08:52.880 --> 09:00.160
didn't have a compiler. It didn't have the rest of the MSVC C run time startup libraries, and so on,

09:00.160 --> 09:05.680
that you need to actually do something. But you had like some parts of it, so I tried to play around

09:05.680 --> 09:13.440
with it, and tried to link things with any support to the LLD linker for this. As it turned out,

09:13.840 --> 09:22.080
the MSVC link.exe linker that already existed actually had support for the Qualcomm 64 objects file format.

09:22.080 --> 09:26.480
So with that, you could kind of reverse engineer and figure out what it's supposed to be doing

09:26.480 --> 09:37.440
with these relocations and implementing the same thing in LLD. So with this, this progress

09:37.440 --> 09:45.360
a bit more and managed to produce trivial, small executables that I could run in my emulator.

09:47.120 --> 09:52.320
And I followed this into the overall effort of LV together with the LW, so I had to support

09:52.320 --> 09:59.840
this also to the MNJWSDK. And that's a small side quest added support for targeting the

09:59.840 --> 10:07.520
universal C run time as well in MNJW. So by the end of 2017, I had got a seemingly working tool chain

10:07.520 --> 10:16.000
with this. Both figured out how to get the Liban wine C++, like C++ and so on, to properly

10:16.080 --> 10:23.040
composite of us in general for X86 and so on, and also have a mostly working on 64 tool chain

10:23.600 --> 10:31.120
targeting the emulator. And the main GW linker interface for LLD by Martell was finally merged,

10:31.120 --> 10:34.720
which meant that you could actually build this tool chain without any extra patches on top of

10:34.720 --> 10:46.960
LLDM, which is a key thing. So I didn't really set out to maintain my own tool chain distribution.

10:46.960 --> 10:54.080
That's a lot of work and I don't want to do that. But building all of this was kind of non-trivial.

10:54.080 --> 11:00.400
So I needed to have a reference example of how we are supposed to do it. And initially this was

11:00.480 --> 11:06.560
just a branch on video lens, Dr. Billbot's repo, but at the end of 2017, I graduated

11:06.560 --> 11:13.280
to split it up to a separate repo on GitHub called LLDM in GW. So I had mostly working to watch

11:13.280 --> 11:22.400
in targeting ARM 64, but only running in an emulator. Well, I contacted my friends in video

11:22.400 --> 11:27.040
and asking them, can they talk to Microsoft and get me something? And they managed to get me a

11:27.040 --> 11:32.800
prototype ARM 64 device. I could try out my existing tool chain on the real OS,

11:32.800 --> 11:41.040
prototype device, did it work? Of course not. Who am I kidding? But the thing is it wasn't really

11:41.040 --> 11:47.040
far off, it was actually quite close. So it turns out that you get that dialogue if it refuses to run

11:47.040 --> 11:51.920
your binary. It refuses to run your binary if you don't have the dynamic base flagset. And that's

11:51.920 --> 11:57.040
just one single bit to say that, yes, I am okay with the OS loading me at a different address.

11:58.240 --> 12:02.720
Set that it takes a little while to figure out, but once you do it, it's like a couple of lines

12:02.720 --> 12:10.400
that code to fix. Also, simple binaries worked, bigger ones crashed, it turns out that on Windows,

12:10.400 --> 12:16.720
when you're allocating stack, you need to probe it. So if you have a function that say allocates

12:16.800 --> 12:22.640
12 kilobytes of stack, you need to touch it, one page at a time, to let the OS actually allocate the

12:22.640 --> 12:28.960
page for you. You don't notice this when you're running on wine on Linux, but when you're running on

12:28.960 --> 12:33.760
the real thing, you do. So then I learned what stack probeing is and have the implement that. And then

12:33.760 --> 12:39.120
it also turned out that set jump was broken, because I didn't have the proper SEH on winding

12:39.120 --> 12:46.240
info, which they require. So instead, I did a trivial rehabilitation of these just by dumping

12:46.240 --> 12:52.880
registers, restoring registers. So with this, with the real device at hand in hand, I made

12:52.880 --> 12:58.720
a working build of VLC with lots of hack in the process, of course, in a couple of weeks.

13:01.360 --> 13:06.480
Then we fast forward a couple of months, and Microsoft had their build conference, where they were

13:06.560 --> 13:11.840
unveiling Windows for ARM 64 for developers, and they were showcasing how do you port

13:11.840 --> 13:19.520
apps to it? You do it by recompiling with the latest Visual Studio, but they also showed something else.

13:20.400 --> 13:26.400
The market thinking, the open VPN driver, it looks simple, I just saw like 5 C file, I saw like

13:26.400 --> 13:31.040
5 header file, that sounds like really simple. What about a more complicated app? So with each

13:31.040 --> 13:36.720
star to VLC, VLC, I don't know if you know is one of the most commonly used software Windows,

13:36.720 --> 13:41.280
it's a media player, it's open source, and it's extremely popular, we know that based on our

13:41.280 --> 13:45.600
usage. So we restart the VLC, and we were like, hey, are you interested in porting your app,

13:45.600 --> 13:50.240
your desktop app, ARM 64? And so just like these tools, the main developer who we will see,

13:50.240 --> 13:54.640
he was like, yes, we are in, we are keen, let's try this out. And so we shipped them a couple of

13:54.640 --> 14:00.640
devices, right? We loaded them up with Windows, we gave them preview tools, and in shocking

14:00.640 --> 14:05.200
the short order of time, they had this up and running. In fact, we have folks from VLC in the

14:05.200 --> 14:10.880
room, I don't know if you go here, maybe there is, there's you go, so if you, he's from VLC,

14:10.880 --> 14:14.160
so if you do actually have questions on what they did, you should reach out to him.

14:15.200 --> 14:19.360
Again, just to be labelled at the point, when I asked John that, he's like, how much code

14:19.360 --> 14:25.680
did he have to change? His answer? Zero. He changed, zero lines of code. His biggest problem was,

14:25.680 --> 14:30.960
again, we'll see, has his own home-brew way of building things. And so, ingesting the latest

14:30.960 --> 14:34.800
Visual Studio Preview compiler into his build system, that was his biggest challenge.

14:34.800 --> 14:38.240
Changing code, zero. He didn't have to change a single line of code. Everything just worked.

14:45.360 --> 14:46.880
Yeah, so that wasn't quite true.

14:46.880 --> 14:57.200
But, then again, sure, Shampa, this didn't change any lines of code. I did. And quite a lot.

14:57.200 --> 15:04.240
I had like a stack of 60 to 70 patches on top of VLC to get this building. But, the point is,

15:04.240 --> 15:13.200
I wasn't using MSVC. And, I mean, the point is that I rather rebuild a complete tool

15:13.200 --> 15:18.480
change from stretch than trying to adapt the automated auto-convill system to use MSVC.

15:21.760 --> 15:27.520
Anyway, going on from there, I mean, that was how things got started. After that, it's just been

15:27.520 --> 15:34.000
a bunch of years of cleaning things up and so on. I'm just going to mention a couple of all fun hacks

15:34.000 --> 15:41.760
that we did during the startup. So, when I started doing this, we didn't, LLVM didn't have

15:41.840 --> 15:48.960
a tool for treating executables or objects files for PECoff. So, to fix this, I built new binutals

15:48.960 --> 15:56.560
and to there, strip-util. But, it doesn't recognize R64 binaries. So, what I did, I did a small

15:56.560 --> 16:03.840
wrapper that changes the machine flag saying that, yes, this is an X86 executable. Please strip it

16:03.840 --> 16:10.960
for me and then I'll change it back to the real one. It works. I mean, a couple months later,

16:10.960 --> 16:16.480
yeah, I did implement the real thing in LLVM, but you can get quite far with hacks. For the resource

16:16.480 --> 16:21.680
compiler, it didn't have integrated pre-processing, which meant, again, a shell script wrapper,

16:22.320 --> 16:27.360
that integrate parsing the command line options during the pre-processing and so on.

16:30.240 --> 16:35.280
What's a hacks at the beginning, much less hacks these days, to these days, it's quite

16:35.360 --> 16:43.280
straightforward. For other parts, there's been quite a lot of work on implementing quite

16:43.280 --> 16:48.320
minted W specific features. Of them, the author import features, probably the biggest one,

16:49.920 --> 16:56.000
which means that you can reference data symbols in a different DLL without explicitly

16:56.000 --> 17:06.320
declaring them as DLL import. And this is maybe the most minted W specific feature that you have,

17:06.320 --> 17:12.080
which before having this, this was the biggest worker for porting or building lots of existing

17:12.080 --> 17:18.160
minted W projects with this tool chain. At the start, I was wondering, shouldn't you just

17:18.880 --> 17:23.360
advocate for people actually adding DLL import for everything? But it turns out that if you want to

17:23.360 --> 17:29.840
link the simple first standard library as a DLL with the titanium C++ API that you use from

17:29.840 --> 17:38.800
in GW, then you do need to have this feature. So the project is matured. We've gotten rid of

17:38.800 --> 17:44.000
most of the hacks in the build process, but these days, since we've added support for ARM 64EC,

17:44.720 --> 17:51.680
which is our fifth architecture, we had to add back some hacks around that. We're not initially

17:51.760 --> 17:57.520
only provided cross compilers from Linux and MacOS, but these days also running on Windows.

17:59.040 --> 18:06.720
So it's works for any case anywhere. And to prove the concept, the M62 package manager

18:06.720 --> 18:13.200
have also added client 64 and client ARM 64 environment, which is in environment where you have

18:13.200 --> 18:22.880
playing NLLD as the system tool chain. Exactly like mine. So these days, I do releases by building

18:22.880 --> 18:29.920
everything on the free GitHub Actions. I do nightly builds every day, every night. With the latest LLVM,

18:29.920 --> 18:35.200
like this Ninja W, if you want to test things out, you can always grab that. And I do release every

18:35.200 --> 18:41.280
time there's a new release of LLVM, which is essentially every other week almost all year round.

18:43.680 --> 19:04.240
All right, so the question is what we use for the C library. So one of the cornerstones

19:04.240 --> 19:12.160
in how Ninja W works is that we are building things to use existing Microsoft C runtime on Windows,

19:13.120 --> 19:18.320
which is also a cornerstone in how you get things GPL compatible because that's like

19:19.920 --> 19:25.360
even though it's a close source library, it shipped as part of the OS. So GPL is okay with linking

19:25.360 --> 19:30.880
to it. So that's the main design in Ninja W that we're linking against the existing one.

19:31.600 --> 19:39.600
Yeah? Do you need that reach of consoles or do you use convenience?

19:40.800 --> 19:46.400
So the question is whether I build Ninja W from source or if I use pre-built. I build everything from source

19:47.840 --> 19:53.520
because if you want to switch, if you want to target something as like ARM 64 or whatever,

19:53.520 --> 19:58.320
you need to have it, you need to build everything from scratch. You can't reuse anything pre-built.

19:58.880 --> 20:05.920
And that's also part of the porting process. So instead of just going to say I want to build things

20:05.920 --> 20:12.320
for ARM 64, you try to build things with a new tool chain for X86, where things are supposed to

20:12.320 --> 20:16.400
be working in the same way. And then once you have that working, then you switch over to a different

20:16.400 --> 20:27.120
architecture. Okay. And the back? The question is why is it still supported?

20:28.080 --> 20:34.720
Yes, I do my local nightly build the same testing with wine on both X86 and ARM 64.

20:34.720 --> 20:41.040
And the ARM 64 support in wine used to be quite hacking. There were a lot of quite fundamental

20:41.040 --> 20:45.440
issues around that, but all those fundamental issues have been solved in the last couple of years.

20:45.440 --> 20:49.840
So also things are really good these days. One final question?

20:57.760 --> 21:05.120
So the question is, if need the WS permissible license, and yes, it's mostly permissible license,

21:05.120 --> 21:13.440
I would say, so some I can't really say for sure, but most of the components have something like

21:13.440 --> 21:18.240
a public domain or similar, there's an all over them.

