WEBVTT

00:00.000 --> 00:07.760
All right, and it's time to go.

00:07.760 --> 00:12.120
All right, so next talk is going to be around net but without throwing a fit.

00:12.120 --> 00:16.160
Yeah, so hello everybody, and welcome to my talk.

00:16.160 --> 00:21.640
My name is Ahma Pratung, I work for Pinguotronics, we do embedded Linux consulting and development.

00:21.640 --> 00:27.000
And something I do a lot at work is network-booting colonis.

00:27.400 --> 00:29.400
Yeah, network-booting colonis.

00:29.400 --> 00:36.440
So where I usually do it is I extract the root fs into an NFS exported directory, optimally.

00:36.440 --> 00:40.600
The root fs is already self-describing, it has a kernel, it has a device tree, it has an inner drum

00:40.600 --> 00:44.320
fs, and it has a bootloader spec that ties it all together.

00:44.320 --> 00:49.160
Here is an example, so it has a 618 kernel, it references a device tree.

00:49.160 --> 00:54.280
And then I can point in this case the barbox bootloader added and booted.

00:54.320 --> 00:59.320
Unfortunately, this doesn't always work out of the box.

00:59.320 --> 01:05.520
Because I usually extract with out food, so the user ID and the ID are all wrong.

01:05.520 --> 01:10.200
And all these executables with sweet bits, they won't work,

01:10.200 --> 01:13.680
because they will try to assume the user I have on the development host.

01:13.680 --> 01:16.080
And not root for example.

01:16.080 --> 01:20.560
There are ways around that, you can page, if you are using root for everything,

01:20.560 --> 01:23.000
you can just remove the sweet bit and it mostly works.

01:23.040 --> 01:25.520
You can run it in a fake root environment.

01:25.520 --> 01:30.440
Yeah, so there are work around, another problem is NFS being networked.

01:30.440 --> 01:33.400
That means that you need a lot of special casing.

01:33.400 --> 01:38.000
For example, if you have ethernet switches, you will have to disable some system

01:38.000 --> 01:42.560
ethernet work deal with, but you can also work around that if you have a USB host controller,

01:42.560 --> 01:48.240
user USB ethernet adapter, if you have a gadget device, a gadget controller,

01:48.240 --> 01:57.040
you can since kernel 612 configure USB 9PFS, which I presented on was my colleague Michael last year here at first them.

01:57.040 --> 02:02.360
What's more difficult to handle is when the user space just says no,

02:02.360 --> 02:07.040
because it's very set in its way to expect a boot block device,

02:07.040 --> 02:12.800
or it's an AB system and it expects to have an active partition.

02:12.880 --> 02:20.640
What you need in to do in this case is look at all the inner scripts and other services

02:20.640 --> 02:24.880
and make sure that they would tolerate being run under NFS hood.

02:24.880 --> 02:30.160
Once you add some verified boot scheme into the mix and then you have device map

02:30.160 --> 02:33.920
or configuration for the M-Varity, the M-Priptium Integrity and so on,

02:33.920 --> 02:39.200
you only get more services that depend on a specific block order,

02:39.200 --> 02:43.440
a blocked device landscape to be available.

02:43.440 --> 02:49.760
We also usually lose this nice aspect or with bootloader spec where you can have

02:49.760 --> 02:55.120
a partition that's fully self-describing, because you will have a signed kernel,

02:55.120 --> 03:00.160
bundle with everything in it and it will usually be in a separate partition and so on.

03:00.160 --> 03:05.440
And the end result is that everyone needs to take care to keep the network boot

03:05.760 --> 03:12.560
workflow working or it's just not there and this bothered me a lot in some previous projects

03:12.560 --> 03:16.800
because I just wanted to network with the kernel, they back some kernel issue to a bisect

03:16.800 --> 03:22.400
but always network boot is bitwating because it's not the usual way the system is booted.

03:23.440 --> 03:29.520
So I wanted to rethink the issue, what's the minimum viable payload that I need for kernel boot

03:29.520 --> 03:32.640
and very importantly I don't want to mess with the bootfs at all,

03:32.640 --> 03:36.880
I don't want to mess with any OS-based system like Yachter or something,

03:36.880 --> 03:41.120
I just want to build a kernel, I want to be able to network boot it and leave everything as is.

03:42.480 --> 03:49.360
So I am usually working with arm systems and they usually probe non-discoverable devices with

03:49.360 --> 03:53.440
device tree, so what I needed is the kernel, the device tree and the modules.

03:54.480 --> 04:00.400
As bundle format to put them all in, I used fit, so fit is flattened image tree,

04:01.040 --> 04:06.960
here's a link to the specification, basically you have a number of images and in the same fit file

04:06.960 --> 04:10.880
you have a number of configurations that references this images.

04:10.880 --> 04:16.240
Images can be a kernel, random disk device tree for different boards and what the bootloader is

04:16.240 --> 04:24.080
going to do is that it will look at the compatible property for example inside the configuration

04:24.080 --> 04:28.080
and compare it against its own compatible for the bootloader.

04:28.080 --> 04:33.040
So I am on a ratcha rock 3A board and it will take the first configuration, it will check

04:33.040 --> 04:38.800
out and free scale I'm 8mm, that's not a match, so it will try the next configuration,

04:38.800 --> 04:43.680
then it sees a ratcha rock 3A and the configuration is also a ratcha rock 3A,

04:43.680 --> 04:50.480
so it takes that configuration and in that configuration it will find a kernel, a ram disk,

04:50.480 --> 04:54.880
and a device tree and it's going to load these three things.

04:55.280 --> 05:02.640
The next thing about fit is since 10 it's supported in the kernel, so for arm 64 it can

05:02.640 --> 05:07.280
just generate a fit image, the fit image that's generated will contain the kernel and it will

05:07.280 --> 05:12.560
contain all enabled device trees, so you could just take the normal depth config for V8 and you

05:12.560 --> 05:17.120
will have a very big fit image with, I don't know, thousands of device trees, but it should be

05:17.200 --> 05:25.520
able to boot on all of them, then there is still the issue of modules, modules are usually

05:25.520 --> 05:30.720
in the bootfs but we can choose the modules in the bootfs because they will be mismatched to the

05:30.720 --> 05:35.440
newer kernel that we are going to boot and we don't want to build everything into the kernel

05:35.440 --> 05:41.360
because we might have dependencies on firmware that's in the bootfs for example, so we just want

05:41.360 --> 05:46.800
to overlay the modules and Linux has a nice mechanism that allows us to do this easily

05:46.800 --> 05:54.320
which is its handling of concatenated CPIOs, so you can have a number of CPIOs, CPIOs,

05:54.320 --> 06:00.160
the format for inetram disks or inetram fs and you can compress them individually and just concatenate

06:00.160 --> 06:05.120
them and what the kernel is going to do, it will check the first inetram fs, decompress it

06:05.120 --> 06:10.400
if needed, extracted in the initial ram fs and then takes the next one and so you can on the command

06:10.560 --> 06:17.760
line just to the cat, composite your inetram fs out of individual components and for

06:17.760 --> 06:23.920
619, an upstream target, it's called module CPIO package and it does what it says in the

06:23.920 --> 06:29.520
tin, it's inetram fs with all the modules and it's your usual directory layout with layer

06:29.520 --> 06:36.400
shaped modules and so on and modules order and all these files and there is in parallel a series

06:36.480 --> 06:41.440
by Simon Glass who added the image fit target in the first place which allows adding

06:41.440 --> 06:47.760
run disk support to the fit image and yeah now we got kernel, device tree and also modules in

06:47.760 --> 06:56.960
the inetram fs and what's missing is something to load the modules and we can also make use of

06:56.960 --> 07:02.640
this concatenation mechanisms or if we already have run disk in it we just need to add a bind

07:02.640 --> 07:09.440
mount into it so here for example if I have a shell in my inet adi I can say mount minus oh

07:09.440 --> 07:16.000
bind lip modules which is a lip module is inetram fs and bind mounted over the lip modules in

07:16.000 --> 07:20.960
the new root file system that I'm set the inetram fs is going to switch to if you don't have an

07:20.960 --> 07:28.240
inetram fs available or you can easily add a line my colleague Stefan Kirkman added to our s inet

07:28.320 --> 07:33.760
our s inet is a very minimal run disk inet that we wrote for some of the embedded systems

07:33.760 --> 07:39.440
that we are developing on and the soon it should also have the ability to bind mount

07:39.440 --> 07:45.840
folders from inet adi including the module directory and we can get that in with exactly the same

07:45.840 --> 07:51.360
concatenation mechanism so here is a shell script that ties it all together I build the kernel

07:51.360 --> 07:59.600
with my make all I build module cpi oh package by compress this modules inetram fs I concatenate

07:59.600 --> 08:05.200
it was our s inet are as inet has a make file that builds cross compiles it for different architectures

08:05.200 --> 08:10.640
and then you have a safe contained inetram fs linked against muzzle and then we make image fit

08:10.640 --> 08:15.760
and an extra argument you get a fit image out that contains all these components modules

08:15.760 --> 08:23.520
kind of device tree and any inetram fs extras you want to add last missing piece here is

08:23.520 --> 08:28.080
we miss out on the staff's a bootloader is doing so usually you have quite some bit of complexity

08:28.080 --> 08:32.480
in the bootloader in the way that it adds command line argument it applies device tree

08:32.480 --> 08:36.960
fix ups for example it was bootloader spec you have an options key and you would lose out on that

08:36.960 --> 08:44.480
if you are just using to boot fit image so some bootloader integration would be nice in

08:44.480 --> 08:48.880
barbox I added this in the form of overrides so what's overrides can do now

08:48.880 --> 08:55.200
if you check out my branch I'm still in process of upstreaming it you can tell barbox boot

08:55.200 --> 09:01.600
as you would usually be it bootloader spec be it fits fit fit be it boot script but take from

09:01.600 --> 09:07.840
this fit image this image just as overrides and apply them and then you can replace the kernel

09:07.920 --> 09:16.960
image out of it fetch and you can so append the inetradi on the fly and a very nifty feature

09:16.960 --> 09:22.720
is that you can put these overrides on your TFTP server for example and configure once how you want

09:22.720 --> 09:29.120
to override the boot and then you can just boot it and yeah I think I have like two minutes on the

09:29.120 --> 09:37.440
clock because I started to be glad I have a small demo that I would show you it's very short

09:39.680 --> 09:43.680
I got 11 minutes okay

09:44.560 --> 09:46.560
okay

09:51.840 --> 09:59.680
yeah I am not seeing it so I can tell you something about it so yeah it's a bootloader spec file

09:59.680 --> 10:04.320
that would have been booted by usual it's a 618 kernel it has a bootloader spec file has a

10:04.320 --> 10:10.320
kernel image and a device three image and it and now this is my overridescript it's fetched over TFTP

10:10.320 --> 10:15.760
now it has an image that's a fit image it concatenates some inetradi on the fly notice these

10:17.040 --> 10:21.760
columns that means append it has some command line arguments that's the stuff on double quotes

10:22.640 --> 10:30.480
oh that's a bit with bar and yeah but you can see it when it scrolls up and now by default if I type

10:30.480 --> 10:36.720
death boot it will boot that script over TFTP you just need to set a user so you can share the TFTP server

10:36.800 --> 10:41.920
between different users you see these notice at the top it says it has overridden stuff

10:43.040 --> 10:48.720
it has taken the override for the kernel override for the inetradi or override for the TFTP

10:48.720 --> 10:54.720
history you see now it's a 619 kernel not a 618 that was there before you see modules

10:54.720 --> 11:03.840
are working and they come from the bind mount and yeah now you can just debug something it doesn't

11:03.920 --> 11:10.400
take much time you just write the script once and you can even do some very brute force print card

11:10.400 --> 11:18.080
debugging and I like this a lot because it's like you have booted in a minute and it makes it

11:18.080 --> 11:23.920
very easy if I don't know where to look I just paper over all over the place print case and I don't

11:23.920 --> 11:29.600
need to care about read only rootfs fit image or the emberty I can just get to debugging without having

11:29.600 --> 11:40.640
hard to touch the rootfs and yeah that's it thanks for listening and I think we have enough time

11:40.640 --> 11:42.640
for Christmas yes please

11:53.120 --> 12:01.760
I have a small question about the first bootnet boot image you have what if something going wrong

12:01.760 --> 12:08.720
when you try to load the image how could you debug without any shell or something for example in our

12:09.680 --> 12:16.960
case we had a broken or mist and figure it network and switch and without shell it's impossible

12:16.960 --> 12:23.920
to debug but we shall we debug it easily what what was going wrong how it could be done in your case

12:25.120 --> 12:30.720
okay so you want so you usually network boot your switch but it's not working and you want to

12:30.720 --> 12:43.440
debug that I mean that Samsung wrong is a configuration of the chosen image itself like in the

12:43.440 --> 12:51.360
case you have this kernel device three whatever but no shell there you expect that everything works

12:51.360 --> 12:56.720
always it's not the case okay yeah yeah yeah yeah that's true you can always do a full network

12:56.720 --> 13:05.440
boot with a different rootfs coming via NFS that's the way I usually use it but often the customer

13:05.440 --> 13:11.280
tells me yes I get like audio glitches here with this image and I want to try a new kernel I want

13:11.280 --> 13:20.160
to debug the kernel so this is mostly for the part where you want to reuse rootfs but with the

13:20.160 --> 13:25.280
booter integration you say so at the end you can specify a different tool and you can specify

13:25.280 --> 13:32.560
other root or you can just do an NFS boot so if you have a built system which can give you a root

13:32.560 --> 13:37.600
FS that you can just boot over NFS yeah go for that and use that and don't use what's on the device

13:38.560 --> 13:45.920
but in my case I really wanted to use what's already on the device but yeah if I had the same issue

13:45.920 --> 13:49.520
as you I would not use a root file system on the device but you another root file system

13:50.480 --> 13:57.600
you could also add a root file system via inner tramefs if you have enough ram and boot from that does that answer your question

14:02.480 --> 14:06.720
Hi so I have a question regarding the initarity image

14:08.080 --> 14:13.520
You said about the concatenation right I mean so basically that's does it mean that you

14:14.080 --> 14:17.040
Can can it concatenate many initarity images

14:17.760 --> 14:22.080
Because you want to have access to see how it behaves in the user space type for example

14:22.320 --> 14:26.880
the firmware or any other extra kernel models that needs to be there right

14:27.920 --> 14:32.320
Then as I said the second part but yes you can concatenate as many in a tramefs as you want

14:32.320 --> 14:39.040
Moussa as I am a door and there can even be compressed individually so that allows you to just composite in a tramefs just with a cat

14:39.600 --> 14:42.000
Okay, but okay, let me frame it this way then

14:42.480 --> 14:47.520
Does this so even your concatenating the initarity images does it follow a particular order?

14:47.520 --> 14:49.520
I can be jumbled up. I mean

14:49.520 --> 14:54.320
Uh, there's a later ones overall it's a previous ones so they are getting extracted over each other

14:54.960 --> 14:56.960
And yeah, so you could overwrite

14:57.680 --> 15:01.760
If you have modules for example in you know original in a tramefs which is not too uncommon

15:02.000 --> 15:09.280
You could just replace it with modules that come later as I trust overrides them so they are extracted as far as I am aware one after the other into the ramfs

15:10.000 --> 15:13.040
Okay, but is there like in the main lane kernel

15:14.480 --> 15:17.040
Maybe this mechanism I I read somewhere that this

15:17.680 --> 15:23.680
You don't need to follow the specific order of initarity images that needs to be concatenated it can handle this automatically

15:24.640 --> 15:28.320
I'm not aware that you need a specific order. I mean, I'm not used it. I thought

15:28.880 --> 15:32.880
Yeah, you know, so just concatenates it and the twack that is for me

15:33.440 --> 15:38.000
I also only learned last year about this mechanism because my colleague and told me about it

15:38.480 --> 15:40.880
And yeah, it works very well for this use case

15:51.200 --> 15:53.200
To benefit from

15:53.680 --> 15:56.160
This bind maroon for modules

15:57.040 --> 16:03.440
Without an initarity would it be feasible to add something to the built-in magic that mounts the init

16:03.600 --> 16:05.600
Romets

16:06.080 --> 16:08.880
In the kernel to to benefit from the modules

16:10.800 --> 16:15.360
Okay, so you want to have the modules in the initramfs, but some byte mounting to happen from kernel side

16:16.160 --> 16:18.160
Yeah, right okay

16:22.400 --> 16:31.280
He had a patch series for doing an overlay in the kernel and that way you could just say okay take

16:31.680 --> 16:33.680
I

16:34.240 --> 16:38.800
Yes, so you can absolutely add this logic to the kernel, but I don't know how well you cut

16:40.640 --> 16:43.680
Are you in favor of it because it's that do it was an initramfs

16:44.480 --> 16:48.640
So it's it's doable, but I don't know how what other chances to get this upstream

16:49.200 --> 16:51.200
Because you need to argue in favor of it

16:52.000 --> 16:58.720
And one example is this device to aliases as you might be aware that you want to give the names specific

16:59.280 --> 17:03.920
You want to give the root files the root device is a block device is specific names

17:04.480 --> 17:11.760
This has taken many years until we had aliases for MMCs and the argument all along was yeah, you can do it in an initramfs

17:11.760 --> 17:16.800
Why do we need extra logic in the kernel to do something when you can just script it in your initramfs or use you

17:16.800 --> 17:20.160
That and I think the same would apply here just do it in the initramfs

17:20.160 --> 17:23.280
Why should the logic be in the kernel, but if it was a it would be nice that's for sure

17:23.600 --> 17:26.880
But I think with our s in it it's will be easy enough that you can just

17:27.280 --> 17:29.280
concatenate it at the end

17:29.280 --> 17:31.280
at a kernel argument and

17:31.680 --> 17:34.480
It will leave everything as is but just do the bind mount extra

17:34.480 --> 17:37.040
So that's why I want to go from so you can just

17:37.600 --> 17:39.680
Take it along and even do it offline

17:39.680 --> 17:44.400
You don't need the bootload integration will just lose out on the command line argument fix ups and so on

17:44.400 --> 17:46.400
But if you don't need that you can do that by hand

17:47.040 --> 17:52.160
Should be doable with our just was offline our s in it concatenation and so on

17:52.800 --> 17:54.800
Okay, thank you

17:57.360 --> 17:59.360
Thank you very much for your talk

17:59.760 --> 18:01.760
Thanks for listening

