WEBVTT

00:00.000 --> 00:11.000
Yes, thank you. Hello everyone. Welcome to a little bit of an exploration of what

00:11.000 --> 00:21.000
we've been doing there with the Libroff's lineage family of software. Just a disclaimer,

00:21.000 --> 00:29.000
so despite the title saying we managed to automate S1 generation, we're actually still in the process

00:29.000 --> 00:36.000
of doing that, so we reached some, we reached some milestone.

00:36.000 --> 00:50.000
Very good. We reached some milestones there, but yeah, it's more like a road to travel rather than an end state.

00:50.000 --> 00:57.000
Anyway, overview and challenges. Well, we, so the goals obviously was to have the Libroff's lineage

00:57.000 --> 01:05.000
software ready for the Cyber Resilience Act time, which are upon us, so we're in the transitional period here.

01:05.000 --> 01:13.000
Ultimate goal, of course, to have full dependency relations and ultimately complete transparency

01:13.000 --> 01:22.000
and traceability for all product artifacts. We have achieved quite a lot of that already for container images,

01:22.000 --> 01:29.000
and there's, of course, several other ways to distribute software rather than just in a container.

01:29.000 --> 01:35.000
We will get to that during this thought. Challenges, of course, it's an impressive mix of technology,

01:35.000 --> 01:43.000
and collaboration online building on top of Libroff's core, as JavaScript and TypeScript

01:43.000 --> 01:53.000
with the ecosystem behind that more than 100 beyond, of course, the Libroff is more than 100 native code dependencies.

01:53.000 --> 02:01.000
And an unfortunate amount of edge cases, like phones and dictionaries, no automated solutions,

02:01.000 --> 02:06.000
and you've all been standard, so it's more like fixing the car while it's running,

02:06.000 --> 02:14.000
because we're still in the process of actually writing the standards that are then going to be in place here.

02:14.000 --> 02:19.000
So, well, let's get going anyway. What can we do? First steps.

02:19.000 --> 02:25.000
So, well, we looked at the current state, which was sad, because it was nothing.

02:25.000 --> 02:38.000
Then we did a little bit of reading and talking to experts and participating in, let's say, assessment groups and standards setting setups there.

02:38.000 --> 02:45.000
And then we did roll the dice, and we picked SPDX over outside clone, just to get things going.

02:45.000 --> 02:54.000
And, of course, the initial thing was the very first obvious thing to do was to get to license completely less.

02:54.000 --> 03:02.000
So, so that's the number of other things you can do with aspects, but it's really like having that as a list of things,

03:02.000 --> 03:09.000
lots of ingredients with the license there, but this is the kind of the first goal to reach.

03:09.000 --> 03:19.000
Yeah, and we decided then for at least for a callover online to do that manually, that was a kind of manageable number of things there,

03:19.000 --> 03:28.000
to handle, so we ran with the manual approach, because there was nothing automated anyway.

03:28.000 --> 03:42.000
So, yeah, we did that for the JavaScript dependencies, luckily there are not so many, so it's not the typical NPM mass with hundreds or thousands of the recursive dependencies.

03:42.000 --> 03:56.000
Then we noticed we should have a font for the admin console and oh my god, so yeah, so we just added that as another dependencies and then we started thinking about fonts as a problem.

03:56.000 --> 04:06.000
Then we looked at the C++ dependencies that was manageable, at least on that level, because it was just a handful of libraries, but less liberal office.

04:06.000 --> 04:28.000
So, yeah, so at least minimally we got this sorted and we had a list there, and the minimal thing that we did then to get this at least a little bit of automation is to retrieve the version information of those dependencies from the built system, so at least that's not a manual step anymore.

04:28.000 --> 04:38.000
Yeah, fonts. Well, what's a font? Well, clearly it has a license, so we need an S1 for that.

04:38.000 --> 04:53.000
They're buying the artifacts, so if you read the relevant standards, you might be led to the conclusion that that's also perhaps security implication star.

04:53.000 --> 05:08.000
And in fact, fonts do contain code, so the hinting machinery that's a virtual machine with a language, I was a small little programming language, so that is normally code.

05:08.000 --> 05:22.000
So I guess we probably need that, and we also need to have this kind of security assessment depth now to be able to see like what's on the disk, what version is that?

05:22.000 --> 05:27.000
It's perhaps a CDE for that, we need to include the hash, etc.

05:27.000 --> 05:37.000
Dictionaries, it's a little bit, perhaps a little bit easier there, because more probably that's just a license question, so anybody would still need an S1.

05:37.000 --> 05:49.000
But there's this kind of weird edge case of high-fination patterns, which is also not, it's clearly not a true incomplete, but it's rather complex, so there might be security implications as well.

05:50.000 --> 06:03.000
Okay, so that was the detour, so yeah, we kind of settled with this massive amount, and then we got like 50 dictionaries and hundreds of fonts on top of that just to make us happier.

06:03.000 --> 06:14.000
Yeah, so fonts, no luck there, so we thought, well, maybe someone else has fixed it, but yeah, no dice, that's still open that bug report.

06:15.000 --> 06:43.000
So what actually have we got today? So for our collaborative online, we have existing machinery as described, so that is in the quotes in January 2025, so we got a nominal list there of the licenses, so at least from a license assessment we know what's in there, what's the package license and also what's the dependency license, the big elephant in the room, of course, is that.

06:43.000 --> 07:12.000
Well, it's just listing the core, the liberal office core as one dependency, but not still, so yeah, 26 relationships, version numbers from the built system, mostly manually maintained, and then we have the liberal office core, which is, yeah, it's just a bit of the elephant in the room.

07:12.000 --> 07:31.000
Great large, new make based projects, very large, like really large, like 10 million lines of code large, very complex at the top of the stack, so like apex style, like everything is below that.

07:31.000 --> 07:40.000
So a manual approach here is very likely, let's say, challenging, and I think silly, I'll just attempt that.

07:40.000 --> 07:49.000
Because you might succeed, but then it will very quickly be out of date, so whatever we do here on that scale, it has to be automated.

07:49.000 --> 08:06.000
So if we go and start with that and analyze what we're looking at, the story there has to set more than 100 third party libraries, most of them see C++ based.

08:06.000 --> 08:27.000
So again, the same problem, we had weird legacy build systems, ad hoc things, maybe not, let's say very, very actively maintained, I think they're all maintained, but doing something like this here, like like automation on the scale.

08:27.000 --> 08:40.000
Places a rather significant burden, lots of funds, lots of dictionaries, the good thing is build system knows about all of that, because it's actually written in a way that it's dynamic.

08:40.000 --> 08:56.000
It can be configured, you can switch things on and off, and it knows what it's doing in the end, it's kind of all wrapping it up and sticking it into depth depths or RPM packages or MSIs or what's that TNG on Mac.

08:56.000 --> 09:12.000
So yeah, in general, so that would be an approach, but the problem is that there's nothing really, it's a bespoke build system, it's part of that, there was a rewrite of that build system in the early 2000s.

09:12.000 --> 09:41.000
In general, there is not much any way for CC++, so most, because there's this platform, there's no uniform, other platforms, other languages like Java, Python, whatever, go, they have kind of standardized, or let's say one or two standard build systems for CC++ unfortunately, that's not the case, so yeah, we came on empty there.

09:41.000 --> 10:04.000
So anyway, but we got to go in, so we got some linear list of at least all the dependencies that we ship, that's a work and progress patch, it's not merged, but it's complete, so that's using the list of, because we needed that anyway, so we need, of course, we have a list of third party code, because we needed a list of licenses and we also ship that.

10:04.000 --> 10:24.000
So that's clear, so we kind of annotated that license file with a bit of expert information, we're able to generate a NASBOM, a very minimal one out of that, but at least it includes all third party dependencies with our license and also the version send down.

10:24.000 --> 10:53.000
But that's just the start, so there's a ton of open problems and open issues, because for vulnerability assessment, which is the ultimate goal in the end, license is nice, license is like, if you, somebody else uses that, and it's of course nice to know what's the aggregate license there, but the point with the CRA, of course, is like being able to assess vulnerabilities and then being able to conclude, do I need to update, is that still say,

10:53.000 --> 11:11.000
what's the implication and the impact? And that requires that we fulfill every file on disk, we can deduct what is that, and what's the coming from, which version is that, is there a CVE open, what's the vendor, et cetera?

11:11.000 --> 11:30.000
Yeah, so we did more reading and more soul searching, and well, that was this nice BSI, this German computer security technical report, I can recommend that, it's not binding in any way, but it's quite profound and informative.

11:30.000 --> 11:41.000
So that's what we need to update just BDX, the O1, or perhaps which to cyclone, maybe we need to think about that.

11:41.000 --> 11:59.000
Yeah, and we would, of course, massively benefit from, at least some of all dependencies to do part of that work, but that's a tall order, as I said, this is like, sometimes, very small, sometimes single person, maintainer teams, so sort of anyone,

11:59.000 --> 12:07.000
has any good ideas, any contacts, any funding for that, let's absolutely talk, that's not really something that we want to do.

12:07.000 --> 12:22.000
We might need to do that, but we might much happier if we could somehow manage and facilitate funding for the actual maintainer staff, so they can do that and maintain that going forward.

12:22.000 --> 12:32.000
Court to do's, well, for all, that's actually what it's necessary to fulfill that BSI requirements there.

12:32.000 --> 12:50.000
For all executables, and that includes intra-project scripts like Python, like the Slipper Office Basics, stuff, and all shared libraries, and perhaps font files, I don't know, maybe if someone has an opinion there, let's let's chat.

12:50.000 --> 13:04.000
So there must be described as a component, system libraries and a link against, must be identified at least, so we actually need to figure out what is this third part of thinking, linking against all the system running.

13:04.000 --> 13:10.000
Well, that includes Windows, Mac, and other systems.

13:10.000 --> 13:22.000
Yeah, at least we don't need to take care of that, so whatever is on the system already, that's somebody else's problem, but we at least need to identify that.

13:22.000 --> 13:40.000
And for all installing components, as listed above, we need the following three things name, check some, so we can actually assess or someone can assess that's modified, and that's this very version that we referenced there, and some flags on top.

13:40.000 --> 14:01.000
For the bundled external bits, that's a bit more that we need, so that all needs figuring out some of that is actually unclear, like CPE, identify most of them do not exist, some of them do, some of them might exist tomorrow, so that's a bit of, while chasing them.

14:01.000 --> 14:25.000
Then there's this fun thing like the declared and concluded license, which is very funny, so there might be a declared license, but there might not be the actual license, so we've got to look into that as well, and then some more bits, yeah, anyway, and this is the, the sticking point here is like all files in the installed package, which is a lot.

14:25.000 --> 14:43.000
And again, we are like, why does the library not have an S form yet, this is just a massive, massive amount of work, all of that is like system dependence, so can't from different, it's going to look different on windows and it looks on on Linux or Mac, so well, yeah, what can we do?

14:43.000 --> 15:03.000
How can we derive that automatically, where what those files at that end up on disk where they come from, how can we derive the dynamic linking dependencies, static linking dependencies on top, and certainly for C++, there's nothing really that we can use out of the box.

15:03.000 --> 15:25.000
So the actual plan comes with bonus challenge, as I said, we have this not only can we configure what's included, in the bills we can say, we don't want Python or we actually want Python or we want this font or that feature or we want PDF import or rather not, so it's quite dynamic, and it's also platform dependent.

15:25.000 --> 15:37.000
Yeah, so that's just one way, so like we started with, let's maybe do this automatically and we end up with, we absolutely must do this automatically everything else is just not maintainable not doable.

15:37.000 --> 15:49.000
Yeah, so the good news is that technically the bill system knows all of that, but most of that is implicit and for the third party things we actually need to dig into variable systems.

15:49.000 --> 16:02.000
So the actual plan is extract all of that from the bill system, so I'm going to speed up a little bit, I think, not much time left as slides will be online.

16:02.000 --> 16:16.000
So what we have is another draft patch that started with that approach of digging that thing out of the existing bill system, but it's very early days.

16:16.000 --> 16:29.000
So all the interesting things like getting the dynamic linking dependencies, getting the third party stuff out of that, that's still needs doing.

16:29.000 --> 16:37.000
So if anyone knows of any other project who's been already looking into that, I'd love to talk.

16:38.000 --> 16:47.000
Yes, it's actually at the top of the stack sometimes, and then again it's also quite interesting and challenging and fun.

16:47.000 --> 16:56.000
So yeah, get involved actually hiring from collaborative and also there's a hack fest coming up tomorrow and on Tuesday.

16:56.000 --> 16:57.000
Thanks so much.

16:57.000 --> 17:08.000
No time for questions, I guess.

17:27.000 --> 17:53.000
Okay, you can.

17:53.000 --> 18:05.000
All right, thanks.

18:05.000 --> 18:14.000
So the question most, what version is the liberal of his kit, let's say the liberal of his core dependency and the non-line, how's that handled?

18:14.000 --> 18:28.000
So that is that is the, let's say whatever release version there is, so that's different, different branches there, so we just use the whatever is in the branch and then increment that if that answers the question.

18:28.000 --> 18:39.000
So like like any other external dependency, it's version.

18:40.000 --> 18:45.000
I think I think we should maybe take the whole boat.

18:45.000 --> 18:46.000
Thank you.

