WEBVTT

00:00.000 --> 00:13.040
I'll be talking about what's new in SPAC version 1.0, which was released last year, and

00:13.040 --> 00:19.680
also what's to come in the next release, which is actually version 1.2, because 1.1 was released

00:19.680 --> 00:23.320
also in the meantime.

00:23.320 --> 00:26.800
Before I start with that, I'll do a short introduction.

00:26.800 --> 00:32.200
It will be slightly different than usual, and the main point about what's new will be

00:32.200 --> 00:41.160
like your repository split some news about a new installer, and the compilers as dependencies.

00:41.160 --> 00:43.720
My name is Harman Stoppels.

00:43.720 --> 00:49.760
I'm based in Zurich and work as an independent software developer, and I'm involved for about

00:49.760 --> 00:55.160
like 6 years already with SPAC.

00:55.160 --> 01:03.920
You can find me online usually with this avatar, essentially it'll help her.

01:03.920 --> 01:08.960
I want to talk a little bit about what's unique about SPAC, but in the context of Nixon

01:08.960 --> 01:18.560
Geets, because there are very well represented at this conference, and instead of going

01:18.560 --> 01:26.040
for the usual, like SPAC is this flexible package manager, and you can mix a match versions.

01:26.040 --> 01:33.200
I will want to talk a little bit about how it relates to Nixon Geets.

01:33.200 --> 01:39.280
For those who don't know Nick's Geets and SPAC, they're all based on the doctoral teases

01:39.280 --> 01:46.600
of Ilkotolster, and he concludes basically two things in his teases, that one, the best

01:46.600 --> 01:53.000
way to describe a silver installation is the hash of all its built instructions.

01:53.000 --> 01:59.360
And second, the best way, or the best language for that, is something purely functional.

01:59.360 --> 02:06.160
Oh, sorry.

02:06.160 --> 02:10.840
It is actually purpose.

02:10.840 --> 02:19.720
So in this group of package managers, Nick's and Geets, they pride themselves as purely

02:19.720 --> 02:25.560
functional package managers, and they boast high ideals, like purity and reproducibility,

02:25.560 --> 02:28.480
and that's great, like don't say grung.

02:28.480 --> 02:35.160
And that seems to suggest by extension that SPAC being written in this mundane language

02:35.160 --> 02:40.800
Python must be a non-functional package manager, and that is admittedly sometimes true.

02:40.800 --> 02:46.280
But, I would like to make a point to the contrary in this talk, saying that also SPAC

02:46.280 --> 02:52.480
can lead claims to higher ideals, and that gets a bit philosophical.

02:52.480 --> 02:57.160
So I thought I would stay in the realm of this philosophical discussion, and to make my point

02:57.160 --> 03:01.880
talk a little bit about package ontology and philosophical realism, namely Plato's

03:01.880 --> 03:03.880
allegory of the key.

03:03.880 --> 03:09.920
So for those who probably everybody recalls from high school that in this game have a group

03:10.000 --> 03:17.280
of captives sitting with your back to a fire in which objects are passing, and it

03:17.280 --> 03:19.360
has shadows in the walls.

03:19.360 --> 03:25.560
Now in SPAC's contribution to the lexicon of philosophy, those objects are called abstract

03:25.560 --> 03:26.560
SPACs.

03:26.560 --> 03:32.560
So for example, you can think of this the command line where Cp2K depends on Q2 at 14 is

03:32.600 --> 03:34.600
passing by.

03:34.600 --> 03:44.320
The conqueror, it casts a shadow of this installation request and casts it in great detail.

03:44.320 --> 03:50.720
The shadows are called concretizations, and they are basically dependency graphs where everything

03:50.720 --> 03:57.080
is resolved, the package divergence, the variance, the edges between the nodes, and it

03:57.120 --> 04:03.320
would, for example, be Cp2K at a specific version, built with GCC, whatever, with

04:03.320 --> 04:06.120
a very specific version of Q2, etc.

04:06.120 --> 04:13.880
So these shadows can be installed and run, but ultimately, they are a projection.

04:13.880 --> 04:20.960
So if you are in the cave as a captive, and you kind of bump the versions of those shadow

04:20.960 --> 04:26.800
objects, like in other words, you are curating a particular silver stick over time, you

04:26.840 --> 04:29.440
are kind of playing with shadows.

04:29.440 --> 04:34.680
SPAC package mentors, however, they transcend from this cave and they work in the world

04:34.680 --> 04:36.400
of ideals.

04:36.400 --> 04:44.200
So the package recipes they work on are not like a specific piece of software, but they

04:44.200 --> 04:48.720
describe how this over could exist in any possible reality.

04:48.720 --> 04:51.520
So they are like the platonic ideals of that software.

04:51.560 --> 04:56.800
So it captures the essence, and instead of saying, like, I depend on Q2, I would go to,

04:56.800 --> 04:59.000
you say, I depend on Q2.

04:59.000 --> 05:04.040
All right, now to get to the point.

05:04.040 --> 05:10.920
So what's unique about SPAC is that if you consider like a temporal axis, that unlike

05:10.920 --> 05:18.120
Nixon geeks, SPAC is not mutating, but additive, like the entity side by, for example,

05:18.120 --> 05:26.400
and exists beyond time, whereas in Nixon geeks it mutating concept over time.

05:26.400 --> 05:31.200
So in SPAC new versions are added, and all ones can be kept.

05:31.200 --> 05:35.280
And by doing this, we keep multiple crowds happy.

05:35.280 --> 05:39.800
So on the one hand, there are like the conservative people, the ones that install old versions

05:39.800 --> 05:41.560
of software.

05:41.560 --> 05:46.840
On the other hand, there are people who want to be bleeding at you, like rolling release,

05:46.840 --> 05:51.200
or software developers who want to try out the newest versions of their dependencies,

05:51.200 --> 05:56.600
and we can keep these folks happy at the same commit over package repository.

05:56.600 --> 06:01.960
And also, we have things like variants, which allow us to avoid community wars over like

06:01.960 --> 06:06.000
what are the proper defaults for a certain package, how to build it.

06:06.000 --> 06:14.560
So ultimately, we keep various, like in this world of ideals, we keep maintainers with different

06:14.640 --> 06:23.840
interests working together, so that back in those caves, users can cast their shadows, they like.

06:23.840 --> 06:28.520
So speaking of package repository, so what we've done in SPAC version 1.0 is already

06:28.520 --> 06:35.160
a little bit old news, is that we split SPAC the tool from SPAC the packages.

06:35.160 --> 06:41.600
I think also Nixon has done this a long time ago, and this was highly requested, because

06:41.600 --> 06:45.280
first of all, it allows you to bump the package repository in newest without pulling

06:45.280 --> 06:51.400
in the latest books that we have developed in SPAC, and otherwise, sometimes you want

06:51.400 --> 06:54.760
a new feature, but keep your packages at this person commit.

06:54.760 --> 06:59.120
So you can now do this in SPAC, and that was, of course, a nice PR to put up a removing

06:59.120 --> 07:03.600
half a million lines from SPAC the tool.

07:03.600 --> 07:08.720
The glue layer between this is the package API, which we promised to keep stable, and

07:08.720 --> 07:13.360
the package API is basically just like the Python first line of a package that says from

07:13.360 --> 07:20.080
SPAC the package import all, that API is versions, and we hope that not deal with version

07:20.080 --> 07:29.240
bumps, hopefully at all, but, and the API is kind of seventh stone, so if you find something

07:29.240 --> 07:33.160
that regresses, you should hold as a countable.

07:33.160 --> 07:38.240
The second thing that is a bit ongoing, so it was not, it did not make it to the one that

07:38.240 --> 07:45.520
already is directly, but it's a new experimental installer, and it will talk a little bit

07:45.520 --> 07:50.760
about a new job server, the public's job server, and how we used it.

07:50.760 --> 07:56.760
So best show what it looks like, so if you run SPAC install something, you will first

07:56.760 --> 08:02.000
of all notice that now multiple packages are building in parallel, as you can see on the

08:02.000 --> 08:09.880
right-hand side, and there can be many in parallel, in principle, there is just one flag

08:09.880 --> 08:17.760
to control the parallelism, and that's kind of like make dash j, if you're used to make,

08:17.760 --> 08:21.800
and we want to have package parallelism by defaults make it a bit easier.

08:21.800 --> 08:27.480
There's a basic terminal user interface, and you can toggle between like overview of what's

08:27.560 --> 08:32.720
building and the logs of a particular package.

08:32.720 --> 08:39.680
And we hope that in the spring we will have this as a property fault.

08:39.680 --> 08:44.760
A little bit about a public job server, I guess it's, I don't know how well known this is,

08:44.760 --> 08:52.560
but it's an ancient protocol, so from a previous century, and it allows composable parallelism

08:52.560 --> 08:55.560
across processes.

08:55.560 --> 09:03.240
So it's been long supported in GCC, but recently there seems to have been like a remissons

09:03.240 --> 09:08.240
of this protocol, like last year somehow everybody got interested in it again, and I think

09:08.240 --> 09:10.920
it was the motivator was mostly ninja.

09:10.920 --> 09:18.240
So somebody had created a PR for ninja, job server support, nine years ago, and last

09:18.240 --> 09:22.480
year the time was ripe to merge it.

09:22.480 --> 09:28.000
It caused like three or four acts of ninja at a build system, at a build tool.

09:28.000 --> 09:32.480
I cannot go into all details, but there's very interesting blog that kind of shows like

09:32.480 --> 09:36.440
what it took from a container to deal with these types of community requests like, why

09:36.440 --> 09:38.800
can you have a job server?

09:38.800 --> 09:44.440
But it was very useful for ninja because sometimes ninja calls ninja, and it would cause

09:44.440 --> 09:49.280
a very high load on the system if there was no job server support.

09:49.280 --> 09:54.560
Much faster, it went in LVM, it went from issue to implementation within a couple

09:54.560 --> 09:55.560
months.

09:55.560 --> 10:02.840
It's not released yet, but and a spec had like the first implementation also in November.

10:02.840 --> 10:07.240
I think Gen 2 had a blog about it, and the Julia community was considering it in the

10:07.240 --> 10:09.440
issue.

10:09.440 --> 10:10.440
So what is it?

10:10.440 --> 10:15.280
It's basically simplest way to think about it is like you have a server that has

10:15.320 --> 10:23.120
a pouch of coins, and that's actually a pipe, and anytime you take a coin, it means

10:23.120 --> 10:30.400
you read a byte from a pipe, you claim that like I'm doing now something in parallel.

10:30.400 --> 10:38.240
And if you do this properly, you create like this process tree, where if you start with

10:38.240 --> 10:43.760
n-1 bytes in the pipe, or tokens in the pouch, and everybody takes one token for every

10:43.760 --> 10:50.000
parallel job, then the leave notes are just n leave notes, assuming that they do

10:50.000 --> 10:56.960
are like sequentially executing or like using one single thread, one single processor, then

10:56.960 --> 11:01.360
you get it's like an estimate for the load on the system.

11:01.360 --> 11:08.160
The interior notes in a graph are idle, and it made their like truly idle.

11:08.160 --> 11:16.180
So people criticize this protocol, it's super simple, but like one bad apple that arguments

11:16.180 --> 11:20.960
suppose that somebody takes a coin from the pouch, and decides like I will never return

11:20.960 --> 11:27.960
this like to keep it, ultimately the job server has no clue who took that coin, and you

11:27.960 --> 11:33.920
continue your build with limited parallelism, so it will just slow down to build.

11:33.920 --> 11:39.120
Well, first of all, I don't think that's very, it doesn't really happen, we don't see

11:39.120 --> 11:42.560
it happen in practice, and I think the Gen 2 people have also been doing lots of build

11:42.560 --> 11:48.040
with these, and they're still using it, so I don't think that's really common things

11:48.040 --> 11:53.720
happen, otherwise usually the build just stops anyhow with a build failure, for example.

11:53.720 --> 11:59.240
But if you really are scared of implementing a client for your job server, then you can

11:59.240 --> 12:04.920
also just use make, because it has both a server and a client, and that's actually what

12:04.920 --> 12:11.000
the GCC does, so there are a link-time optimization executable, actually generates a

12:11.000 --> 12:18.000
make-fall runtime, and then calls make on it, I don't know if that's great, but it works.

12:18.000 --> 12:22.480
You don't get, like you get the composable parallelism that way, spec used to promote

12:22.480 --> 12:28.600
is before the new installer for power users that you could generate a giant make-fall

12:28.600 --> 12:33.040
or your dependency gravity or about install, and then run make on it and make it basically

12:33.040 --> 12:38.240
like the driver that starts the job server, and you would get the parallel installations

12:38.240 --> 12:40.080
like that.

12:40.080 --> 12:44.800
In practice, it was not great, and that's why we have a new installer, for example, sometimes

12:44.800 --> 12:48.520
you don't need to install all the early dependencies, which is part of the dependency graph,

12:48.520 --> 12:54.760
and then this graph is not as static as you think it is, and there was also a lot of overhead

12:54.760 --> 13:00.640
of running make-to-runs back to install a package of a very short install that was just

13:00.640 --> 13:06.640
too much start-up and shut down overhead of Python.

13:06.640 --> 13:10.440
Other people claimed that this product was not modern enough, and by modern I mean 2008

13:10.440 --> 13:16.240
when the gold linker first introduced like a fretpool in linking.

13:16.240 --> 13:23.720
So for example, if you have a fretpool with many short-lived tasks, then like reading from

13:23.720 --> 13:30.040
a pipe and writing to it, that would be a lot which is purely in C. So people thought

13:30.040 --> 13:35.800
it was not possible to combine this ancient protocol with modern parallelism.

13:35.800 --> 13:40.760
But LVM seems to have pulled it off, so that's a quick look at how they do it, and

13:40.760 --> 13:45.920
finally enough it's being discussed as we speak, so maybe this slide is already not current

13:45.920 --> 13:53.120
as I speak, but basically what they implemented in a first iteration is they create a fretpool,

13:53.160 --> 14:00.880
it does, for example, linking jobs or short-lived tasks, and they start all the threads and

14:00.880 --> 14:08.320
then they block all but one on acquiring a token from a job server, and only when they get

14:08.320 --> 14:14.120
to token, it's going to start doing a parallel work.

14:14.120 --> 14:20.320
It's not yet completely finished, I would say, so I hope it works in the next LVM release,

14:20.320 --> 14:26.960
but we'll see there are some open issues, and actually the Gen2 people were on that,

14:26.960 --> 14:30.680
and I started also commenting on it, so I'm now getting updates about it.

14:30.680 --> 14:36.760
I hope that the next LVM release has a proper implementation.

14:36.760 --> 14:41.760
Last thing, or maybe second to last thing, I want to talk about is compilers as dependencies,

14:41.760 --> 14:44.520
which landed in spec version one.

14:44.520 --> 14:48.640
So I think this was kind of like the catalyst over the point at which you said, like, probably

14:48.640 --> 14:55.080
now it's time to make one that overlays of spec if we have this, and I think it was

14:55.080 --> 15:02.160
actually introduced at or first alluded to by Todd Camlin, at false them in 2018, so

15:02.160 --> 15:04.960
it's like very, very long ago.

15:04.960 --> 15:13.560
In spec 0, not x, we used to have compilers as attributes on a node, so for example, you

15:13.560 --> 15:20.360
would have installed open as LVM 3, and then a compiler would be just like some property

15:20.360 --> 15:27.520
on that node, and that was a bit of a simplification of the real world.

15:27.520 --> 15:33.960
Now we do it like this, open as a cell is a node, the compiler is also a node, and we

15:33.960 --> 15:40.360
even add a couple other nodes that are kind of necessary to make the model work, namely

15:40.360 --> 15:45.640
the runtime libraries as separate nodes in the dependency graph.

15:45.640 --> 15:50.080
So this is what you now typically see if you do spec install something, and I've colored

15:50.080 --> 15:55.240
a couple in green and the other one in gray, so the compiler is now a build dependency in a

15:55.240 --> 15:59.880
principle, you can also uninstall it, and your software would still continue to work,

15:59.880 --> 16:07.480
which wasn't the case in the past with spec.

16:07.480 --> 16:09.160
So how does it work?

16:09.160 --> 16:18.560
In spec, we have now better concept of languages as entities languages like C, C, XF, and

16:18.560 --> 16:24.360
4 trends, where we have right now their kind of like virtual packages, and the compiler

16:24.360 --> 16:28.240
packages, they are providers of these virtuals.

16:28.240 --> 16:33.840
So for example, a compiler could provide 4 trend, and the package depends on 4 trend, and

16:33.840 --> 16:37.440
then our software links these two together.

16:37.440 --> 16:41.440
It can be all conditional, so for example, you can have like an incomplete installation of

16:41.440 --> 16:48.400
C, without a 4 trend compiler, then it just provides C and C, XX.

16:48.400 --> 16:56.280
The compilers as packages, they can inject further dependencies into their parent, and that's

16:56.280 --> 17:01.760
how we deal with G, C, Runtime, like Lip Center, and C, XX, etc., and G, Lip C as dependencies

17:01.760 --> 17:03.760
in the graph.

17:03.760 --> 17:08.120
Now we have some special logic that you can like these runs of their bits, special, like

17:08.120 --> 17:14.280
you can mix different G-Lip C versions, and it's you run time versions in the graph, and

17:14.280 --> 17:18.720
that allows us to do like compiler mixing.

17:18.720 --> 17:23.920
And we try to be nice deal with like sometimes they're like a breaking change in the API

17:23.920 --> 17:29.320
of G4 trend, like Lip C4 trend, and we try to ensure that you don't accidentally get

17:29.320 --> 17:34.520
like half a stack compiled with all G's and other half with new, and there would be

17:34.520 --> 17:41.520
API issues there, so we try to avoid that in this over.

17:41.520 --> 17:46.440
We added some new syntax, and I think this is something that is probably unique to

17:46.440 --> 17:53.800
the package manager right now that you can properly talk about mixed tool chains.

17:53.800 --> 18:01.520
So for example, you could use Clang for C and C++, and GCC for 4 trend, which was like

18:01.520 --> 18:06.520
maybe less so now, but it used to be quite popular combination of compilers, and we can

18:06.520 --> 18:07.960
actually model this.

18:07.960 --> 18:12.480
So on the command line, you can actually say like I want to install this package, and

18:12.480 --> 18:18.600
then the percent means depends on, and then it goes like language equals compiler, or it

18:18.600 --> 18:23.960
is actually more generic, it is virtual equals package.

18:23.960 --> 18:28.240
You can use this syntax everywhere, so we can now, we also have finer granularity, when

18:28.240 --> 18:35.320
you know like this package here, you cannot use a C++ compiler from GCC.

18:35.320 --> 18:42.520
You can specify a conflict like this to add that constraint, and we added some, like you

18:42.520 --> 18:48.080
can specify in config, what your compiler tool chain is, where you can say like you can

18:48.080 --> 18:53.360
shorten this a little bit, so you can, for example, talk about install things with Clang

18:53.360 --> 18:58.400
with G4 trend as a dependency.

18:58.400 --> 19:02.400
I think I still have one minute, so another thing that we're concerned about, or actually

19:02.400 --> 19:08.080
use it or concern about, is that spec was pretty slow, so spec is probably different in

19:08.080 --> 19:14.240
the sense that there is a resolver that is doing what otherwise would be done by a community

19:14.240 --> 19:21.600
pushing like this ecosystem forward, so the server was kind of slow, and also the fact

19:21.600 --> 19:26.720
that we used Python did not make it much faster.

19:26.720 --> 19:31.720
I think I won't let go over all of these, but they're like various targeted PRs in the last

19:31.720 --> 19:38.040
two months to make spec as a command line tool a bit more, a bit easier to use and faster

19:38.120 --> 19:44.760
to respond, and I think, so most of it comes down like trying to make the dimensionality

19:44.760 --> 19:54.680
of the problem we're solving, smaller, so the search is a shallower or, and we're trying

19:54.680 --> 20:01.560
to reduce allocations and that kind of stuff, and we use some new tools that are in Python

20:01.640 --> 20:08.520
free.15 that are actually quite nice, and I can recommend, for example, the new sampling

20:08.520 --> 20:15.480
profiler, it's actually quite nice to figure out where your bottlenecks are, and I think

20:16.840 --> 20:21.000
and with this slide, that like over time you can see that there are some, like, this is the

20:21.000 --> 20:29.160
package repository where we solve our largest set of packages with our solver, and every now and

20:29.160 --> 20:34.680
then we bump the spec version there, and then you see this drop, and we hope to get a few more

20:34.680 --> 20:41.800
crops so that the spec install, it doesn't feel like it's stuck for a minute. I'll end there,

20:41.800 --> 20:45.080
and I'm taking questions from the audience.

20:45.080 --> 21:06.520
Yeah, 3.15 is not released. Oh, the question is, isn't 3.15 Python 3.15 not released yet,

21:06.520 --> 21:12.440
and that's true, but you can already use it. In fact, you could probably spec install it too.

21:30.120 --> 21:35.800
So the question is about compiler bootstrapping that people would spec install

21:36.520 --> 21:45.480
GCC, and then mark GCC installed by spec as an external or, like, as a configured compiler.

21:45.480 --> 21:53.800
So I didn't touch on this, but this is irrelevant in spec1.0, because if you install a compiler,

21:54.520 --> 22:01.480
then next time you install something with spec spec, realizes that, hey, this GCC package is

22:01.480 --> 22:11.240
available in my store, so I can reuse that. So with compilers being ordinary packages,

22:11.800 --> 22:16.120
they should not be very different from, like, if I first install C-make and then install the

22:16.120 --> 22:21.080
package that uses C-make, the solver would also realize, hey, I can reuse this installed C-make.

22:21.640 --> 22:37.400
So the question is, can you use back without, like, any links to the operating system?

22:39.720 --> 22:50.040
I would say we do, like, until the level of the lip C library. So there's, I think, different from,

22:50.120 --> 22:57.320
like, nix and geeks, spec ultimately still has this dependency on the system version

22:58.440 --> 23:05.160
of usually GCC. But otherwise, that's like the only required external package.

23:08.600 --> 23:08.840
Yes?

23:09.800 --> 23:18.520
spec was a 0 version based for the pages, so maybe not a bit close, and you have spec1.0 since

23:18.520 --> 23:23.480
that's similar, isn't she? Have you seen any impact on that on how development is done,

23:23.480 --> 23:29.320
that people need to be more careful to keep that if they are stable, because you are used to having,

23:29.320 --> 23:33.960
let's say, the freedom to create things. And now that you're on the 0, you will, I'm

23:33.960 --> 23:37.960
going to mention that you want to be sure that things are stable. Does it have a specific

23:37.960 --> 23:39.960
impact system of how things are built?

23:39.960 --> 23:45.480
So the question is, whether one that already is, are there any concerns about, like,

23:45.480 --> 23:52.120
are we limiting ourselves with, or is there impact on development? Is there impact on development

23:52.120 --> 23:57.720
has done, now that there is a one that already is? I was actually a bit scared about this, so

23:58.120 --> 24:05.880
notably with defining a package API, and if we did it pretty quickly. So we split the repo,

24:05.880 --> 24:11.080
and then, okay, now we'll see, like, maybe we forgot about all kinds of stuff that we want to

24:11.080 --> 24:16.360
change in spec, and that's now no longer possible. In practice, I would say we did not really see this.

24:16.360 --> 24:22.120
So our day-to-day development on the tool side is not really impacted. Sometimes you have to look

24:22.120 --> 24:28.360
up, like, am I touching public API here? But in practice, we still, like, for example, I was

24:28.360 --> 24:34.360
worried, like, we cannot do performance improvements, but we can. And actually, we just started

24:34.360 --> 24:38.680
with that in one that, oh, because we felt like, okay, now we have something stable. We can do more

24:38.680 --> 24:44.280
targeted performance improvements, for example. So it feels like we're still quite flexible.

