WEBVTT

00:00.000 --> 00:10.000
Hello, welcome to the next speaker.

00:10.000 --> 00:20.000
Alfonso will be talking about open way dilemma.

00:20.000 --> 00:28.000
So the reason precise moment when the safety of artificial intelligence model for support

00:28.000 --> 00:31.000
work, when it evaporates.

00:31.000 --> 00:34.000
And it does not happen when the model hallucinates.

00:34.000 --> 00:41.000
It does not happen when the user coaxes the model into harmful behavior using a gel-breaking

00:41.000 --> 00:42.000
prompt.

00:42.000 --> 00:44.000
It happens quietly.

00:44.000 --> 00:47.000
It happens at a speed of a download.

00:47.000 --> 00:52.000
It happens when the model waits, those lower parameters that constitute the cognitive

00:53.000 --> 01:01.000
architecture of the model itself, leave the control perimeter and enter free circulation.

01:01.000 --> 01:09.000
Now, in that moment, the acceptable user policy stops being a control mechanism

01:09.000 --> 01:13.000
and becomes essentially a piece of paper.

01:13.000 --> 01:20.000
In the moment, API frauduling, monitoring, and each and every enforcement mechanism

01:20.000 --> 01:21.000
vanishes.

01:21.000 --> 01:27.000
The model stops being essentially a service and becomes a material.

01:27.000 --> 01:30.000
It becomes clay.

01:30.000 --> 01:37.000
Now, if you work in open source, you already feel what this implies, what does it contain,

01:37.000 --> 01:38.000
right?

01:38.000 --> 01:46.000
Because that clay that model operates are just software you can expect, but they are capabilities

01:46.000 --> 01:48.000
you can fork.

01:48.000 --> 01:50.000
So, good afternoon for them.

01:50.000 --> 01:55.000
We are here because we believe in that clay in one way or another, right?

01:55.000 --> 02:00.000
We believe in the convenience provision of innovation that open weight model such as

02:00.000 --> 02:04.000
Lama and Strahdip Sik have unleashed.

02:04.000 --> 02:13.000
But we believe essentially that doing inference for tuning or some very limited extent training

02:14.000 --> 02:21.000
or now of premises prevents monopoly capture and drives transparency.

02:21.000 --> 02:27.000
But I am here to tell you that that very openness created the security paradox.

02:27.000 --> 02:32.000
The kind of security paradox that we cannot longer ignore.

02:32.000 --> 02:39.000
For 20 years or so, we leave by Laura's last in Maxine, code is low.

02:39.000 --> 02:43.000
The architectural software regulated behavior.

02:43.000 --> 02:48.000
But in the age of open weight I, code is no longer the low.

02:48.000 --> 02:50.000
Code is the substrate.

02:50.000 --> 03:01.000
And once the substrate is only on the user GPU, the low is whatever the user choose to be, right?

03:01.000 --> 03:05.000
We have entered here what I call a mitigation gap.

03:05.000 --> 03:09.000
When the one side we have regulators in Brasser, in the sea, trying to regulate AI,

03:09.000 --> 03:14.000
I see if they were softer as a service platform, right?

03:14.000 --> 03:18.000
On the other side, we have the technical reality of open weights.

03:18.000 --> 03:25.000
Where control is distributed is anonymous and is irrevocable.

03:25.000 --> 03:27.000
I am also the Gregorio.

03:27.000 --> 03:33.000
I am a researcher at the intersection cybersecurity research in AI policy.

03:33.000 --> 03:37.000
I spend my career attacking the funding computers.

03:37.000 --> 03:43.000
And recently I set across the table or the European Commission where I try to help them understand

03:43.000 --> 03:49.000
that we cannot regulate a downloadable file in the same way we regulate cloud API.

03:49.000 --> 03:55.000
So today in the next 20 minutes or so, I want to walk it through three key ideas.

03:55.000 --> 04:04.000
First, the threat reality, the technical evidence that the skill floor for fancy cyber operations is collapsing.

04:04.000 --> 04:07.000
Second, the policy trap.

04:07.000 --> 04:11.000
Our intentional regulations such as the EU AI Act,

04:11.000 --> 04:16.000
could accidentally kill open source if you are not careful in what we are doing.

04:16.000 --> 04:23.000
And third and lastly, the way forward, how we fix the open weight dilemma without killing open source.

04:24.000 --> 04:29.000
So let's strip away the eye and look to the signal place.

04:29.000 --> 04:35.000
We often hear that AI risk are fictional, that are speculative.

04:35.000 --> 04:41.000
The reality is that they are not offensive capabilities of open weight models are real

04:41.000 --> 04:44.000
and are here and they can be measured.

04:44.000 --> 04:47.000
So look at nitrate occult framework for instance.

04:47.000 --> 04:53.000
This is a framework for measuring the effective capabilities of open weight models.

04:53.000 --> 04:57.000
We particularly respect with cyber security skills.

04:57.000 --> 05:04.000
And in early 2025, less than one year ago, structure evaluation showed that open release models,

05:04.000 --> 05:07.000
specifically the reason in models such as Deepseeker 1,

05:07.000 --> 05:13.000
achieve over 90% accuracy on challenging offensive cyber knowledge tests.

05:13.000 --> 05:19.000
Now, we are talking about expert level knowledge and proficiency in vulnerability, discovery,

05:19.000 --> 05:22.000
privileged escalation and exploit chaining.

05:22.000 --> 05:27.000
They can be downloaded by everyone everywhere on earth.

05:27.000 --> 05:33.000
Now, possessing knowledge is not the same as secuting a clever arc granted,

05:33.000 --> 05:36.000
but the gap between knowing and doing is closing.

05:37.000 --> 05:45.000
So the unique danger of open weights is not just that they know something peculiar about these fields,

05:45.000 --> 05:52.000
but rather the constraints the security guard lace can be easily stripped away.

05:52.000 --> 05:57.000
When a company release a model, they first try to align it.

05:57.000 --> 06:03.000
They use reinforcement learning from human feedback in order to teach the model not to what

06:03.000 --> 06:07.000
or not to generate fishing emails, not to build a bomb.

06:07.000 --> 06:12.000
So in the close-up guide, this alignment is sticky,

06:12.000 --> 06:18.000
but in the open weight model, this alignment is superficial at best.

06:18.000 --> 06:25.000
So my research and the research of other people in this field, possibly also somebody in the room,

06:25.000 --> 06:31.000
has shown that safety in this space is not a permanent attribute of the model,

06:31.000 --> 06:34.000
but it is a team veneer.

06:34.000 --> 06:39.000
With a consumer grade GPU such as this one, a few hours of interning,

06:39.000 --> 06:46.000
or even surgical techniques such as obliteration that suppress refusal back to or selectively,

06:46.000 --> 06:52.000
an adversary can strip away those security guard rays with this.

06:52.000 --> 06:54.000
This is not just theoretical.

06:55.000 --> 07:01.000
In early 2025, we saw the emergence of tools like Santerus AI in darkenate forums.

07:01.000 --> 07:03.000
These are not chatty bots.

07:03.000 --> 07:09.000
These are modular autonomous attack platform running on private infrastructure,

07:09.000 --> 07:12.000
empowered by custom tune open weight models.

07:12.000 --> 07:17.000
They don't rely on open AI APIs, they don't have a kill switch,

07:17.000 --> 07:23.000
they automated the Rogerius cybercrime by generating polymorphic code that can escape the detection,

07:23.000 --> 07:28.000
or scaling spearfishing campaigns to 1000 targets across multiple languages,

07:28.000 --> 07:34.000
or do vulnerability research faster than we can typically do in real life.

07:34.000 --> 07:39.000
This is the so-to-speak democratization of a fence.

07:39.000 --> 07:44.000
The floor of a fence is cyber operation is collapsing,

07:44.000 --> 07:51.000
and this brings us to the core problem for our community working in intersection of cyber security and AI.

07:51.000 --> 07:54.000
Cyber conflicts are typically shaped by asymmetry's of power.

07:54.000 --> 07:57.000
They typically revolve around the idea that we can control,

07:57.000 --> 08:01.000
who gives access to what specialized knowledge and tooling.

08:01.000 --> 08:04.000
But open weight AI may receive this power imbalance,

08:04.000 --> 08:10.000
they may receive this power asymmetries by giving state-level capabilities to script key

08:10.000 --> 08:13.000
this and criminal syndicates alike.

08:13.000 --> 08:19.000
If we open source community do not have a good answer for this,

08:19.000 --> 08:21.000
a compelling answer for this.

08:21.000 --> 08:24.000
I promise you the regulators will find the answer for us,

08:24.000 --> 08:26.000
and we are not going to like it.

08:26.000 --> 08:29.000
And this brings me to the policy trap.

08:29.000 --> 08:34.000
I've been deeply involved with the consultation process of the EU AI Act,

08:34.000 --> 08:39.000
and specifically about the general purpose AI code of practice.

08:39.000 --> 08:44.000
The initial instinct of the regulators was to regulate the provider,

08:44.000 --> 08:46.000
the entity that trains the model,

08:46.000 --> 08:48.000
the logics in sound the first.

08:48.000 --> 08:51.000
If Ford builds a car,

08:51.000 --> 08:56.000
Ford is responsible in the added to the brakes' failed.

08:56.000 --> 08:58.000
So applying the same line of reasoning,

08:58.000 --> 09:00.000
if Mr. Lomita releases the model,

09:00.000 --> 09:05.000
they should be responsible if the same model is used for acting a bank.

09:05.000 --> 09:07.000
But here lies the trap, right?

09:07.000 --> 09:13.000
Because the EU AI Act creates obligation for AI risk AI systems.

09:13.000 --> 09:15.000
It demands risk management.

09:15.000 --> 09:19.000
It demands post-market monitoring and record keeping.

09:19.000 --> 09:20.000
Now ask yourself,

09:20.000 --> 09:25.000
how can an open source developer perform post-market monitoring?

09:25.000 --> 09:26.000
Realistically.

09:26.000 --> 09:28.000
Once you download the weights,

09:28.000 --> 09:31.000
the developer has zero visibility into what you do.

09:31.000 --> 09:33.000
They cannot log your problem,

09:33.000 --> 09:35.000
they cannot recall the model,

09:35.000 --> 09:40.000
they cannot push security patch to your laptop of edge devices.

09:41.000 --> 09:43.000
Now if the low-insist,

09:43.000 --> 09:47.000
the original developer is liable for their stream issues,

09:47.000 --> 09:53.000
the rational business decision is to do not release the open weights in the first place, right?

09:53.000 --> 09:58.000
This was the mitigation gap I highlighted today European Commission.

09:58.000 --> 10:02.000
We risk creating a regulatory environment

10:02.000 --> 10:08.000
that implicitly favored close proprietary AI systems.

10:09.000 --> 10:12.000
Now because they are necessarily better or safer,

10:12.000 --> 10:15.000
but because they are controllable.

10:15.000 --> 10:19.000
So we add to fight for a new legal architecture.

10:19.000 --> 10:23.000
And we secure a critical victory in the interpretation of the Act,

10:23.000 --> 10:27.000
center around the concept of substantial modification.

10:27.000 --> 10:32.000
This is the legal firewall protecting open source.

10:32.000 --> 10:35.000
So bear with me for a second, please.

10:35.000 --> 10:40.000
The argument I successfully advocated to the European Commission is the following.

10:40.000 --> 10:43.000
When a third party downloads a model and fine-tune it,

10:43.000 --> 10:48.000
especially for the sake of removing the security garrays of specializing the model

10:48.000 --> 10:50.000
for malicious use,

10:50.000 --> 10:53.000
they are no longer a simple user.

10:53.000 --> 10:55.000
They become a new provider.

10:55.000 --> 10:59.000
So by performing the substantial modification for any

10:59.000 --> 11:02.000
meaningful notion of substantial modification,

11:02.000 --> 11:05.000
the liability shifts accordingly.

11:05.000 --> 11:10.000
And so if I release a general part of the model and you obliterated

11:10.000 --> 11:12.000
to create a fishing engine,

11:12.000 --> 11:16.000
you have created a new AI risk AI system.

11:16.000 --> 11:19.000
The four you become responsible for risk assessment

11:19.000 --> 11:23.000
and in turn you become responsible for compliance.

11:23.000 --> 11:25.000
This distinction is vital,

11:25.000 --> 11:27.000
because a knowledge that the technical reality

11:27.000 --> 11:32.000
has safety of the model is conditioned to the deployment context.

11:32.000 --> 11:37.000
It moves the regulatory crosshair from the creation to the modification

11:37.000 --> 11:40.000
of the weaponization of the tool itself.

11:40.000 --> 11:42.000
However, the battle is not over,

11:42.000 --> 11:45.000
because there is still ambiguity about the open source

11:45.000 --> 11:49.000
extension present in the general purpose AI code of practice.

11:49.000 --> 11:54.000
Does it apply to open weights that do not release the training data?

11:54.000 --> 11:58.000
If we stick to the strict open source initiative definition,

11:58.000 --> 12:03.000
then current models like Lama and Stral do not qualify.

12:03.000 --> 12:10.000
So we need a pragmatic interpretation that protects the practice open

12:10.000 --> 12:13.000
weights model regardless of data purity.

12:13.000 --> 12:17.000
So to speak, provided the documentation is transparent.

12:17.000 --> 12:20.000
So how do we navigate this my field?

12:20.000 --> 12:23.000
We cannot close the source, we don't want that.

12:23.000 --> 12:29.000
But we cannot ignore the possibility that the open weight model gets weaponized.

12:29.000 --> 12:32.000
We need a new social technical part,

12:32.000 --> 12:34.000
a new standard or responsible release,

12:34.000 --> 12:38.000
and here it is what does a look in practice.

12:38.000 --> 12:43.000
To begin with, because we cannot control the model after the release,

12:43.000 --> 12:47.000
the actor releasing becomes the single most critical moment

12:47.000 --> 12:49.000
in the lifetime of open weights model.

12:49.000 --> 12:55.000
We must treat the documentation not just as a simple rhythmy file,

12:55.000 --> 13:00.000
but rather as a legal and technical doorie

13:00.000 --> 13:04.000
that is passed to the downstream actor to the downstream user.

13:04.000 --> 13:07.000
This means the team model cards are explicit,

13:07.000 --> 13:11.000
that list explicitly offensive capabilities

13:11.000 --> 13:15.000
needs to be populated after proper reading.

13:15.000 --> 13:19.000
So if your model scores let's say 80% on the court benchmark,

13:19.000 --> 13:21.000
then let's write it down on the label,

13:21.000 --> 13:23.000
let's write it down on the model cards.

13:23.000 --> 13:25.000
This is not about admitting the feat,

13:25.000 --> 13:27.000
pretty the other way around.

13:27.000 --> 13:30.000
It's about allowing the liability shift.

13:30.000 --> 13:31.000
By disclosing the risk,

13:31.000 --> 13:35.000
we transferred the duty of care to the downstream actor.

13:35.000 --> 13:38.000
Second, we need to move away from the DL Bunny models

13:38.000 --> 13:41.000
based on parameter count and computational thresholds.

13:41.000 --> 13:45.000
Instead, we need to focus on specific irisic capabilities.

13:45.000 --> 13:49.000
Can the model synthesize biological weapon protocols?

13:49.000 --> 13:54.000
Can the model discover zero-dibular abilities in C++ code bases?

13:54.000 --> 13:58.000
Now, if the model have this such type of capabilities,

13:58.000 --> 14:03.000
perhaps the corresponding weights need to be released with some kind of friction,

14:03.000 --> 14:09.000
and this friction may entails tire access or get at the end of your customer protocols.

14:09.000 --> 14:15.000
Where the general reasoning core remains open.

14:15.000 --> 14:20.000
Now, granted, this is technically art to do is any sub-optimal,

14:20.000 --> 14:22.000
but it's better than black and bang.

14:22.000 --> 14:27.000
And finally, we must recognize that only defense against AI scale of fans

14:27.000 --> 14:29.000
is AI scale defense.

14:29.000 --> 14:31.000
Look, we are in the 2026,

14:31.000 --> 14:34.000
and it's clear that many of the assumptions security problems

14:34.000 --> 14:37.000
will be the pawn are fully apart.

14:37.000 --> 14:40.000
Attackers are no longer constrained by human speed,

14:40.000 --> 14:44.000
with AI reconnaissance, exploitation, and lateral movement

14:44.000 --> 14:47.000
can now run continuously and empower a little.

14:47.000 --> 14:52.000
What used to take weeks to complete now takes just minutes if we want to?

14:52.000 --> 14:55.000
Most defensive models still assume time exists,

14:55.000 --> 14:58.000
but the audic testing assumes attacker moves slowly,

14:58.000 --> 15:01.000
reactive detection assumes there's time to intervene,

15:01.000 --> 15:05.000
a manual validation assumes humans can keep up with all of this.

15:05.000 --> 15:08.000
Those assumptions are breaking apart.

15:08.000 --> 15:10.000
No, because teams are failing,

15:10.000 --> 15:13.000
but because the model itself is understrain.

15:13.000 --> 15:16.000
The real question in 2026 is not our risk cure.

15:16.000 --> 15:20.000
The real question is, can we run security problems at scale

15:20.000 --> 15:24.000
without adding humans becoming the bottleneck in all of this?

15:24.000 --> 15:28.000
The same open with models that attackers use for finding vulnerabilities

15:28.000 --> 15:30.000
can be used for patching vulnerabilities.

15:30.000 --> 15:34.000
So, we need to prioritize AI enable defense.

15:34.000 --> 15:37.000
We need to flood these metaphorically speaking,

15:37.000 --> 15:40.000
this security space with defensive technologies.

15:40.000 --> 15:43.000
Because if the skill floor for attackers is dropping,

15:43.000 --> 15:48.000
we must raise the floor for defenders faster and accordingly.

15:48.000 --> 15:53.000
The open source is probably the only way we can do this fast enough

15:53.000 --> 15:58.000
for the largest number of security stakeholders around.

15:58.000 --> 16:01.000
Because security throws security never work in the past

16:01.000 --> 16:04.000
and it's not going to work any longer in the future.

16:04.000 --> 16:06.000
So, let me conclude with this.

16:06.000 --> 16:09.000
The era of code is low,

16:09.000 --> 16:11.000
which is to say,

16:11.000 --> 16:15.000
the architecture that itself enforce the rules of the game

16:15.000 --> 16:19.000
is definitely over when we speak about artificial intelligence.

16:19.000 --> 16:22.000
In the open weight era, the code is low less.

16:22.000 --> 16:24.000
It's fluid, it's everywhere.

16:24.000 --> 16:28.000
This terrifying freedom is also our greatest strength.

16:28.000 --> 16:31.000
The mitigation that is real, the risk are high.

16:31.000 --> 16:35.000
But the solution is not to retreat beyond the wall garden or proprietary APIs.

16:35.000 --> 16:38.000
The solution is to build the governance model that is well matched

16:38.000 --> 16:42.000
to the technical realities we have out there.

16:42.000 --> 16:45.000
We can do so.

16:45.000 --> 16:51.000
If we shift the ability of actors to the actors who actually control the system,

16:51.000 --> 16:55.000
if we are radically transparent about the real work capabilities

16:55.000 --> 16:57.000
of the open weight models we design.

16:57.000 --> 17:03.000
And if we use the power of open source to build the new system of the internet.

17:03.000 --> 17:08.000
Ultimately, we are the architects of this new reality.

17:08.000 --> 17:12.000
So, let's try to build the kind of reality we can survive from.

17:12.000 --> 17:13.000
Thank you.

17:13.000 --> 17:22.000
We've got time for some questions.

17:22.000 --> 17:25.000
So, is there someone with questions?

17:25.000 --> 17:27.000
I don't see anyone.

17:43.000 --> 17:47.000
I guess the communist model works within you.

17:47.000 --> 17:54.000
But yeah, what do you think about, for example, the or global picture,

17:54.000 --> 17:58.000
are the governments and organisations?

17:58.000 --> 17:59.000
Well, thanks for asking.

17:59.000 --> 18:02.000
Because you need over the next one year or two,

18:02.000 --> 18:06.000
my ambition is to bridge the gap from a regulatory perspective,

18:06.000 --> 18:08.000
also overseas.

18:08.000 --> 18:10.000
But you know, in Europe we have a game.

18:10.000 --> 18:15.000
You typically joke saying that the American awaits China manufacturers

18:15.000 --> 18:17.000
and we regulate things.

18:17.000 --> 18:24.000
Because you know, the power of EU regulation is that we are using the market lever

18:24.000 --> 18:26.000
as a bargaining chip.

18:26.000 --> 18:33.000
And we try to put forward the kind of regulation that are cross nationals,

18:33.000 --> 18:38.000
because the cross beyond borders and by the time other players,

18:39.000 --> 18:43.000
around the globe, want to do business with European entities.

18:43.000 --> 18:47.000
They need to comply with our regulation, right?

18:47.000 --> 18:53.000
So, by operating at a EU level, we are already having an impact overseas.

18:53.000 --> 18:57.000
Then of course, there are different sensibilities around the globe.

18:57.000 --> 19:04.000
And it's my ambition to work on that over the next 12, 24 months or so.

19:04.000 --> 19:13.000
Any other questions from audience?

19:13.000 --> 19:25.000
You said the collect response to kind of a aggressive AI is reasonable with defensive AI systems

19:25.000 --> 19:28.000
and the open source specifically.

19:28.000 --> 19:31.000
Are there any projects we should keep in our effort?

19:32.000 --> 19:37.000
I don't have a taxonomy because this kind of space that is moving very very fast.

19:37.000 --> 19:40.000
I also is a cut-and-mouth game as per usual.

19:40.000 --> 19:46.000
But you know, I've been working in the offensive cyber operation working group,

19:46.000 --> 19:50.000
the Royal Old University of London, which will try to brainstorm

19:50.000 --> 19:55.000
how offensive cyber operation are going to evolve into the future.

19:55.000 --> 20:00.000
And one element we agreed upon was that the tempo of everything

20:00.000 --> 20:04.000
is going to be dramatically affected.

20:04.000 --> 20:10.000
So, this is the reason why it compels me to suggest scaling the fans using AI

20:10.000 --> 20:16.000
because humans are not going to keep up.

20:16.000 --> 20:22.000
Any other questions?

20:22.000 --> 20:28.000
Maybe the last question?

20:28.000 --> 20:35.000
Any wishes for software developers who are not developing new open-weight models

20:35.000 --> 20:41.000
but use existing models for improving security and safety in their applications

20:41.000 --> 20:46.000
when it comes to like plunge action protection and other issues?

20:46.000 --> 20:51.000
Well, if we don't measure things, we cannot manage those things.

20:51.000 --> 20:58.000
First, most important thing I would like to do myself and with the help of everyone

20:58.000 --> 21:03.000
here is to measure the factic capabilities of open-weight models.

21:03.000 --> 21:07.000
These have been done to some limited extent by metric with the framework

21:07.000 --> 21:12.000
but they didn't publish in open-source data set in corpora.

21:12.000 --> 21:20.000
So, a very first step can be replicating those results and build incrementally on the top of that.

21:20.000 --> 21:26.000
Some last question.

21:26.000 --> 21:28.000
If not, thank you for your dark.

21:28.000 --> 21:29.000
Thank you for the questions.

21:29.000 --> 21:30.000
Thank you.

