WEBVTT

00:00.000 --> 00:10.240
All right. Welcome everyone. I'm Florian. I'm based in Switzerland at the Bernie University

00:10.240 --> 00:14.280
of Applied Sciences, and today I'm going to present to you a project that we've been working

00:14.280 --> 00:22.880
on for, yeah, a bit like a year in the core phase, let's say, and let's call it openparlated.ch.

00:22.880 --> 00:29.000
So basically, if you ask yourself what has happened in Switzerland, in the education policy,

00:29.000 --> 00:34.000
for example, in the last, let's say, five years, as a few, for example, a researcher,

00:34.000 --> 00:40.760
a journalist, or also civil society organization, advocacy manager, for example, this is really

00:40.760 --> 00:45.000
hard to answer this question. It's going to take you a lot of time or a lot of money like

00:45.000 --> 00:50.280
you need access to, like, an expensive lobbying or monitoring tool to get to kind of

00:50.280 --> 00:56.360
acquire this data. Why is this the case? As a lot of countries in the world, like Switzerland

00:56.360 --> 01:01.000
also federalist country, so the decisions are not just made at the national level, but also

01:01.000 --> 01:05.760
at the subnational level. So that means we have one parliament, we have 26 cantinal

01:05.760 --> 01:11.440
parliament, which is like the subnational, and then we have even 461 municipal parliament,

01:11.440 --> 01:17.760
and that's for a country of 8 million people, so you can imagine. Yes, and if you look

01:17.760 --> 01:24.000
at, like, legislative data in Switzerland, you don't have to be able to read that, but basically

01:24.000 --> 01:29.080
the messages, it's apples and oranges. So there's a couple of parliament that provide data

01:29.080 --> 01:33.360
through APIs, but most of them, they have, like, they rarely have websites, there's a lot

01:33.360 --> 01:39.120
of PDFs going on. Yeah, so it's a really massive situation. So the data is not really accessible,

01:39.120 --> 01:46.720
it's definitely not harmonized. Yeah, and it comes in, like, many forms and shapes. So again,

01:46.720 --> 01:52.520
as I've said, it's quite costly, you need a lot of resources to have access to these APIs.

01:52.520 --> 01:57.800
There's also kind of symmetries, like, given that, because, yeah, if you have, if you have

01:57.800 --> 02:02.920
the financial means, then you're able to access the data if you don't have the financial

02:02.920 --> 02:08.200
means you simply can't. And there's also a lot of inefficiencies, especially in the research,

02:08.200 --> 02:13.040
but also in the journalism sector, it's like, like, political scientists, for example,

02:13.040 --> 02:18.800
they get the data to clean them for their specific project, but they don't share the data

02:18.800 --> 02:25.520
with others. So what we try to do or what we decided to do is we want to collaboratively,

02:25.520 --> 02:31.520
basically, involving all the user groups, build an open standard, but also, obviously, an open

02:31.520 --> 02:42.560
API for harmonized Swiss legislative data. With that, and that's open par data.ch. So we give

02:42.720 --> 02:47.840
everyone at this point access to harmonized the open data from currently 78 national,

02:47.840 --> 02:54.960
Cantonese and municipal parliament. So researchers, journalists, civil society organizations

02:54.960 --> 03:00.960
can, on the one hand, analyze the states, but also monitor it. And with that, we basically

03:00.960 --> 03:08.480
want to foster transparency participation and innovation. So, basically, we have, kind of,

03:08.480 --> 03:13.200
like, a two track approach. On the one hand, we kind of, like, we're currently fighting symptoms

03:13.200 --> 03:19.440
on one hand. So we built this API as an MVP with a kind of short-term solution. So we import

03:19.440 --> 03:24.960
the data, we clean it, we harmonize it, and we publish it openly. Through our API, that's basically

03:24.960 --> 03:32.960
what we did in 2025, to basically create value ASAP. And also, like, get feedback from the community

03:32.960 --> 03:38.560
from the users. As the next step, and that's what we're currently doing, we want to kind of

03:38.560 --> 03:44.320
build this into a product, kind of in the medium-term. So we're adding, like, a value-based governance,

03:44.320 --> 03:48.320
but also a business model to be able to find it an actually sustained structure.

03:49.600 --> 03:53.680
But then, on the other hand, we're also trying to fight root causes. That's why we're developing

03:53.680 --> 03:59.920
data standards in a kind of, like, a Swiss framework to a organization for public and that sense

04:00.320 --> 04:05.200
legislative data. And in the next step, as soon as you have to stand it, which is going to be the

04:05.200 --> 04:13.280
case, at the end of the year, basically, fingers crossed, to then enable an encourage in that sense,

04:13.280 --> 04:18.880
like, also actively lobby parliament and governments to implement standards and also publish their

04:18.880 --> 04:24.640
data through Open APIs. Yes. And that's, if that all goes right, we will be able to kind of,

04:24.640 --> 04:28.880
like, phase out one crawler after another that we're currently using to get most of the data.

04:30.080 --> 04:37.680
Yes. So maybe just quickly about how do we do that? So, as of set, we scrape the data, we do that with

04:37.680 --> 04:43.360
it's a bit cut off, sorry, from the rendering, but we do that with a patchy hop ETL, put that into

04:43.920 --> 04:52.880
post-gressed field database, and then that gets queried with fast API front-end backend, angular

04:52.960 --> 05:00.160
front-end, and then react admin panel. So you get the JSON output from our API. You can test that,

05:00.160 --> 05:04.240
or you can check it out, it's currently in better, but we're about to switch to release version

05:04.240 --> 05:10.320
in the next weeks, and we always welcome feedback. So, please make an issue if you see something

05:10.320 --> 05:19.280
that doesn't behave as it should. What data do we have? So, yeah, we have data on, like, what do we

05:19.280 --> 05:24.480
name that mean by? Legislative data, sorry. So, we have data on bills, we have data on

05:24.480 --> 05:28.960
political actors, so, like, MPs, how they wrote, which committees that they are part of,

05:29.680 --> 05:35.760
when do they meet agendas in parliament, meeting minutes, we have votes in parliament, we have

05:35.760 --> 05:41.360
different degrees, so laws, we have a lot of documents, also, yeah, we extracted all the information

05:41.360 --> 05:46.000
from those documents as well, so it's all terrible. So, a lot of text and a lot of processes, basically.

05:46.960 --> 05:51.280
Yeah, so, for this is how if you query our API, this is what it could look like,

05:51.280 --> 05:57.200
I'm just one excerpt, for example, for this case, the concose of vote, of the canton of

05:57.200 --> 06:06.240
vote, this would be like an extract from the bills. Then, we also built like a GUI, but for the

06:06.240 --> 06:11.440
sole purpose of kind of like for people, also for non-technical people to be able to preview the data,

06:11.840 --> 06:15.920
it's not really supposed to be used as kind of a monitoring tool itself, but more to,

06:15.920 --> 06:21.920
so people know what's going on. I think quite a cool feature is also if you go through the GUI,

06:21.920 --> 06:31.520
you will all get the API query as the URL and hit HTML and JSON. So, what can you do with our data?

06:32.320 --> 06:38.240
We basically, this like, evaluated or are focusing on two use cases or like two use case

06:38.240 --> 06:43.840
clusters, so to say, one is kind of like one time or your recurring data analysis,

06:44.560 --> 06:48.720
and every focus on researchers and journalists, like data journalists in that sense,

06:48.720 --> 06:54.400
and then there's kind of the use case of like more continuous and real-time tool assisted

06:54.400 --> 07:00.160
political monitoring. By civil society organizations, by intercontinental conferences, that's a

07:00.160 --> 07:04.640
very Swiss thing, you don't have to know what that is, but then also journalists, so that they

07:04.960 --> 07:11.120
for example alerts, if something like something that interests them pops up. But we also want to

07:11.120 --> 07:16.160
kind of enable like exchange of data and information between different parlaments,

07:16.160 --> 07:24.480
between administration and parlaments, etc, etc. Yes, so just in an actual, what can you do?

07:24.480 --> 07:31.600
For example, as a researcher, for example, you can analyze bills, so how do they progress in parlaments,

07:31.680 --> 07:36.640
how do they go through the different stages, what are the success factors, like who submits them,

07:36.640 --> 07:42.640
etc, etc. You can evaluate and analyze topical trends, you can analyze decrease and their

07:42.640 --> 07:49.200
legislative footprint, basically, voting behavior, for example, but also links to other

07:49.200 --> 07:55.760
data sources, like media coverage, companionsion, company registry, etc, etc. And the goal is really

07:55.760 --> 08:00.320
here to kind of also build like an ecosystem or a research ecosystem around the state, as so.

08:00.960 --> 08:08.240
We want to encourage and enable researchers to cooperate, to share data, to share

08:08.800 --> 08:13.680
clean and rich data, so if they process the data, they can share it with others, or they can share

08:13.680 --> 08:19.760
pre-trained ML class, like machine learning classifiers, topic modeling, stuff, but also data processing

08:19.760 --> 08:28.560
pipelines. Yes, and just like how, how does this kind of like look like, but it's, yeah, it's not

08:28.560 --> 08:33.920
super important, but basically, again, we get the data from from the parlaments, but also from

08:33.920 --> 08:39.680
third parties, via some APIs, mostly, by web crawling, and then we can offer it to different

08:39.680 --> 08:45.360
researchers, they can cooperate, among each other, but also kind of going through our platform at some point.

08:47.040 --> 08:52.480
Yes, and like we just like finish that work, basically, and it is already like the first article

08:52.480 --> 08:58.560
published a couple of weeks ago, which looks at, for example, both in behavior, in this case,

08:58.560 --> 09:04.960
in Geneva, so basically on a left-right axis kind of compare how the different parties vote.

09:05.760 --> 09:12.000
There's also like another funny prototype happening, so there's a module here, kind of undell

09:12.000 --> 09:17.440
analyzed how animals are discussed in the Swiss parliament. So for example, here you see

09:17.440 --> 09:23.440
bunnies, like this is how much, like, like, rabbits were since the 1990s, like, 13 times mentioned

09:23.440 --> 09:30.800
in Swiss parliament, for example. Yeah, so what we are, what I do is, so basically we have this

09:30.800 --> 09:35.040
prototype or the special version now, but the question is really, okay, how can we keep the

09:35.040 --> 09:40.400
state accessible and how can we like operate this like open data infrastructure sustainably

09:40.400 --> 09:45.600
and in the public interest? So the questions are kind of okay, how do we govern it,

09:45.600 --> 09:50.880
democratically, how can we like operate it, efficiently and financially sustainably,

09:50.880 --> 09:55.520
and how can others contribute high quality and drop purple data? And then last but not least,

09:55.520 --> 10:02.240
what can we learn from this? Yeah, so basically, yeah, we're in the process of doing that.

10:02.240 --> 10:07.040
I got like a grant from the Swiss government to kind of focus on these two aspects.

10:08.480 --> 10:11.600
So we are currently evaluating and develop in principles.

10:11.600 --> 10:19.520
Yeah, so yeah, that's quite a quite self explanatory. I'm also objecting for the whole data in

10:19.520 --> 10:24.720
infrastructure and what we're also doing is kind of focusing on the business model currently.

10:24.720 --> 10:32.640
So we're based on OSS, but also open data or data and digital comments, project business models,

10:33.200 --> 10:38.720
we kind of prioritize, prioritize a couple of revenue streams that we have a look at that

10:38.720 --> 10:44.880
really evaluate more thoroughly. For example, usage, fees, membership fees, kind of like in the

10:44.880 --> 10:52.160
comments, comments area, dualizing, but also grants and donations, grants is how this project

10:52.160 --> 10:56.960
has been funded so far, but also kind of selling support, services and consultancy.

10:58.800 --> 11:04.160
And the next steps are really of this project is kind of developing, implementing and evaluating

11:04.320 --> 11:11.120
again, the business model governance and organizational structure contributes and then, but we also

11:11.120 --> 11:18.000
want to implement contribution and data quality mechanisms and then last but not least check out

11:18.000 --> 11:22.960
if there are kind of synergies between what we are doing and like a potential legislative database.

11:23.360 --> 11:27.360
And if that all works out, like we want to scale this kind of maybe as so to some extent to

11:27.360 --> 11:32.400
nationally, but maybe also to other kind of like data spaces in Switzerland.

11:33.520 --> 11:38.080
Yeah, and just very quickly what have you learned so far, super important to continuously

11:38.080 --> 11:43.200
involve the stakeholders and the users from the very, very start. That's what we did.

11:43.920 --> 11:48.640
Coalition of the willing, so we offer a hand to everybody and say do you want to join us?

11:48.640 --> 11:51.920
But if they say no, we don't get blocked by these kind of people.

11:52.880 --> 11:59.040
Yes, and then super important transparency for us, so we want to really lift this well,

11:59.040 --> 12:04.000
so everything is in the open, working in the open, there's no closed meeting notes, everything is on

12:04.000 --> 12:10.000
GitLab and focus. And so there's like various amounts of creeps, feature creep,

12:10.000 --> 12:16.880
blah blah blah, data creep or whatever, try to fight those. Yes, I think that's it from my side,

12:16.880 --> 12:22.800
and I would love to hear your feedback. And if you have any ideas or people, you think I should talk to you.

12:22.800 --> 12:26.000
Thank you so much.

12:29.200 --> 12:31.040
Couple of minutes for questions, it's great.

12:32.560 --> 12:35.040
Yeah, so I had to run through this. Yeah.

12:46.880 --> 12:50.640
We should also see that some point to access on that. Does it mean that there are some

12:50.640 --> 12:55.760
elements that send this for people to rest of that? No.

13:00.480 --> 13:05.440
Yeah, so the question was that you mentioned, and you asked the weather, whether there's

13:05.440 --> 13:11.440
economists that offer data through financial compensation? No, that was a misunderstanding. No,

13:11.440 --> 13:16.160
it's just you pay for for a monitoring tool for somebody to give you access to the data,

13:16.160 --> 13:22.000
but like data is all public, so we only currently supply public data, but it's not really accessible.

13:22.000 --> 13:26.000
So as if you want to scale, if you don't want to just have a look at like one specific

13:26.000 --> 13:31.840
parliament at one point in time, then it's just, yeah, it's a lot of effort, basically.

13:31.840 --> 13:38.000
Yeah, and the second was about the switch to creating to APIs. Yes, you were

13:38.000 --> 13:42.640
on an in-game experience, when the parliament is doubting what was in API.

13:42.640 --> 13:45.920
So that doesn't look like it was less switched than what we could scrape.

13:45.920 --> 13:50.320
And so in the end, we remained on scraping, although we have to switch for some parts,

13:50.320 --> 13:54.800
because of the APIs of what's working. Yeah, that's a very good question.

13:54.800 --> 13:59.840
So the question was whether, so from the experience in France, if I got this correctly,

14:00.640 --> 14:05.040
like the data through that was provided through APIs, it was less than what was

14:05.040 --> 14:07.120
available through the website, so to stay with the website.

14:08.160 --> 14:12.400
Yeah, so during the time that we did the project, there was no switch so far,

14:13.040 --> 14:17.760
but the thing is that basically, so we try to kind of counteract this through the standard

14:17.760 --> 14:23.280
basically. So within the standard, there's defined, okay, this has to be published,

14:23.280 --> 14:28.400
kind of in this in that way, so they can't do that. But yeah,

14:28.400 --> 14:32.640
now, but I think it's definitely a very good point, and I think it's also,

14:32.720 --> 14:39.200
I mean, this is a data project, but it's as well, it's also a kind of lobbying project to that extent.

14:40.720 --> 14:42.000
Yes, thank you.

14:42.000 --> 14:47.920
And very good point, I just seen other initiatives, similar, I just know from Brazil,

14:47.920 --> 14:51.920
I think here is a border, it's Black's ML. Yeah, with openly the open standard. Yeah,

14:51.920 --> 14:59.280
have to take a look. Yeah, yeah, Black's ML, so for the decrease, kind of for the text,

14:59.360 --> 15:06.560
we will include LexML and a comment also, which is connected to LexML. Yeah, that's a little, that's, yeah.

15:08.400 --> 15:12.960
Yeah, and if you have other questions, I'm going to be outside for the next five minutes or whatever,

15:12.960 --> 15:17.120
or just until there's no more questions. Thank you.

