WEBVTT

00:00.000 --> 00:10.000
Hi, thanks everyone for coming to this lightning talk about ESLONDROIT's download statistics.

00:10.000 --> 00:13.000
Given this is a lightning talk, we probably won't have time for any questions.

00:13.000 --> 00:18.000
So if you do have questions, I invite you to come to the ESLONDROIT booth and UD.

00:18.000 --> 00:22.000
So let me just talk about what the downloads statistics are, how they work and how to use them.

00:22.000 --> 00:27.000
And for the people who don't know yet, ESLONDROIT is a source for open source apps.

00:27.000 --> 00:33.000
Similar to F-Troid, except that we get it from the developers directly, and in 60% of the cases,

00:33.000 --> 00:37.000
have reproducible builds to make sure that it matches the source code.

00:37.000 --> 00:41.000
First about me, who am I? I'm Sylvia Finals.

00:41.000 --> 00:43.000
I'm known online as the last project.

00:43.000 --> 00:50.000
For my day job, I do DevOps, a Linux system administration, aka I write a lot of YAML.

00:50.000 --> 00:55.000
I'm a contributor to the ESLONDROIT project, and some of you may know me for contributing

00:55.000 --> 01:01.000
to Katima. This is a little app for storing QR codes, playing takes, etc.

01:01.000 --> 01:05.000
So first of all, why do we even have download statistics?

01:05.000 --> 01:10.000
Well, we already had other ways to, like, prioritize good apps, especially in a sort of new store.

01:10.000 --> 01:15.000
But a lot of people have been asking for download statistics to help figure out which apps are popular

01:15.000 --> 01:18.000
and likely to be well maintained for a longer time.

01:18.000 --> 01:22.000
So we decided to implement this to help use those choose to write apps for them.

01:23.000 --> 01:28.000
We believe that down statistics help prioritize popular apps, which in the first world is a lot of the time,

01:28.000 --> 01:34.000
mostly word of mouth, because there's generally a bit less budget.

01:34.000 --> 01:39.000
And it also prioritizes all the well maintained apps because an update gives a download spike

01:39.000 --> 01:42.000
and the age reflects a total amount of downloads.

01:42.000 --> 01:47.000
It's however not a silver bullet, just because an app is downloaded a lot doesn't mean it is a great app.

01:47.000 --> 01:51.000
So it's just an indicator of quality and the guarantee of quality.

01:51.000 --> 01:54.000
And it works best when used over medium periods.

01:54.000 --> 01:57.000
So you account for the age of the app, but not solely.

01:57.000 --> 02:01.000
So the general overview, as you can see here,

02:01.000 --> 02:06.000
the first step is a client, or your phone, or your browser, whatever.

02:06.000 --> 02:09.000
Download an APK from one of our azure and right mirrors.

02:09.000 --> 02:12.000
That mirror then writes that download to the silver lock.

02:12.000 --> 02:20.000
Then we have a little tool, which we call the stats builder, which scans true the server for lock and outputs some statistics JSON.

02:20.000 --> 02:22.000
It's a very simple tool.

02:22.000 --> 02:25.000
At the recent server for lock, it turns them into statistics.

02:25.000 --> 02:31.000
It's just two Python files, and a total config file, and it works in two phases.

02:31.000 --> 02:33.000
First of all, of course, we need to configure it.

02:33.000 --> 02:39.000
So say you have this download, this log line in your patchy lock.

02:39.000 --> 02:42.000
Now you'll have to write a regular expression.

02:42.000 --> 02:45.000
And it feels that our necessary art of date, time,

02:45.000 --> 02:49.000
demetted the path, the status code, and the user agent.

02:49.000 --> 02:57.000
The other feels our notes necessary, but I've included them here to make the regular expression easier to eat, to read.

02:57.000 --> 03:03.000
If you have a browser that, or if you have a server that outputs JSON, for example, like Cadi,

03:03.000 --> 03:06.000
you can also directly parse the JSON.

03:06.000 --> 03:09.000
And you also need to tell it the date time format.

03:09.000 --> 03:15.000
So as you can see, this is how that regular expression gets played up.

03:15.000 --> 03:20.000
So, for example, let's say we run this over our patchy lock.

03:20.000 --> 03:23.000
We have the logs in far-load a patchy access lock.

03:23.000 --> 03:27.000
We first throw it through grab to filter out all the APK downloads.

03:27.000 --> 03:31.000
It means it's a lot easier to write a proper regular expression.

03:31.000 --> 03:35.000
If you want to have other stuff in your logs, like server, we're starting, or whatever.

03:35.000 --> 03:39.000
Then we throw it into the parser with the config file, and we just write it to a JSON file.

03:39.000 --> 03:47.000
And we got a simple JSON file that's per date says this app had that many downloads with that client.

03:47.000 --> 03:49.000
That's step one.

03:49.000 --> 03:54.000
And then for step two, after we have all these statistics, we run the index generator,

03:54.000 --> 04:01.000
which creates an index suggestion, saying, for this date, you have to look into these files.

04:01.000 --> 04:07.000
And it's important to keep this index suggestion and test directory together.

04:07.000 --> 04:13.000
The problem with this, if you want to use these downloads a t6, that you only have data for one server.

04:13.000 --> 04:17.000
And it's very verbose. It's always as like the looks per day.

04:17.000 --> 04:23.000
And you have to read the index JSON to figure out where to find the stats for specific day.

04:23.000 --> 04:27.000
So to make that a bit easier, I'm more usable.

04:27.000 --> 04:33.000
So we have the stats collector, which pulls the build stats for all of our known Mirrors.

04:33.000 --> 04:39.000
Currently three, two in Germany, one in the United States, we're very welcome to more Mirrors.

04:39.000 --> 04:46.000
And then just merge those together into a basic and an upstream.

04:47.000 --> 04:49.000
These files are a lot easier to consume.

04:49.000 --> 04:59.000
And currently, we run an instance of this, which daily uses woodpecker CI on codeberg to generate that output.

04:59.000 --> 05:01.000
And we publish it to a public URL.

05:01.000 --> 05:08.000
I've not written the public URL here, not because I want to hide it, but just because I want to be able that if people use it,

05:08.000 --> 05:14.000
that we can communicate with them, if we have to make some changes, especially because it's still rather new.

05:14.000 --> 05:18.000
So feel free to ask us for the URL if you want to use it.

05:18.000 --> 05:25.000
We don't have two basic types. We have the basic types, which is per app download stats.

05:25.000 --> 05:30.000
We've got the client info. As you saw in the earlier slides, we had like per client information.

05:30.000 --> 05:33.000
That's too much info for some people.

05:33.000 --> 05:39.000
And we have the upstream statistics, which is basically the exact same format as shown earlier.

05:39.000 --> 05:44.000
And we also provide 40 state types, 4 formats.

05:44.000 --> 05:52.000
We provide a yearly format, the monthly and days, which is just 4 a month every single day, and the daily format.

05:52.000 --> 05:55.000
And we also have a rolling dot JSON for most of these.

05:55.000 --> 06:00.000
So you can get like the last year of statistics, the last month, etc.

06:00.000 --> 06:06.000
So for example, if you want to know the total number of downloads per app in the last 30 days,

06:06.000 --> 06:11.000
you just grab the rolling dot JSON in the monthly directory in the basic directory.

06:11.000 --> 06:18.000
So you need few kilobytes. If you want per day information, of course that file is going to be a lot bigger.

06:18.000 --> 06:24.000
For example, in June 2025, 2025, 06 per JSON in monthly in upstream.

06:24.000 --> 06:33.000
And of course, if you just want all the downloads, the apps on a certain date, just grab that date from the daily.

06:33.000 --> 06:37.000
Those statistics are then uploaded to the statistics server.

06:37.000 --> 06:41.000
And then a client can download it. So what kind of client?

06:41.000 --> 06:48.000
Well, for our first version, we build stats.isionbrite.org, which gives you like a web browser dashboard,

06:48.000 --> 06:52.000
which helps a lot to make these JSON files more readable.

06:52.000 --> 06:56.000
But where it actually gets a lot cooler is that we've been working together,

06:56.000 --> 07:04.000
we've to identify and in the new store developers, to show these downloads statistics directly into the new store.

07:04.000 --> 07:08.000
In the new store, you just see the amount of downloads total in a period.

07:08.000 --> 07:15.000
But in the new store, you can even sort the apps by popularity, which will help you find popular,

07:15.000 --> 07:18.000
most likely well maintained apps.

07:18.000 --> 07:23.000
And the new store has a lot of graphs and stuff, because the new store developed a lot of graphs.

07:23.000 --> 07:26.000
Some of you will surely love graphs too.

07:26.000 --> 07:29.000
That is basically the quick overview.

07:29.000 --> 07:35.000
There's a lot of texty slides, so please do take this opportunity to just download the slides

07:35.000 --> 07:37.000
and look them over as a reference sheet.

07:37.000 --> 07:42.000
I'd like to say a quick thank you to Announat for sponsoring this work,

07:42.000 --> 07:46.000
and other related work through an NGI movie free grant.

07:46.000 --> 07:51.000
It's the first grant we've ever got as Asionbrite, which is a really cool accomplishment for us.

07:52.000 --> 07:55.000
And I'm just leaving these important links up for a bit,

07:55.000 --> 08:00.000
to skewer code, contain all of this data, so you can check out some of this stuff.

08:00.000 --> 08:03.000
Yeah, thank you.

