WEBVTT

00:00.000 --> 00:14.160
Next up, we are going to stream a video from a speaker, speaker, that couldn't make it

00:14.160 --> 00:26.680
because of visa consideration, what's the person from Africa, the continent, I will

00:26.760 --> 00:36.680
discuss the specifics, but we intended the diffusion of the video as some sort of maybe

00:36.680 --> 00:43.240
that's better not recording. I was going to say that we intended the diffusion of her video

00:44.520 --> 00:51.880
more or less against the first-time policies as an act of resistance because it was

00:52.440 --> 01:00.280
tough hearing that she didn't get her visa. We will have her video for 20 minutes and then

01:00.280 --> 01:10.520
we are going to have a live Q&A session with her on the screen, if it works. We did test it beforehand,

01:10.760 --> 01:21.800
but we are going to have a live Q&A session, we are going to have a live Q&A session with her on the screen,

01:21.800 --> 01:28.840
that's true, go ahead and then, that's it.

01:51.800 --> 02:19.080
Can you hear anything?

02:19.080 --> 02:26.080
Oh really?

02:26.080 --> 02:34.080
Okay, maybe we should test that instead of...

02:34.080 --> 02:37.080
Oh, you mean a centred?

02:37.080 --> 02:39.080
Yeah, but actually if you...

02:39.080 --> 02:41.080
Or is that just...

02:41.080 --> 02:43.080
You might be the first light?

02:43.080 --> 02:46.080
Yeah, it's just the first light.

02:46.080 --> 02:48.080
That's good.

02:48.080 --> 02:52.080
Okay, perfect.

02:52.080 --> 03:01.080
Okay, perfect.

03:01.080 --> 03:04.080
Tell her, tell her, we are launching the video.

03:04.080 --> 03:09.080
I'm just gonna write after her.

03:09.080 --> 03:23.080
I guess we should be in advance.

03:23.080 --> 03:38.080
So we have time for switching from the video to the queue.

03:38.080 --> 03:48.080
So let's do this.

03:48.080 --> 04:02.080
Hi everybody.

04:02.080 --> 04:05.080
I hope that you're doing well and you are enjoying it.

04:05.080 --> 04:06.080
I'm a first-end.

04:06.080 --> 04:09.080
He's telling me that I am not able to leave here physically.

04:09.080 --> 04:12.080
Everybody bought something for technology.

04:12.080 --> 04:16.080
And the amazing first-end thing for making space for me.

04:16.080 --> 04:19.080
And maybe we need to send you a recording of my talk.

04:19.080 --> 04:23.080
I'm also thankful that you have decided to give me the next few minutes.

04:23.080 --> 04:25.080
Please, let's do what I have to say.

04:25.080 --> 04:30.080
Now, without further ado, let's get into it.

04:30.080 --> 04:35.080
So, I'm going to talk to you in a reproducivalier practices.

04:35.080 --> 04:41.080
And then I will be focusing on L.M.I.C.S.

04:41.080 --> 04:42.080
Good morning.

04:42.080 --> 04:44.080
My name is pressure seniority.

04:44.080 --> 04:47.080
I am an open source manager from Disaster Science Program Policy Project.

04:47.080 --> 04:49.080
Glyst-Ostronal team.

04:49.080 --> 04:52.080
I am a software source and ability in 2006.

04:52.080 --> 04:53.080
Hello?

04:53.080 --> 04:54.080
Yeah.

04:55.080 --> 04:57.280
And I sometimes just memorize the question, I have a question, I have a question,

04:57.280 --> 05:03.080
I do it on a podcast adaptation, I'm speaking, I love to read and I absolutely love to believe.

05:06.080 --> 05:11.080
From my talk today, I have a few things that I really like it to take from it.

05:11.080 --> 05:15.080
Because this is a foundation and what I have to say.

05:15.080 --> 05:19.080
So, to definition of the key words, reproducibility AI.

05:19.080 --> 05:23.080
And then open as a blanket statement for what opens up and opens times right?

05:23.080 --> 05:30.080
I would also be explaining why which is ability and open is important.

05:30.080 --> 05:36.080
I will be talking about the challenges with non-feeting for which is ability and openness.

05:36.080 --> 05:41.080
I would also be talking about how to build for which is ability and openness in mind.

05:41.080 --> 05:44.080
And sharing some case studies from a global context.

05:44.080 --> 05:52.080
Then I would also be talking about how to build for which is ability and openness in mind for and in my sees.

05:53.080 --> 05:59.080
Now, to definitions, we can't talk about which is ability, without talking about research.

05:59.080 --> 06:01.080
So what exactly is your religious ability?

06:01.080 --> 06:09.080
Your religious ability research is work that can be independently created from the same data from the same code that the original scene used.

06:09.080 --> 06:15.080
So if anybody decides that they want to make use of that in material and they get the code of data and the code,

06:15.080 --> 06:17.080
it should give them the same results.

06:18.080 --> 06:29.080
This definition is that that pair from the turnery and amazing community and resource that typically tries to make which is a bit reset to easy not to do.

06:29.080 --> 06:34.080
Now, explaining why which is ability is essentially almost the same thing.

06:34.080 --> 06:44.080
But now like which is ability now when we start to work in practice and independently created from the same data and same code at the original scene used.

06:44.080 --> 06:48.080
Right, as you can see, this same definition is also that pair from the turnery.

06:48.080 --> 06:54.080
So there are some words that are usually using the same context and sometimes can be in touch games.

06:54.080 --> 06:58.080
They don't mean the same thing and this image here tries to explain that.

06:58.080 --> 07:03.080
So if the analysis is the same and data is the same, then it should be which is support.

07:03.080 --> 07:08.080
If the analysis is different or the data is same, then it is robust.

07:08.080 --> 07:16.080
If analysis is the same, but the data being used is different and the research work is applicable.

07:16.080 --> 07:23.080
And then if the data is different and the analysis is also different and the research work is generalizable.

07:23.080 --> 07:28.080
This image shows the whole process of building of like a research work.

07:28.080 --> 07:31.080
So data collection and data processing.

07:31.080 --> 07:36.080
And analysis, data publishing and access, data visualization and data reuse.

07:36.080 --> 07:40.080
And then research ideas can come up again and become the whole process.

07:40.080 --> 07:43.080
We said data planning and design collection.

07:43.080 --> 07:46.080
Like it's a whole cycle.

07:46.080 --> 07:49.080
Now, we come to explain what open source is.

07:49.080 --> 07:51.080
This is generally an open source compressive.

07:51.080 --> 07:54.080
It might be the assumption that everybody knows what it is.

07:54.080 --> 07:57.080
But for every fresher, I will just explain what it is.

07:58.080 --> 08:04.080
The open source.com describes open source as something people can modify and share because it design is probably accessible.

08:04.080 --> 08:12.080
Something is quite important because as we've seen, the concept of open source is now up to the extent that beyond the context of original

08:12.080 --> 08:14.080
Chinese, which was software, right?

08:14.080 --> 08:22.080
So now other forms of cultural, creative outputs, such as hardware, education, resources, and even now sciences, right?

08:22.080 --> 08:29.080
And then refer into something that's open source, generating means that it's available under an open license.

08:29.080 --> 08:34.080
This definition, this general definition has got to come to an egosry.

08:34.080 --> 08:39.080
And then explaining what an AI system is because that's what we're talking about here.

08:39.080 --> 08:42.080
And AI system is in machine-based system.

08:42.080 --> 08:55.080
It will exist or implicit objectives within the fairs from the impetus we see is how to generate output such as predictions, content, recommendations, or decisions are coming from physical or virtual environments.

08:55.080 --> 09:01.080
Different AI systems vary in their levels of autonomy and adaptiveness, are part of the program.

09:01.080 --> 09:03.080
This is the OECD definition.

09:03.080 --> 09:11.080
And it's quite important because it's the one that the open source initiative uses to explain what an open source AI is.

09:11.080 --> 09:13.080
That's where the adapts the definition from.

09:13.080 --> 09:19.080
And now we explain what an open AI system means according to the open source initiative.

09:19.080 --> 09:27.080
So an open source AI is an AI system that is made available on that terms.

09:27.080 --> 09:33.080
And in a way that grant freedom to study, modify, and share.

09:33.080 --> 09:38.080
It has to have these four freedoms to be considered open source AI.

09:38.080 --> 09:43.080
This freedom is available to fully functional system and to display elements of these systems.

09:43.080 --> 09:46.080
So put in models, we wait to train in data.

09:46.080 --> 09:55.080
And then if the condition to exercise in these freedoms is to have access to the preferred form to make modifications to the system.

09:58.080 --> 10:06.080
It's very easy when we make mention of open our reproducibility to sometimes think that they mean the same thing.

10:06.080 --> 10:08.080
But they do not, right?

10:08.080 --> 10:15.080
It work can be reproducible and open, but doesn't have to claim that all reproducible works are open, right?

10:15.080 --> 10:21.080
So reproducible doesn't always mean open, but the central intersection, where they meet is the answer to the question,

10:21.080 --> 10:27.080
can others reproduce understand and then build on this work without insight on knowledge?

10:27.080 --> 10:30.080
Can I see material and just work with what has been created there?

10:30.080 --> 10:32.080
What has been provided there?

10:32.080 --> 10:37.080
With having to know the actor researchers or having to find out whoever it is, right?

10:37.080 --> 10:43.080
If I can't do that, it means it work has been considered for open and which is all.

10:44.080 --> 10:47.080
What sticks to this intersection for this presentation?

10:47.080 --> 10:54.080
As I'll be speaking about AI systems and creating them within this context.

10:54.080 --> 10:58.080
So what is the potential open our reproducibility?

10:58.080 --> 11:02.080
It helps make results very viable and trustworthy.

11:02.080 --> 11:08.080
When a material is created and publicly available, it's easy for people to trust what they've seen.

11:08.080 --> 11:15.080
If it's made, if it's reproducible, if you can't find the data or the analysis, that was made from it.

11:15.080 --> 11:21.080
As opposed to a research, I just making claims and everybody can't check if it works or if it makes any sense,

11:21.080 --> 11:24.080
I would have to read those what it has been said to do, right?

11:24.080 --> 11:33.080
It also enables adaptation across different contexts because if person B in the global north is able to apply this to the demographic,

11:33.080 --> 11:40.080
passing C in the global south, you'll be able to reproduce that same result by what has been provided.

11:40.080 --> 11:46.080
So it can work in any part of the world that's applied or any situation depending.

11:46.080 --> 11:52.080
And then it enables the support of a responsible and SQL AI.

11:52.080 --> 11:59.080
Because of how AI is very prevalent right now, and everybody's making use of it, and there's a lot of probing to how they use this.

11:59.080 --> 12:05.080
This is an important tool, but we also need to make that this shift and also respect basic human rights.

12:05.080 --> 12:10.080
There is a lot of questions into how is the data being sourced and the ethics of it.

12:10.080 --> 12:18.080
So when the data is created and open as a reproducibility is covered in, it supports the use of it to be responsible.

12:18.080 --> 12:27.080
And it also enables research as a user to be able to trace where it's come from, and everyone was actively sourced and also probing to make ensure that.

12:27.080 --> 12:32.080
And if you nice using feature is you, or it's got some activity.

12:32.080 --> 12:39.080
It also makes the research as a cognitive because you know that somebody's going to look into your work and they're going to question where this has come from.

12:39.080 --> 12:40.080
Right.

12:40.080 --> 12:48.080
So it also reduces waste and the vacation because when you see that something has happened or has been done before, it's easy for you to build on it as opposed to.

12:48.080 --> 12:55.080
The research from someone else doing the same work, knowing that somebody else has worked on it because that's material was not made publicly.

12:55.080 --> 13:01.080
I really wanted to probably found out when they had to submit it to a journal.

13:01.080 --> 13:07.080
Publication and also reduces the prevents the concentration of power.

13:07.080 --> 13:15.080
Most of us or some of us might have seen the current news of how much our AS systems are taking.

13:15.080 --> 13:24.080
The way it's affecting certain environments when we are able to spread the use or the knowledge and study of AS systems.

13:24.080 --> 13:33.080
It enables power to not be concentrated at one place because now different environments and different researches in particular about what are trying out the same thing.

13:33.080 --> 13:38.080
So it makes the use of energy to not be one place.

13:38.080 --> 13:49.080
It also enables the strengthening of global research capacity of mention how it work can be adapted in different contexts or in different environments or different demographics.

13:49.080 --> 13:56.080
By doing that and creating a material or a research article that is.

13:56.080 --> 14:05.080
Global, so that's reproducible, given from different parts of the world can try it and because now they have something to do don't, it makes work.

14:05.080 --> 14:13.080
It makes innovation very easy to work with or to work on and now it's like widespread and you have people from different parts of the world are trying out the same thing.

14:13.080 --> 14:17.080
So it's very great that the strengthening of global research capacity.

14:17.080 --> 14:21.080
So now that is the importance. What are the challenges?

14:21.080 --> 14:27.080
Interestingly, the inverse of the importance of all these things has challenges right now.

14:27.080 --> 14:30.080
So we see that results cannot be trusted.

14:30.080 --> 14:38.080
There are models out there in the world that people are making use of or that researchers are making claims of and saying, this is open or this is just a problem.

14:38.080 --> 14:44.080
Up to practice or in the up to theory of it, which is not, which is why there are.

14:44.080 --> 14:49.080
Which is why there is open source air definition to be able to help use that in the bio key.

14:49.080 --> 14:55.080
If someone says this is open source, this is what you use as a checkmark to see.

14:55.080 --> 15:00.080
Okay, this works. So this is not a good answer to the definition.

15:00.080 --> 15:06.080
And there is also the difficulty for we using or adapt an existing research material.

15:06.080 --> 15:15.080
It is what's exist and what is publicly available that people can use and build on and can foster innovation.

15:15.080 --> 15:26.080
When something is not publicly available, when it is not accessible, it is harder for researchers to be able to build on the right.

15:26.080 --> 15:32.080
And then you now find out there is also the consistency on in the end twist, which I will come to.

15:32.080 --> 15:41.080
Another challenge that we see is our resources have been concentrated because material is not being shared or publicly available and kind of be produced.

15:41.080 --> 15:50.080
So you find out that it is most likely the same set of maybe people, demographic or organizations that make.

15:50.080 --> 16:05.080
Or work on these systems because the most likely are ones who have access to the first set of resources.

16:05.080 --> 16:17.080
We find that another challenge is power resources being concentrated because, again, this materials are not publicly available or they are not able to be produced.

16:17.080 --> 16:23.080
And so it's just the same set of people that keep on turning out materials or findings.

16:23.080 --> 16:33.080
And people with low resources or lower access to these tools are not able to work on something that is groundbreaking because.

16:33.080 --> 16:36.080
The resources are not available for them.

16:36.080 --> 16:41.080
It's cool because also increase because now nobody can probe into how something is being built.

16:41.080 --> 16:48.080
There's no way to see, okay, let me try this test results that you said brought out this and to be able to check.

16:48.080 --> 16:53.080
It's correct or it's right or it's not right. So when.

16:53.080 --> 16:56.080
It's not easy for people to look into the work that's been done.

16:56.080 --> 17:01.080
Then the question of ethics comes up and there's less happening to question it.

17:01.080 --> 17:12.080
And then we see that time energy on funding our police said, I can mention previously when we put the difficulty in using or adoption and existing research material.

17:12.080 --> 17:20.080
And you can also happen that people now work on the same team and you cannot tell until you've probably submitted it to be approved.

17:20.080 --> 17:29.080
And then you can also go over participation strings which is really sad because when work cannot be reproduced.

17:29.080 --> 17:44.080
It's hard for people who are not able to do like first set of research or innovation to now be able to do don't plan and that's to their own context.

17:44.080 --> 17:54.080
Building open our business meeting mind, we wanted first of all start with transferring training. So log and ten more than a detector, lost functions optimizers.

17:54.080 --> 18:00.080
I have a permit as I want them seeds and separate data, pre-processing training and money.

18:00.080 --> 18:04.080
You want to report the results across multiple rounds to capture variability.

18:04.080 --> 18:13.080
This is important because research has once we would see different methods that we tried and different results that came out from them.

18:13.080 --> 18:22.080
If you want to document data pipelines, try to describe all the resources, filtering the very non-sleeves and book these relevant statistics.

18:22.080 --> 18:32.080
Then in the case when data cannot be shared, it's just like if you're working on something sensitive, which you must like do you have some points if in that's the case then.

18:32.080 --> 18:38.080
Just let the researchers know that we made use of the data but it's currently sensing what we cannot be shared.

18:38.080 --> 18:42.080
We have the documents, computational and environmental requirements.

18:42.080 --> 18:47.080
We report hardware, memory, training time, batch sizes and energy or cost requirements.

18:47.080 --> 18:50.080
Again, it seems we could use what we've done.

18:50.080 --> 18:56.080
We need to have the same analysis and then same code or the data set.

18:56.080 --> 19:02.080
So share with you the variables between libraries and class setup instructions.

19:02.080 --> 19:04.080
We want to model behavior as possible.

19:04.080 --> 19:16.080
We also want to design for other patients.

19:16.080 --> 19:21.080
They need to reproduce the whole evaluation by achieving them and doing other patients.

19:21.080 --> 19:32.080
So AI can be meaningless across different contexts, including resource constraints, constraint and LMIC environments.

19:32.080 --> 19:48.080
These are some resources and commences that exist to enable education and common practice to use openness and reproducibility.

19:48.080 --> 19:54.080
There's a two-in-way which has made an image on to make reproducibility as to either not to do.

19:54.080 --> 19:57.080
We should try it and go check it out.

19:57.080 --> 20:01.080
It's like whole found called resources.

20:01.080 --> 20:09.080
There's open data and then there's get from the data or the images to collect data anywhere.

20:09.080 --> 20:16.080
The next point is now with building an open array which is being made for LMICs.

20:16.080 --> 20:24.080
So the previous points that we've considered, which are transparent training, document head data pipelines,

20:24.080 --> 20:26.080
of heat and environment clarity.

20:26.080 --> 20:31.080
One of the behavior is constantly released and designed by the patient.

20:31.080 --> 20:36.080
Considering them and an active active concentration for lower research settings,

20:36.080 --> 20:40.080
we have a design for reproducibility from the start.

20:40.080 --> 20:43.080
It's not something that should happen so many things we do.

20:43.080 --> 20:49.080
Because again, this is something that should be applied, but regardless of the context.

20:49.080 --> 20:58.080
But again, if it's applied from the beginning, it helps lower research like context.

20:58.080 --> 21:02.080
So define who can reproduce your work on the level of reproducibility.

21:02.080 --> 21:12.080
The exact results, similar performance or conclusions, is also going to be where you now specify tools that were used.

21:12.080 --> 21:18.080
So whoever is going to reproduce the work, nobody needs to have a very good view to do that.

21:18.080 --> 21:23.080
Right, we'll talk about the constitution for technology to somewhere in Lita.

21:23.080 --> 21:25.080
But that's for point one.

21:25.080 --> 21:27.080
You want to call it a use local data.

21:27.080 --> 21:32.080
A availability of data sets within the African context is constantly improving.

21:32.080 --> 21:39.080
Training AI models on data sets from here improves the chances of creating AI systems with local contact.

21:39.080 --> 21:41.080
Here means Africa.

21:41.080 --> 21:44.080
In general, like other LMICs.

21:44.080 --> 21:50.080
Citizen client, you want to design models to run a limited hardware and encourage local communities and contributions.

21:50.080 --> 22:01.080
This is important because first of all, when you're doing a research work, trying to implement or reproduce some of these.

22:01.080 --> 22:02.080
Reset copy.

22:02.080 --> 22:03.080
Sorry.

22:11.080 --> 22:13.080
Let me start that again.

22:13.080 --> 22:18.080
Citizen science, design models on limited hardware and encourage local community collaboration.

22:18.080 --> 22:24.080
This is an important because from the beginning, when you decipher which is basically from the start,

22:24.080 --> 22:31.080
you want to factor in using tools that people from lower resource settings will have access to.

22:31.080 --> 22:34.080
Obviously, that happens when you cannot help it.

22:34.080 --> 22:36.080
You have to use something very, very high end.

22:36.080 --> 22:44.080
When the case is where you can, it is encouraged to try out into the actual property accessible.

22:44.080 --> 22:50.080
And then also encouraging local community contributions means that individuals, normal citizens,

22:50.080 --> 22:53.080
I mean, not be really strategized by involved in the process.

22:53.080 --> 22:57.080
That way, it also gives you a great larger data sets.

22:57.080 --> 23:00.080
And it shares more knowledge about what you're building.

23:00.080 --> 23:04.080
And this is just create interesting to what is being built.

23:05.080 --> 23:15.080
Enable adaptation on local reuse, provide guidance for high end journey and reproduce and result on smaller or local data sets.

23:15.080 --> 23:20.080
So let's whatever resource that is created to guide documentation.

23:20.080 --> 23:25.080
Documentations should be clear and easy to use for future readers.

23:25.080 --> 23:30.080
And sometimes also it's so really important because even for the team that has done the work,

23:30.080 --> 23:35.080
you want to be able to come back to two, three weeks data or one year, two years data.

23:35.080 --> 23:37.080
You say you want to be done that work, are you done before?

23:37.080 --> 23:43.080
They want to understand what you, what you were previously.

23:43.080 --> 23:51.080
I have listed some existing communities as the teachers and tools that exist to help our open area,

23:51.080 --> 23:57.080
which is ability to focus on the analyses as DHIS to,

23:57.080 --> 24:01.080
then as the African reproducible network community,

24:01.080 --> 24:06.080
then as DHIS has done, what are the projects which is focused on making up the data,

24:06.080 --> 24:08.080
accountable, accessible and interpretable.

24:08.080 --> 24:17.080
And then open science, Africa, every system, as a list of what are open science communities or projects or funders,

24:17.080 --> 24:26.080
just basically policy makers that focus on building the open area which is easy in mind.

24:27.080 --> 24:33.080
So to end my talk, I've mostly done this as a way to share knowledge for books,

24:33.080 --> 24:39.080
core researches, not so core researches, like the technical and non-technical,

24:39.080 --> 24:41.080
mostly as general knowledge.

24:41.080 --> 24:46.080
And at least understand what this means and how to apply it and what our work is to study the current.

24:46.080 --> 24:51.080
The open area is how we ensure communities and the main cities and not just these other way,

24:51.080 --> 24:56.080
or contributors and it's showing me how it's built and applied.

24:56.080 --> 25:00.080
And it means science that truly knows what that is.

25:00.080 --> 25:02.080
And that is one of the key tenets that we focus on,

25:02.080 --> 25:07.080
such as the ability to transparency, collaboration, and sharing.

25:07.080 --> 25:10.080
Thank you so much for your time.

25:10.080 --> 25:12.080
So this is a some research that I can tell you,

25:12.080 --> 25:19.080
how I have understood from tools that helped me in making this presentation.

25:19.080 --> 25:21.080
Yeah.

25:21.080 --> 25:26.080
I've shared some resources that you found in the class to get a better idea.

25:26.080 --> 25:30.080
So this is an exhaustive, so I just listed up a bunch of them.

25:30.080 --> 25:37.080
If you like to find out more of it, please connect me and I'll do that.

25:37.080 --> 25:38.080
Yeah.

25:38.080 --> 25:44.080
If you kind of really was actually back, because I like to learn what you learn

25:44.080 --> 25:48.080
or you think I can improve on, because I know and I look forward to it.

25:48.080 --> 25:55.080
And I'm going to continue on some time and we'll see you in some months.

25:59.080 --> 26:00.080
Thanks, sir.

26:00.080 --> 26:06.080
So this works really glad we could host that talk even through video.

26:06.080 --> 26:13.080
I especially liked the part where precious said enabling science that truly knows no border,

26:13.080 --> 26:16.080
because that's really the problem here.

26:16.080 --> 26:21.080
We would like to try to have a live Q&A session.

26:21.080 --> 26:25.080
It looks like it works, and I hope it will.

26:25.080 --> 26:28.080
Can you hear us, precious?

26:28.080 --> 26:31.080
Yes, I can. Can you hear me?

26:31.080 --> 26:34.080
Yes, we can.

26:34.080 --> 26:37.080
Okay.

26:37.080 --> 26:42.080
So how do we do that? Do we have questions?

26:42.080 --> 26:45.080
I may repeat them myself.

26:45.080 --> 26:47.080
Yeah, so we have one right here.

26:47.080 --> 26:48.080
Okay.

26:48.080 --> 26:50.080
I want to know.

26:50.080 --> 26:53.080
Maybe, maybe you can take the mic.

26:53.080 --> 26:59.080
I remember I participate on project from the Wikmedia Foundation called Global Development.

26:59.080 --> 27:06.080
And there was a documentary called People are Knowledge focused on people from India and South Africa

27:06.080 --> 27:10.080
where the oral citations or the main source of information.

27:10.080 --> 27:16.080
So I'm wondering if you are thinking about training some of the LLMs in the African context.

27:16.080 --> 27:23.080
Now that's easier to make the transcription from voice to train this model and improve the African context.

27:23.080 --> 27:29.080
Just I don't know if you know something about it.

27:29.080 --> 27:32.080
So precious, did you hear the question?

27:32.080 --> 27:33.080
No.

27:33.080 --> 27:35.080
That's terrible.

27:35.080 --> 27:36.080
Sorry.

27:36.080 --> 27:37.080
It was a great question.

27:37.080 --> 27:40.080
And you, do you hear me?

27:40.080 --> 27:42.080
Yes, I hear you well.

27:42.080 --> 27:46.080
I heard some part about like contributing to projects Wikmedia.

27:46.080 --> 27:54.080
And then the question about not how we can easily chant and chant text to speech.

27:54.080 --> 27:57.080
But in 19 here, after I've got the question.

27:57.080 --> 28:01.080
Okay, so you got most of the important part of the question.

28:01.080 --> 28:11.080
The question was, do you plan or anyone around you plan on training LLMs on transcribed oral data?

28:11.080 --> 28:16.080
On data that's mainly oral and then because it's getting easier to transcribe,

28:16.080 --> 28:23.080
having specifically oral purposes, corporas of data for training models.

28:23.080 --> 28:25.080
Did you hear me?

28:25.080 --> 28:27.080
Yes, I did.

28:27.080 --> 28:29.080
Thank you.

28:29.080 --> 28:33.080
So I do know some projects that have stats in that.

28:33.080 --> 28:36.080
Honestly, I don't know about it in the science context.

28:36.080 --> 28:44.080
That's why I do know of a musla and funded projects where people from the Hebrew region have encouraged to send video records.

28:44.080 --> 28:45.080
Sorry.

28:45.080 --> 28:51.080
What do the recordings of the language transcribed in some of the things that have written in text for longevity purposes?

28:51.080 --> 28:57.080
So the hope is that since we have these things beginning,

28:57.080 --> 29:02.080
then a lot of other sectors can also apply it and then if it's applied in one part,

29:02.080 --> 29:06.080
which is a vision of language, then it can be applied to other sectors.

29:06.080 --> 29:08.080
So it's a great way to start.

29:08.080 --> 29:13.080
So I do think that in the future we can see more things like that.

29:13.080 --> 29:16.080
Great. Great.

29:16.080 --> 29:20.080
Any other questions?

29:21.080 --> 29:25.080
Okay.

29:25.080 --> 29:30.080
Shout it out and then I repeat it.

29:41.080 --> 29:42.080
Great question.

29:42.080 --> 29:44.080
So precious the question was,

29:45.080 --> 30:02.080
can you give us an idea of the constraints and restrictions for training and using AI and LLMs in LMIC's context?

30:02.080 --> 30:03.080
Okay.

30:03.080 --> 30:08.080
So this is considered a general ability of data, right?

30:08.080 --> 30:10.080
So obviously there's more improvement,

30:10.080 --> 30:12.080
conversations like this are happening,

30:12.080 --> 30:15.080
and you have institutions that are making it their goals,

30:15.080 --> 30:17.080
and we have more data available.

30:17.080 --> 30:19.080
Because that's in from there.

30:19.080 --> 30:23.080
We have less data than it should be available currently out there.

30:23.080 --> 30:28.080
So one is not particularly proper according to data.

30:28.080 --> 30:31.080
You get things happen and then,

30:31.080 --> 30:34.080
but time you want to check,

30:34.080 --> 30:36.080
can we find where this happened?

30:36.080 --> 30:37.080
What is the data saying?

30:37.080 --> 30:39.080
It's hard to see that they were documented.

30:39.080 --> 30:41.080
Sometimes that it's not digitized,

30:41.080 --> 30:44.080
and then you find out that probably from some hospitals,

30:44.080 --> 30:46.080
where I need to have direct codes,

30:46.080 --> 30:48.080
taking by pen and paper,

30:48.080 --> 30:50.080
and then if maybe a large fire happens,

30:50.080 --> 30:52.080
or that's a thing,

30:52.080 --> 30:55.080
then the information is lost, right?

30:55.080 --> 30:59.080
So one of the things that projects like DSWB's

30:59.080 --> 31:03.080
and DSWB is to first of all train people

31:03.080 --> 31:06.080
to become more digitally trained, right?

31:06.080 --> 31:07.080
So by doing that,

31:07.080 --> 31:10.080
then we have more data that's digitized,

31:10.080 --> 31:12.080
and that's like one constraint.

31:12.080 --> 31:15.080
Another is sometimes for internet and power,

31:15.080 --> 31:17.080
because you obviously need like a little power

31:17.080 --> 31:19.080
to use a run system.

31:19.080 --> 31:26.080
So some places don't have as good access to the internet,

31:26.080 --> 31:29.080
as there is the water as it should be,

31:29.080 --> 31:32.080
and then now it's like less availability of power.

31:32.080 --> 31:34.080
And sometimes it's also like computational resources,

31:34.080 --> 31:37.080
because it takes a lot of memory and RAM to be able

31:37.080 --> 31:39.080
to run some of the LLMs.

31:39.080 --> 31:41.080
And from the first part of the world,

31:41.080 --> 31:44.080
it's generally a little bit more expensive

31:44.080 --> 31:47.080
to procure these devices.

31:47.080 --> 31:49.080
So, yeah.

31:49.080 --> 31:52.080
These are thoughts or conversations that have been had,

31:52.080 --> 31:54.080
but some of the constraints like listening them out

31:54.080 --> 31:56.080
would be an ability to see a data,

31:56.080 --> 31:58.080
lack of digital literacy to large extent,

31:58.080 --> 32:02.080
but that's just been with respects to storing data

32:02.080 --> 32:05.080
and then level power and internet constraints.

32:05.080 --> 32:08.080
And so sometimes lack of education too,

32:08.080 --> 32:11.080
but thankfully more and more people are talking about this,

32:11.080 --> 32:13.080
and then we have more organizations

32:13.080 --> 32:16.080
that are coming up to educate members of the public,

32:16.080 --> 32:20.080
and as many people as I will need to listen.

32:20.080 --> 32:22.080
Cool.

32:22.080 --> 32:23.080
Thanks.

32:24.080 --> 32:27.080
Questions?

32:27.080 --> 32:28.080
It's good.

32:28.080 --> 32:30.080
We are on time.

32:30.080 --> 32:31.080
We have.

32:31.080 --> 32:32.080
Yeah.

32:32.080 --> 32:35.080
It's good.

32:35.080 --> 32:37.080
So.

32:37.080 --> 32:42.080
Well, there's no questions.

32:42.080 --> 32:44.080
It's fine.

32:44.080 --> 32:45.080
Yeah?

32:45.080 --> 32:46.080
We can wrap up.

32:46.080 --> 32:49.080
Brush is you want to add anything.

32:49.080 --> 32:51.080
If you hear me.

32:52.080 --> 32:55.080
I want to say that this is part of my SSI.

32:55.080 --> 32:57.080
This is my SSI topic.

32:57.080 --> 33:00.080
So, next year, I'm going to be working intensively

33:00.080 --> 33:04.080
on trying to actually make the practices easier to reproduce

33:04.080 --> 33:06.080
and educate people and how to do that.

33:06.080 --> 33:09.080
So, hopefully, I lend better.

33:09.080 --> 33:12.080
But next time, I'm probably speaking on something like this.

33:12.080 --> 33:14.080
I would have like a larger resource.

33:14.080 --> 33:16.080
I sent out like my links in profiles.

33:16.080 --> 33:18.080
If you have maybe more information on something,

33:18.080 --> 33:21.080
you think that might be what they can need to please reach out to me.

33:21.080 --> 33:22.080
And let me know.

33:22.080 --> 33:25.080
And thank you so much for attending my talk.

33:25.080 --> 33:26.080
I'm for the full of them.

33:26.080 --> 33:27.080
I'm giving me space.

33:27.080 --> 33:29.080
I'm making a possible for this to happen.

33:29.080 --> 33:31.080
Thank you so much.

33:31.080 --> 33:32.080
Thank you.

33:38.080 --> 33:39.080
Thanks.

33:39.080 --> 33:40.080
See you soon.

