WEBVTT

00:00.000 --> 00:07.000
Yeah, welcome, everyone.

00:07.000 --> 00:10.000
I'm going to talk today about machine learning on air

00:10.000 --> 00:14.000
and give you an overview about available frameworks and toolboxes

00:14.000 --> 00:19.000
that you can use to improve your own DSP and radio communications.

00:19.000 --> 00:22.000
So, I had a small lie on this slide

00:22.000 --> 00:25.000
because there will be no transmission over air today.

00:25.000 --> 00:29.000
So, all I'm going to talk about is mostly offline optimization

00:29.000 --> 00:33.000
and also the talk is mostly focused on communication.

00:33.000 --> 00:37.000
So, that's what my understanding of DSP and radio is

00:37.000 --> 00:39.000
or when I combine both.

00:39.000 --> 00:41.000
So, yeah.

00:41.000 --> 00:43.000
First of all, who am I?

00:43.000 --> 00:47.000
So, I started using radio sometime 2015

00:47.000 --> 00:50.000
then I worked at this small SDR company

00:50.000 --> 00:52.000
or did an internship there.

00:52.000 --> 00:56.000
Then I became involved with the radio project

00:56.000 --> 00:58.000
and did like some code contribution

00:58.000 --> 01:03.000
and also today I still am responsible for some stuff there.

01:03.000 --> 01:07.000
And then 2019 I finished my masters

01:07.000 --> 01:10.000
and worked a little bit at ESA

01:10.000 --> 01:13.000
and I was happy to also have been organizing

01:13.000 --> 01:16.000
a pre-fossed them, heck,

01:16.000 --> 01:20.000
a radio hacking event at ESA

01:20.000 --> 01:23.000
and since 2001 I'm actually working

01:23.000 --> 01:25.000
with machine learning on communication.

01:25.000 --> 01:27.000
So, that was also my first exposure.

01:27.000 --> 01:31.000
So, you're a little bit seeing the last four years

01:31.000 --> 01:33.000
what I've learned.

01:33.000 --> 01:34.000
All right.

01:34.000 --> 01:37.000
So, short overview what you're going to talk about.

01:37.000 --> 01:39.000
First, I want to give you a little bit of introduction

01:39.000 --> 01:43.000
into machine learning for digital signal processing and radio.

01:43.000 --> 01:47.000
Then present to you some of the toolboxes

01:47.000 --> 01:51.000
that you can use to run optimization for the file

01:51.000 --> 01:56.000
and then give you a short tutorial like a small example

01:56.000 --> 01:59.000
that you can also replicate or use the public code

01:59.000 --> 02:03.000
that I uploaded to use the starting point

02:03.000 --> 02:05.000
to use these toolboxes.

02:05.000 --> 02:10.000
So, first, yeah.

02:10.000 --> 02:13.000
So, first, what is AI and machine learning?

02:13.000 --> 02:15.000
So, right now there's a lot of craze and hype

02:15.000 --> 02:17.000
about AI, everyone wants to do it.

02:17.000 --> 02:20.000
Nobody really knows what it is.

02:20.000 --> 02:23.000
So, I want to clarify that

02:23.000 --> 02:25.000
I'm not going to talk about how to use LLMs

02:25.000 --> 02:28.000
to improve your communications or to have

02:28.000 --> 02:31.000
them design your algorithms.

02:31.000 --> 02:33.000
I'm not going to talk about any AI agents

02:33.000 --> 02:36.000
that's also going to do the coding for you.

02:36.000 --> 02:37.000
Yeah.

02:37.000 --> 02:40.000
Also, no using of APIs of chatboards

02:40.000 --> 02:43.000
to come up with clever DSP algorithms.

02:43.000 --> 02:45.000
Oh, there's a point missing,

02:45.000 --> 02:48.000
but we're going to talk about how to apply

02:48.000 --> 02:52.000
the machine learning principles to improve the algorithms itself.

02:52.000 --> 02:55.000
So, there will be a little bit of math.

02:55.000 --> 02:56.000
Oh, yeah.

02:56.000 --> 02:57.000
So, it's not going to be chat.

02:57.000 --> 02:58.000
Yeah.

02:58.000 --> 02:59.000
That's the point.

02:59.000 --> 03:02.000
So, we'll look at the communication system

03:02.000 --> 03:05.000
and see how we can use some of the open source tools

03:05.000 --> 03:08.000
to apply machine learning in the different parts of our system.

03:08.000 --> 03:12.000
So, first of all, what is machine learning now?

03:12.000 --> 03:18.000
So, there's like an old quote from this book.

03:18.000 --> 03:23.000
Basically, we want a computer program

03:23.000 --> 03:26.000
and we want to feed it some experience E

03:26.000 --> 03:29.000
and define some task T it has to solve

03:29.000 --> 03:32.000
with respect some to some performance measure P.

03:32.000 --> 03:35.000
So, it's a very basic definition

03:35.000 --> 03:39.000
and it should improve the performance

03:39.000 --> 03:42.000
with if you provide more experience.

03:42.000 --> 03:45.000
So, this kind of data driven approach.

03:45.000 --> 03:49.000
And actually, if you think about it more closely,

03:49.000 --> 03:52.000
this means that we already had machine learning

03:52.000 --> 03:54.000
and communications for a long time.

03:54.000 --> 03:56.000
I will give you an example

03:56.000 --> 03:59.000
in some other slides.

03:59.000 --> 04:03.000
So, first of all, this is like a schematic

04:03.000 --> 04:06.000
representation of how a communication system

04:06.000 --> 04:08.000
roughly can look like.

04:08.000 --> 04:11.000
So, you start in the top left with a data source.

04:11.000 --> 04:13.000
So, you have some bits you want to transmit

04:13.000 --> 04:18.000
in a digital system like a picture of pictures or videos or emails.

04:18.000 --> 04:19.000
You compress them.

04:19.000 --> 04:22.000
You put them to some forward error correction

04:22.000 --> 04:25.000
and then you get this capital B and not capital

04:25.000 --> 04:29.000
but the bold B, which we later will also use.

04:29.000 --> 04:33.000
And our system then we map them

04:33.000 --> 04:36.000
with some representation for the physical space

04:36.000 --> 04:40.000
like amplitude shift keying, phase shift keying,

04:40.000 --> 04:44.000
QAM, FSK, different modulation formats.

04:44.000 --> 04:47.000
We put it through some pulse shaping

04:47.000 --> 04:49.000
to put it on the physical medium

04:49.000 --> 04:51.000
and send it to some channel.

04:51.000 --> 04:52.000
I don't know.

04:52.000 --> 04:53.000
It cannot be anything.

04:53.000 --> 04:54.000
Can be wireless.

04:54.000 --> 04:56.000
Can be a satellite.

04:56.000 --> 04:59.000
Communication channels, cables, cyber optical.

04:59.000 --> 05:00.000
It doesn't really matter.

05:00.000 --> 05:02.000
But, some channel.

05:02.000 --> 05:04.000
And the receiver.

05:04.000 --> 05:07.000
We have to somehow get rid of all the channel effects.

05:07.000 --> 05:10.000
We have to synchronize our signal again.

05:10.000 --> 05:14.000
And then after that we get some sort of X hat

05:14.000 --> 05:17.000
which is should represent as closely

05:17.000 --> 05:21.000
to this original X that we got out of the symbol map.

05:21.000 --> 05:24.000
And then our D map will give us

05:24.000 --> 05:29.000
nowadays mostly or oftentimes soft values

05:29.000 --> 05:32.000
which we call these kind of L values,

05:32.000 --> 05:34.000
or look exactly at ratios.

05:34.000 --> 05:37.000
And then we put them to channel decoder, decompress it.

05:37.000 --> 05:38.000
And then at the receiver,

05:38.000 --> 05:42.000
we hopefully have no errors or like a very low bit error rate.

05:42.000 --> 05:44.000
So this is more or less the definition of the system.

05:44.000 --> 05:46.000
And if you want to now apply machine learning,

05:46.000 --> 05:49.000
you basically have to define either.

05:49.000 --> 05:52.000
You want to replace the whole transmitter chain,

05:52.000 --> 05:53.000
everything.

05:53.000 --> 05:56.000
So you just feed bits and get almost physical

05:56.000 --> 06:00.000
signal out or you can also replace single blocks out of this

06:00.000 --> 06:04.000
with neural networks or with other approaches

06:04.000 --> 06:07.000
that have a trainable parameters.

06:07.000 --> 06:11.000
And then you can define the task.

06:11.000 --> 06:13.000
You define the performance measure.

06:13.000 --> 06:16.000
I'm going to show you some of the ones

06:16.000 --> 06:18.000
that you can use for communication.

06:18.000 --> 06:21.000
So that I commonly use for communications.

06:21.000 --> 06:25.000
And then you do data-driven simulation.

06:25.000 --> 06:30.000
So generate bits according to either uniformly distributed

06:30.000 --> 06:34.000
or maybe you have some other patterns that are in your data

06:34.000 --> 06:38.000
that you can also feed in this data source.

06:38.000 --> 06:44.000
And you compute this loss or the performance measure

06:44.000 --> 06:48.000
and then you compute the gradient of this objective function that you have.

06:48.000 --> 06:53.000
And the way you do it and the way I'm going to present today

06:53.000 --> 06:57.000
is numerically and you use commonly known frameworks

06:57.000 --> 07:00.000
which provide us this automatic differentiation.

07:00.000 --> 07:04.000
So it's not always easy from the data sink and the receiver

07:04.000 --> 07:08.000
to calculate the gradient all the way to the source by hand

07:08.000 --> 07:10.000
or analytically sometimes not possible.

07:10.000 --> 07:13.000
So we rely on this automatic differentiation

07:13.000 --> 07:16.000
and numerical simulations to give us

07:16.000 --> 07:18.000
and let's say approximation of this gradient

07:18.000 --> 07:22.000
because it's of course not exact.

07:22.000 --> 07:27.000
And yeah, in order to improve your parameters

07:27.000 --> 07:29.000
that are called theta here,

07:29.000 --> 07:34.000
we apply an optimization step where we have this kind of gradient

07:34.000 --> 07:41.000
and we have a step with mu and we try to go

07:41.000 --> 07:44.000
to find the minimum of our loss functions.

07:44.000 --> 07:47.000
Very simple stuff I hope.

07:47.000 --> 07:50.000
All right, so what are good objective functions?

07:50.000 --> 07:54.000
So one that you could think about immediately

07:54.000 --> 07:57.000
is probably the mean squared error where just compute

07:57.000 --> 08:00.000
mean squared error between your transmit symbol

08:00.000 --> 08:05.000
and your receive symbol, take the average across your batch

08:05.000 --> 08:08.000
or your time length of your simulation

08:08.000 --> 08:10.000
and then you have some loss.

08:10.000 --> 08:14.000
And this is already a pretty good one as we can see later.

08:14.000 --> 08:17.000
Then it's a bit more complicated.

08:17.000 --> 08:20.000
And I call it now modified cross entropy

08:20.000 --> 08:23.000
because this is not exactly cross entropy.

08:23.000 --> 08:26.000
This is let's say the practitioners formula

08:26.000 --> 08:30.000
that you can use at the end if you have this kind of simulation

08:30.000 --> 08:33.000
where on the left side this minus h of x

08:33.000 --> 08:38.000
is the negative entropy of your symbols.

08:38.000 --> 08:40.000
So the source entropy how much information

08:40.000 --> 08:44.000
you can put into for example your QAM.

08:45.000 --> 08:48.000
So for example 64 QAM you can put in six bits

08:48.000 --> 08:53.000
if you have a uniform occurrence of all the symbols.

08:53.000 --> 08:58.000
And on the right side is more or less this conditional entropy

08:58.000 --> 09:01.000
basically this is what you get at the receivers

09:01.000 --> 09:06.000
or you're receiving some complex symbol yk

09:06.000 --> 09:13.000
and you want to figure out what kind of symbol xk was sent.

09:13.000 --> 09:15.000
This is the part you want to minimize.

09:15.000 --> 09:18.000
And this is the part you basically want to maximize of this term

09:18.000 --> 09:23.000
and all the entropy and in, yeah.

09:23.000 --> 09:27.000
And you can also formulate the same bitwise

09:27.000 --> 09:30.000
where this are now these locally-hot ratios

09:30.000 --> 09:34.000
and this is still the source entropy

09:34.000 --> 09:38.000
and you can also use this as a loss function.

09:38.000 --> 09:43.000
If you didn't get exactly how these formulas are born

09:43.000 --> 09:47.000
I left out basically all of the steps coming from math

09:47.000 --> 09:50.000
to these derivations.

09:50.000 --> 09:56.000
This is just what we are going to use later in the simulation.

09:56.000 --> 10:00.000
So I said we already used this kind of machine learning

10:00.000 --> 10:02.000
quite a long time in communications

10:02.000 --> 10:06.000
and actually the first occurrence in literature

10:06.000 --> 10:11.000
is like 1960 where they came up with the LMS equalizer

10:11.000 --> 10:16.000
and it's quite simple system.

10:16.000 --> 10:18.000
Where you basically have the system model.

10:18.000 --> 10:26.000
You derive this x hat k by performing equalization

10:26.000 --> 10:30.000
with a vector f and some receive values

10:30.000 --> 10:33.000
and then you can compute this mean squared error.

10:33.000 --> 10:38.000
You can actually manually derive it, find the minimum

10:38.000 --> 10:41.000
and then you get these two terms.

10:41.000 --> 10:44.000
So the gradients are either in the complex world.

10:44.000 --> 10:48.000
You get this or in the real world, you get this.

10:48.000 --> 10:52.000
And you can just apply the same update step

10:52.000 --> 10:54.000
as I shown before.

10:54.000 --> 10:59.000
So you have the previous equalizers equalizer tabs

10:59.000 --> 11:03.000
and you just subtract this gradient that you can

11:03.000 --> 11:07.000
take either one of those depends on if you are real valued or not.

11:07.000 --> 11:10.000
And then you can basically step into the right direction

11:10.000 --> 11:13.000
and minimize your mean squared error.

11:13.000 --> 11:18.000
So this is let's say quite fun to see that this is nothing new.

11:18.000 --> 11:25.000
But in 2017 there was quite remarkable publication

11:25.000 --> 11:29.000
of let's not do this only on the receiver.

11:29.000 --> 11:33.000
But we can actually start from the transmitter

11:33.000 --> 11:36.000
going to some channel at the receiver

11:36.000 --> 11:40.000
and try to find maybe the best constellation we can transmit

11:40.000 --> 11:41.000
over this channel.

11:41.000 --> 11:45.000
In this case it was a simple AWGN channel.

11:45.000 --> 11:48.000
That's what we're also going to do later in the demo.

11:48.000 --> 11:54.000
And basically the transmitter, you see you put in some

11:54.000 --> 11:59.000
bits or in this case it's an encoded one hot vector

11:59.000 --> 12:03.000
where only one part of the vector is one everything else is zero.

12:03.000 --> 12:10.000
And this leads to getting a complex symbol X that you can transmit.

12:10.000 --> 12:13.000
Put it through this channel, get a complex symbol Y,

12:13.000 --> 12:16.000
put it through another neural network.

12:16.000 --> 12:21.000
And then you just run the optimization chain as we have seen before.

12:22.000 --> 12:26.000
And here on the right are some pictures from the publication.

12:26.000 --> 12:31.000
Basically they got QPSK, PSK and also other constellations

12:31.000 --> 12:35.000
depending on how they put in the constraints.

12:35.000 --> 12:41.000
And later this was also extended to bitwise and to end.

12:41.000 --> 12:45.000
So not only improving the location of the symbols,

12:45.000 --> 12:50.000
but also the labeling, so where to put all the labels.

12:50.000 --> 12:53.000
And that you some of the results you can see here.

12:53.000 --> 12:58.000
So you get a gray coded 16 QAM.

12:58.000 --> 13:05.000
And you can also do derivations or simulations to see how it changes

13:05.000 --> 13:06.000
if you change the SNR.

13:06.000 --> 13:08.000
So points are moving.

13:08.000 --> 13:10.000
So here red is very low SNR.

13:10.000 --> 13:12.000
So here you have points that are close together.

13:12.000 --> 13:15.000
So it transmits less.

13:15.000 --> 13:19.000
A smaller amount of different symbols and here large amount.

13:19.000 --> 13:26.000
And it was even extended to this kind of approach where you remove

13:26.000 --> 13:29.000
most parts of the transmitter and the receiver replacing

13:29.000 --> 13:30.000
the neural networks.

13:30.000 --> 13:34.000
And you don't even have any synchronization or equalization.

13:34.000 --> 13:38.000
But you have this kind of fully pilotless communication.

13:38.000 --> 13:43.000
Which requires, let's say, more neural networks or like deeper neural networks.

13:43.000 --> 13:46.000
But you get also constellations like this.

13:46.000 --> 13:51.000
Which you see there are highly asymmetric and one could already.

13:51.000 --> 13:56.000
Yeah, assume from that this is more or less used by the neural networks

13:56.000 --> 14:01.000
to perform this kind of synchronization and equalization.

14:01.000 --> 14:04.000
So we have these publications.

14:04.000 --> 14:11.000
But not necessarily every publication is giving us a free and open source code.

14:11.000 --> 14:14.000
So what do we need to replicate this?

14:14.000 --> 14:19.000
So one part we need these boxes that are used in traditional systems

14:19.000 --> 14:24.000
because you're not always replacing your whole system with neural networks.

14:24.000 --> 14:28.000
So you need the classical algorithms that you can still use or you require

14:28.000 --> 14:31.000
to use depending on your own scenario.

14:31.000 --> 14:37.000
And preferably in already this kind of automatic differentiation framework.

14:37.000 --> 14:41.000
We need these kind of channel models so we can do simulations

14:41.000 --> 14:44.000
because we are not allowed to always use the real thing.

14:44.000 --> 14:47.000
And that's also good.

14:47.000 --> 14:53.000
And a lot of utility functions are still in these automatic differentiation.

14:53.000 --> 14:59.000
So we can leverage this computation of the gradient and we can optimize things.

14:59.000 --> 15:03.000
And now the question is who should write these toolboxes?

15:03.000 --> 15:05.000
Because the authors aren't always doing that.

15:05.000 --> 15:08.000
But actually the authors should do that.

15:08.000 --> 15:12.000
And in this case the authors also did.

15:12.000 --> 15:18.000
So I'm going to present the first toolbox is called Shona.

15:18.000 --> 15:22.000
And it is developed by a research group at Nvidia.

15:22.000 --> 15:25.000
And you can only see the first name, Hoytis.

15:25.000 --> 15:29.000
But actually in this group all the previous papers that I shown.

15:29.000 --> 15:33.000
Some authors from this group actually contributed to this library.

15:33.000 --> 15:39.000
So they gave back their knowledge to open source and also free software.

15:39.000 --> 15:42.000
It's a passion to license.

15:42.000 --> 15:44.000
So that's quite nice.

15:44.000 --> 15:49.000
And they use as a base tens of low because it was very popular at the time.

15:49.000 --> 15:54.000
And that's also how they created their research papers.

15:54.000 --> 15:58.000
Then I'm also going to show you a little bit about Mocha,

15:58.000 --> 16:02.000
which is more or less my work that I was able to do together with my colleagues.

16:02.000 --> 16:08.000
So that's where we, when you develop things and roll papers, try it.

16:08.000 --> 16:12.000
Or I also try to make them contribute their code.

16:12.000 --> 16:20.000
So we can also have a growing toolbox and give back to the public good.

16:20.000 --> 16:23.000
And our toolbox is based on PyTorch.

16:23.000 --> 16:30.000
And then there's also a third toolbox that I'm just going to put on this slide here.

16:30.000 --> 16:38.000
And also have the QR code in the end, which is developed by a group in Hong Kong polytechnic university.

16:38.000 --> 16:40.000
And it's based on Jack.

16:40.000 --> 16:44.000
So let's say if you have some sort of preference yourself for one of these frameworks,

16:44.000 --> 16:52.000
there's already starting points or already quite widely well developed libraries.

16:52.000 --> 16:55.000
So Trona, what does it consist of?

16:55.000 --> 16:57.000
So what can you optimize?

16:57.000 --> 17:00.000
So they for one provided a system simulator.

17:00.000 --> 17:01.000
So it's like a higher level.

17:01.000 --> 17:04.000
So not the physical layer that they presented before,

17:04.000 --> 17:08.000
but it's like about link adeption, power control scheduling.

17:08.000 --> 17:12.000
And then they have a physical layer simulator.

17:12.000 --> 17:16.000
So this is all the stuff that I mentioned before.

17:16.000 --> 17:20.000
So they have the forward error correction implemented inside the framework.

17:20.000 --> 17:22.000
They have the mapping, channel models.

17:22.000 --> 17:26.000
They have already off the M&M and MIMO and also 5G new radio,

17:26.000 --> 17:28.000
like physical layer implementation.

17:28.000 --> 17:35.000
So this is a rather well developed toolbox.

17:35.000 --> 17:41.000
And since I think last year, they published this.

17:41.000 --> 17:44.000
They have also a ray tracing and channel emulator.

17:44.000 --> 17:49.000
So where you can define a 3D model of the landscape you want to do.

17:49.000 --> 17:54.000
And you can run a ray trace or to get actually the channel impulse responses

17:54.000 --> 18:00.000
for also moving targets and different antenna patterns,

18:00.000 --> 18:03.000
different antenna arrays.

18:03.000 --> 18:04.000
Yeah.

18:04.000 --> 18:11.000
And it's more or less electromagnetic accurate channel modeling.

18:11.000 --> 18:13.000
And how can you use it?

18:13.000 --> 18:16.000
Well, you can just run pip install, Shona,

18:16.000 --> 18:21.000
or UV ads, Shona depends on, but it's a public on pipi.

18:21.000 --> 18:25.000
And quite easy to get started.

18:25.000 --> 18:29.000
So for mock-up for our library, we basically have the same stuff.

18:29.000 --> 18:32.000
We have mapers, synchronization, equalization,

18:32.000 --> 18:36.000
discrete channel models, fiber optical channel model, and utilities.

18:36.000 --> 18:39.000
So it's not nothing really different.

18:39.000 --> 18:42.000
We don't have this extensive wireless channel model.

18:42.000 --> 18:47.000
And also we're missing this forward error correction implemented inside this framework

18:47.000 --> 18:50.000
because we have smaller team.

18:50.000 --> 18:57.000
But it's also available on pipi and you can install it simply by running this command.

18:57.000 --> 19:02.000
And yeah, this is more or less some of the results that we got for

19:02.000 --> 19:08.000
let's say special type of channel and DSP algorithms that we optimize

19:08.000 --> 19:11.000
constellations for.

19:11.000 --> 19:15.000
So that I wouldn't have had here.

19:15.000 --> 19:17.000
Okay, so we do a short tutorial.

19:17.000 --> 19:21.000
So you had this big graph with the block diagram with all of the things.

19:21.000 --> 19:24.000
And we reduce this to this kind of block diagram.

19:24.000 --> 19:27.000
So we have some data source, the transmitter.

19:27.000 --> 19:34.000
We just use a symbol maper, have an AWGN channel, and a neural demaper.

19:34.000 --> 19:39.000
So it's a bare-born system just to show you the capabilities.

19:39.000 --> 19:46.000
And we use this kind of binary cross entropy to find the constellation and the labels.

19:46.000 --> 19:48.000
And how do we do this?

19:48.000 --> 19:54.000
So this is just to give you an overview that it's technically not that difficult.

19:54.000 --> 19:57.000
So you need to do a bunch of imports.

19:57.000 --> 20:03.000
You can define some variables that we use.

20:04.000 --> 20:09.000
You create all these blocks that are available from Shona.

20:09.000 --> 20:11.000
So you have like a binary source.

20:11.000 --> 20:16.000
You already have these constellation constellations, like in the source code.

20:16.000 --> 20:22.000
And this case, LBGN channel, neural demaper, and this loss.

20:22.000 --> 20:32.000
And I mean, the way that you define this kind of channel also means you could easily swap in different channel definitions that are available.

20:32.000 --> 20:37.000
Some of them have preconditions, like you have to increase sampling rate, to pulse shaping or anything.

20:37.000 --> 20:43.000
But let's say in the easy case, you can simply swap in a different channel in a simulation.

20:43.000 --> 20:47.000
And then, how do we perform our end-to-end simulation?

20:47.000 --> 20:59.000
Well, we create bits or sample bits, map them to symbols, send them to a channel, get some LLRs, compute the binary cross entropy.

20:59.000 --> 21:02.000
And that's it.

21:02.000 --> 21:06.000
And then, if you want to then do this kind of optimization step.

21:06.000 --> 21:12.000
We have to roll this in some sort of model, which is defined in terms of flow.

21:12.000 --> 21:23.000
Get the weights, the trainable weights, get the gradients of the loss with respect to the weights and apply these step.

21:23.000 --> 21:28.000
So it's also not widely complicated.

21:28.000 --> 21:40.000
And so I created or I have a notebook running that I'm going to show right now, which is based on the left QR code is going directly to their documentation page of Shona.

21:40.000 --> 21:45.000
And they have, like, I don't know how many notebooks and examples, but it's a lot.

21:45.000 --> 21:55.000
And on the right is the code for the notebook that is running, because I took a lot of things out of there to make it a bit more simple.

21:55.000 --> 21:59.000
So let's see.

21:59.000 --> 22:03.000
So here is basically the code that we had before.

22:03.000 --> 22:09.000
And I already executed it all the way to this kind of plot, where we have more or less the transmit constellation.

22:09.000 --> 22:12.000
So we start with 64 Qm.

22:12.000 --> 22:14.000
And here right now we have nothing.

22:14.000 --> 22:20.000
And then I'm going to just start this training.

22:21.000 --> 22:26.000
And you basically see, so I think this SNR was selected to be like 10 dB.

22:26.000 --> 22:32.000
So 10 dB is not really suitable to transmit 64 Qm over.

22:32.000 --> 22:45.000
So more or less our machine learning system adapts the constellation to something where it's put some more other points closer together.

22:45.000 --> 22:56.000
And we sometimes say it sacrifices a bit, because all of these have the only differ in one bit of the mapping of the label.

22:56.000 --> 23:03.000
And this runs now a little bit.

23:03.000 --> 23:11.000
We can continue with the slides, because we can do the same with Mocha.

23:11.000 --> 23:17.000
So this is more or less similar definitions with the defined transmission.

23:17.000 --> 23:19.000
We create our blocks.

23:19.000 --> 23:21.000
So it's all quite modular.

23:21.000 --> 23:23.000
So that's what is also our goal.

23:23.000 --> 23:30.000
So we can create these sort of block diagrams or the mental model of a block diagram.

23:30.000 --> 23:33.000
So we can connect them all together.

23:33.000 --> 23:38.000
And also more or less run the simulation.

23:38.000 --> 23:46.000
So I have bits, so generate bits, map them symbols, send them for the channel, get LRs.

23:46.000 --> 23:55.000
And then in our case, or in the PyTorge way, it's a bit simpler to then do the spec propagation.

23:55.000 --> 24:00.000
So we have this loss and we calculate the gradient all the way to the back.

24:00.000 --> 24:05.000
And then we will make the optimizer do a step.

24:06.000 --> 24:09.000
And for that, I have a different simulation.

24:09.000 --> 24:13.000
But we can see, so this has now finished simulating.

24:13.000 --> 24:20.000
So we have this kind of weirdly looking consolation, which has some points closer together and some further part.

24:20.000 --> 24:25.000
And now we can actually continue in this block.

24:25.000 --> 24:27.000
So that's the nice thing about Toronto.

24:27.000 --> 24:29.000
They have this error correction.

24:29.000 --> 24:34.000
So we can now create a simulation and run BR.

24:34.000 --> 24:41.000
So if this is not creeping out, we just define this and run it.

24:41.000 --> 24:49.000
And so now basically, I can have a live plot of the bit error simulation curve for this kind of

24:49.000 --> 24:54.000
Consolation and demaper chain that we created.

24:54.000 --> 24:59.000
So this notebook is available in the link.

24:59.000 --> 25:04.000
And the slides are also available online on the first image already.

25:04.000 --> 25:12.000
And so for that, for Mocha, I've created a different demo.

25:12.000 --> 25:18.000
It's not a Jupyter notebook, but it's a standalone application where we have like a nice GUI.

25:18.000 --> 25:28.000
And it's a bit more mobile or more, yeah, it's a bit faster.

25:28.000 --> 25:33.000
More or less, the graphical interface that doesn't mean that the algorithm is faster.

25:33.000 --> 25:36.000
It just looks nicer.

25:36.000 --> 25:44.000
And we can basically, yeah, here change the SNR live.

25:44.000 --> 25:51.000
And you can basically already see in this demo how this has an impact on our receive system.

25:51.000 --> 25:57.000
So this is more or less again, a map and this is what the receiver sees.

25:57.000 --> 26:07.000
So it's quite low noise, but if you now increase the noise a lot.

26:07.000 --> 26:14.000
So we basically see it's going to have to change the Consolation again.

26:14.000 --> 26:17.000
And yeah, here are some other different channel models.

26:17.000 --> 26:24.000
So this is like a small demo purpose because you can actually do research with this demo.

26:24.000 --> 26:30.000
And yeah, you can find this with this link or with this QR code.

26:30.000 --> 26:35.000
And yeah, this is the short-ish demo.

26:35.000 --> 26:42.000
So you can download or check out the GitHub repositories for all of them download them from PIP.

26:42.000 --> 26:49.000
And also use it for your own optimization or you can create your own RFV waveform.

26:49.000 --> 26:59.000
And then if you're an amateur radio license operator, you could transmit it through whatever means you want.

26:59.000 --> 27:10.000
That's about it. And thank you for your attention.

27:11.000 --> 27:13.000
Yeah.

27:13.000 --> 27:14.000
It was very interesting.

27:14.000 --> 27:18.000
My question is, you're assimilate in the fibromtic conference?

27:18.000 --> 27:19.000
Yes.

27:19.000 --> 27:27.000
What are the physical parameters of the fibromtic conference?

27:27.000 --> 27:34.000
So the question is what are the physical parameters or characteristics to put in the fibromtic transport?

27:34.000 --> 27:41.000
So for the optical fiber, so it has similar lead to wireless communication also noise.

27:41.000 --> 27:44.000
And but you also have some optical effects.

27:44.000 --> 27:52.000
So there's, for example, the current effect, which more or less leads if you have an increased amplitude.

27:52.000 --> 27:55.000
You have a phase shift that is proportional to this amplitude.

27:55.000 --> 28:01.000
So you're more or less at the transmitter, for example, quite limited in the transmit power.

28:02.000 --> 28:07.000
But also optical systems experience a little bit higher phase noise.

28:07.000 --> 28:10.000
Relative to the symbol rates.

28:10.000 --> 28:17.000
So low phase noise systems are like 15 kilohertz lined with.

28:17.000 --> 28:24.000
And but they are like a 32 gigabort or 60 gigabort transmit rates.

28:25.000 --> 28:28.000
So these are the, so this is the main effect.

28:28.000 --> 28:33.000
So the current effect, and then you have to do to do nonlinear fiber simulation.

28:33.000 --> 28:38.000
You need to do this kind of clips split step for a simulation.

28:38.000 --> 28:41.000
But then there's also chromatic dispersion that I forgot to mention.

28:41.000 --> 28:46.000
So basically the light on different wavelengths is traveling at different speeds.

28:46.000 --> 28:48.000
So you have this kind of group.

28:49.000 --> 28:51.000
Linear group delay.

28:52.000 --> 28:54.000
Yeah, group delay dispersion.

29:01.000 --> 29:05.000
So the question is if it's suitable to adapt this chromatic path errors.

29:05.000 --> 29:06.000
And yes, of course.

29:06.000 --> 29:11.000
So shown up, for example, they have this multi path 3GPP model,

29:11.000 --> 29:15.000
where you can create channels with multi path.

29:16.000 --> 29:18.000
Let me show this.

29:25.000 --> 29:27.000
So this work here.

29:27.000 --> 29:31.000
So this was done for all of the M over this kind of multi path channels.

29:31.000 --> 29:34.000
So yeah, this paper and the bottom.

29:34.000 --> 29:37.000
So I put these IEE papers, but they are.

29:37.000 --> 29:40.000
I think all of them are available on archive as well.

29:40.000 --> 29:43.000
No, that's miss happened myself for this conference, but yeah.

29:43.000 --> 29:49.000
So this trimming the fat from all of the M, they basically put on each of the of the M transmitter and receiver.

29:49.000 --> 29:53.000
They put this kind of modulators or map us.

29:53.000 --> 29:56.000
And they removed all of the pilot processing.

29:56.000 --> 30:01.000
So this is all done in the inside the neural networks.

30:01.000 --> 30:06.000
And they use the op the wireless channels for this.

30:06.000 --> 30:08.000
So with multi path.

30:08.000 --> 30:12.000
You can also use this to keep the problem.

30:12.000 --> 30:16.000
And instead, try to optimize the channel.

30:16.000 --> 30:18.000
Simulation instead.

30:18.000 --> 30:19.000
Yes.

30:19.000 --> 30:21.000
So that is also done.

30:21.000 --> 30:23.000
That is one of the.

30:23.000 --> 30:27.000
Also one of the research areas that people do a lot.

30:27.000 --> 30:29.000
So they use this kind of.

30:29.000 --> 30:36.000
Yeah, optimization to find channel models that are difficult to analytically model.

30:36.000 --> 30:39.000
So they they more or less have a lot of measurement data.

30:39.000 --> 30:44.000
And now you create for example.

30:44.000 --> 30:45.000
What's the called?

30:45.000 --> 30:48.000
Yeah, some some.

30:48.000 --> 30:52.000
I just forgot the name of these type of of the again.

30:52.000 --> 30:56.000
So the generic adversarial networks.

30:56.000 --> 30:58.000
Generative adversarial networks.

30:58.000 --> 31:01.000
Yes, where you can basically then sample.

31:01.000 --> 31:05.000
Let's say channels that are similar to what they have seen before.

31:05.000 --> 31:10.000
So yeah, but yeah, also for this kind of system in multi path and the channels.

31:10.000 --> 31:15.000
This all relies on a little bit on you need to to have the data.

31:15.000 --> 31:20.000
So either you have an analytical model or you have enough measurement data to cover.

31:20.000 --> 31:29.000
Let's say the plane of possibilities.

31:29.000 --> 31:32.000
Hey, I'm curious in the beginning.

31:32.000 --> 31:35.000
You have like your graph.

31:35.000 --> 31:41.000
And for me, equalization and synchronization reflect in order.

31:41.000 --> 31:45.000
Specific reason or it's just thinking.

31:45.000 --> 31:50.000
I mean, it's synchronization and equalization.

31:50.000 --> 31:54.000
You can put depending on your specific architecture.

31:54.000 --> 31:58.000
You could can put them in different orders as well or parts of equalization.

31:58.000 --> 32:03.000
And parts of synchronization. So yeah, this is it was just like to give an idea.

32:03.000 --> 32:10.000
But yeah, this depends on your specific equalizer and synchronization algorithm.

32:10.000 --> 32:15.000
We have to put them in different order.

32:15.000 --> 32:21.000
And are there any limits to the simulation channel?

32:21.000 --> 32:26.000
What can you use for other.

32:26.000 --> 32:32.000
So the question is if there are any limits for the channel simulation or what their parameters.

32:32.000 --> 32:35.000
And yes, so if you.

32:35.000 --> 32:39.000
So let's say I did this now on this MacBook because it's very.

32:39.000 --> 32:44.000
It was a simple simulation, but if you want to do for example, this kind of fiber simulations.

32:44.000 --> 32:52.000
And typically to get the good result, you need to sample the or perform like steps of length.

32:52.000 --> 32:55.000
100 meters, that is like a lot of.

32:55.000 --> 33:00.000
If you have like a few hundred kilometers of fiber, that's a lot of computation steps.

33:00.000 --> 33:04.000
And in every step, you need to still set for every calculation.

33:04.000 --> 33:08.000
You have to save the gradients. So that becomes a memory issue as well.

33:08.000 --> 33:13.000
So then, for example, we have big GPUs with like 20 or 40 gigabytes of memory.

33:13.000 --> 33:17.000
And even they are not able to process everything in one step.

33:17.000 --> 33:22.000
But this is our more or less the limit. So if you end also the batch size.

33:22.000 --> 33:31.000
So if you have a lot of samples, of course, you also need to save all of these numerically computed gradients in order to perform these optimization steps.

33:31.000 --> 33:34.000
So yeah, this is.

33:34.000 --> 33:40.000
So for your you have the school that you are that most.

33:40.000 --> 33:43.000
You were able to simulate the.

33:43.000 --> 33:45.000
Yeah, it's still running.

33:45.000 --> 33:50.000
So how realistic is that how much do you need or like a.

33:50.000 --> 33:53.000
Obviously it doesn't take for six months of age.

33:53.000 --> 33:56.000
No, like a 16G or 20G channel.

33:56.000 --> 33:57.000
How.

33:57.000 --> 33:58.000
How.

33:58.000 --> 34:01.000
What a realistic professor right there.

34:01.000 --> 34:04.000
You mean now for the training order for the inference.

34:04.000 --> 34:07.000
Like in interaction with students.

34:07.000 --> 34:10.000
I mean, this is depends on your.

34:10.000 --> 34:16.000
So the question is what are the possible transfer rates using neural networks or neural.

34:16.000 --> 34:19.000
Transmitters receivers on hardware.

34:19.000 --> 34:21.000
That's highly dependent.

34:21.000 --> 34:26.000
So for example, for let's say if you create a transmitter that you trained.

34:26.000 --> 34:31.000
You wouldn't use any neural network weights most likely in a transmitter.

34:31.000 --> 34:36.000
You would just extract the consolation and put this into your hardware.

34:36.000 --> 34:42.000
And for the receiver, there are you can, for example, use a neural network.

34:42.000 --> 34:43.000
You need to.

34:43.000 --> 34:48.000
Yeah, see how well it can be adapted to the hardware you need.

34:48.000 --> 34:56.000
But from what I've seen for these kind of simple cases, the neural networks are quite thin like three layers.

34:56.000 --> 34:59.000
And these are, yeah, not a lot of computation.

34:59.000 --> 35:01.000
So you can even make the case.

35:01.000 --> 35:04.000
For example, if you compare it to like a maximum likelihood receiver,

35:04.000 --> 35:09.000
where you compute the distance to every possible point, you can be cheaper with a neural network.

35:09.000 --> 35:14.000
Because it's doing this intrinsically in a different way.