WEBVTT

00:00.000 --> 00:15.000
Okay, can you hit me?

00:15.000 --> 00:17.000
Okay, thank you.

00:17.000 --> 00:22.000
So hello, I'm Paval Mahek and I'm here to talk about phones and cameras.

00:22.000 --> 00:27.000
For the things that something about me, you have an email address up there, you have my

00:27.000 --> 00:28.000
favorite account.

00:28.000 --> 00:38.000
When I'm not hacking phones, I'm usually playing with horses and I have good smartwatch, which

00:38.000 --> 00:40.000
shows time, which is not usual for a watch.

00:40.000 --> 00:44.000
So if you want a watch that shows time, then you can take a look.

00:44.000 --> 00:45.000
I'm playing with phones.

00:45.000 --> 00:50.000
I like small ones, so this one actually runs post-marketers and I'm trying to use

00:50.000 --> 00:54.000
it as a daily driver that will bite me, but whatever.

00:54.000 --> 01:01.000
So I can not hit him when I try to make money, and that's about it, I guess.

01:01.000 --> 01:08.000
So phones usually have dump sensors, which is a good thing.

01:08.000 --> 01:15.000
Smart sensors are in NARPC, but we won't talk about this, because we don't have too much time.

01:15.000 --> 01:21.000
And we usually don't have useful ISP image signal processor, because we don't have drivers.

01:21.000 --> 01:25.000
Sometimes you usually do have ISP, but we don't can't use it.

01:25.000 --> 01:31.000
And camera is basically mandatory for all users, everybody needs camera.

01:31.000 --> 01:33.000
So what do we need to do?

01:33.000 --> 01:38.000
First thing we need to do is auto exposure, because if you don't do auto exposure, you have

01:38.000 --> 01:42.000
completely break frame, completely white frame, not useful.

01:42.000 --> 01:44.000
Auto exposure is for translate white easy.

01:44.000 --> 01:50.000
You can get something quite simple, like if it's too wide, you just turn down the time,

01:50.000 --> 01:53.000
and you will get something working.

01:53.000 --> 02:01.000
Auto focus is harder, because most cameras need auto focus, and auto focus is actually

02:01.000 --> 02:04.000
very tight and it's quite hard to do well.

02:04.000 --> 02:08.000
And I'm currently not aware of any good auto focus implementation.

02:08.000 --> 02:16.000
I try to write few, maybe you can try them, and make a picture so that needs more work.

02:16.000 --> 02:19.000
Next stuff.

02:19.000 --> 02:23.000
So camera vendors like to like to like to you a lot.

02:23.000 --> 02:28.000
So if they speak about pixels, they mean stop pixels basically.

02:28.000 --> 02:33.000
And if they speak about bits, they mean something else, then display means.

02:33.000 --> 02:39.000
So you need conversion between this, and this is called debiring, and then

02:39.000 --> 02:45.000
cork, you multiply with color matrix, and do bunch of other steps.

02:45.000 --> 02:50.000
I won't go into too much details there, but it's computationally quite intensive.

02:50.000 --> 02:57.000
And then they are some like optional steps, like lens shading compensation,

02:57.000 --> 03:03.000
because the camera optics is quite bad too in the phones.

03:03.000 --> 03:07.000
And if you compensate for that badness, you get better picture.

03:07.000 --> 03:11.000
Then if you want, you can do denoising, sensor, yeah.

03:11.000 --> 03:18.000
If you have sensor on your screen, if you had like 100 bit pixels, you would be mad.

03:18.000 --> 03:23.000
If you have 100 bit pixels on your phone, sensor, that's pretty much normal.

03:23.000 --> 03:31.000
So basically they are all bad, they are like there's different levels of bad, and software is supposed to deliver that.

03:31.000 --> 03:36.000
They don't need to do, well, you've conversion.

03:36.000 --> 03:43.000
Basically RGB is bad for JPX and for video encoding, moving encoding.

03:43.000 --> 03:48.000
So you want to get to other cork space, and then you need to encoding.

03:48.000 --> 03:55.000
You often have hardware blocks to do that, but you don't have drivers, so you want to use it to do it on CPU.

03:55.000 --> 03:57.000
Working design.

03:58.000 --> 04:03.000
If you want something working, you want to keep the sensor in mid to high resolution,

04:03.000 --> 04:12.000
because that's what enables you to do phase detection autofocus, for example,

04:12.000 --> 04:14.000
and you get all the data.

04:14.000 --> 04:21.000
In the mid resolution, you can still get good enough frame rate, and you can debyre and downscaled on GPU.

04:21.000 --> 04:24.000
GPUs is fast enough to do this.

04:25.000 --> 04:33.000
That's an advantage because if you want to take a photo, you just the preview is running.

04:33.000 --> 04:39.000
GPU is doing conversion from the robot data to your display,

04:39.000 --> 04:45.000
but if you want to take a photo, you just copy the buffer, and then process the slider.

04:45.000 --> 04:47.000
So you just save raw.

04:47.000 --> 04:57.000
On the movie, you really want your GPU to convert to a UI format, and you can pass the 2G streamer.

04:57.000 --> 05:02.000
G streamer is great, quite fast, actually.

05:02.000 --> 05:11.000
It seems to be faster than the beginning itself, and if this, you will be able to get, like,

05:11.000 --> 05:21.000
go to 8 megapixels at 30FPS from the dream 5, and the dream 5 is basically the worst case phone.

05:21.000 --> 05:23.000
Your phone probably is better than that.

05:23.000 --> 05:34.000
The dream 5 is a break, it's a well supported break, but it's so, but if we design for that one, everything else will probably just work too.

05:34.000 --> 05:40.000
So, I discovered implementation and here it is, I call it click machine.

05:40.000 --> 05:46.000
I'm not the only one who worked on that, I just basically asked for help on social media.

05:46.000 --> 05:50.000
Hey, I need to do this, it should be easy, but it's not easy for me.

05:50.000 --> 05:57.000
Help me, and people did, and the robots did, and if you can see the results as a good lap,

05:57.000 --> 06:03.000
so you wish to be able to search for that, it's part of the 2G project.

06:03.000 --> 06:12.000
I edit phase detection of the focus, which is nice for robustness and so on, it needs more tuning.

06:12.000 --> 06:21.000
Design if click machine, if you have a 3D project, you wouldn't know if click machines is movies first,

06:21.000 --> 06:32.000
and movies first is actually good design because still images are easier and so on, so that's what we did.

06:32.000 --> 06:36.000
It's quite liberant for specific, unfortunately.

06:36.000 --> 06:46.000
So, big sense goes to Dos, mentioned here, and his tourism employee, and he helped a lot of the work,

06:46.000 --> 06:54.000
but in the end, the single liberant for specific, but it's good demonstration that this thing is possible.

06:54.000 --> 07:01.000
I tried to port it to something else, I wasn't too successful, if I knew more GPU programming,

07:01.000 --> 07:04.000
I should probably be able to do it, help will come.

07:04.000 --> 07:10.000
It's 8 bits specific, because liberant for currently only has 8 bit drivers for the camera.

07:10.000 --> 07:16.000
You basically won't turn out 12 bit for good images.

07:16.000 --> 07:25.000
There are patches or dead, but they hit the kernel to, so there's more work to be done.

07:25.000 --> 07:30.000
Signal processing that we do debire and downscale at the same time, which is great,

07:30.000 --> 07:34.000
because it means that we have better colors, because we don't have to interpolate,

07:34.000 --> 07:42.000
and it actually makes stuff easier because we can take the photos at the same time and so on, so that's good thing.

07:42.000 --> 07:52.000
It's like a few hundred coats, lines of coats, it's not like crazy complex, and it works, so I'm happy with that.

07:52.000 --> 08:03.000
It can do certain FPS in, like, don't hit megapixels, so video recording with sound via this streamer.

08:03.000 --> 08:06.000
So, that's something I believe is good.

08:06.000 --> 08:19.000
One second or not, liberant five has a hardware back, and it's known if you overload the memory while the camera is working, it will just hang.

08:19.000 --> 08:26.000
So, I don't know solution, I don't believe anybody knows the solution, solution may be getting you hardware,

08:26.000 --> 08:37.000
and that is the solution long term, but if you want to see, like, pretty complete implementation that means tuning and a lot of tuning,

08:37.000 --> 08:41.000
that's click machines, and it's available on GitLab.

08:41.000 --> 08:48.000
Next project I should mention is megapixels, megapixels is way more nature than click machine,

08:49.000 --> 08:57.000
but it's not designed for movies, so without a design, it won't be able to record long movies.

08:57.000 --> 09:00.000
I don't think it's going to be fixed.

09:00.000 --> 09:08.000
It's GTK based, so it has, like, quite nice interface and so on, but there's a triple GTK,

09:08.000 --> 09:16.000
and GTK new GTK requires new hardware, but we only have all hardware on phones.

09:16.000 --> 09:23.000
So, Samsung will eventually need it to be done with that.

09:23.000 --> 09:29.000
Probably not by me, but hopefully someone could ever think of Samsung.

09:29.000 --> 09:35.000
And then there's a future, right? Future is the camera. Everybody knows future is in the camera,

09:35.000 --> 09:44.000
because that's where the money are, it's same for the PC, sent phones and so on,

09:44.000 --> 09:50.000
but it's also a lot of work, and it's currently missing some quite important pieces.

09:50.000 --> 09:57.000
So on GitLab, in the camera repository, I have a modifth fork of the camera,

09:57.000 --> 10:03.000
which contains autofocus from NekoCVD, like, Samsung's to hand,

10:03.000 --> 10:12.000
that autofocus needs a lot of work, but it's important to have the component first.

10:13.000 --> 10:25.000
Lip camera needs a lot of work, as a whole, it's moving, thinking.

10:25.000 --> 10:36.000
It's supposed to be suitable for any hardware, so work is progressing more slowly than it should.

10:36.000 --> 10:41.000
Then you need some kind of client, we've been coming up to take photos, right?

10:41.000 --> 10:47.000
And you can take some kind of cheese, but cheese is not the real camera application suitable for phone.

10:47.000 --> 10:51.000
You would like to, for example, take the high resolution picture pictures,

10:51.000 --> 10:57.000
and you would like to select time for the exposure and so on, manually and so on.

10:57.000 --> 11:07.000
And I'm not missing the cheese, it's suitable for, like, this kind of debugging, so I started my own.

11:07.000 --> 11:12.000
It's called, I called it Millicum, and it's in the repository,

11:12.000 --> 11:16.000
set for the phone name and come.CPT.

11:16.000 --> 11:22.000
It uses SDL because SDL is nice, and SDL doesn't bite unlike GTK,

11:22.000 --> 11:31.000
which means the user interface won't be that pretty, but this is still prototyping, right?

11:31.000 --> 11:41.000
Lip camera made great progress in last, I believe, 14 days, because GPU ISP processing was finally managed,

11:41.000 --> 11:48.000
after a few months of, like, maybe half a year or more of efforts,

11:48.000 --> 11:51.000
thanks a lot to people that made it happen.

11:51.000 --> 11:57.000
And that's basically means that the camera is starting to be usable.

11:57.000 --> 12:02.000
I'm not saying it's usable for photos now, but it's moving in the red direction,

12:02.000 --> 12:07.000
and at this moment most of the code is there.

12:07.000 --> 12:11.000
What we are still missing, it can do the buyer on the GPU,

12:11.000 --> 12:15.000
and it can do color matrix multiplication on the GPU,

12:15.000 --> 12:19.000
which are two heavy steps.

12:19.000 --> 12:23.000
I don't believe it can do one shading at the moment,

12:23.000 --> 12:28.000
and I don't believe it can do I-O-E.

12:28.000 --> 12:32.000
It's just like, I'm thinking maybe just for all the shading.

12:32.000 --> 12:38.000
Okay, so as you see, the camera is moving quickly, so we have patch on the mailing list,

12:38.000 --> 12:43.000
but let camera is so out of focus was on the mailing list too,

12:43.000 --> 12:47.000
year ago, and it's still on there, so it's hard.

12:47.000 --> 12:52.000
And as a single, I would really like to see, is buyer the processing,

12:52.000 --> 12:56.000
which means if I already have my photo in the NG,

12:56.000 --> 13:00.000
I still need to want to pass it really ISP one more time

13:00.000 --> 13:04.000
to get it into the high resolution, nice photo.

13:04.000 --> 13:07.000
That is for that are also there from a very, very people,

13:07.000 --> 13:12.000
but that's another major task waiting for us.

13:12.000 --> 13:15.000
So a little camera will be great in future.

13:15.000 --> 13:23.000
It starts to be useful, but it's like not yet, not yet there.

13:23.000 --> 13:29.000
So, while developing all this,

13:29.000 --> 13:34.000
I realized that I need a way to see the images basically.

13:34.000 --> 13:38.000
And after some research, I realized that we're like,

13:38.000 --> 13:44.000
current data formats are not suitable, so I created my own.

13:44.000 --> 13:49.000
This is supposed to be basically what the NG should be,

13:49.000 --> 13:53.000
but the NG is very complex, stiff-based format,

13:53.000 --> 13:57.000
and it can't restore everything we would like to store,

13:57.000 --> 14:00.000
so the NG is non-starter.

14:00.000 --> 14:04.000
Basically what this format does is there's fixed size header,

14:04.000 --> 14:08.000
header contains width, width, image width, image height,

14:08.000 --> 14:10.000
and for CC code.

14:10.000 --> 14:13.000
And timestamp and stride and so on,

14:13.000 --> 14:17.000
but the idea is that if you have the data from the sensor,

14:17.000 --> 14:21.000
you don't say with raw, you just append this easy header,

14:21.000 --> 14:25.000
and it means I have the other tools that can work with this.

14:25.000 --> 14:30.000
For example, I can divide it, I can display it.

14:31.000 --> 14:35.000
I can take a bunch of four CC images

14:35.000 --> 14:39.000
and to move it for testing purposes and so on.

14:39.000 --> 14:41.000
I'm going to happy about this.

14:41.000 --> 14:43.000
I believe it helped my development a lot,

14:43.000 --> 14:50.000
so if you find that you need a simple format

14:50.000 --> 14:53.000
for your global camera application,

14:53.000 --> 14:56.000
four CC might be for you.

14:57.000 --> 15:00.000
Of course, if you are doing professional photography

15:00.000 --> 15:04.000
with the graphics, that needs DNG output,

15:04.000 --> 15:09.000
but there's quite a way from raw,

15:09.000 --> 15:14.000
raw picture up to the DNG or JPEC.

15:14.000 --> 15:17.000
And it can move conversion to PNM,

15:17.000 --> 15:19.000
portable network map.

15:19.000 --> 15:23.000
It can do conversion to JPEC and so on.

15:24.000 --> 15:26.000
I'm starting to write simple tools,

15:26.000 --> 15:28.000
maybe some are Python,

15:28.000 --> 15:30.000
some are an instant error on C,

15:30.000 --> 15:34.000
but it's important to see at least some sync.

15:34.000 --> 15:36.000
Okay, about hardware.

15:36.000 --> 15:39.000
There are two phones that should be mentioned.

15:39.000 --> 15:42.000
One of them is the brain five.

15:42.000 --> 15:46.000
The brain five is the board I was talking about.

15:46.000 --> 15:51.000
If you want to take photos with your phone,

15:51.000 --> 15:53.000
this is the hardware to work with.

15:53.000 --> 15:57.000
It's not cheap, it's not nice, it's brick.

15:57.000 --> 15:59.000
It doesn't have a good battery life,

15:59.000 --> 16:02.000
but it's graded for us and created this,

16:02.000 --> 16:05.000
and it's quite a major.

16:05.000 --> 16:07.000
The camera work there two years ago,

16:07.000 --> 16:10.000
and it still works there.

16:10.000 --> 16:14.000
If you want something closer to real phone,

16:14.000 --> 16:17.000
then maybe one plus six is for you.

16:17.000 --> 16:19.000
With very lightest code,

16:19.000 --> 16:23.000
we have camera sensors working in post-marketers,

16:23.000 --> 16:26.000
and it has autofocus tips and so on.

16:26.000 --> 16:31.000
It has, it's like, not this tick, it's thinner.

16:31.000 --> 16:37.000
But the general advantage is that the support

16:37.000 --> 16:42.000
is getting there and people are working hard on it.

16:42.000 --> 16:44.000
Okay,

16:44.000 --> 16:48.000
so I believe this is for it.

16:48.000 --> 16:51.000
I don't think we could get something like

16:51.000 --> 16:54.000
camera showing on this thing.

16:54.000 --> 16:57.000
So if you want to see,

16:57.000 --> 17:00.000
I have all three applications running here hopefully.

17:00.000 --> 17:04.000
I can show you after the talk ends.

17:04.000 --> 17:10.000
And this is thank you and time for questions.

17:10.000 --> 17:20.000
Thank you.

17:20.000 --> 17:26.000
Or comments or anything else.

17:26.000 --> 17:31.000
So if there are no questions, then I get one.

17:31.000 --> 17:32.000
I don't know.

17:32.000 --> 17:35.000
I don't know if there's a phone from the previous talk.

17:35.000 --> 17:37.000
Okay.

17:37.000 --> 17:40.000
Yeah, so question was why not the telephone?

17:40.000 --> 17:43.000
The telephone is different generation, basically.

17:43.000 --> 17:46.000
So Libre and five is like three for it.

17:46.000 --> 17:49.000
So at the moment, and the support is there.

17:49.000 --> 17:52.000
The telephone is more like one plus six.

17:52.000 --> 17:54.000
It's Qualcomm's thing.

17:54.000 --> 17:59.000
And basically, what one plus six and the telephone,

17:59.000 --> 18:02.000
I believe they share the SOC,

18:02.000 --> 18:05.000
which means that we're done on the telephone.

18:06.000 --> 18:08.000
It's applicable to one plus six and so on.

18:08.000 --> 18:09.000
This is the generation.

18:09.000 --> 18:11.000
We will get there.

18:11.000 --> 18:15.000
But if you want easy start, it's still within five.

18:18.000 --> 18:20.000
And I mark questions.

18:20.000 --> 18:23.000
Okay, so thank you by the way, health wanted.

18:23.000 --> 18:39.000
Yes, so question was, if my projects utilize hardware acceleration,

18:39.000 --> 18:41.000
the utilized GPU for that.

18:41.000 --> 18:47.000
Because basically the drivers for ISP, I'm not AI.

18:47.000 --> 18:50.000
And GPU is fast enough to do that.

18:50.000 --> 19:01.000
So modern phones have ISP block and JPEC and coding block and movie and coding

19:01.000 --> 19:02.000
block.

19:02.000 --> 19:07.000
But I'm not aware of any phone that the hardware drivers would be available.

19:07.000 --> 19:12.000
And that means that I'm using GPU.

19:13.000 --> 19:17.000
Okay, ping-pong problem is, so I didn't mention ping-pong problem.

19:17.000 --> 19:19.000
Maybe I should have.

19:19.000 --> 19:23.000
It's a phone I have had good things about.

19:23.000 --> 19:27.000
But I don't have, like, I don't have one.

19:27.000 --> 19:34.000
If I found one on my doorsteps, I could probably play with that.

19:34.000 --> 19:36.000
But it should be mentioned.

19:36.000 --> 19:39.000
One more thing, there's ping-pong pro.

19:39.000 --> 19:42.000
It's the dump camera and useful camera development.

19:42.000 --> 19:46.000
And then there's good old ping-pong, no pro.

19:46.000 --> 19:52.000
And it has a smart camera, which basically means it's like PC and everything works.

19:52.000 --> 19:57.000
So if you have a phone which can take pictures that come quality,

19:57.000 --> 20:00.000
you might take a liquid fine ping-pong.

20:00.000 --> 20:03.000
Maybe for some kind of application development.

20:03.000 --> 20:07.000
But it's really, really different hardware from everything else.

20:07.000 --> 20:11.000
But ping-pong pro would be very suitable to play with.

20:17.000 --> 20:20.000
Okay, it's black magic, basically.

20:20.000 --> 20:22.000
It's low-level black magic.

20:22.000 --> 20:27.000
I have some of the function names are the character's long.

20:27.000 --> 20:31.000
And I don't know what is it.

20:31.000 --> 20:34.000
I believe it's HGL towards some sync.

20:35.000 --> 20:42.000
It was major blocking point because, apparently, if we have a GL version 3,

20:42.000 --> 20:46.000
we would be able to work without crazy hex.

20:46.000 --> 20:51.000
But we are limited to a GL2, which means the GPU can do it.

20:51.000 --> 20:54.000
But we are doing some quite crazy hex.

20:54.000 --> 20:56.000
Take a look at the product machines.

20:56.000 --> 21:02.000
It's white black magic at places.

21:03.000 --> 21:08.000
Okay, so if there are no other questions, then thank you.

21:08.000 --> 21:10.000
And if you want to see it in work.

