WEBVTT

00:00.000 --> 00:10.600
This is the other end of the story. This is looking at an application that is consuming

00:10.600 --> 00:20.000
a camera and the troubles we had with it. Have I got this switched on? Maybe.

00:20.000 --> 00:29.120
Lights green. Yeah, so the troubles we had interfacing to the APIs and yeah, I have

00:29.120 --> 00:33.760
a sub-confusion, which was that plan 9 had the right idea.

00:33.760 --> 00:42.560
So yeah, I'm Tim Panton, I'm the CTO at PI.pgmbh pipe and we license a WebOTC stack for

00:42.560 --> 00:48.800
small cameras and some of them move, some of them don't, that's a baby monitor, but some

00:48.800 --> 00:54.160
of them move quite quickly. This is actually the subject of the talk, I don't know if you

00:54.160 --> 01:00.320
can see it up in the top corner there, but it's a camera that's mounted in a race car over

01:00.320 --> 01:08.240
the driver's shoulder so that the driver can be helped by the picker, so it's a real-time

01:08.240 --> 01:13.600
video from behind the driver's shoulder looking out over the track, but also what the driver

01:13.600 --> 01:23.400
was doing back to the picker. So we're interested in low latency, high quality video

01:23.480 --> 01:29.000
over 5 or 10 kilometers, so we're not talking thousands of miles, but a distance, so we use

01:29.000 --> 01:34.120
5g for that, but we're not, I'm not really going to talk about much of that, except that

01:34.120 --> 01:42.440
that's the context of it, oh and by the way, that's what it looks like. Yeah, so I'm not really

01:42.440 --> 01:47.960
going to talk about a lot of that, what I'm interested in talking about is the bit between

01:48.040 --> 01:55.640
the camera and our software, how about interface works together. I mean, just to give you

01:55.640 --> 02:04.520
the full picture, we have an arm, SPC sitting there, running Linux with our software, sitting there

02:04.520 --> 02:10.920
doing the robot. You see piece and some smart stuff about bandwidth management and various other

02:10.920 --> 02:17.560
security things and whatever else, monitoring some sensors and stuff like that, but it needs to

02:17.560 --> 02:23.640
get the key thing is it needs to get the camera data. So it talks kind of about how we did that

02:23.640 --> 02:31.800
and the history of how we've walked through that. So the first things are like a lab prototype

02:31.800 --> 02:37.000
to see whether we could make this work at all and what we did, which is what you do with any video

02:37.000 --> 02:42.520
projects like this, is you start off with Gstreamer. You put a Gstreamer pipeline together

02:42.520 --> 02:47.880
and you kind of get something to vaguely work. And that's what we did. And so basically what we did

02:47.880 --> 03:00.040
is we took the VFRL to Camp Source and encoded it to H264, Packadized it to RTP, all in Gstreamer,

03:00.040 --> 03:05.240
and then sent that to those RTP packets to local hosts, which we then picked up in our web

03:05.240 --> 03:14.680
HTC stack, did stuff, magic stuff with them and then sent them out over the wire. And this

03:14.680 --> 03:20.200
actually worked, I mean it's simple, it's a really simple set up. And one of the nice things

03:20.200 --> 03:30.120
about it is that the video bit is well isolated from our stuff in the web HTC stack. It works really

03:30.120 --> 03:36.280
well in the lab, but the moment you get it out onto a real network with packet loss and variable

03:36.280 --> 03:42.920
bit rates and stuff like that, it failed miserably. And basically what you want to be able

03:42.920 --> 03:49.320
and in order to get anything you have to send keepsending keyframes, which reduces what you get

03:49.320 --> 03:57.400
in terms of the performance. So I want to say a little bit about like isolation, what I mean

03:57.400 --> 04:05.080
about the separation. So there's kind of essentially three sorts of isolation talking about there,

04:05.080 --> 04:12.520
we're talking about process isolation. So is it running a separate process? Does my, in fact,

04:12.520 --> 04:18.680
Java process? Does it have to care about what's happening in the video, low-level video APIs?

04:18.680 --> 04:23.000
And it's really nice if they don't because then the threading gets a lot easier and stuff like that.

04:23.960 --> 04:28.120
The second piece of isolation is about memory. Are they sharing memory? Is there a security

04:28.120 --> 04:33.720
risk between like sending data between these two things? Is it something I need to worry about?

04:33.720 --> 04:40.280
And then the third piece is a, is license isolation. And this is actually, in our particular case,

04:40.280 --> 04:47.320
we have potentially have some proprietary algorithms sitting in the Java, doing stuff that

04:47.320 --> 04:54.440
maybe the race manufacturers have told us about. And we can't leak that out into open source,

04:54.440 --> 04:59.000
although we want as much of the stack as possible to be open source. So we need to be able to

04:59.000 --> 05:04.200
understand what the license isolation is. And that's not always possible. So all of these things

05:04.200 --> 05:13.000
are really desirable. So next step is, okay, so let's generate full frames, let's be able to control

05:13.000 --> 05:19.080
generating full frames and control the bit rate. So we can manage the bit rate on the encoder.

05:19.080 --> 05:24.200
So we went off and found a nice open source project for the Raspberry Pi, which managed the

05:24.200 --> 05:29.560
camera and did all that. And it was, it's another G-streamer note. So we did that. We wrapped it up in

05:29.560 --> 05:37.160
a pile of C so that we exposed those two pads up into a standard IO. And then we had the Java

05:37.240 --> 05:45.640
exact that. And we got the ability to control the bit rate and to force a generation of full frame

05:45.640 --> 05:54.680
directly from Java, which was, but keeping the isolation, which was lovely. And with a really

05:54.680 --> 06:02.360
simple API, a really elegant API, which is just basically asking. And so we got, still got good

06:02.600 --> 06:08.520
isolation. So the only downside is, it's a G-streamer pipeline. Now, I love G-streamer,

06:08.520 --> 06:13.720
it's really useful, but it is a pipeline. So you end up with some latency due to the fact

06:13.720 --> 06:19.000
you've got a bunch of frames in the different processes in the queue, in the different stages of

06:19.000 --> 06:25.320
the pipeline. And the other downside of this design is that every packet has to loop up and down

06:25.400 --> 06:32.280
through local host, through the kernel. And that's unnecessary and costly. So we wanted to

06:32.280 --> 06:39.080
get the latency down, because these cars move pretty quickly, so it kind of matters. So we went

06:39.080 --> 06:44.440
for what turns out, in my view, to be absolutely the pinnacle of what we've done with this,

06:44.440 --> 06:54.440
which is to read H264 frames, direct from Davideo. So you can, with some trickery on a particular

06:54.520 --> 07:05.320
version of Raspberry, and you can ask it to generate H264 from Davideo. And when you read a block,

07:05.320 --> 07:10.120
do a read from it. You've got an entire frame from it, and you could just packetize that and send

07:10.120 --> 07:16.440
it out. So we did that, and you got very nice low latency. You've got direct VFRL2 access to

07:16.440 --> 07:22.680
changing the bit rate and forcing full frames. So this turned out to be a really easy way to do this.

07:22.680 --> 07:31.000
And very, I mean, nicely low latency, and good isolation, because the definition is,

07:31.000 --> 07:36.600
is like, it's fastest to max, that's basically, it's fastest to min exec. So we're not like,

07:36.600 --> 07:41.720
we've got a really nice thing. However, it only works with the Broadcom blob, because that

07:41.720 --> 07:47.160
tight integration between the encoder and the camera is in the Broadcom blob, and it's nowhere else.

07:47.240 --> 07:55.000
And then the Raspberry Pi folks deprecate it in the next release. So that was out, which is a sad thing.

07:56.040 --> 08:01.480
So now we get to the, like, okay, so we can't really use the Raspberry Pi, because they've got

08:01.480 --> 08:10.280
rid of the hardware encoder in the five, they don't. And just various reasons, it becomes on

08:10.360 --> 08:16.360
practical. So we ended up moving to sadly not a rocket, as it turns out, let's say, aim logic.

08:18.920 --> 08:25.960
And this hardware, this cadres film for, nice hardware, great. The only problem is that the

08:25.960 --> 08:35.240
supported G-stream camera interface doesn't, well, sorry, encoder interface doesn't support

08:35.320 --> 08:41.880
forcing full frames and doesn't support changing the bit, right? And I could not find the source code

08:41.880 --> 08:49.880
that would compile for that AML encoder module that would work with the camera. So like, I did

08:49.880 --> 08:54.120
a, I spent a week looking for it, and I just couldn't find anything to compile. Well, I did find

08:54.120 --> 09:01.640
was the door, I saw that it loaded to, to talk to the encoder. So I found the source code to that,

09:01.640 --> 09:08.280
and did what was either an elegant or ugly depending on your taste. Hack on that was just

09:09.320 --> 09:15.640
shares a two-by-memory segment between that and the Java. And so you write into that, what bitrate

09:15.640 --> 09:21.640
you want, and whether you want a full frame. And for every encoded frame, it looks at that,

09:21.640 --> 09:26.120
and says, okay, so I need a full frame in general, so it changes the bitrate as necessary.

09:26.120 --> 09:30.120
That turned out, and it was nice about that, as you can say, it's ugly in some respects,

09:30.200 --> 09:37.320
but it's nice in that you can say to the, you can mark this shared memory segment as effectively

09:37.320 --> 09:44.680
read only in the encoder software, and you can have it right only in Java. So this is like,

09:44.680 --> 09:48.120
you know which direction the data is traveling. So although you're sharing memory and you're

09:48.120 --> 09:52.680
losing isolation from that point of view, it's still sort of controllable from that point of view,

09:52.680 --> 09:56.600
which is nice. And we're back at the highlight and see because we've got a pipeline,

09:57.560 --> 10:06.360
and it, you know, in some sense, it's ugly. So this is what we're doing at the moment,

10:06.360 --> 10:12.920
which is the next step, which is to say, the thing I'd been avoiding, trying to avoid doing

10:12.920 --> 10:22.760
for a long time, that we, I didn't want to talk direct to the APIs in Java. I wanted a nice

10:22.760 --> 10:29.240
clean interface, turned out to be impossible to do that, or it turned out, I've avoided doing

10:29.240 --> 10:36.120
J and I, which is just really ugly API for that. But the Java people have done produced a

10:36.120 --> 10:45.000
much cleaner API for accessing.ISOs, that I started to use called FFM, and I'm not going to talk

10:45.000 --> 10:50.200
anything about it, except that it, it gives you at least some sort of control over the memory

10:50.200 --> 11:00.280
access and our, and allocation, and you can invoke, it invoke methods on the DLL. So, we basically

11:01.000 --> 11:08.920
access both the encoder and VFRL2. The downside with the VFRL2 is we couldn't just read from

11:08.920 --> 11:17.640
dev video. We had to use the, the, um, icons for doing the memory mapping in that segment, but that's

11:17.720 --> 11:23.720
turned out to be manageable in the FFM, so that was okay. And we ended up with really nice latency,

11:23.720 --> 11:29.720
we'd add 270 milliseconds glass to glass, which I was quite pleased with, and that's with, with a

11:29.720 --> 11:34.200
private 5G between them. It seemed pure Java, so I don't have to maintain any C, which is really

11:34.200 --> 11:42.120
nice. Um, and I don't have to change what ships in the, by the, by the cada stuff,

11:43.000 --> 11:48.760
guys into the operating system. I don't have to modernist.so, which I'm much happier with.

11:48.760 --> 11:53.720
The downside is the isolation is obviously a little bit limited, and there's more code. I've got

11:53.720 --> 12:04.360
some code to maintain. However, terrible thing happened, and they have now updated the hardware,

12:04.360 --> 12:13.800
so that it doesn't support that interface anymore. Um, the, the VFRL2, uh, the video has gone,

12:13.800 --> 12:24.520
and it's now dev media, and it has a completely different API, um, which basically VFRL2 kind of

12:24.520 --> 12:32.360
pretty much doesn't do anything that you expect it to. Um, but there is a dot, so which is compiled

12:32.520 --> 12:39.080
in C++, so I have to mangle a bunch of names in order to try and even access these methods.

12:39.080 --> 12:47.640
But it, uh, and I have to, because of the C++ API, I have to mangle, um, mangle and I have to do

12:47.640 --> 12:54.840
LD preload in order to get this thing to work. But it does actually work. Um, I guess it's because

12:54.840 --> 12:59.000
it's Android compatible or something. I don't, right? Don't know why this changes happened,

12:59.080 --> 13:04.920
but it's kind of fairly unsatisfactory from my personal point of view. Um, and I'm working with

13:04.920 --> 13:10.920
Cardus to try and make the VFRL2 stuff at least, uh, work so that I can control the brightness and the

13:10.920 --> 13:19.080
hue and, and, and, and region of interest and all of those things. So I kind of want to say, the summary

13:19.080 --> 13:26.440
of this is that plan nine was right, like the file system interface, the cleanness, the simplicity

13:26.520 --> 13:33.160
of a file system interface, makes people who are trying to use, uh, advocate right applications

13:33.160 --> 13:39.240
that aren't in C++, then their lives massively easier. If you're trying to use a memory safe

13:40.200 --> 13:47.480
development environment, then the C++ dot SO is the wrong answer, like it just is unsatisfactory

13:47.480 --> 13:53.320
and in, in, in every respect. And it, it would be really nice to, to, to get back to the kind of plan nine.

13:53.400 --> 13:59.080
It's in the, it's, everything is a file, um, aspect. There's a talk nine o'clock tomorrow morning,

13:59.080 --> 14:03.960
if anyone's awake on the basics of plan nine, or recommend you go to it. It's a real eye opener,

14:04.840 --> 14:09.960
how they think about things, and how much you can get done with a really simple clean interface.

14:11.000 --> 14:16.760
And I'd love us to get back at least close to that. Now I want to say, like, open source is great,

14:16.760 --> 14:23.880
like I've been grumbling, but like the open source, I remember long ago when it was cheaper to

14:23.880 --> 14:31.320
read to write a whole new TCP IP stack than to license one from Intel. So we've, we've come an

14:31.320 --> 14:38.600
awful long way since then, and I, I do, I appreciate the fact that any of this was possible at all,

14:38.600 --> 14:43.800
but it would be nice if we got back to something that was simpler, um, for the developers to use.

14:44.760 --> 14:50.280
So, um, that's kind of the, the point, and come and talk to me afterwards, ask questions,

14:50.280 --> 14:56.440
but shout, because I'm a bit deaf. Um, and like there's a bunch of open source, we, we do

14:56.440 --> 15:03.240
consulting on and some of which we've written, and some of which we haven't. Um, and, uh, yeah,

15:03.240 --> 15:10.040
I can contact me on any of those, or just find me a bit around today. So yeah, thank you.

15:13.960 --> 15:24.200
Anyone got a question? Uh, I'm kind of talking to a little more later, but the one thing I'm

15:24.200 --> 15:30.200
really curious about, I see really like read from the relatively, yeah, instead of memory mapping the

15:31.480 --> 15:35.640
for your use case, you, I would expect you won't remember my purpose, because then there's those

15:36.040 --> 15:43.720
so we do, we do, so why do we, why do I want to read rather than memory map? Um, just, uh,

15:43.720 --> 15:49.560
so yeah, the reason we want to read rather than memory map is that it's just simpler, right?

15:49.560 --> 15:57.000
It's a simpler, until we got to the, um, ff, um, ffm interface of Java,

15:57.800 --> 16:03.720
memory mapping was like, I had to write a bunch of C that would then I'd have to link in to the

16:03.800 --> 16:11.240
Java, in order to do the memory mapping, whereas, um, with ffm, I'm allowed to do that without

16:11.240 --> 16:15.240
having to write a new C, so it's essentially about not having to write C code.

16:15.240 --> 16:22.680
Yeah, at the cost of, but I mean, the latency is, I find down below a couple of hundred

16:22.680 --> 16:27.320
minute seconds, I'm happy. Any other questions?

16:34.200 --> 16:38.040
Because it's just like you asked up with a three-day shift in library from the vendor,

16:38.040 --> 16:41.800
who's done a school at the school. So, I love to respond to you.

16:41.800 --> 16:47.080
I mean, um, this is kind of why I'm here today to have that conversation. Why am I not using

16:47.080 --> 16:54.840
mainline? Um, I should be looking at it. It didn't appear to be possible, so I don't know whether

16:54.840 --> 17:02.840
it is or it isn't. Yeah, we'll, we'll, we'll find out. Anyone else? No? Good. Thank you.

17:03.720 --> 17:13.720
Thank you.

