WEBVTT

00:00.000 --> 00:07.000
All right, well we get set up here.

00:07.000 --> 00:09.000
I want to go ahead and welcome Flores.

00:09.000 --> 00:13.000
Flores is a software engineer at number zero and a mountain climber.

00:13.000 --> 00:17.000
He's going to go ahead and talk about an introduction to IRO.

00:17.000 --> 00:22.000
IRO is a rust library that simplifies P2P and networking by establishing direct QUIC

00:22.000 --> 00:26.000
connections or quick connections, allowing developers to easily build the custom

00:26.000 --> 00:29.000
decentralized protocols.

00:29.000 --> 00:31.000
With that, I'll go ahead and over for us.

00:31.000 --> 00:33.000
Give it up for Flores.

00:39.000 --> 00:40.000
Thank you.

00:40.000 --> 00:47.000
So yeah, I'm Flores and I work at number zero where we are working in IRO.

00:50.000 --> 00:54.000
Iro is basically like what we want to make is like we want to create

00:54.000 --> 00:58.000
a library to make P2P connections essentially.

00:58.000 --> 01:00.000
We just want a connection.

01:00.000 --> 01:05.000
Like we concentrate basically on the very basics of like you have two end points

01:05.000 --> 01:07.000
and a connection between that.

01:07.000 --> 01:09.000
But where are all possible?

01:09.000 --> 01:11.000
Like we want to P2P.

01:11.000 --> 01:14.000
So every endpoint can accept incoming connections.

01:14.000 --> 01:18.000
Regardless of where you are located on the network.

01:18.000 --> 01:23.000
And that's kind of the core of what IRO tries to do.

01:24.000 --> 01:29.000
We use quick because quick is like fits very well for this.

01:29.000 --> 01:38.000
And the end points we identify by this public key pair essentially.

01:38.000 --> 01:42.000
And this means that the public key pair is actually the thing that you dial.

01:42.000 --> 01:47.000
So you kind of have this notion of dial by endpoint ID.

01:48.000 --> 01:54.000
And yeah, and that's basically the idea is that like this is like the core of like P2P networking

01:54.000 --> 02:01.000
and then on top of that you can build your other application protocols etc.

02:01.000 --> 02:08.000
So this is sort of the typical setup from that you get in the internet like you have two devices behind home networks.

02:09.000 --> 02:16.000
And generally generally these are sort of behind not devices, not routers,

02:16.000 --> 02:21.000
which means you can't actually make a direct connection to the other end.

02:21.000 --> 02:26.000
Even if you have like IPv6 connection not may not be happening,

02:26.000 --> 02:29.000
but you will probably still have a firewall in between there.

02:29.000 --> 02:34.000
So this doesn't really change that much to it.

02:35.000 --> 02:41.000
So what happens generally is that if you have two applications that when you talk to each other,

02:41.000 --> 02:44.000
you end up with a server somewhere in a data center.

02:44.000 --> 02:49.000
Apparently there's only one data center in the world, so everyone fails at the same time.

02:49.000 --> 02:56.000
This is, yeah, this centralization is kind of bad.

02:57.000 --> 03:05.000
And it's like ends up with too much failure and also like the server ends up knowing much more than it needs to,

03:05.000 --> 03:06.000
generally.

03:06.000 --> 03:13.000
And we would like really be able to, for the endpoint devices to be able to have control themselves

03:13.000 --> 03:18.000
and really to have more like more user-in seeing that.

03:18.000 --> 03:31.000
So instead of like what to make this possible, we kind of go to like the,

03:31.000 --> 03:39.000
the I reliability essentially manages like how to create their connections and for that we need to open.

03:39.000 --> 03:44.000
Now you see that we still have a real A server.

03:44.000 --> 03:48.000
So how much battery is this really if we still have a server.

03:48.000 --> 03:51.000
Unfortunately, there's kind of the reality of the internet.

03:51.000 --> 03:55.000
You cannot like make direct connection without having some sort of server.

03:55.000 --> 03:59.000
Like the, the problem is that you have like, you need a public IP address.

03:59.000 --> 04:06.000
Like on the client side, you will always have to dial out and you need to have some public IP address.

04:06.000 --> 04:12.000
The reason we actually like make this work is that we tried to make this real A server,

04:12.000 --> 04:13.000
but it was a little less possible.

04:13.000 --> 04:20.000
So the traders we sort of make is that we really want this connection to be a reliable.

04:20.000 --> 04:25.000
Like if your connection is not like, if you can't always establish your connection,

04:25.000 --> 04:29.000
then you're just not going to use it and you're going to go back to the centralized version.

04:29.000 --> 04:33.000
So that's why we have this like as an absolute fallback.

04:33.000 --> 04:37.000
There is a transport that goes fire the real A server.

04:37.000 --> 04:47.000
The real A server is also like, and the other thing is like when you connect your connections.

04:47.000 --> 04:52.000
If it takes like a long time, if it takes like five seconds to establish, that's also not very good.

04:52.000 --> 04:58.000
So like right at the beginning of the connection, we will immediately start sending traffic through the real A server.

04:58.000 --> 05:04.000
And then that traffic will help with the whole punching, and then hopefully you get a direct connection,

05:04.000 --> 05:08.000
which will have much lower latency, et cetera.

05:08.000 --> 05:16.000
What you can't really see on this picture is that the real A server doesn't like, it's not like one server.

05:16.000 --> 05:18.000
Like everyone can host their own server.

05:18.000 --> 05:24.000
It's kind of much, you think more of it as a federated network of real A servers really.

05:24.000 --> 05:30.000
You're like, you can host your own server, you don't even need to know.

05:30.000 --> 05:36.000
Eventually, when you need to connect to another endpoint, you need to know what real A server they are using,

05:36.000 --> 05:39.000
but they could be using a completely different set of real A servers than you are.

05:39.000 --> 05:47.000
Like we'll talk in a minute about how you find out where you can locate another endpoint.

05:47.000 --> 05:51.000
And the real A server is also like very blind. It doesn't know anything.

05:51.000 --> 05:59.000
Like it doesn't store anything, and it just sees encrypted data going from one node to another node.

05:59.000 --> 06:07.000
So it has like the very minimal kind of information that it has.

06:07.000 --> 06:12.000
So how this whole punch can kind of work with this is that.

06:12.000 --> 06:19.000
We have so the last thing about like before going there is like the real A server also like has this.

06:19.000 --> 06:22.000
You see here that it's like an HBS connection.

06:22.000 --> 06:27.000
This is kind of like the idea of this is that like this is the connection that like.

06:27.000 --> 06:32.000
Is the most likely to work on the internet regardless of how restricted kind of environment you are in.

06:32.000 --> 06:36.000
We do want that a web soccer upgrade.

06:36.000 --> 06:50.000
This is also like even if you have like very restrictive environments that do like machine in the middle kind of certificate connection hijacking to apparently security reasons but whatever.

06:50.000 --> 06:58.000
Like there are often there are more likely to let through a web soccer upgrade than something else than a customer upgrade.

06:58.000 --> 07:07.000
And once we have the web soccer connection, we can basically what we do is we put you the key packets that we normally send over directly the internet.

07:07.000 --> 07:09.000
We put those inside that tunnel.

07:09.000 --> 07:13.000
And the real A, the only thing the real A server does is like it gets a packet.

07:13.000 --> 07:16.000
It sees what's end pointed is for.

07:16.000 --> 07:22.000
And then it's then the only thing can go is like, well, do I have that endpoint ID connected to me.

07:22.000 --> 07:24.000
If so, then put it over there.

07:24.000 --> 07:27.000
If you don't, then just drop it on the floor.

07:27.000 --> 07:29.000
And that's the only thing the real A can do.

07:29.000 --> 07:33.000
So that's the only kind of knowledge the real A server has.

07:33.000 --> 07:36.000
To actually do.

07:36.000 --> 07:38.000
To actually do.

07:38.000 --> 07:44.000
Hoping with this is like you start the way the way this works in Iro is that you.

07:44.000 --> 07:46.000
You start the connection via the real A server.

07:46.000 --> 07:50.000
And then you have to exchange like information about how.

07:50.000 --> 07:52.000
How to establish a direct connection.

07:52.000 --> 07:56.000
And basically you'll give like candidate IP addresses to each other.

07:56.000 --> 08:02.000
So in this case, you have like the two external IP addresses that you would share.

08:02.000 --> 08:08.000
Now when you do this, you don't actually really know if you're even on the same network like you could actually be local on the network.

08:08.000 --> 08:11.000
So you actually just share all your IP addresses to the other side.

08:11.000 --> 08:14.000
You send those via the real A connection.

08:14.000 --> 08:17.000
And then once you both have the other kind of.

08:17.000 --> 08:20.000
Connections, what you do is you kind of you send.

08:20.000 --> 08:22.000
Packets to each other.

08:22.000 --> 08:25.000
To all of those addresses at the same time.

08:25.000 --> 08:29.000
And normally if you rejure firewalls, not devices, whatever.

08:29.000 --> 08:34.000
We'll see that going, they outgoing packets go out at the same time.

08:34.000 --> 08:37.000
And then they cross over somewhere on the internet.

08:38.000 --> 08:42.000
And then when the packet comes back, when the other packet arrives.

08:42.000 --> 08:47.000
Hopefully the router will kind of recognize this as being a income.

08:47.000 --> 08:49.000
My packet from the same connection.

08:49.000 --> 08:53.000
And that is kind of the basics of like how you do a whole bunching essentially.

08:53.000 --> 08:56.000
So an arrow manages basically this.

08:56.000 --> 08:58.000
Here does only like two paths.

08:58.000 --> 09:01.000
When you do this in practice, you will get more paths than this probably.

09:01.000 --> 09:03.000
Like a lot of times, we now have IP before.

09:03.000 --> 09:05.000
And I do the six direct paths.

09:05.000 --> 09:07.000
We might get both of them.

09:07.000 --> 09:09.000
You could get even several.

09:09.000 --> 09:12.000
An arrow trying to kind of tries to keep all of those connections.

09:12.000 --> 09:15.000
And like selects the best one out of it that works.

09:15.000 --> 09:19.000
And it's as soon as that the connection is established.

09:19.000 --> 09:21.000
And as soon as like whole-pinching happens.

09:21.000 --> 09:25.000
The relay connection will the relay path will kind of go.

09:25.000 --> 09:28.000
Go quiet and not have any traffic.

09:28.000 --> 09:31.000
So and the relay server doesn't even because it is completely blind.

09:31.000 --> 09:36.000
It doesn't even know that whole-pinching was even attempted or succeeded or anything.

09:36.000 --> 09:42.000
All it knows is that sometimes there's a bit of traffic sometimes there's not.

09:42.000 --> 09:47.000
So you still have to find out where you're on your endpoint.

09:47.000 --> 09:55.000
It's like because with dial by endpoint ID, which is line of the public key of your cryptographic

09:56.000 --> 10:02.000
And for that we have to have.

10:02.000 --> 10:05.000
Yeah, for that we have to like have we have a system basically.

10:05.000 --> 10:09.000
It's currently called, well, we just renamed that actually in the last release.

10:09.000 --> 10:11.000
It's now called address lookup.

10:11.000 --> 10:16.000
And the basic idea is that you map your endpoint ID that you have.

10:16.000 --> 10:20.000
And you want to find out like where I find this.

10:21.000 --> 10:27.000
Maybe it's on a relay URL, which is kind of the typical scenario that I just described.

10:27.000 --> 10:33.000
But it could also be like you could be on a local network somewhere with no access to a relay server.

10:33.000 --> 10:34.000
That's like cut off.

10:34.000 --> 10:40.000
So you could also resolve into IP addresses, directly, etc.

10:40.000 --> 10:46.000
And this kind of is modeled on like DNS resolution, like DNS lookups,

10:46.000 --> 10:51.000
which is kind of very similar way you have like a name and then you need to find some IP addresses.

10:51.000 --> 10:56.000
And this is also like why, how you find like the relay server of the other endpoint,

10:56.000 --> 11:02.000
without having it configured or needing to have to not know it ahead of time.

11:02.000 --> 11:06.000
The nice thing about this is that this is like completely accessible.

11:06.000 --> 11:14.000
So you can like depending on the application, you can extend and add your own address lookup.

11:14.000 --> 11:21.000
For example, like the gossip protocol that I will quickly mention at the end as well.

11:21.000 --> 11:33.000
That has its own, or adds its own mechanism to this, because it kind of already can find out information about other endpoints out of band of,

11:33.000 --> 11:36.000
so it can already provide us.

11:36.000 --> 11:41.000
The most common one that we can recommend is actually literally DNS.

11:41.000 --> 11:48.000
There is very nice system, PKAR, AR, Pika.

11:48.000 --> 11:53.000
And what this actually does is like, it's a very small and needs standard.

11:53.000 --> 12:08.000
It's not like, that essentially is a mechanism to like publish some information that is like signed by a public key or a key pair.

12:08.000 --> 12:17.000
And like, this essentially there is like, we have basically, and you can, again, this is like, you can run your own DNS server for this,

12:17.000 --> 12:23.000
but we have the DNS server running, that basically at a sub domain, you can basically ask for like, this is the idea I'm looking for,

12:23.000 --> 12:25.000
and you get back here DNS records.

12:25.000 --> 12:30.000
This DNS record is kind of signed by the key itself.

12:30.000 --> 12:37.000
So the server, like to submit, there's you have to publish this on to an HTTP server essentially,

12:37.000 --> 12:42.000
but this server will only accept these records, like, if you can actually, if the signature actually is valid.

12:42.000 --> 12:45.000
And actually, when you look up here, you also have this kind of system.

12:45.000 --> 12:50.000
DNS is kind of nice, because it's like, how would you, a lot of the internet is worked for this,

12:50.000 --> 12:54.000
you get all the caching, and such are going on in there, so it's like a very nice way.

12:54.000 --> 12:57.000
And DNS is like an all the very popular way.

12:58.000 --> 13:04.000
It's like, looking at work only, and you basically broadcast, you like, the notebook network saying, like,

13:04.000 --> 13:07.000
can I find, is this no idea here or something?

13:07.000 --> 13:13.000
And the DNS one is kind of funny, because it does something extra that's no part of Android's look up,

13:13.000 --> 13:23.000
because, like, if you familiar with MDNS, a lot of people actually expect to just, like, magically know what other devices are in the local network.

13:23.000 --> 13:30.000
And, and DNS does this by basically broadcasting every software, and here I have this service kind of thing.

13:30.000 --> 13:38.000
So, so, yeah, if you use the DNS address look up, you have this a little bit of extra information that you can use as well.

13:38.000 --> 13:50.000
The same PCAR system that kind of allows you to publish DNS records into the DNS system actually allows you to publish it directly into the DHT as well.

13:50.000 --> 14:02.000
But the bitter and mainline DHT is kind of the DHT that has been around for a long time and survives a lot of attacks such as that to it.

14:02.000 --> 14:10.000
So, it's like a very reliable DHT, and the nice thing about the PCAR is that it can publish directly onto DHT.

14:10.000 --> 14:19.000
So, if you don't want to have any central services to try involved, this kind of also shows that this is possible.

14:19.000 --> 14:26.000
Depending on your choices, right? And, you know, free to, yeah, you can make more of these.

14:26.000 --> 14:35.000
I'll go very quickly through, like, basically, once you have a connection, right, you have, we basically provide a quick connection between two endpoints, and that's it.

14:35.000 --> 14:47.000
Then it's up to the application to kind of decide how to do quick is kind of very nice with this, because it is build on top of UDP, which is,

14:47.000 --> 14:51.000
which is very nice for, for whole-pinching, etc., so it was kind of nice for us.

14:51.000 --> 14:57.000
But it provides also a kind of a familiar interface to the application developers.

14:57.000 --> 15:04.000
So, basically, you get, like, streams, which is, like, order data, like, you get a TCP one.

15:04.000 --> 15:11.000
The stream, you get more than one stream though, like, in TCP only get one.

15:12.000 --> 15:23.000
And the nice thing about, like, if you know, buy a HTTP too, there you also have, like, multiple streams, but you kind of have a headline blocking where if you use some packaging one stream, all the streams are blocked.

15:23.000 --> 15:27.000
In quick, it doesn't, doesn't have that problem anymore.

15:27.000 --> 15:40.000
You also, there's a data-grams extension, and these two together means that, like, if you have an application protocol already, that's, like, on TCP or UDP, you can just bought them and, like, use quick, very nicely.

15:41.000 --> 15:47.000
Quick has a lot of other nice improvements though, like, especially the connection handshake is, like, much faster.

15:47.000 --> 16:02.000
If you do a connection handshake in, like, TCP and TLS, you get, like, three round trips to, between the two endpoints, before you can start sending data to, application data to, to the other endpoints.

16:02.000 --> 16:07.000
In quick, the normal handshake, the long handshake, kind of, is just one half round trip.

16:07.000 --> 16:15.000
Like, quick manages to combine the TLS setup and the connection handshake in, together.

16:15.000 --> 16:25.000
And, like, you just send one packet to, to the remote, you get one response back, and then you can send an application data.

16:25.000 --> 16:35.000
Once you have, like, managed to have a connection, and you reconnect to the same, to the same endpoint again, you can even use zero RTT, which basically means that you can,

16:35.000 --> 16:40.000
but that use the data application data directly in the first packet.

16:40.000 --> 16:46.000
Underhoods, like, there are various other things, like, lots of detection, congestion control is very nice.

16:46.000 --> 16:54.000
It's always encrypted, and also, like, it's always encrypted is kind of, so TLS is, like, mandatory.

16:54.000 --> 17:02.000
And, everything is, like, encrypted on the packets, like, so everything on the wire, like, there's, like, just a few bits in the head,

17:02.000 --> 17:04.000
basically, that are not encrypted.

17:04.000 --> 17:11.000
Everything else, like, including the other, including the, yeah, all the connection metadata is fully encrypted.

17:11.000 --> 17:18.000
The purpose of this is partially just to make everything look like the same, like, you cannot tell anything.

17:18.000 --> 17:23.000
TCP is, like, very occupied in a way, it's very difficult to evolve.

17:23.000 --> 17:29.000
Yeah, the hope is that this will stop middle boxes from being relying on particular properties of quick.

17:29.000 --> 17:34.000
But also, the other nice thing is, like, everything looks like H2B3, essentially.

17:34.000 --> 17:43.000
So, the more we get H2B adoption, the more everything just looks like, the same data grams around them, you need to be packets flying around.

17:43.000 --> 17:50.000
Which is kind of, like, a nice thing to aim for.

17:51.000 --> 18:01.000
So, once you have your quick connection, basically, arrow kind of gives you, like, once to provide you, like, the model of protocols kind of.

18:01.000 --> 18:12.000
It doesn't force you, like, to use just one one application protocol, like, we provide, like, we kind of try to make it easy enough to, like, use multiple protocols.

18:12.000 --> 18:17.000
So, you can establish multiple connections and each endpoint can, like, implement various protocols.

18:18.000 --> 18:25.000
So, I will quickly talk about, like, two of these protocols that we ourselves built.

18:25.000 --> 18:31.000
There are, kind of, generic building blocks.

18:31.000 --> 18:36.000
There's, yeah, you're not, of course, you're not limited to, to, to any of these.

18:36.000 --> 18:44.000
But, yeah, blocks is sort of the, the protocol that we have for requesting, this is kind of verified streaming.

18:44.000 --> 18:54.000
So, the idea is that you, you get your, as, as you, if you fetch a file or something, you want to download something,

18:54.000 --> 18:59.000
that as data comes in, you continue to know that this is, like, the correct data comes in.

18:59.000 --> 19:07.000
And gossip is kind of this, this way of, like, creating a swarm and, like, without having to have a direct connection to every member in this swarm,

19:07.000 --> 19:11.000
you can, like, send small messages to the entire swarm.

19:11.000 --> 19:20.000
So, very quickly kind of going through these, it's just, yeah, it's kind of neat protocol, I guess.

19:20.000 --> 19:29.000
It's, um, blocks is kind of built on top of Blake 3, which is a modern and fast hashing algorithm.

19:29.000 --> 19:39.000
And, um, yeah, and Blake 3, basically it uses the internal structure Blake 3 to, to enable this.

19:39.000 --> 19:48.000
So, what Blake 3 does is, like, it, you, you, you, um, partition the file into, like, one kilobile blocks.

19:48.000 --> 19:51.000
And each block gets it and hasht.

19:51.000 --> 20:00.000
The reason that, this makes Blake 3 really good for optimizing, because you can, basically, do this in parallel, which is in the instructions, et cetera.

20:00.000 --> 20:07.000
And then once, once you start, once you have all those hashes, you can start, like, combining, like, your hash basically two of these hashes together.

20:07.000 --> 20:09.000
And that make up the next layer of hashes.

20:09.000 --> 20:19.000
And you keep building up this, this, um, these hashes all in into a circle 3 essentially, uh, all the way until you have, um, a root hash.

20:19.000 --> 20:24.000
Now once you have, like, you, you, you have managed to build this all the way up to the root hash.

20:24.000 --> 20:32.000
To do this or the verified streaming, what you have to do is you have to traverse the hash from, from essentially the top to the bottom and the left to the right.

20:32.000 --> 20:40.000
So you start with the top 1 and then you go, um, you, you send the, you, you send the, you're, you're, you're, you can check those, your hashes in that make up that day.

20:40.000 --> 20:42.000
It's still make up the root hash.

20:42.000 --> 20:45.000
Then you basically keep doing that and you keep going, you go to left.

20:45.000 --> 20:46.000
You send those the hashes.

20:46.000 --> 20:50.000
Then at the end you end up on the, at the bottom left and you actually find send the file chunk.

20:50.000 --> 20:56.000
You've sent very little data now, but you already know that everything that you have sent at any point, you received something.

20:56.000 --> 20:58.000
You know that is valid data.

20:58.000 --> 21:05.000
You can go through a line to draw a file or you can do also it's really need to do ranges as well.

21:05.000 --> 21:10.000
So you don't have to like build the entire tree. You can like go through them.

21:10.000 --> 21:14.000
And this kind of shows the order in which you would send kind of this chunks.

21:14.000 --> 21:18.000
And you would always kind of be able to verify data.

21:18.000 --> 21:23.000
This means you can like, yeah, fetch files with different chunks from different endpoints, etc.

21:23.000 --> 21:25.000
Which is kind of very neat.

21:26.000 --> 21:30.000
Very quickly, I think we probably have a minute left to it. Something one.

21:30.000 --> 21:32.000
Okay.

21:32.000 --> 21:35.000
Gossip.

21:35.000 --> 21:38.000
Yeah, the gossip is kind of like it's a implementation.

21:38.000 --> 21:41.000
There are many gossip innovations with different trade-offs.

21:41.000 --> 21:47.000
This one is sort of based on spikes written by the Willow folks.

21:48.000 --> 22:01.000
And it's very efficient that it's like just a bit of a message to dynamic gossip cluster, I guess, or a number of endpoints.

22:01.000 --> 22:05.000
But it's not necessarily the most like attack resistant one.

22:05.000 --> 22:10.000
If you have malicious notes and part of that, you might want to reconsider this.

22:10.000 --> 22:16.000
Very high view kind of is like it's built on like two separate systems.

22:16.000 --> 22:22.000
One is like where you have like a handful of like neighbors, which are just randomly chosen essentially.

22:22.000 --> 22:25.000
And whenever you have a message you send it to all of those.

22:25.000 --> 22:27.000
But that doesn't scale up.

22:27.000 --> 22:35.000
So like if you do it out through the whole swarm, that you get a lot of duplicate messages you're being sent around.

22:35.000 --> 22:43.000
But the active neighbors allow you to find quickly like, or become member of the group.

22:44.000 --> 22:47.000
The passive neighbors are sort of like in case you lose some active neighbors.

22:47.000 --> 22:51.000
You immediately have other ones you can replace this.

22:51.000 --> 22:55.000
And if you have too many neighbors, they get handed off to another node and that kind of thing.

22:55.000 --> 22:59.000
And then on top of that, the plum tree part is like building a spanning tree.

22:59.000 --> 23:01.000
And that's actually what you're doing.

23:01.000 --> 23:12.000
You kind of continuously make sure that there is a single tree of like where message can be passed through to be delivered to every member in the gossip swarm.

23:13.000 --> 23:18.000
So you have to continuously prune and make for repair links in this tree you're to draw.

23:18.000 --> 23:21.000
That was very, very fast.

23:21.000 --> 23:27.000
But yeah, hopefully it gives you an idea of like, let's help.

23:29.000 --> 23:32.000
Thank you for a sweet appreciate you coming in here.