WEBVTT

00:00.000 --> 00:12.520
I think we're about time to start for the next talk.

00:12.520 --> 00:22.640
So I'm from Amplores, I work at number zero where we build Ira and we kind of embraced

00:22.640 --> 00:29.480
quick for this and specifically like last year I talked about, last year I talked about

00:29.960 --> 00:37.160
how we wanted to go from normal single path quick to quick multi-path and so this year I

00:37.160 --> 00:42.360
want to talk a bit more about that experience and what we sort of learned along those things.

00:43.560 --> 00:50.520
I want to talk too much about Ira but essentially the goal that we want to have is like that

00:51.080 --> 00:57.160
we want fast and reliable connections anywhere and we want them to be paid based or like

00:57.160 --> 01:02.200
anyone can accept connections in common connections regardless of where you are on the internet even

01:02.200 --> 01:08.280
on your home advice at home or something. So this means you have to do we want to do paid pay

01:08.280 --> 01:13.080
and paid pay means we want to do whole bunching. That kind of push just towards

01:14.600 --> 01:22.040
whole bunching is kind of a lot easier on UDP so that kind of push just towards UDP and then

01:22.120 --> 01:29.800
uses don't really want the unreliability of UDP so and to quick I guess. So that's kind of the story

01:29.800 --> 01:33.800
of how we ended up on a quick story.

01:33.800 --> 01:53.640
All right sorry. Yeah and the basic the basic architecture kind of of of

01:54.680 --> 02:00.280
Ira to some sense is like we do have like inherent in multiple paths which is why we kind of

02:01.160 --> 02:06.600
inherently we're drawn to like wanting to put multi path because because you

02:06.600 --> 02:11.480
be want like if you have like to pull set up you have like to divide it in the home network to

02:11.480 --> 02:17.640
something you can't establish the direct IP connections to it and normally normally those take

02:18.600 --> 02:23.800
require whole bunching. We're not I'm touching slightly a little bit on whole bunching at the end

02:23.960 --> 02:33.160
but mostly not but the really connection is sort of like you make if each device makes a connection

02:33.160 --> 02:38.760
to the relay server we do this via GPS upgrade to web sockets and then we can like transport

02:38.760 --> 02:44.760
UDP data grams via the relay server to each other and the general idea is that we use that to

02:44.760 --> 02:50.120
kind of establish the connection if you can't establish it immediately direct and then we do

02:50.120 --> 02:54.200
whole bunching coordination over that and then you get the other path so we have like inherently

02:54.200 --> 03:03.560
several paths typically these days you got IPv4 IPv6 if everything goes well but the reason we want to kind

03:05.080 --> 03:10.920
of go to multi path is that each path has its own properties right just talked about congestion control

03:10.920 --> 03:16.760
each path has its own congestion controller and with single path you don't have this like you

03:17.400 --> 03:23.400
start on one of the paths the traffic goes through the path and then you switch to another path

03:23.400 --> 03:27.880
and then you congestion controller has to like we start figure everything out again if you switch

03:27.880 --> 03:33.000
back as you move through networks the congestion controller loses all that state there's a

03:33.000 --> 03:40.040
loss detection at such a loses all that state so that's not ideal so that's kind of my drives us do it

03:40.760 --> 03:46.920
another important part for us is that quick also has inherently this protects against

03:46.920 --> 03:53.560
linkability between different paths so if you have like if you send in packets via via for the same

03:53.560 --> 04:02.360
connection via two different paths or quick tries to like avoid showing that these these

04:02.360 --> 04:06.920
dispatch are actually related to the same connection that is something we didn't have in the

04:06.920 --> 04:13.960
single path quick version so that was also kind of the reason for us um so multi path is

04:13.960 --> 04:21.240
kind of an ITF draft it's actually like I think maybe the last version of the draft by now they

04:21.240 --> 04:28.440
even have the iron on numbers now this is a total title for it but it kind of only specifies the

04:28.440 --> 04:33.000
wide protocol it only says like how do you build packets how do you send them how do you decode them

04:33.080 --> 04:40.520
on the other side it doesn't really talk about how you um about how you send data over the various

04:40.520 --> 04:44.840
parts or how you decide to do that that is kind of very much the left up to the application

04:46.840 --> 04:53.880
so it doesn't solve all all the problems but but it gives you tools to do you have an interoperable

04:53.880 --> 05:01.400
kind of why a protocol at least which is a very good start before we go into kind of multi path

05:01.400 --> 05:07.640
we need to talk about paths in and and create itself out of c9000 so quick already had the notion

05:07.640 --> 05:18.600
of a path in a way and it's mostly exists because it's mostly existed because usually you have

05:18.600 --> 05:23.000
like a client or the original design of like HHB3 it was like you have a client behind the

05:23.000 --> 05:28.440
mat and you have like discovering your data center that's not what we want but that's the original

05:29.240 --> 05:33.240
design so the client the math could get rebound like if you can actually as I look for like a minute

05:33.240 --> 05:39.160
of something and then you suddenly start sending packets again um it did not could have read

05:39.160 --> 05:44.840
um good of like already decided that that flow was already um the first flow was already finished

05:44.840 --> 05:51.320
so it has to rebound and you not and not mapping um and then suddenly the services like

05:51.320 --> 05:56.200
valid packets come in from somewhere that isn't the client but it looks like it is the client

05:56.520 --> 06:02.280
um so what the server had to do is like send this um you have to validate that there's actually

06:02.280 --> 06:09.400
the real same client um and and validating that means like knowing it's the same TLS just it can

06:09.400 --> 06:15.320
still control the TLS identity the reason to validate is that this can also like be an on path

06:15.320 --> 06:21.640
a dark or something or why you suddenly pretend you're sending to something from somewhere else

06:21.720 --> 06:25.320
and then you can like potentially redirect a lot of data through some interesting

06:25.320 --> 06:31.880
victim on the internet somewhere um and that's kind of why this part challenge kind of mechanism

06:31.880 --> 06:40.360
already exists um and and and and not everybody is this kind of the the primary reason for this

06:41.320 --> 06:46.680
and and part challenge is kind of like identifying that that there's still the right there you're talking

06:46.680 --> 06:54.520
to and and they are now just on a new address um so that looks something like this like

06:54.520 --> 07:01.720
assuming that your your client is originally sending um from from like the first address um

07:01.720 --> 07:05.320
it's sending data then at some point the client is going to be unaware of this so

07:05.320 --> 07:09.640
at some point it just starts sending again like streamframe in this case like application data

07:10.600 --> 07:16.840
um and uh it just going to appear from a new path so if server has to do this part challenge

07:16.840 --> 07:21.560
thing client has to send the path response the client has to make sure that this comes back from

07:21.560 --> 07:27.720
the same same address that it receives from for the important in a second but right now the

07:27.720 --> 07:35.720
client doesn't really know about this yet it's just blissfully unaware um um and but I just

07:35.720 --> 07:42.040
tend to a response uh this this packet has to be patterned to like uh 12 on the bytes payloads at least

07:43.880 --> 07:48.120
you get like the IP header over at home that on top of that and that's kind of to make sure that

07:48.120 --> 07:54.200
the path is actually capable of carrying traffic and um that is like the data grams that is big enough

07:54.200 --> 07:59.320
for for to be usable so it's like gives you a full full idea that the path is actually usable

07:59.880 --> 08:04.360
only when this path validation has succeeded can the server start sending one of the new path

08:04.600 --> 08:14.200
um and uh like for our implementation of multi path we are writing everything in rust and we

08:14.200 --> 08:20.040
we started from the twin rust library and this is kind of the migration that it that it's supported

08:20.920 --> 08:29.160
uh in inside the data structures um there is however like even still in in RFC 9000

08:29.240 --> 08:35.240
uh that that is not the limit of of what you can do like only the client is allowed to migrate

08:36.840 --> 08:43.880
but um only the client is allowed to migrate but that doesn't mean it can only do in volunteer

08:43.880 --> 08:49.960
migrations it can also actually knowingly migrate for example like a uh a commensituration is like

08:49.960 --> 08:53.560
when you have a mobile phone with um Wi-Fi and mobile data interfaces

08:54.440 --> 08:59.880
in theory what what the client is allowed to do is like it can start if it is aware of this it can

08:59.880 --> 09:07.000
actually already try and verify validate essentially the um the other path so it could try and

09:07.000 --> 09:11.160
validate the path for the second interface if it's originally been sending on on on the first

09:11.160 --> 09:17.480
good rest we can try and like said it's also dressed as the second interface and it can do a

09:17.480 --> 09:24.440
part challenge um as long as it only puts like part challenge and a few very um a few very small

09:24.440 --> 09:31.240
uh amounts of uh allowed friends in there it it it it only counts as uh this thing that is called

09:31.240 --> 09:36.840
like a probing packet and probing packets don't actually switch the path to to the new one like

09:36.840 --> 09:44.840
you saw in the involuntary migration this means that um the client can already verify this path

09:45.800 --> 09:50.280
uh the other thing to to notice is that like I added this part challenge on the way back

09:50.280 --> 09:54.520
like the server can also like um it's we'll see to the probing packet so it knows it doesn't

09:54.520 --> 09:59.080
need to switch the path to the client doesn't need to switch but it can also already start to validate

09:59.080 --> 10:04.680
this path by putting its own part challenge in because only the person that the side that receives

10:04.680 --> 10:09.160
the response actually knows that it's validated and both are a feature know that it's validated

10:09.160 --> 10:13.880
so you need this three packets before you can fully validate um

10:15.800 --> 10:21.320
then if you have like two parts validated or you could even like pre-validate like more

10:21.320 --> 10:28.920
part if you wanted um i'm not sure if anyone actually implemented that fully um but then you're

10:28.920 --> 10:33.960
still sending on the first path and then you like switch to the next path by sending something

10:33.960 --> 10:38.600
that is not a programming packet um in this example ping but it like it doesn't really matter right

10:39.160 --> 10:45.560
um but that switch is kind of like happens like suddenly you just like switch to the new path

10:45.560 --> 10:50.760
and now you're on on that path um and now you're you're probably when you do that you reset

10:50.760 --> 10:56.680
your congestion controller states etc and then they they they build up their knowledge again from them

10:56.680 --> 11:03.560
from there um this kind of brings us to like how multi-path model things because in multi-path

11:03.720 --> 11:13.720
um uh they bring in the notion of a path ID and the path ID is essentially just extending

11:13.720 --> 11:19.160
the packet number spaces so in in original quick you got like three packet number spaces like

11:19.160 --> 11:25.000
the and and the packet number spaces literally just like the packet number like you start at one

11:25.000 --> 11:29.400
it keeps incrementing with every packet you sent quick and ever reach out to the same packet

11:29.480 --> 11:34.280
um also in in case of loss uh it built a new packet with the data that was lost in it but the

11:34.280 --> 11:42.120
packet number will always increment um the initial hand shake is like the spaces are like very early on

11:42.120 --> 11:47.080
most of the stuff is in the data space and that packet number just like keeps incrementing so for

11:47.080 --> 11:52.440
quick to be able to like really get good loss detection signals and congestion control signals

11:52.440 --> 11:57.400
they wanted to have the the packet numbers for each path separately which is why they decided to

11:57.400 --> 12:04.680
just split the packet number spaces up path ID um it also affects the crypto crypto um how you

12:04.680 --> 12:09.720
have you encrypt the the packets originally also is just on with the non-spaces like just from

12:09.720 --> 12:17.720
the packet number and now now we'd include the path ID um but the combination of this basically

12:17.720 --> 12:23.160
should also never be reused for for for the cryptography as well because um if you were using non

12:23.240 --> 12:33.480
uh that's not very good um so this this spaces and packet number spaces are actually like

12:36.200 --> 12:41.800
quite difficult to manage in a way um so the the reason is that the mapping kind of was

12:41.800 --> 12:49.480
was very difficult uh for like from from before multiple to after multiple uh like just this

12:50.440 --> 12:55.560
the space is like if you if you kind of skip over initial and handscape hands hands shake space

12:56.280 --> 13:04.520
um because you don't really spend very long in that and uh but the the data you kept per path essentially

13:04.520 --> 13:10.040
and the path is is um like ideally you want to keep your congestion control straight and all that

13:11.160 --> 13:14.360
per for two builds like IP import source and destination

13:15.000 --> 13:23.480
um but also like the the space ID also contains the crypto keys so each um especially during the

13:23.480 --> 13:29.240
handshake you have like so you have like a cryptography of keys to to encrypt the very first packet

13:29.240 --> 13:35.560
in quick um that are just like for the client hello then the server hello from TLS is kind of

13:35.560 --> 13:39.400
encrypted the different keys and then finally you have the final keys that the rest of the connection

13:39.400 --> 13:43.960
is kind of encrypted with so so you have to derive this three these three sets of keys

13:44.680 --> 13:50.120
at the beginning as well and they are still like um this this whole motion of like multi parts

13:50.120 --> 13:54.920
split the packet number spaces into more packet number spaces doesn't really match fully

13:54.920 --> 14:01.400
one to one to that because um the cryptography keys for the entire data space stay the same

14:01.400 --> 14:07.720
even in multi path um so we had to like it's you end up having to split this data in in in

14:07.720 --> 14:13.240
very awkward or or at least in in in in the implementation we had in very awkward ways

14:14.520 --> 14:19.000
the per connection data like path data which contains congestion control, etc

14:19.960 --> 14:26.200
it only had like the version now and like the previous version and the reason to have like the previous

14:26.200 --> 14:30.920
version is to like protect against basically someone faking a migration

14:31.880 --> 14:38.120
so so the thing that you did there was like when you observe the migration as a server

14:38.120 --> 14:43.560
you would still kind of try and path validate both a new and the old path and then if the

14:43.560 --> 14:47.560
old path still validated then you know that someone was trying to fake a migration

14:49.160 --> 14:56.040
and you can sort of recover quickly so that's that's why you have like this previous path data

14:56.440 --> 14:58.440
stuff as well um

15:01.240 --> 15:06.920
this thing is like to connection identify this kind of an image of big thing in the you need to be

15:06.920 --> 15:12.600
aware of in multi path because the problem is like right where do packets come from um and

15:13.800 --> 15:18.200
the way to do that so because of this migration right you have this involuntary migration that

15:18.200 --> 15:23.560
can happen um packets could suddenly come in from a different IP address so you can't use that to

15:23.560 --> 15:28.920
know which which um which connection the packet belongs to because that can suddenly change

15:28.920 --> 15:34.920
so quick solves this with CIDs connectionities um it's essentially uh a small byte array in

15:34.920 --> 15:40.280
in the packet here there are somewhere fixed offset um depends on the header used but essentially

15:40.840 --> 15:45.160
on some fixed offset and and that means that you can use that when you see a packet

15:46.040 --> 15:49.800
coming in from your sockets you can look at that connection idea and go out out belongs to that

15:49.880 --> 15:55.000
connection that belongs to that connection um but this means also that because it is on the

15:55.000 --> 16:02.280
receiving side that you do this the sand aside uh has to know the CID that you want and so you have to

16:02.280 --> 16:09.400
issue the the the CIDs which is like with a new connection ID frames um in in multi path that's the

16:09.400 --> 16:15.720
new apart from your connection ID and the headers actually are completely unchanged in multi path

16:16.360 --> 16:23.320
so the way the way to do it is basically your issue site at CIDs for each path ID so so the

16:23.320 --> 16:30.760
states of the CIDs kind of gets um multiplied by the path you want to have um but the variance

16:30.760 --> 16:36.360
like one of the interesting consequences in implementing this is that CIDs the way um when you

16:36.360 --> 16:43.320
had a single path the way they were issued you only have an issue a um you you always have a CID

16:43.320 --> 16:48.360
essentially from your peer because the peer is sending to um the way the way the connection is set up

16:48.360 --> 16:53.480
you could never end up with a CID okay usually you wanted to rotate CIDs like when you have been

16:53.480 --> 16:58.200
idle for like a minute or something you might as well grow well let's rotate the CIDs because why not

16:58.760 --> 17:04.360
and in the worst case you couldn't do that if you didn't have more spare CIDs in multi path now

17:04.360 --> 17:09.160
if you want to start sending on your path um even when you receive the packet on that new path you may

17:09.160 --> 17:14.680
know there but it's sent because you just might not have CIDs yet and that that fallibility kind of

17:14.680 --> 17:26.040
it was as also quite a lot of impact in the implementation um so which brings us to like

17:26.040 --> 17:32.680
opening paths like managing paths I guess so opening paths now we know how you can like send

17:32.680 --> 17:37.800
the packet on a specific path and and all the sort of consequences they're committed like

17:37.880 --> 17:43.000
opening path is is is the state the state uh that you have to kind of track

17:43.000 --> 17:51.400
purpose kind of triggers well in in uh and not yeah um firstly like only clients are allowed to

17:51.400 --> 17:59.080
open path which is kind of uh easy from like that that carries on from like RFC 9000 thread

17:59.880 --> 18:04.840
you were only clients were allowed to migrate and it kind of simplifies it simplifies a lot of

18:04.840 --> 18:11.000
design but it's a little bit annoying for us but we can generally work or because we want to be

18:11.000 --> 18:16.840
paid payer right we want both sides to be equal but we can sort of work around this by the the client

18:16.840 --> 18:24.840
is always the one that starts a connection um but so so to open a path a client has to send well

18:24.840 --> 18:32.200
or and a side must have to send or receive basically um uh a packets on a new path ID and that's all

18:32.280 --> 18:37.640
it can even state still be on the same 4 tuple although you probably want to avoid that um but that's

18:37.640 --> 18:43.080
not enough to like fully have uh uh the open path because each 4 tuple that you actually send or send

18:43.080 --> 18:48.360
and receive them all has to be validated on before you can start sending on it so if you if you

18:48.360 --> 18:53.560
as a service only receive a packet that happens to use the CID of a new path ID that's that wasn't

18:53.560 --> 18:59.720
used yet uh you have to then go and validate and only when all of that succeeds then then you can

19:00.360 --> 19:09.400
start using the path um abandoning is kind of also um both sides can abandon the path for

19:09.400 --> 19:14.120
whatever reason coordination wise this is also tricky because if you end up with no paths you don't

19:14.120 --> 19:21.240
have a connection anymore um so you you have to yeah you have to be a bit careful there as well

19:21.880 --> 19:33.720
um so that kind of gets you to like uh the the the last I don't know the last challenge of

19:33.720 --> 19:38.920
of most part I guess that I'll talk about which is like um package scheduling is like well what

19:38.920 --> 19:44.440
you actually like if you have data to send you have to put your your your pending data that you want

19:44.440 --> 19:48.680
to send you have to decide where to send it to there's depends on whether there's any congestion

19:48.680 --> 19:59.400
control window and things like that um what do you have CIDs um uh this is kind of like awkward

19:59.400 --> 20:05.640
to decide like the the way the way we decided to do this for now initially at least we will

20:05.640 --> 20:12.520
improve from this later it's the but initially we we only want to send to uh as a single remote

20:12.840 --> 20:19.240
at a time like as the primary um this pathway to spots uh back up available is kind of a mechanism

20:19.240 --> 20:27.720
in the uh in the in multi path that is kind of entirely advisory and and we send it but we don't

20:27.720 --> 20:34.920
really use it uh I'm being told my time is up so um yeah package scheduling is sort of like you

20:34.920 --> 20:40.280
have to balance all those things and and and not to too many checks upfront but still kind of

20:40.280 --> 20:44.360
managed to send uh send your package on on the path that can actually send them or decide to

20:44.360 --> 20:51.400
wait um there's a few trade-offs there uh the thing that I didn't get ran to uh how we then use

20:51.400 --> 20:56.840
multi path with uh quick natural versatile it's basically using part-changes just to do whole

20:56.840 --> 21:11.560
pinching but all the kind and skip past that part of this yeah okay uh so things like great

21:11.560 --> 21:16.840
talk for us uh so we don't have any time for questions but maybe uh you can just catch them uh in the

21:16.840 --> 21:22.840
hallway or uh some other time um