WEBVTT

00:00.000 --> 00:22.220
Hello, what kind of everyone? I'm Fernando. I'm a

00:22.220 --> 00:27.400
kind of network engineer at Sousa and today I'm here to talk about

00:27.400 --> 00:36.520
channels and light weight tunnel infrastructure with NFTables so let's start with

00:36.520 --> 00:41.560
topic. So what's light weight tunnel? Well it's something that has been around

00:41.560 --> 00:47.800
the camera since 2015. It's not new and it allows to ingest this

00:47.800 --> 00:54.120
touch encapsulation instructions to routes originally. We will go back to that

00:54.120 --> 01:01.680
later and it has support in multiple kind of tunnels like IP, IP, DX, Lange, and many

01:01.680 --> 01:07.600
others. So why is this useful? This is useful for virtualized environment. If you are

01:07.600 --> 01:14.600
familiar with Kubernetes or environments with a lot of VMs, usually all the

01:14.600 --> 01:20.360
connection is tunnels in enterprise networks. You have the machines, VMs or

01:20.360 --> 01:24.560
bare metal connected to switches connected to a huge network and it's

01:24.560 --> 01:31.040
common that you want to connect everything together or in Kubernetes. It's also quite

01:31.040 --> 01:38.000
common to have bought bought or no-to-not communication tunnel and also there

01:38.000 --> 01:45.840
is a lot of work in business networking like OBS that works through tunnels. So in

01:45.840 --> 01:51.000
essence the initial support was something like what's other there. You added the

01:51.000 --> 01:57.920
route to a specific endpoint and you added some encapsulation instructions. So for

01:57.920 --> 02:02.480
example here we are in the DSABX line, you add the ID and you add the

02:02.480 --> 02:07.840
destination endpoint and then the device that is going to handle this. So the good

02:07.840 --> 02:12.520
thing about this and if you are familiar with DX line is that usually or with an

02:12.520 --> 02:15.800
tunnel actually is that usually you won't need to create one interface

02:15.800 --> 02:23.640
where a tunnel that you create. So you have for let's say every endpoint that you

02:23.640 --> 02:28.920
want to connect, you have one tunnel interface that have source destination,

02:28.920 --> 02:34.120
some options and then on the other end you have another tunnel with source

02:34.120 --> 02:39.760
destination and so options and then then working the middle. The good thing is that

02:39.760 --> 02:46.120
this other unit create a single device, it could be an IPAP, the X-Long

02:46.120 --> 02:52.960
or whatever and that single device can be used for different endpoints and this

02:52.960 --> 02:56.840
super useful in Kubernetes for example when you need to create a managed

02:56.840 --> 03:02.640
a huge amount of interfaces so you can reduce that amount of interfaces and in

03:02.640 --> 03:09.920
a sense make it more simple and efficient. All right so what new by

03:09.920 --> 03:16.160
I'm talking about this in 2026, 11 years later, well we have other support

03:16.160 --> 03:23.800
to unafita and ftables for it and it's true that lightweight tunneling is not

03:23.800 --> 03:27.680
very well documented, it's hard to find documentation how it works, how it

03:27.680 --> 03:33.840
configures it and if you are an expert on virtualizing and working then it's

03:33.840 --> 03:39.440
easier but usually it's pretty hard to be able to figure out how to use it.

03:39.440 --> 03:45.360
So there is a lot of option rate as I say probably to do it to the lack of documentation

03:45.360 --> 03:50.960
and I must explain that the kind of leader documentation in NFTables is not that good

03:50.960 --> 03:55.680
so that's on my side I'm working on it sorry and I would try to

03:55.680 --> 04:01.840
have a patch of it to to improve that documentation and the things that in IP route

04:01.840 --> 04:07.360
the implementation and configuration it's sometimes not very compatible with

04:07.360 --> 04:11.360
never manager because never manager is touching also routes and all this stuff so

04:11.360 --> 04:15.440
there are conflicts and things sometimes doesn't work pretty well so either you

04:15.440 --> 04:22.080
cannot use it or you can you need to drop network manager and in this case as the

04:22.080 --> 04:26.800
implementation in NFTables does not touch the routes what it does is create a

04:26.800 --> 04:33.840
rule set of NFTables and inject the encapsulation instructions there it's completely

04:33.840 --> 04:37.280
compatible with this kind of engines like never manager is systemDener where the

04:37.280 --> 04:45.120
and so on so that's another thing well NFTables it's going to be the supporting

04:45.120 --> 04:50.080
NFTables is more flexible because it's not dependent on routes anymore you pull match

04:50.720 --> 04:57.040
on any characteristic that you want to split the traffic and it doesn't necessarily need to be

04:57.040 --> 05:01.600
the source and the destination it could be the ID or it could be whatever all the stuff that

05:01.600 --> 05:07.920
you want to touch and it's other you also to combine it with some features like the maps,

05:07.920 --> 05:16.160
the sets, dynamic sets, dynamic maps and actually any NFTables expression that you want to use

05:16.240 --> 05:23.280
and this helps you to well scale up and simplify the configuration from the CCR administrator

05:23.280 --> 05:30.080
point of view so also by the way in NFT this is an object so it's not like a standard

05:30.080 --> 05:36.160
expression it's an object statement and what is good about this is that it supports the update

05:36.160 --> 05:42.080
operation so imagine that you have a one tunnel created with one tunnel template that we're looking

05:42.080 --> 05:47.280
to that data and you want to update some of the fields configure you can do it using the

05:47.280 --> 05:52.320
transactions system that NFTables support so you won't have intermediate states like people have

05:52.320 --> 05:59.440
for example in IP rules because you got an executing commands in your specific order if you have

06:00.160 --> 06:06.160
high flow traffic that could be some packets in the middle that suffer some drops or or some

06:06.160 --> 06:13.760
connectivity issues all right so first let's look at what we did in the corner so this is

06:13.760 --> 06:20.880
the key piece of the work which in essence we just take the NFTNL object information we get the

06:20.880 --> 06:28.960
SKB when we have a evaluation and evaluation it's in essence a match and we drop initially the

06:28.960 --> 06:37.600
DSD metadata that is attached to the SKB and we put our well we get a reference of the

06:37.600 --> 06:46.000
of the metadata that we have generated and we attach it to the SKB and this sounds super simple but

06:47.520 --> 06:54.320
well first ND it's a metadata structure it's in essence it contains an IP tunnel info

06:54.320 --> 06:59.920
and it parts all the information of the tunnel and this comes from generic things like I say

06:59.920 --> 07:10.640
say source address IP 4 and IP 6 destination port ID, BTL or other stuff but also the specific

07:11.520 --> 07:16.000
of that tunnel let's see for example if you have the need you know that you can have multiple

07:16.000 --> 07:21.760
options configure if you have BX line you have the GBP configure so all of that it's contained

07:22.320 --> 07:29.920
and the hard work that is doing NFTables is in essence when you're from the users space

07:29.920 --> 07:34.560
configure the tunnel it's validates all of that and make sure that you are not configuring something

07:34.560 --> 07:42.880
wrong in the terms of that it makes sense like you're not mixing IP 4 with IP 6 or you are not

07:42.880 --> 07:49.760
to think in valid values and it's passing all of that into the correct tunnel type and then

07:49.760 --> 07:57.680
inside to need but I'll tell you so before attaching it to the SKB it's kind of trivial thanks

07:57.680 --> 08:03.440
to all the mechanisms that we already have because actually without them it will be quite complex

08:04.400 --> 08:12.080
so let's look at a real NFT tunnel object so this is a generic template this is not

08:12.640 --> 08:21.280
this could be used for example for any kind of tunnel IPIP, PX line, Geneva and so on because it doesn't

08:21.280 --> 08:27.920
have a specific options for it but of course this is not configuring any of the specific options

08:27.920 --> 08:35.520
and if you want to confirm and the default will be whatever is defined on the on the driver of the tunnel

08:35.520 --> 08:42.080
so if you want to confirm it we'll look at the later so in essence we have a table in that table

08:42.800 --> 08:49.280
we create the the object that will be the tunnel we put the name and we can define the ID

08:50.240 --> 08:55.280
source address and destination address this examples IPv4 but IPv6 is supported

08:56.160 --> 09:03.120
and then the destination port and TTL that's it we are going to look later on how to use this

09:03.120 --> 09:12.160
object but they're variants so as I say if you want to define some of the options that are more

09:12.160 --> 09:18.400
specific to the tunnel type like for example the VX line GBP option you need to add the

09:18.400 --> 09:25.280
section so you add a small section under the tunnel definition and you put the X line GBP in this

09:25.280 --> 09:31.200
case 100 and for Geneva people will be the same Geneva and the options configure

09:32.960 --> 09:41.520
so all right what are the real world samples that we can deal with with this guy in off

09:41.520 --> 09:46.880
tunneling so we have two games that are connected to each other like for example in a cabin

09:46.880 --> 09:55.520
that is a cluster which are the nodes then these VMs has a BX line and there are containers

09:55.520 --> 10:03.440
configure with a better to allow our container communication with the host of the VM with the

10:03.440 --> 10:11.680
guest VM sorry so here we will use the template to allow that encapsulation and be able to communicate

10:11.840 --> 10:18.000
and the good thing is that if we have three or four containers we could use the same BX line

10:18.000 --> 10:28.240
interface we would need to create one pair pair container all right so this will be the rule set

10:28.240 --> 10:35.600
for that specific scenario as I say we have the definition of the tunnel and then we need to

10:35.600 --> 10:41.600
redirect the traffic to the tunnel and from the tunnel because here we need to redirect

10:41.600 --> 10:48.880
from container and to the container so we in essence match the address and that address will be

10:49.520 --> 10:58.320
that is computer on the containers and after that we define right we want to the the tunnel

10:58.320 --> 11:07.280
object to kick in and we define tunnel name the name of the tunnel and later we forward E to

11:07.280 --> 11:13.680
BX line 0 so BX line BX line the bias line the bias we be able to look at the encapsulation

11:13.680 --> 11:22.320
instructions that are defined there and do proper encapsulation but we also need to do something else

11:22.320 --> 11:30.240
when we get the traffic from the other container we need to do for one it for what it

11:30.240 --> 11:38.560
to the the best in the host so it's actually it lands in the container so you have traffic in the

11:38.560 --> 11:46.560
both ways this by the way cool if you use IP route to do this you will need to use probably TC

11:46.560 --> 11:51.680
to ready the traffic from the guest to the container when you are listening it because with a

11:51.680 --> 12:00.160
route and on it's usually not enough another example that I'm not going to get in very much

12:00.160 --> 12:07.040
it's yeah the simple tunnel example you have two VMs connected to a switch connected to a network

12:07.680 --> 12:15.120
and you create a VX line and therefore you encapsulate the traffic between them so

12:15.280 --> 12:23.280
let's look at a real example for this specification area

12:25.680 --> 12:33.120
so this recorded with this demo aski with aski cinema so I'm going to explain what's happening

12:33.120 --> 12:39.120
so first I'm going to log into one of the VMs because we need to configure both VMs

12:45.920 --> 12:57.440
right so what we have known here finishes me going to try to make it be here fine okay yeah so

12:57.440 --> 13:12.240
what we have known here is to create few names places okay they are right more okay okay that's fine

13:12.400 --> 13:23.120
so we have created two names spaces those names spaces have the best configures and so on

13:23.120 --> 13:29.520
and we have used the NSS the ruleset that we show before but instead having only one address we

13:29.520 --> 13:36.080
are having two others because both containers has different others and in essence yeah this is the

13:36.160 --> 13:42.800
ruleset and now I'm going to log in the other machine to configure the other end

13:53.680 --> 13:59.200
okay yeah so it's configure and now I'm going to pick being from the containers to the other

13:59.200 --> 14:12.320
container using the tunnel which should be happening so yes all right so I'm going to

14:13.120 --> 14:18.880
name space this is done this is done with names space that's a VM works kind of the same way

14:20.160 --> 14:26.080
all right and as you can see I'm able to ping from the container to the other end and from

14:29.200 --> 14:36.560
here the container to the other one so both ends and I'm going to show you now that I only

14:36.560 --> 14:43.120
using one BX line and also yeah here you can see there is only one which doesn't have any IP

14:43.120 --> 14:49.680
configure and I have the both best on the host the both pairs on the host and the physical

14:49.680 --> 14:56.320
nick that I'm using with the others configure and no rules well the default rules of course for

14:56.320 --> 15:05.360
connectivity and yeah that's that's basically it we were able to achieve our connections

15:06.080 --> 15:14.640
from different endpoints using a single BX line and this can be a scale app to a very very

15:15.600 --> 15:34.640
big hand-hound of containers and endpoints so yeah all right so that's it thank you very much

15:34.640 --> 15:44.960
everyone here thank you for listening and also thank you for all the volunteers and

15:45.040 --> 15:48.880
organist solution

15:52.000 --> 15:57.760
and questions thank you

15:57.760 --> 15:58.760
Oh, there.

16:10.760 --> 16:11.760
Yeah, hey.

16:11.760 --> 16:15.760
So we just wanted to bring the rummage

16:15.760 --> 16:17.760
internal version easily.

16:17.760 --> 16:21.760
Did this supported in any of the tables?

16:21.760 --> 16:26.760
So with the latest bug fixes, I believe it is 618.

16:26.760 --> 16:28.760
It is 618.

16:28.760 --> 16:31.760
So we fixed some bugs in Geneva.

16:31.760 --> 16:37.760
But for BX then, you should be able to use it since 64.

16:37.760 --> 16:38.760
Something like that.

16:38.760 --> 16:41.760
From the top of my mind, maybe it's a little bit off.

16:41.760 --> 16:44.760
But if you want to use it, you need 618.

16:44.760 --> 16:45.760
Thank you.

16:45.760 --> 16:46.760
Thank you.

16:46.760 --> 16:59.760
So suppose that's some person from neighbour room,

16:59.760 --> 17:01.760
hosting Gbps will come here.

17:01.760 --> 17:02.760
And that's the question.

17:02.760 --> 17:05.760
Is it possible to port this implementation

17:05.760 --> 17:09.760
from any tables to some XDP or BPS program?

17:09.760 --> 17:10.760
Is it possible?

17:10.760 --> 17:11.760
Probably.

17:11.760 --> 17:12.760
I don't know.

17:12.760 --> 17:13.760
I'm not a BPS expert.

17:13.760 --> 17:14.760
I don't know.

17:14.760 --> 17:17.760
But I'm pretty sure that someone could do it.

17:17.760 --> 17:19.760
It's not rocket science.

17:19.760 --> 17:23.760
So there's just manipulation with the packet.

17:23.760 --> 17:25.760
Yeah, it's manipulation of the packet.

17:25.760 --> 17:26.760
In essence, what is it?

17:26.760 --> 17:29.760
Well, actually, I'm not very sure that you can do it.

17:29.760 --> 17:32.760
Because if I know where I'm from BPS or BPS context,

17:32.760 --> 17:35.760
you cannot see the SQV, which is the packet representation on the kernel.

17:35.760 --> 17:39.760
And what we are doing here is touching the

17:39.760 --> 17:43.760
instruction to do the encapsulation to that SQV structure.

17:43.760 --> 17:47.760
So actually, I don't know.

17:47.760 --> 17:59.760
Maybe you can go to the, I guess the, he might know better.

17:59.760 --> 18:00.760
Thank you very much.

18:00.760 --> 18:01.760
All right.

18:01.760 --> 18:02.760
Thank you.

