WEBVTT

00:00.000 --> 00:18.000
Good afternoon everyone, this is Jose Castillo, I'm here with my colleague Raul, we are both

00:18.000 --> 00:25.200
performance engineers in Red Hat, we work with in the open shift and Kubernetes ecosystem

00:25.200 --> 00:30.640
performance teams and we are going to discuss today a performance comparison and evaluation

00:30.640 --> 00:37.520
of different Kubernetes, multi-cluster networking solutions, basically we compare the

00:37.520 --> 00:48.080
scaper, some mariners anistio in terms of throughput, latency, resource consumption, and power

00:48.080 --> 00:55.520
and other metrics. Thanks Raul, this was part of a research effort that we did last week

00:55.520 --> 01:02.960
with some colleagues, psi, which is now on Nvidia and André, which is a professor in in

01:02.960 --> 01:10.480
in Italy, don't mistake. Next slide please. So this is a number of what we are going to discuss

01:10.480 --> 01:16.240
today, first of all we are going to discuss why we need multi-cluster solutions, why multi-cluster

01:16.240 --> 01:24.000
solutions are relevant today versus like huge single-cluster instances. Then we will discuss each

01:24.000 --> 01:29.840
of the solutions that we analyzed, some mariners cover anistio, each of them are very different

01:29.840 --> 01:37.200
in a lot of aspects, the network architecture, the network technology that they use to transfer

01:37.200 --> 01:42.880
the data, not only that, the use cases that they cover, the end user for each of the solutions

01:42.960 --> 01:48.320
is different, so I think it is important to understand the difference between, between those and just

01:48.320 --> 01:58.560
to justify why we chose those solutions. Then we will describe the test bed that we use for the

01:58.560 --> 02:03.440
comparison, it was a vermetal enterprise, a couple of vermetal enterprise environments,

02:04.960 --> 02:10.880
some of the tooling, a lot of tooling for data planes, some for throughput, some for latency,

02:10.960 --> 02:18.080
and also the tooling that we use for the power and energy metrics. Finally we will discuss the

02:18.080 --> 02:24.400
results and we will commence some future work improvements that we have on mind.

02:27.280 --> 02:35.360
So first question, why multi-cluster, why we use several Kubernetes clusters, and this is a trend right

02:35.360 --> 02:44.960
now. So besides a couple of use cases that really need huge clusters, like I'm thinking like

02:44.960 --> 02:50.320
training, A, A training and stuff like that, in general there's a trend where we see that the

02:50.320 --> 02:58.320
nodes of the governance clusters are usually distributed within smaller clusters. So there's a bigger

02:58.400 --> 03:06.000
number of clusters with small numbers. Guides that for several reasons, we may need several

03:06.000 --> 03:11.280
clusters for high availability reasons, we may need several clusters because we want to implement

03:11.280 --> 03:16.880
a disaster recovery solution, sometimes is because we want for quality of service reasons, we want

03:16.880 --> 03:23.440
to deliver data closer to the end user. So there's plenty of reasons why we would like to have

03:24.400 --> 03:29.840
several Kubernetes clusters to deploy applications. On top of that, there's hybrid cloud,

03:29.840 --> 03:36.080
sometimes we need to have or we prefer to have some applications running on premises, on the data center,

03:36.080 --> 03:41.600
and we have other environments and other applications that are better located in cloud environments.

03:42.160 --> 03:49.760
So multi-cluster is is really important nowadays. There's the management question of multi-cluster,

03:49.920 --> 03:55.520
so there's a lot of solutions there, open cluster management, advanced cluster management and others,

03:55.520 --> 04:00.480
but today we'll be focusing on the data playing aspect, and on the networking aspect of the

04:00.480 --> 04:08.880
multi-cluster framework. And this is the first solution that we're going to be analyzing,

04:08.880 --> 04:17.200
it's called Samariner. Samariner creates an IPSEC. It's also super wide guard, but IPSEC

04:17.200 --> 04:23.920
is the default standard for creating this connectivity between the clusters. One of the workers,

04:23.920 --> 04:30.400
no, the fish clusters is marked as the guide window, so the east, west, traffic in the left-cluster

04:30.400 --> 04:36.080
in Kubernetes 1, goes to the guide window, and the guide window, using kernel networking and the IPSEC

04:36.080 --> 04:43.120
module, establishes a tunnel with the other cluster. So that's how so Mariner works.

04:43.120 --> 04:49.040
Of the different solutions that we are going to analyze, Samariner is the only one that has

04:49.040 --> 04:54.400
some Java centralized architecture. There's one cluster which is the hub, and the other clusters

04:54.400 --> 05:02.960
are considered the spokes. And it requires some previews, the use case for Samariner, then user is

05:02.960 --> 05:10.400
the cluster administrator. It basically is a solution that connects clusters. The second solution

05:10.400 --> 05:16.560
that we're going to analyze is called a SCAPER. SCAPER is very different instead of connecting

05:16.560 --> 05:23.760
clusters, let's say the other way, it connects different namespaces. The end user of SCAPER is

05:23.760 --> 05:30.480
an developer, so you don't need to be an administrator of the cluster to be able to deploy a SCAPER.

05:30.480 --> 05:35.120
If you have user, tenant, user, level access to a couple of clusters, you are able to connect

05:35.120 --> 05:42.080
to those namespaces using NTLS. Basically SCAPER creates a pod, which is a router pod,

05:42.080 --> 05:48.960
and all the traffic goes through the gateway pod in user networking kind of traffic.

05:49.840 --> 05:53.840
It's also interesting to note that the SCAPER cluster can be scaled out

05:53.840 --> 05:58.560
or is entirely, so it's a really cloud native solution and framework.

05:58.560 --> 06:11.360
Finally, we analyzed the performance of STU. It's just very now because it's a service

06:11.360 --> 06:17.120
mesh solution, very popular within one cluster, but it can also be used to connect several clusters.

06:17.920 --> 06:24.160
There are several modes of functioning of STU. Historically, STU had this

06:24.800 --> 06:30.800
envoy sidecar, so for every pod you have this sidecar running envoy and that's where the

06:30.800 --> 06:37.280
gateway magic happened and all the advanced level, the seven features. There's another way

06:37.280 --> 06:41.440
more of a operation, the downside of this envoy approach is that this very resource

06:44.160 --> 06:50.000
consumes a lot of resources. Basically for each pod, you have an extra container, right?

06:50.560 --> 06:56.000
There's a newer solution, which is called Lambient mode, in which you only have like one gateway

06:56.000 --> 07:00.800
per each of the workers. Unfortunately, this new solution called Lambient mode doesn't work with

07:00.800 --> 07:08.160
multi-cluster, so we were forced somehow to use the classical envoy solution for the testing.

07:11.200 --> 07:16.240
The same way that Samariner connects clusters and SCAPER connects namespaces

07:16.240 --> 07:21.840
is the basically connect services within clusters, and it's also a solution that can be deployed

07:21.840 --> 07:29.440
at a tenant level. There are some trust that you need to establish within the clusters,

07:29.440 --> 07:36.080
in the case of STU to be able to connect the services across the cloud continuum in the

07:36.160 --> 07:48.640
form of certificates and trust issues, trust change. Here we see a comparison between the three solutions.

07:49.760 --> 07:54.720
Basically, Samariner is the only solution that's more centralized that follows a

07:54.720 --> 08:00.800
Habspoke architecture. It's the only level three solution we use the CIPSEC or

08:00.960 --> 08:07.760
wide-walling kernel working mode. We're going to see after words when Raul discusses the results

08:08.400 --> 08:14.400
that there are some limitations to the IPSEC kernel implementation, and this translates in

08:15.360 --> 08:21.600
in some bandwidth limitations as well. What each of the solutions connects, as I said,

08:21.600 --> 08:25.840
Samariner connects different clusters, and it needs to be done by the cluster, I mean,

08:26.720 --> 08:31.120
um, Istio connects services and SCAPER connects different namespaces.

08:33.120 --> 08:37.680
Then user personas also different for SCAPER than user personas, the developer or the,

08:38.320 --> 08:42.720
the tenant user of the clusters, for Samariner usual is the cluster admin,

08:42.720 --> 08:45.840
and in Istio we have both, both kind of personas.

08:47.520 --> 08:52.400
Security features are a little bit different, both SCAPER and Istio, leverage MTSS,

08:53.360 --> 08:55.360
and Submariner uses IPSEC.

08:57.840 --> 09:03.920
It's also relevant to note that the SCAPER and Istio are solutions that are independent of the container

09:03.920 --> 09:09.520
network interface, so it doesn't matter if you're using OVN, SILium or whatever, but Samariner is

09:09.520 --> 09:16.400
dependent on the C&I, and it supports a subset of the, of the C&I implementations.

09:16.480 --> 09:24.560
All of them obviously support multi-cloud that does the main scenarios, multi-cloud and hybrid scenarios.

09:25.600 --> 09:32.000
All of them has the same kind of Apache license, and the only project that's not C&CF

09:32.000 --> 09:41.040
hosted is SCARENTLY SCAPER, and this is the test with that we use for the testing.

09:41.680 --> 09:48.320
We had a couple of vermetal clusters, enterprise servers with a couple of number nodes.

09:51.360 --> 09:57.120
We have obviously the control plane, three masters, and we isolated all the gateway components of

09:57.120 --> 10:03.200
each of the solutions in one particular node, in one particular work in each of the clusters.

10:03.920 --> 10:08.560
We did that to make sure that we were able to measure the overhead of each of the solutions.

10:08.560 --> 10:14.560
So we wanted to know the price that we need to pay in terms of CPU memory and energy

10:14.560 --> 10:20.800
for connecting the clusters both for Sumariner, for SCAPI and for IPSEC.

10:25.200 --> 10:31.040
We also wanted to make sure that dormitory and not approach for the testing was consistent.

10:31.680 --> 10:36.720
So we applied this performance profile in the cluster, the idea was to

10:36.720 --> 10:44.080
to reserve some CPUs for the operational system, to have some CPUs also for the workloads,

10:44.080 --> 10:46.320
for the client and server pods.

10:47.440 --> 10:55.040
We wanted to make sure that we weren't interrupted from the operational system,

10:55.040 --> 11:00.160
so we wanted to make sure that the pods that run the actual workload were not

11:00.160 --> 11:01.760
interrupted at any given moment.

11:03.040 --> 11:09.040
It's also important for us that the workloads run within the same node, as you guys know,

11:10.240 --> 11:14.240
there's a penalty that you pay if the interface is connected to one.

11:14.240 --> 11:19.680
The PCI address of the interfaces in one node, and you have cross-numan node traffic,

11:19.680 --> 11:23.520
so that's something that we also wanted to avoid for this particular testing.

11:24.080 --> 11:29.040
This is the performance profile, and then from the pod perspective,

11:33.680 --> 11:38.560
we just pointed the workloads to this performance profile,

11:38.560 --> 11:45.520
and we disable, we didn't get any interruptions from the operational system.

11:46.640 --> 11:50.800
That's to make sure that the workloads were completed,

11:50.800 --> 11:53.040
and the results were reproducible.

11:56.640 --> 12:01.680
And I'm going to let Raul discuss some of the tooling and the results of the testing.

12:07.360 --> 12:08.800
Right, thank you, Jose.

12:08.800 --> 12:09.520
You hear me well?

12:10.800 --> 12:15.440
Okay, so, I'm going to speak about a little bit about the tooling we use

12:16.400 --> 12:24.800
for layer for testing, let's say, TCP throughput in latency.

12:24.800 --> 12:29.760
We use Uperf, there are many other alternatives like NetPerf or Iperf,

12:29.760 --> 12:36.080
but for the particular testing we use Uperf because we were more or less used to the

12:36.080 --> 12:45.040
it results, so I consider it stable tool, and which returns quite deterministic results.

12:45.200 --> 12:48.960
This is something we are looking for.

12:48.960 --> 12:54.480
For layer for layer 7 testing, HTTP testing, we use work tool,

12:54.480 --> 13:02.640
kind of, or those who don't know what it is, is HTTP loader tester,

13:02.640 --> 13:12.240
but with the difference that with the original work tool, work tool has an adoption,

13:12.320 --> 13:26.800
to what the option has the nuance of generating a constant fixed amount of a fixed rate of HTTP

13:26.800 --> 13:38.160
request per second, which is something necessary to measure latency safely.

13:39.120 --> 13:42.880
And respect to energy monitoring, we use a Kypler,

13:42.880 --> 13:46.560
which is going to be a sufficient power level exporter, which we basically saw,

13:46.560 --> 13:52.560
operator, that deploy is a diamond set with a magic, we'll talk about it later.

13:53.600 --> 13:58.000
And we also use the BCM of the BurmETal servers,

13:58.000 --> 14:01.840
that is basically, we were taking a look at the BCM,

14:02.240 --> 14:13.600
based on controller, BFC, BCM, the I-DRAC, and then this case, we were using Dell's servers,

14:13.600 --> 14:22.080
and we were taking a look at the graphs in the BCM to ensure that the results exported by

14:22.080 --> 14:29.280
KERPER were aligned with them. Also, the assets of the all the testing can be found in that repository,

14:32.800 --> 14:39.600
with respect to HUPERF, as I said, we use it for TCP support and TCP latency,

14:39.600 --> 14:46.000
for TCP support to use this configuration, which we see currently as the XML that describes

14:46.000 --> 14:50.880
a configuration to send as many write operations as possible during an exact one minute.

14:52.160 --> 14:56.880
So, quite simple, it's a placeholder environment by the border,

14:56.880 --> 15:02.560
that we have to need to be replaced by the service AP in this case. It's important to know that

15:03.680 --> 15:06.800
in the cluster communication happens at a service level,

15:11.200 --> 15:17.520
especially because a service level, exactly, you can go directly to a pod

15:18.240 --> 15:21.200
from that different cluster you had to go to a service in this case.

15:22.160 --> 15:31.360
We also have another placeholder for size, because we tested multiple packet sizes for the throughput

15:31.360 --> 15:40.880
test, and respect to TCP latency, we use the well-known TCP request latency profile,

15:42.080 --> 15:47.440
which we see currently get a statin C8 measuring the elevation of a small read write operations

15:48.320 --> 15:57.200
during one minute. Same as I said before, we have some environmental variables that need to be set

15:57.200 --> 16:08.320
up front in order to configure the test properly. Poor monitoring capillary, so poor monitoring

16:08.320 --> 16:16.480
is basically the operator that replace a demon set, and demon set that replace say this of pods in

16:16.480 --> 16:23.360
every note that collects real time CPU board consumption from notes. Those poor consumption

16:24.480 --> 16:30.480
numbers can be collected from the RAPL running, it's probably made from Intel,

16:30.480 --> 16:35.280
and I think the AMD has another alternative with a different name. I don't remember what was the

16:35.280 --> 16:40.800
name, but basically it's a high precision monitoring of CPU packets and RAM energy use it,

16:41.760 --> 16:52.080
and redfish, which is a real time platform power usage by the AVCM controller. In this case,

16:52.800 --> 16:59.600
redfish was our choice, because we wanted to measure not only the CPU and the memory power consumption,

16:59.600 --> 17:07.040
but also the full stack memory consumption, including network devices, cooling funds,

17:07.040 --> 17:18.160
police supply overhead, and any other device that could be affected by this performance test.

17:22.640 --> 17:28.240
I'm going to speak a little bit about the results, so respective throughput, we can find out that

17:28.240 --> 17:35.440
you can find out that the highest throughput was found in Istio. We saw that Scott Per was limited

17:35.760 --> 17:43.040
at 8 GB per second, and some RNA comes with a low throughput. In this case, we read it around

17:43.040 --> 17:49.840
three GB per second, it was the ribet because of an IPSEC kernel implementation limitation,

17:50.800 --> 18:01.440
that limits the IPSEC kernel established for IPSEC kernel that some RNA uses,

18:02.480 --> 18:14.160
it's only processing in a single core fashion. There are active efforts to improve that logic

18:14.160 --> 18:24.800
in the kernel side, there's even an RFC with a proposal to improve that logic, but as far as I know,

18:24.800 --> 18:43.040
it has even been any significant progress on this matter yet. The red lines are represented in the

18:43.040 --> 18:49.040
baseline, which is in this case. Our regular Kubernetes cluster IP service, so as you can see,

18:49.040 --> 19:02.400
there's a relevant penalty for using multi-cluster networking, but ATDB latency. The latency

19:02.400 --> 19:09.920
is well measured with the application a lot to work to, in this case, we configured it with

19:10.640 --> 19:19.920
to send 10,000 requests per second across 100 simultaneous connections, and we've seen two threads,

19:19.920 --> 19:32.000
plus sending a plus requesting a 1KB body size, 1KB static file from the ATDB server.

19:32.000 --> 19:40.640
In this case, I work with you in a regular NGNX pods, replicas, Serbian static files.

19:40.640 --> 19:45.680
Well, we can see an significant latency overheading esteem with the cluster,

19:46.960 --> 19:53.600
much higher than the previous and the other solutions, so at this time, we can conclude that

19:53.680 --> 20:02.000
with this we can conclude that the esteem could be a good solution to achieve a good throughput,

20:02.000 --> 20:13.760
but at a high latency. With respect to resource consumption overhead, we are measuring here,

20:13.840 --> 20:22.000
we are plotting here, the representing here, the radio between the throughput and,

20:27.280 --> 20:34.160
oh, sorry, yes, this is representing absolute numbers of the CPU consumption,

20:34.160 --> 20:40.160
and you can see the submariners, the most resource efficient, followed by esteem,

20:40.160 --> 20:51.040
and we can see that we saw that as a cover, consumers, the most number of CPUs at all packet size

20:51.040 --> 20:59.280
and threads combinations, but to say, yes, and this was the one I was talking about before,

20:59.280 --> 21:04.960
that basically represents the radio of resource consumption compared with the throughput achieved

21:04.960 --> 21:12.960
with all solutions, and we can see that the scalper and submariners do much better than the

21:12.960 --> 21:22.960
scalper in terms of throughput per core, especially esteem in this case is clearly winning the other

21:23.040 --> 21:34.960
two solutions, power consumption, these are the numbers we saw from, that were returned in

21:34.960 --> 21:43.120
from issues by an exposed by, by Kepler and the Kepler pods, and we saw that the higher CPU

21:43.120 --> 21:49.680
usage is basically correlated with a higher power consumption, which is predictable, but this

21:49.680 --> 21:57.920
is something we wanted to know as well, and this is a nice graph that basically summarizes all the

21:57.920 --> 22:08.800
investigation with all the research we did, the closer is that the bar points to the borders means

22:08.800 --> 22:14.640
the better, I mean, for example, we can know that for example, some mariner is the best at the

22:14.720 --> 22:21.360
latency and the number of truckshenshells and CPUs it, and on the other side we can note that

22:21.360 --> 22:29.360
esteem is the best in terms of throughput, so depends on the requirements or your infrastructure,

22:29.360 --> 22:37.440
or solution, you will pick, you should pick one of these options, the most balanced

22:37.440 --> 22:45.440
at one seems to be scalper, but for example, if you are, you really need a high throughput,

22:45.440 --> 22:57.280
I will go with esteem, and in case of getting a required latency, that means that we will need to

22:57.280 --> 23:04.960
go with some mariner, so it's a matter of finding the balance, evaluating what are your needs

23:04.960 --> 23:13.520
for your application and infrastructure, and try one of these solutions, and that was pretty

23:13.520 --> 23:22.960
bad, it's, yep, and the future work, yes we have plans to run, to try to run real work benchmarks,

23:22.960 --> 23:30.400
involving measures in misadding databases and applications, based on microservices, also explore

23:30.400 --> 23:38.400
the capabilities of the control plane, like service discovery times, service failure time,

23:38.400 --> 23:47.440
those KPIs are not, are not, were not measured, that did in this research, but

23:49.440 --> 23:54.560
are really good KPIs as well, depending on the workload you have plans to run,

23:55.520 --> 24:02.000
and yes, scale the numbers on scale the number of clusters in their tests, so depending on

24:02.000 --> 24:09.360
the number of clusters, the solution will probably be affected, the results of the solution will

24:09.360 --> 24:15.200
probably affect it, so that's something we will need to test going forward, and well, this is

24:15.200 --> 24:22.000
include ambient mesh mode in the rest-test metric space, probably that has that some

24:22.000 --> 24:29.920
mistake because as far as you know, ambient mode doesn't support multi-cluster get, it does, okay,

24:29.920 --> 24:40.560
that's good, not the, awesome, that's great, great news, thank you, that is pretty much it,

24:41.440 --> 24:45.440
you have any questions, we have some minutes,

24:54.320 --> 24:56.320
yep

25:10.560 --> 25:36.080
yes, the question was about the word will be the use case in considering that the latency numbers

25:36.080 --> 25:42.960
are much higher than the baseline, well in my case it depends on the workload, I mean,

25:42.960 --> 25:48.400
not every workload is going to be a database basis, it's not going to be to communicate with

25:48.400 --> 25:55.280
a database, possible you need, you know, not every workload requires a very super low latency,

25:55.360 --> 26:00.960
not every workload is such a good workload, so it really depends, I mean, there are,

26:05.040 --> 26:06.960
it depends on the application you want to run, I guess,

26:08.960 --> 26:13.600
just to complement yes, for point, and that's why we want to test with real-world applications,

26:13.600 --> 26:19.200
and we mentioned that as a future work, I mean, we just basically were measuring network

26:19.200 --> 26:22.160
characteristics, by itself, we doesn't make a lot of sense,

26:23.120 --> 26:28.000
especially because these two is a solution for multi-class, for micro services architectures, so that's

26:28.800 --> 26:34.960
100% the next step to have like realistic applications with databases and stuff like that and see

26:34.960 --> 26:41.280
how they behave, also scaling the control plane, the number of workers, there's also these metrics

26:41.280 --> 26:46.800
that we're discussing that I think is very important like from where you create the service in

26:46.800 --> 26:51.760
one of the clusters, how long it takes to be available in the other cluster, so I think there's a lot

26:51.840 --> 26:59.120
of space to explore new metrics, and that was not our intention at all, yes.

27:13.120 --> 27:18.560
That's an excellent question, so we didn't apply any space, I guess, the question was if we applied any

27:18.560 --> 27:25.040
tuning to the solutions, so we did not, because we didn't thought it was fair, so if we tune

27:25.040 --> 27:30.480
one of the solutions and not the others, I mean, we weren't like a very complex matrix to test all

27:30.480 --> 27:36.160
the possible solutions, so there are all like upstream solutions, so we did use upstream, the

27:36.160 --> 27:39.920
open shift, the downstream for the platform, but there were all like the upstream

27:40.960 --> 27:45.600
deployments without out of the box configuration, so we didn't test any specific tuning in

27:45.600 --> 27:49.680
each of them, interesting, what does it say about it?

28:16.560 --> 28:27.040
Yeah, it actually depends on the solution, right, because for instance, so the question is,

28:27.040 --> 28:32.560
how do we plan, how do we do capacity planning about the resource consumption of each of the

28:32.560 --> 28:39.200
solutions, I guess, and the answer to the depends on the solution, for instance, should mariner

28:39.200 --> 28:45.200
allows you to have this worker label as a gateway, so you can have for instance,

28:45.200 --> 28:50.400
in AWS, a different flavor for that worker, and you can be very intentional about the

28:50.400 --> 28:55.520
workloads that you schedule or you avoid scheduling on that cluster, you can even reach an extreme

28:55.520 --> 29:02.080
where you have this worker research just for gateway functions, in the case of a scalpel, it's

29:02.080 --> 29:07.680
just a pod, so you need to be careful where the router is, is a schedule, and in the case of

29:07.680 --> 29:12.320
Istio Raul, if you can help me there, I don't know if you have like an solution to

29:13.600 --> 29:19.760
Istio the same, Istio has a gateway pod that they gave a pod, what's running on top of them,

29:19.760 --> 29:26.400
what we call the gateway node, in the case of the cluster that was receiving the traffic,

29:27.920 --> 29:34.880
because the traffic goes out directly from the worker node, and then reaches out the cluster

29:35.040 --> 29:41.280
gateway node, note. Yeah, and that's why we tried to measure in one of these slides, the

29:42.400 --> 29:46.400
the raw overhead of each of the solutions, so we want to be able to do some capacity planning,

29:46.400 --> 29:51.680
like as a cluster administrator, how much resources does it cost me to connect these clusters

29:51.760 --> 29:53.680
using each of the solutions?

30:01.600 --> 30:05.600
Okay, and this single server is not like a cluster, right?

30:09.440 --> 30:13.520
Yeah, I guess this was not covered as part of the study, this was like, this is a, like,

30:13.520 --> 30:19.200
egress traffic scenario, we do cover those in other testing that we do, but that was not like the

30:19.280 --> 30:26.320
point of this, this particular study. Yes, yes, for sure. We do egress and egress traffic a lot.

30:26.320 --> 30:29.360
In this case, we were trying to focus on multi-clusters scenarios.

30:30.480 --> 30:34.080
Yeah, apart from the results I shared with you, we also tested

30:36.080 --> 30:39.600
pod to produce in host networks scenarios, is that answers your question?

30:40.880 --> 30:42.880
Yeah, the results are not there, but

30:43.920 --> 30:46.720
but yeah, there should be in the papers, right?

30:50.000 --> 30:52.000
Yes, please.

30:58.560 --> 31:05.680
That's an excellent question. The question again, is if any of the solution support high availability, right?

31:07.520 --> 31:11.360
I can say on the top of my mind, Scapper for sure, because in Scapper, you can

31:12.720 --> 31:14.720
scale horizontally the router pod.

31:15.680 --> 31:21.680
I'm not sure about Scapper, honestly, I need to double check because you have this gateway node,

31:21.680 --> 31:28.240
so I'm not sure if automatically recovers, like you can have several gateway nodes, but I'm not sure

31:28.240 --> 31:33.360
if there's like an automatic failure detection or something like that on the top of my mind.

31:33.360 --> 31:40.560
And with this, you're all, yeah, with this, yeah, you had a study, they move, with this takes care of

31:41.120 --> 31:47.040
pushing the confirmation to the employee proxies, so that's, you can scale out the number of

31:47.040 --> 31:57.360
replicas on having a cluster of ESTOD pods, so one of them should be active, and you can scale

31:57.360 --> 32:02.320
out the number of gateways across multiple gateways nodes, so it shouldn't be any problem with

32:02.400 --> 32:04.400
the respect to high availability.

32:10.240 --> 32:11.440
Thank you very much, everyone.

