WEBVTT

00:00.000 --> 00:12.800
So, hello everybody, nice to have you in the modern email Devroom.

00:12.800 --> 00:18.840
We saw in the previous talk that there was a need for performance testing and benchmarking.

00:19.840 --> 00:20.840
Here we are.

00:20.840 --> 00:23.840
So, I'm Benoitelier.

00:23.840 --> 00:31.840
So, I work at Nagohar on OP-3 building an alternative to Office 365.

00:31.840 --> 00:38.840
And I'm doing that by building Twake Mail, which is a collaborative email server.

00:38.840 --> 00:43.840
So, it's a product belonging to Nagohar, of course, open source.

00:43.840 --> 00:49.840
But built on top of open source building blocks, namely Apache James.

00:49.840 --> 00:54.840
So, that's the main server developed in the Apache Software Foundation.

00:54.840 --> 01:07.840
It's unique in that sense that to my knowledge, it's the only mainstream open source server that is in following an open governance model.

01:07.840 --> 01:12.840
So, which is quite unique.

01:12.840 --> 01:19.840
And why did I need it to do some performance testing?

01:19.840 --> 01:33.840
Back when we started working on the project, back in 2014, we actually came up with a radically new architecture.

01:33.840 --> 01:43.840
And quite happy, following talks today, to see that it's not longer that original to say that the mail server needs to be state-less.

01:43.840 --> 01:50.840
And delegating its state management to specialise databases.

01:50.840 --> 01:59.840
So, we're doing basically the same thing that's too hard, well-dark and difficult to some extent.

01:59.840 --> 02:06.840
But we needed a tool to validate that the main server is not the database itself.

02:06.840 --> 02:14.840
But that there's an in-direction level on top of existing database is an idea that is actually viable.

02:14.840 --> 02:20.840
And also Apache James is a very old software.

02:21.840 --> 02:25.840
First, good lines, dates back from 1999.

02:25.840 --> 02:30.840
Top level project was established in 2007.

02:30.840 --> 02:34.840
And the internal architecture is still moving.

02:34.840 --> 02:45.840
And for example, one of the very big last refactorings that we did back in 2020 was to make the IMA player fully asynchronous.

02:45.840 --> 02:54.840
And using React or the Reactive code principles, which is a massive code change.

02:54.840 --> 03:00.840
And, of course, we wanted to have a tool to tell us if we're messing up something badly.

03:00.840 --> 03:04.840
And shipping and performance regressions.

03:04.840 --> 03:09.840
And on the actual wish list.

03:09.840 --> 03:15.840
So, of course, we want to benchmark taking root choice and art up to true changes.

03:15.840 --> 03:18.840
I'm sorry, they have been quite long.

03:18.840 --> 03:21.840
We also want a tool to validate custom or deployment.

03:21.840 --> 03:28.840
So, let's say I've got a government and they want to run 50,000 mailboxes.

03:28.840 --> 03:34.840
I need to be able to run a benchmark on their servers to actually check.

03:34.840 --> 03:38.840
Yes, on today you won't have load problems.

03:38.840 --> 03:40.840
It won't collapse.

03:40.840 --> 03:46.840
And to do this, we wanted to have a easy tool run efficient framework.

03:46.840 --> 03:54.840
So, we wanted to be able to generate the load a very big load from a single point.

03:54.840 --> 03:58.840
And, so that it's easier to aggregate statistics.

03:58.840 --> 04:03.840
We also wanted it to be easy to use for us.

04:03.840 --> 04:06.840
So, that means CLI code.

04:06.840 --> 04:12.840
So, that we could review the scenario, commit them, et cetera, et cetera.

04:12.840 --> 04:17.840
And, of course, we wanted to be adapting the scenario.

04:17.840 --> 04:22.840
For example, simulate client that don't have quick rethink.

04:22.840 --> 04:32.840
And what happens is if you run a huge proportional, best client does think collapse.

04:32.840 --> 04:46.840
So, we actually decided to reuse, we actually did research and found no state-of-the-art tools for doing

04:47.840 --> 04:50.840
extensive benchmarks with IMAP.

04:50.840 --> 04:55.840
There's tons of tools around for doing benchmarks.

04:55.840 --> 04:58.840
And one of them is Gatling.

04:58.840 --> 05:04.840
So, Gatling is an open source product that has been developed by some French people.

05:04.840 --> 05:08.840
We've been its Java-based, a phascala-based.

05:08.840 --> 05:10.840
So, it runs on the GVM.

05:10.840 --> 05:14.840
It feels at home for James' people.

05:14.840 --> 05:16.840
It's fully asynchronous.

05:16.840 --> 05:19.840
It's built on top of Akka-actual system.

05:19.840 --> 05:25.840
So, basically it's just messaging, flowing around internally.

05:25.840 --> 05:35.840
And this Akka-based architecture is what allows to generate an extreme amount of flow

05:35.840 --> 05:41.840
from a single load tester, basically.

05:41.840 --> 05:45.840
And it also comes with beautiful graphs.

05:45.840 --> 05:51.840
But our problem is that Gatling only ship HTTP out of the box.

05:51.840 --> 05:55.840
And we need it to develop IMAP connector for it.

05:55.840 --> 06:01.840
So, I tried to do a little graph about the internal structure.

06:01.840 --> 06:05.840
And there's a bit of vocabulary.

06:05.840 --> 06:09.840
So, basically, we are running a simulation.

06:09.840 --> 06:11.840
The simulation is a scenario.

06:11.840 --> 06:16.840
So, what is a single session doing during the internal scenario?

06:16.840 --> 06:21.840
And basically, the idea is to come up with a domain-specific language

06:21.840 --> 06:24.840
that describes the scenario in IMAP.

06:24.840 --> 06:31.840
The scenario is subdivided between actions, for example, select and checks.

06:31.840 --> 06:35.840
I'm sorry that it overlaps.

06:35.840 --> 06:43.840
And, basically, simulation will use a protocol to connect to your main server.

06:43.840 --> 06:48.840
So, you will need to have described that.

06:48.840 --> 06:53.840
Thinner is to inject user that would start running the scenario

06:53.840 --> 06:55.840
and following an injection profile.

06:55.840 --> 06:58.840
So, more on that later.

06:58.840 --> 07:04.840
And then, basically, we've got an object that is IMAP sessions

07:04.840 --> 07:12.840
that's an acto that would use the IMAP session associated to the user of that session.

07:12.840 --> 07:23.840
To start another actor that would be a binding to the IMAP NIO library.

07:23.840 --> 07:31.840
So, that is actually coding the main server and doing the IMAP operation.

07:31.840 --> 07:37.840
Then, when we come back another actor would just record the timings

07:37.840 --> 07:44.840
and be able to do the checks on the response and continue on with the scenario.

07:44.840 --> 07:50.840
So, basically, most of the job was binding designing all those actions

07:50.840 --> 08:00.840
and plugging them with actors onto existing IMAP NIO commands.

08:00.840 --> 08:08.840
Life is complicated and full of wrong assumptions and things like IMAP NIO.

08:08.840 --> 08:20.840
Sadly, it was offering no builds on very old Java abstractions that don't allow you to get notified

08:20.840 --> 08:24.840
when the result of your asynchronous operation is actually done.

08:24.840 --> 08:29.840
So, we had a contribution to the IMAP NIO library, which was accepted by Yahoo.

08:29.840 --> 08:37.840
So, this is how the DSL actually look like.

08:37.840 --> 08:42.840
So, that's no pointer.

08:42.840 --> 08:47.840
So, basically, the idea is that we just name our action.

08:47.840 --> 08:52.840
Then, use the DSL and give, for example, the login.

08:52.840 --> 08:59.840
So, you have, you can see that you can store and also access information on the session.

08:59.840 --> 09:04.840
So, there's that's what the dollar thing is actually doing.

09:04.840 --> 09:06.840
And we can chain actions.

09:06.840 --> 09:08.840
It's also quite interesting.

09:08.840 --> 09:11.840
So, you've got things like waiting.

09:11.840 --> 09:16.840
And you've got all the structure control on building the Gettling DSL.

09:16.840 --> 09:23.840
So, for example, do randomizing things and what we are trying to do it.

09:23.840 --> 09:31.840
So, I omitted much of the code for simplicity here is to replicate the load that we have on actual server

09:31.840 --> 09:38.840
that we can see on our metric system and try to replicate exactly the proportion in that scenario

09:38.840 --> 09:41.840
to try to be as relevant as possible.

09:41.840 --> 09:52.840
And below is the scenario, we actually fit things and we execute the initial connection

09:52.840 --> 09:59.840
and then during the scenario direction, we just do some new action and pose like this.

09:59.840 --> 10:04.840
So, that's, but again, that's not getting a map.

10:04.840 --> 10:10.840
That's how we use Gettling aim up to actually build our simulation at Nagoha.

10:10.840 --> 10:18.840
This structure is ours and you can do whatever you want with it.

10:18.840 --> 10:23.840
So, then once we get it, we can plug a feeder.

10:23.840 --> 10:33.840
We can choose the injection profiles with the inject with a flat profile 10,000 people during the injection period.

10:33.840 --> 10:44.840
It's actually a nice model because if you decide to eject people, you can inject people very quickly and do endurance tests during a long, long, long time.

10:44.840 --> 10:49.840
They're just playing on the scenario duration or on the injection profile.

10:49.840 --> 11:02.840
You can keep injecting people during the entire scenario duration to actually have always new user coming through the simulation.

11:02.840 --> 11:13.840
And you can also do a breaking test by taking a number of user that is very high and see how high you can go.

11:13.840 --> 11:22.840
So, once we are done, we can use scalar build tool to actually run Gettling.

11:22.840 --> 11:31.840
So, you've got a very nice execution report on the CLI.

11:31.840 --> 11:43.840
And once you're doing that may be a mistake from the guy writing the slide.

11:43.840 --> 11:48.840
So, imagine that it's not a J but a I.

11:48.840 --> 11:53.840
Sorry, this one hopefully is I map.

11:53.840 --> 11:56.840
So, that's the execution report.

11:56.840 --> 12:01.840
The top graphic is pure propaganda, that's not the interesting part.

12:01.840 --> 12:04.840
The interesting part is in here.

12:04.840 --> 12:11.840
And hopefully you are able to unfold the actual section.

12:11.840 --> 12:17.840
So, what is interesting is that you've got meantime, meantime sucks.

12:17.840 --> 12:30.840
You have interesting things like 99%, which gives a better representation of the worst case that your user will be exposed to.

12:30.840 --> 12:37.840
You can see also the number of operations that is done per minute, a per second, sorry.

12:37.840 --> 12:41.840
And the percentage of failure that you have.

12:41.840 --> 12:50.840
And basically, after that, we also are able to see here that's the injection profile.

12:50.840 --> 12:56.840
How much user were active on the system over time.

12:56.840 --> 13:03.840
And here is the number of requests per seconds that the system succeeded to handle.

13:03.840 --> 13:06.840
And above you see the response time.

13:06.840 --> 13:09.840
So, when you're Java based, it's actually quite interesting.

13:09.840 --> 13:14.840
You can see, for example, your garbage collections.

13:14.840 --> 13:25.840
What is also interesting is that you can get that exact same two graphs for each of the I map command above and see the revolution of the time.

13:25.840 --> 13:30.840
So, I'm almost so basically you can use it.

13:30.840 --> 13:33.840
It's HGPL V3.

13:33.840 --> 13:43.840
It may not be a modern version of getting V3 to go contribute a great if you want.

13:43.840 --> 13:53.840
We also have some additional related resources, but that is not released as a separated DSL.

13:53.840 --> 13:57.840
That is the J map simulation.

13:57.840 --> 13:58.840
I'm sorry.

13:58.840 --> 14:07.840
And we also have SMTP tool that sucks for doing the same thing because when you have a hammer by every screw looks like a nail.

14:07.840 --> 14:14.840
We also had, but it's not open source, it's for private deployment validation.

14:14.840 --> 14:17.840
A tool for provisioning data.

14:18.840 --> 14:33.840
So, creating tons of folder of emails on to with nice distribution of email sizes for the virtual user that we've been seeing before.

14:33.840 --> 14:41.840
And of course, it complements all of Apache Jamestack performance toolbox.

14:41.840 --> 14:43.840
That's it.

14:43.840 --> 14:46.840
Do you have questions?

14:47.840 --> 14:50.840
Thank you.

15:04.840 --> 15:05.840
Correct.

15:05.840 --> 15:09.840
And Jameter is an Apache project.

15:09.840 --> 15:19.840
I would say, Jameter from what we've been looking at first did not have built in I map connector.

15:19.840 --> 15:32.840
Second, I think, don't have, I've, we've not been seeing that a domain specific language that was that well developed on top of Jameter.

15:32.840 --> 15:42.840
And actually the DSL was very important to us so that we could replicate any client behavior easily in the performance test.

15:42.840 --> 15:48.840
Does it answer a question?

15:48.840 --> 16:01.840
And there's another point which is architecture which is if I'm correct, I'm not an expert here, but Jameter is actually using a fret pool to simulate things.

16:01.840 --> 16:03.840
It's way less efficient.

16:03.840 --> 16:16.840
So, we were able to simulate easily 20 to 50,000 people with a single machine, which Jameter to my knowledge is not able to do.

16:17.840 --> 16:20.840
So, I think it was a cute.

16:20.840 --> 16:26.840
I didn't get fully if Jameter part of the test with this functional or not.

16:26.840 --> 16:31.840
If it would be, did you ever run it in a comparative fashion?

16:31.840 --> 16:34.840
Can you give any insights about it?

16:34.840 --> 16:37.840
It's the same server shootout.

16:37.840 --> 16:38.840
Okay.

16:38.840 --> 16:43.840
So, the Jameter test shoot is fully functional.

16:44.840 --> 16:51.840
It's way easier to write because Jamep is HTTP and JSON based.

16:51.840 --> 17:01.840
So, basically, the code is way less complicated than for IMAP and to our for us.

17:01.840 --> 17:07.840
It's directly part of our Gatling tooling and not extracted as a separated library.

17:07.840 --> 17:13.840
If there's interest in that, I think we would be happy to extract it.

17:13.840 --> 17:20.840
And for the comparative section.

17:20.840 --> 17:29.840
So, we, of course, measure better performance with Jamep.

17:30.840 --> 17:39.840
However, this tool, don't directly reflect direct user and direct usage.

17:39.840 --> 17:45.840
We try to fit to something that is representative.

17:45.840 --> 17:50.840
But part of the design choice is that it's willing.

17:50.840 --> 17:56.840
We don't invest extensive amount of resources to try to replicate the real world.

17:56.840 --> 18:00.840
It's this versability and complexity and so on and so on.

18:00.840 --> 18:07.840
That we are reduction of compute is at least factor two for us.

18:07.840 --> 18:12.840
And our specific implementation of JMAP behind it.

18:26.840 --> 18:32.840
Thank you very much.

