WEBVTT

00:00.000 --> 00:13.640
and I am rather pleased to introduce another protein talk. As some of you may know, I am kind

00:13.640 --> 00:21.880
of a protein person. So the other aspect of proteins apart from the structure is annotation.

00:21.960 --> 00:32.120
So having a talk about protein from Boralia, I am a Lufjani, so you are physically based

00:32.120 --> 00:36.040
on the VBI, aren't you? You are not visiting, yeah, yeah. So I came with Jai, I think it's the

00:36.040 --> 00:39.160
night coming too fast and over two.

00:39.160 --> 00:53.560
Oh, I look maybe, should I do that? I reset, just. It's good that I tried before, I was working

00:53.560 --> 01:04.360
and made this. It's not seeing anything. I've got a third one if this one doesn't

01:04.360 --> 01:20.160
go. Yeah, I'm loving you, you are. So I mean, yeah, there we go. It's thinking about it.

01:20.160 --> 01:27.440
Thinking about it. Thinking about it really hard and then getting out. Yes, okay. People thought

01:27.440 --> 01:36.320
that's can take so long. Okay, let's get started. So I, everyone, I'm Orientalisani,

01:36.320 --> 01:41.200
I'm a project lead in the Uniprot team. So the talk is spot vista, open source protein

01:41.200 --> 01:48.400
feature visualization with reusable web components. So as I said, I work at Uniprot is a

01:48.400 --> 01:52.320
comprehensive, high quality, publicly accessible resource of protein sequence and function

01:52.320 --> 01:57.760
information for the interest of the talk of today. We're going to focus on annotations.

01:57.760 --> 02:02.640
There are specifically parts of the protein specific positions of the protein sequence and the

02:02.640 --> 02:08.880
need that we had was to visualize those in some kind of visualization. So Uniprot is a

02:08.880 --> 02:14.640
consultant composed of AnboleBI where I work and also PIA in the US and CB in Switzerland.

02:15.360 --> 02:21.280
So 10 years ago, when I started IDBI, I was actually in another team in Interpro and I was

02:21.360 --> 02:29.120
tasked with adding to the Interpro website a visualization that has been done by the PDBE team.

02:29.760 --> 02:35.360
The issue is that the website that we had was in React. The PDBE visualization was in AngularJS,

02:35.360 --> 02:41.040
actually the first version of Angular and it just won't work. There was no way to make it work,

02:41.040 --> 02:46.400
actually to make it talk to each other, basically those two bits of code. So I ended up having

02:46.400 --> 02:51.200
to just rewrite the thing, which was a bit of a pity. So at the time we started talking

02:51.200 --> 02:56.560
within the different teams at the DBI to find a way to do visualization that we could share

02:57.360 --> 03:01.760
regardless of the framework that we would be using and also share outside of the DBI for

03:02.800 --> 03:08.000
for usage by other teams. That's when we started working on Nightingale.

03:08.960 --> 03:13.360
So Nightingale is an abundance of visualization library of standard web components. This is focused

03:13.360 --> 03:17.040
on protein visualization that has had before annotations within the protein sequence.

03:19.040 --> 03:25.200
Those are composable components that can be associated in different ways. So you can just use

03:25.200 --> 03:33.040
one component or a bunch of them together and it's interpretable with of standard components regardless

03:33.040 --> 03:38.640
of the underlying framework. So so compatible with any component that would be following the

03:38.720 --> 03:46.400
Nightingale APIs, which is just a specific set of web standard API that we've decided to use

03:47.440 --> 03:54.560
in order for those components to talk to each other. So we started using web components.

03:55.360 --> 03:59.760
10 years ago it was not well supported. So it was a bit of a challenge to start but then

04:00.560 --> 04:07.920
support is way better now. So this is a work component. It's a group of APIs of the browser that

04:07.920 --> 04:14.560
includes custom elements, Shadowdom HTML templates and we use them to be able to develop those

04:14.560 --> 04:20.960
components. So the good thing with that is not dependent on any framework. So regardless of if

04:20.960 --> 04:28.880
he used React or any related, like next or if he used Angular or if he used you, you should be able

04:28.880 --> 04:34.720
to work those components because they don't depend on those framework. And also if you don't use any

04:34.720 --> 04:40.160
framework, that would also work. We tried to limit the number of dependency that we use within

04:40.160 --> 04:47.120
those components. You know, here the sequence length. So from 1 to 770, we have some domains

04:47.120 --> 04:52.800
drawn there. But actually on the Interpro website, so a different website, we're able to have the same

04:52.800 --> 04:58.800
components look a bit different, but actually they're all using the same components underneath.

04:58.800 --> 05:03.840
And they're being fed different bits of data depending on the website. But they work in the same way.

05:05.600 --> 05:11.520
And this is the same for PDBE. PDBE also uses some Nightingale components in their own specific

05:11.520 --> 05:19.360
way with their own style. So for Uniprot, there was a specific need to have a turnkey component with

05:19.360 --> 05:26.320
all of the components that were important for us to view. And that's where a product size,

05:26.320 --> 05:31.840
property size, basically a combination of those Nightingale components are sampled in a specific

05:31.840 --> 05:38.640
way by the Uniprot team fed Uniprot data and having some extra features on top of that.

05:38.640 --> 05:43.840
It's yourself a web component wrapping over web components inside. And the good thing with that

05:43.840 --> 05:49.760
is that you can just use that one and then you will use all the underlying components together

05:49.760 --> 05:56.320
assemble this in a specific way. So the viewer is composed of tracks. The tracks are the

05:56.400 --> 06:00.480
Nightingale components that we saw before. They are the fundamental building blocks.

06:01.680 --> 06:08.160
And each track as we saw before can be used individually or in this case they can be combined together.

06:09.760 --> 06:15.120
And product size, so the wrapping thing, the whole thing is associated them together, fetching

06:15.120 --> 06:21.520
the data from new plot APIs and assigning each bit of data to each track responsible to

06:21.520 --> 06:25.520
displaying their own thing. That could be variance, that could be domains, et cetera. And actually

06:25.520 --> 06:32.400
the good thing with that is we also have this structure visualization. There is itself another track.

06:32.960 --> 06:38.400
And that could work with the rest of the tracks and it can work together interact with each other.

06:40.480 --> 06:44.480
So the architecture is that we have a manager and then within the manager you have all the tracks

06:44.560 --> 06:51.360
and the manager is in charge of listening to what all the tracks are saying. Let's say when the

06:51.360 --> 06:57.280
user interact with them and then propagating that to the other tracks in there and also assigning

06:57.280 --> 07:05.040
the data to the right track. So as an example, if a user click on a specific variant in one of the

07:05.040 --> 07:13.440
track then the manager, sorry, the event itself will emit a standard event that will be picked up

07:13.600 --> 07:20.400
by the manager and the manager will go to all the components within itself to be able to assign

07:20.400 --> 07:26.400
that information to all the tracks which means that you can highlight in one track and you will

07:26.400 --> 07:31.440
highlight on all the track even on the district to viewer and same thing you can keep in

07:31.440 --> 07:38.640
sick the zoom, the pan, the panning of the visualization or together. So under the hood, what do we

07:38.720 --> 07:44.560
have? We have the lead library for building reasonable work component. We'll serve the 3GS for

07:44.560 --> 07:51.280
the data driven rendering with SVG and canvas. Also recently we started using canvas to run

07:51.280 --> 07:57.520
the specific tracks and that led to web better performance for when we had a lot of rotations

07:58.240 --> 08:04.320
and we still have a SVG overlay on top of that. So we made the move to canvas because we are

08:04.400 --> 08:10.320
previously using on the SVG but the more data you have on the screen, the more heavy in memory this

08:10.320 --> 08:14.480
and that would just not be able to scale with the new amount of data that we'll get every day.

08:16.320 --> 08:21.200
So this is the work that we did recently, performance optimization. You can see the by

08:21.200 --> 08:28.640
number of annotations and here initial lot time and here the interaction time or refresh time

08:29.120 --> 08:35.280
and the SVG presentation is in blue and canvas implementation is in green and you can see that

08:35.280 --> 08:43.200
we had some threshold of acceptable time for those two metrics and we managed to reduce that by a lot

08:43.200 --> 08:51.920
by using canvas it's the SVG. So we had some challenges. I mean it's been 10 years since we

08:51.920 --> 08:59.120
started working on that. This is not a focus that sorry a project that one person is completely

08:59.120 --> 09:04.480
signed on so it's a bit of a small work as step by step and so at the beginning we tried to keep

09:04.480 --> 09:10.880
it pure and not having any dependency at all but in the end especially when integrated new developers

09:10.880 --> 09:16.240
in the project we realized that we needed to integrate some libraries, some lightweight libraries

09:17.200 --> 09:24.320
to avoid food guns when we had new developers in the project. We also embraced recently

09:24.320 --> 09:31.120
type script so the compile code is not in type script so anyone can use it but actually if someone

09:31.120 --> 09:36.960
develops they will be able to have that enhanced experience by using type script and having those

09:36.960 --> 09:44.560
type annotations. We also have some challenges because all of those components were handled in a

09:44.640 --> 09:50.800
mono repo and it was a bit challenging to make sure that all of them were bundled

09:50.800 --> 09:59.520
independently without having too much data sorry, too much code in each of them and as I said before

09:59.520 --> 10:05.600
we are in the path of having the performance improvement to have a full transition to canvas

10:05.600 --> 10:10.800
we still have some components that are not in canvas still in SVG and we will explore webGL

10:10.800 --> 10:16.240
implementations. We also want to improve developer experience and use the experience because some

10:16.240 --> 10:21.280
users were asking for example to rearrange some tracks or to remove some tracks and so this is

10:21.280 --> 10:27.520
something that we would want to implement and a big test that we want to do is engaging with a

10:27.520 --> 10:34.400
wider community not just the EBI and so this is something that we will do soon. I just wanted to

10:34.560 --> 10:40.080
specifically the software sustainability institute ground that we managed to get so from tomorrow

10:40.080 --> 10:45.200
and during one year we will get some money to be able to work on the sustainability of this specific

10:45.200 --> 10:53.920
project by organizing one hackathon having some of his hours for developers and also engaging

10:53.920 --> 11:02.080
with a community more. Here are some links to the different reports and I don't know if we have

11:02.160 --> 11:13.520
them for a bit of questions yeah. Questions and congratulations to keep you taught there.

11:15.120 --> 11:19.200
So whichever whilst you're taking questions other than questions.

11:21.200 --> 11:24.720
If they're in non I mean I can just feel a little bit oh yeah there's one.

11:25.120 --> 11:36.080
So actually this is more star within nightingale so this is a wrapper this is the one that

11:36.080 --> 11:41.440
is not really lightweight to be honest because if you don't use the structure viewer this is

11:41.440 --> 11:46.480
quite lightweight but if you want to have the structure viewer within it we have the nightingale

11:46.480 --> 11:52.560
component which is a wrapper to be able to talk if the same language with the same API with the

11:52.640 --> 11:57.360
other components but inside it it's interactive with the most eye library which in itself is

11:57.360 --> 12:03.280
quite big I mean yeah this is something that we're discussing with the PDV team to be able to

12:04.560 --> 12:09.680
extract just a bit the we need and so hopefully that's something that we can do in the future.

12:10.640 --> 12:25.120
And so yeah just so I like that the thing that I said before about the software sustainability

12:25.120 --> 12:30.480
institute we will announce through LinkedIn I guess this is a main communication channel

12:31.520 --> 12:38.720
the hackathon the different office hours that we'll be doing from now until next year and hopefully

12:38.720 --> 12:44.320
this means that we will be able to fix and improve all the code that some of the code

12:44.320 --> 12:49.680
might be 10 years old so this is something that we really need to update and and actually have time

12:49.680 --> 12:56.640
to spend on improving that so that hopefully the community can take over and we can get contributions

12:56.640 --> 13:06.080
even outside of API.

