WEBVTT

00:00.000 --> 00:10.560
All right, everyone, help me welcome to the stage Hendrik, Hendrik is a core contributor

00:10.560 --> 00:17.400
to TLS TLS Notary and he's going to be giving us a technical introduction today to ZK TLS

00:17.400 --> 00:18.920
and TLS Notary.

00:18.920 --> 00:23.640
These are systems that use multi-party computation and zero-knowledge proofs to allow users

00:23.640 --> 00:26.680
to prove the authenticity of private HTTPS data.

00:26.680 --> 00:27.680
I'll let him take it away.

00:27.680 --> 00:29.480
Let's help him help me welcome him.

00:29.720 --> 00:32.800
Good morning.

00:32.800 --> 00:35.800
Test tests, is it on?

00:35.800 --> 00:36.800
Yep.

00:36.800 --> 00:37.800
Perfect.

00:37.800 --> 00:44.080
Okay, good morning, my name is Hendrik, as thank you for the great introduction.

00:44.080 --> 00:50.640
So we're here decentralized internet, we use the internet every day to do things that matter

00:50.640 --> 00:58.520
to us, important stuff, fun stuff and we do it to access thank you information, exam

00:58.520 --> 01:08.080
results, reputations on systems and TLS HTTPS service really well, but it's a protocol built

01:08.080 --> 01:10.040
between two parties.

01:10.040 --> 01:14.920
So as soon as we want to start sharing data, it gets difficult.

01:14.920 --> 01:21.560
So HTTPS makes sure communication is secure, but it's hard to share that data.

01:21.560 --> 01:27.480
And in this talk, I will present you a ways about fixing that, so about turning ordinary

01:27.480 --> 01:35.080
HTTPS transcripts into cryptographic proofs that you can use to share and all of this

01:35.080 --> 01:40.600
while keeping the user in full control over its privacy.

01:40.600 --> 01:47.120
So first, let's talk about TLS, so that's the S in HTTPS.

01:47.120 --> 01:52.680
It has been mentioned a few times already, so let's dive a little bit deeper.

01:52.680 --> 02:00.680
So it is a security protocol, it's using a key public infrastructure, so certificates

02:00.680 --> 02:06.200
to make sure that when you receive data over the internet, you can also check that the data

02:06.200 --> 02:09.800
indeed comes from the source, it claims to be.

02:09.800 --> 02:18.480
It uses key exchanges and encryption so that you also know that nobody was able to mess

02:18.480 --> 02:27.720
with the data in the middle, and that also, nobody can e-strap, so it's encrypted, and

02:27.720 --> 02:30.720
there's a message authentication code.

02:30.720 --> 02:38.040
So today I will just talk about HTTPS, TLS, and will mix them the whole time, but know

02:38.040 --> 02:45.160
that it's at TLS is also applicable to more than just HTTPS, so other examples are secure

02:45.160 --> 02:51.640
to web sockets or FTP, for example.

02:51.640 --> 02:57.240
But so the challenge is what if Alice receives data from a website and wants to share that

02:57.240 --> 03:06.320
with Bob, so why isn't this trivial to do that's because TLS uses symmetric encryption?

03:06.320 --> 03:10.920
Why does it use symmetric encryption for performance?

03:10.960 --> 03:16.520
So that means that Alice has the same key as the website, so if she would just give the data

03:16.520 --> 03:23.080
to Bob, she could just change things.

03:23.080 --> 03:26.040
And so what systems do we use today?

03:26.040 --> 03:33.480
Unfortunately, a lot of people just share screenshots and still think that is safe, but in the

03:33.520 --> 03:42.120
H of AI, that's just the same as the trust me bro, security system, unfortunately there

03:42.120 --> 03:48.200
is also even if it's against the terms of service and for lots of applications.

03:48.200 --> 03:56.360
Also, people just sharing their passwords, which is a whole other mess that we definitely

03:56.360 --> 03:59.040
should avoid.

03:59.040 --> 04:06.520
Before I go to solutions, maybe re-emphasizing why sharing data is important.

04:06.520 --> 04:12.600
There's lots of websites, there's government websites, banks, social platforms, gaming.

04:12.600 --> 04:19.080
And there's definitely lots of scenarios where there are bobs that would want to see private

04:19.080 --> 04:24.000
data from Alice on this often walled gardens.

04:24.000 --> 04:29.200
I say you want to start a business, then it's very hard to compete with the big platforms

04:29.200 --> 04:35.200
just because that access isn't there, but also here in the open source community, cross-rooted

04:35.200 --> 04:39.280
social, peer-to-peer finance systems.

04:39.280 --> 04:47.720
Also things like internet archive, where it could be that users can get them data in a very

04:47.720 --> 04:54.520
ifiable way that would open up way more data sources for these archives and other things

04:54.520 --> 04:56.880
like Ethereum or Ecos.

04:56.880 --> 05:04.320
And so what we want to do is unlock this web data that's locked in these walled gardens

05:04.320 --> 05:10.480
so that it can be shared without intermediaries.

05:10.480 --> 05:16.080
Why do we want to do it is often how history goes.

05:16.080 --> 05:20.080
We start with a system that's very open.

05:20.080 --> 05:30.720
They attract users to grow, but as soon as they reach their selling points, it's extraction

05:30.720 --> 05:38.160
time and instead of helping the user, they just try to extract as much value as it can.

05:38.160 --> 05:44.400
The same web cooperating APIs get closed after a while in the beginning, all we built

05:44.400 --> 05:53.200
this together and then suddenly the selling point is reached the gates of the wall garden

05:53.200 --> 05:56.800
or closed and it's full competition mode.

05:56.800 --> 06:03.600
So unfortunately we have seen this pattern time and time again and so we need to find

06:03.600 --> 06:08.080
against that.

06:08.080 --> 06:18.400
So what's other alternatives are there for a portable HTCPS data that actually there is

06:18.400 --> 06:24.560
a very simple solution and that would be signing the data at the source.

06:24.560 --> 06:32.160
Unfortunately we've seen this slide right before there is no incentive for the big platforms

06:32.160 --> 06:42.240
to allow competitors so unfortunately that my work for government websites in use cases

06:42.240 --> 06:48.320
but it's now the generic solution definitely not something that small players can

06:48.320 --> 06:50.320
can enforce.

06:50.320 --> 06:57.840
Other options are a lot for example, but again you need cooperation from the website and

06:57.840 --> 07:03.840
also you introduce another party that you need to trust and there's another points where

07:03.840 --> 07:07.360
censorship can happen.

07:07.360 --> 07:17.680
So I present you a new player that's CKTLS which can help to fight those small gardens.

07:18.880 --> 07:26.880
It does not only help to get data from Alice which I from now on will start calling Prover

07:26.880 --> 07:34.320
and Bob which I will call the Verifier so it now does not only help for Bob to verify that the

07:34.320 --> 07:41.760
data is authentic but also the Prover stays in full control over the privacy so what exact data

07:41.760 --> 07:49.760
is shared with the Verifier but I will of course explain all of that in the following slides.

07:49.760 --> 07:58.960
So it allows to verify web data without compromising privacy and in that way shift control

07:58.960 --> 08:01.440
back to users.

08:01.440 --> 08:09.680
So as CKTLS the CK stands for zero knowledge and it's an interactive fabrication.

08:09.680 --> 08:18.000
So if there are photographers in the room and they hear a CK they start making assumptions

08:18.000 --> 08:26.880
so it's not publicly verifiable it's now at TLS between three parties and because it is TLS

08:26.880 --> 08:32.560
the three parties need to be online because they will do the TLS session together.

08:35.120 --> 08:44.640
And then there's also a second phase so first they will do the TLS session and the Prover will have

08:45.200 --> 08:53.680
the Verifier participates and sees encrypted information and the Prover will commit to that encrypted

08:53.680 --> 09:00.480
information so at the end the Prover can make a proof based on those commitments which the

09:00.480 --> 09:10.560
Verifier can check when the when data is disclosed through a selective disclosure part.

09:11.200 --> 09:16.880
So there's actually two approaches so you see a server approval and a Verifier that's the

09:16.880 --> 09:23.920
three parties here and it really depends on who does the actual TLS session with the server.

09:23.920 --> 09:31.520
So there's proxy mode which is the simplest where the Verifier is in between the Prover

09:31.520 --> 09:38.000
and the server and there is the NPC mode where it's the Prover that talks to the

09:38.000 --> 09:44.000
the web server and the Verifier checks the result in the end. So they have all different

09:44.000 --> 09:50.320
security privacy and resource tradeoffs so let's zoom in into proxy mode first.

09:51.840 --> 09:59.280
So in proxy mode it's very similar to regular TLS so here the Prover has the

09:59.280 --> 10:06.320
there's a key exchange with the server and the Verifier just observes everything that happens

10:06.480 --> 10:16.800
and sees the encrypted information. One risk here is that if the Prover has control over the

10:16.800 --> 10:26.400
network then the Prover could fool the Verifier but in lots of use cases this is enough of a

10:26.400 --> 10:32.080
trust mode to have but if there's lots of money involved or it's really important then you should

10:32.080 --> 10:39.600
not use this mode. There's also privacy tradeoffs so because the Verifier communicates with the

10:39.600 --> 10:51.680
website the the Verifier will also know who that website is and another big challenge of this mode is

10:53.440 --> 11:01.120
that the server will see lots of requests from that Verifier and will start to see patterns there

11:01.120 --> 11:09.200
so this is also easy to censor from the server side. So from the server side this looks like

11:09.200 --> 11:17.360
normal TLS traffic so that's not something but if there's like if that is an Amazon server

11:18.640 --> 11:26.480
then it's often already blocked by default by by many websites so that could also be a reason

11:26.480 --> 11:34.960
to look at different solutions but it's a relatively lightweight solution to to follow this approach.

11:36.240 --> 11:44.000
The other and the most secure approach is to use multi-party computation so that's a cryptographic

11:44.720 --> 11:53.120
trick that from the applied cryptography that we can use and here instead of the Verifier or the

11:53.120 --> 12:00.880
Prover having the TLS key a good mental mode to have is that they each have a key chart so they

12:00.880 --> 12:12.480
have to work together to do the TLS transaction and NPC allows you when so both parties prove

12:12.480 --> 12:19.680
and verify have a private part that's their key share but still they can compute a public function

12:19.760 --> 12:27.920
to do the things that needs to happen with the server. So in this setup has a stronger privacy

12:27.920 --> 12:34.880
guarantees because the Prover here is the only one who talks to the server so they could also have

12:34.880 --> 12:42.960
like completely blind proofs to the Verifier so that could even be presenting a proof without actually

12:42.960 --> 12:50.400
knowing what exact server was consulted. So that's already more exotic use cases but just to say

12:50.400 --> 12:59.680
that this is a very generic and safe approach. The only downside is that using NPC has an overhead

12:59.680 --> 13:06.240
so it does require way more bandwidth between the Prover and the Verifier than the actual

13:07.040 --> 13:17.600
request and also yeah there's a cryptography that needs to be computed so there is also more

13:18.400 --> 13:27.600
compute requirements. The second part of CKTLS is the selective disclosure part so we used

13:27.600 --> 13:35.520
either proxy or NPC to do the TLS transcript at that point everybody only has encrypted information

13:35.520 --> 13:43.840
after commitments it can be decrypted and then the Prover can decide what exactly is shared with the

13:43.840 --> 13:53.920
Verifier and the simplest way to do that is just reduction. So in reduction you just say you did

13:54.960 --> 14:00.880
an HTTP request where there is an authentication code in the headers that's definitely something

14:00.880 --> 14:08.080
you do not want to share with anybody else so you can cover that and say that I only need to

14:08.080 --> 14:17.360
prove a user name on a system I could just disclose that JSON data and redact everything that

14:17.360 --> 14:25.840
the website I'm using does not need. Of course there is things to be mindful of when you start

14:25.840 --> 14:34.880
using redacted data so you have to be careful when verifying that people are not trying to inject

14:34.880 --> 14:42.000
extra data and then with some smart reduction try to fool you that the user name is something else

14:42.640 --> 14:49.760
and also if you use the length of fields and you have a Boolean there be careful

14:50.560 --> 15:01.760
just redacting true or false will disclose the actual answer. And of course we were in the field of

15:01.760 --> 15:08.560
applied cryptography so we could instead of using redactions we could also use a Zebra knowledge

15:08.640 --> 15:19.440
proofs where we just disclose use the displacements in the first phase and hash commitments

15:19.440 --> 15:27.280
and then prove any statements about the transaction to the the Verifier and the Verifier and

15:27.280 --> 15:37.040
also has full guarantees that the data that is shared or the statement that is shared is true.

15:39.040 --> 15:49.040
And there you could use any proofing system that you want. So why am I explaining all of this

15:51.040 --> 15:57.840
first because we needed but also because we built an implementation of CKTLS in the TLS Notary

15:58.160 --> 16:08.720
project so it's Apache 2 or MIT to your choice and we have been building a rust implementation

16:08.720 --> 16:21.760
since 2022 under sponsorship of the Ethereum Foundation and so we built a CKTLS protocol using

16:21.760 --> 16:33.760
MPC. This is not just the proof of concepts so it's actually already used in the wild so

16:33.760 --> 16:44.480
there's for example CKP2P that uses CKTLS or web proofs to prove that bank transactions happens.

16:44.480 --> 16:52.880
So it's a real peer-to-peer crypto on ramping platform where you don't need a bank or central

16:52.880 --> 17:04.240
exchange you just have CKP2P managing contacting people and then getting people onboard it.

17:04.240 --> 17:13.440
Vouch is a system by Vlayer also to do all kinds of web proofs for different applications.

17:14.480 --> 17:23.920
Keyring is a company using it for to reuse KYC know your customer implementations that you

17:23.920 --> 17:31.600
not always have to upload your passport and stuff but you can reuse existing platforms.

17:34.240 --> 17:42.480
The most natural place to use CKTLS is in the browser because that's where all your

17:43.360 --> 17:51.440
passwords and cookies live and so it's written in Rust can be compiled to web assembly so we

17:51.440 --> 17:59.760
also have a demo extension that you can install just to play around with CKTLS in practice

18:00.480 --> 18:07.120
and it's also a plugin based system so you can just plug in different applications so I will demo

18:08.320 --> 18:18.640
using it for Spotify but you could use it for Twitter or other things just to make proofs of data

18:18.640 --> 18:21.760
so that you can use it in different places.

18:22.080 --> 18:33.600
So I will have a recorded demo not to run into problems here so as a website we will use Spotify

18:36.080 --> 18:42.800
the proofer in this example will be my browser that connects to a demo application

18:43.520 --> 18:52.720
that connects to Spotify and then the very firing this case is just a web service that

18:52.720 --> 19:01.680
we see that does the MPC verification and then just response back what it actually used so let's see

19:02.960 --> 19:09.280
I hope this works so on top we have the website where the proofer runs and the

19:10.160 --> 19:18.480
very fire runs in the bottom here so I opened a Spotify app and on first the proofer has to agree

19:18.480 --> 19:24.880
that he wants to run a plugin and user can cancel at any point so this

19:26.560 --> 19:36.000
extensions opened the Spotify website I'm now logging in my account so this works with any

19:36.000 --> 19:44.960
workflow where you need to get the authentication code but so any second now I'm logged in and now

19:44.960 --> 19:54.320
our extension will detect that the authentication code was found so now the request to get my

19:54.320 --> 20:04.240
favorite Spotify artist can be started and so if we go see the back end and I know sorry this is

20:04.800 --> 20:12.320
the extension logs where you can see that the multi-party computation is happening and here you can

20:12.320 --> 20:20.800
see my favorite chess artist and this is the server logs and here you can see that if you have good

20:20.800 --> 20:27.600
eyes that I did this on Thursday and that this is the data that was disclosed so this was using

20:27.680 --> 20:39.040
just simple reductions to disclose the data so this was a very quick demo but this is a primitive

20:39.040 --> 20:46.080
that you can use in many different ways so not all use cases are proofers the user

20:46.880 --> 20:54.400
very fire is a server it could also be reversed so for example the second item there we have a

20:54.400 --> 20:59.760
demo where the website is actually the very fire and it's the server that proves some

21:00.560 --> 21:08.240
access to private data if you want to play around to demo I just showed try the the first bullet

21:09.520 --> 21:17.040
we also have a tutorial on building plugins or you could also just dive deeper use it in native

21:17.040 --> 21:26.640
first applications build whatever but so the message is if you want to build an app or whatever

21:26.640 --> 21:34.240
where you need to to be able to verify data from a user in a privacy respecting way try

21:34.240 --> 21:43.040
it's notary and we are very curious to see what you guys will build and so we want to open the

21:43.040 --> 21:59.760
web and without needing to change current wall gardens so thank you all right we do have time for

21:59.760 --> 22:05.280
one question for we close this out there we go right here and don't worry you can always

22:05.360 --> 22:11.280
start in active presentation as well thanks so much for the presentation I'll try to keep it quick so

22:12.000 --> 22:17.760
my impression of zero knowledge protocols is that they are super computationally expensive

22:17.760 --> 22:23.680
maybe even network expensive so can you perhaps comment on how expensive this is computational

22:23.680 --> 22:28.400
compared to well the current state which is no privacy but just TLS

22:28.480 --> 22:38.240
oh yeah my mic is still on sorry I was waiting for the spatula there's there's definitely

22:38.240 --> 22:46.560
an overhead I don't know the latest numbers but like if you want the Spotify demo that's a few

22:46.560 --> 22:57.200
kilobytes that you get from Spotify that's multiple it's like the 50 megabytes I think that's

22:57.280 --> 23:04.960
now it's changed between proof and verifier and then for speed so this was on a regular laptop

23:04.960 --> 23:12.400
but it also runs on phones the computational overhead is limited it's way bigger than just TLS

23:13.040 --> 23:18.000
but it's not so you decide a couple of seconds you have your proof

23:18.000 --> 23:26.800
all right one more round of applause for Hendrick everyone thank you very much

