WEBVTT

00:00.000 --> 00:15.000
So, on top, who knows who it is, and everybody dealt with Zoom killer errors before.

00:15.000 --> 00:17.000
They're kind of a pain, right?

00:17.000 --> 00:23.000
So, like, your programs running along, and somebody allocates memory that might not be your program,

00:23.000 --> 00:28.000
and the kernel decides that your program is the fat program that's using too much memory.

00:28.000 --> 00:34.000
It's killed, and then in your demessage log, you see some kernel stack trace and nothing about, you know,

00:34.000 --> 00:38.000
what your program is doing at the time, so it's like, it just got nuked.

00:38.000 --> 00:44.000
So, this can be thought if you're in, you know, production and trying to solve these issues, it can be a challenge.

00:44.000 --> 00:53.000
So, what is Zoom Proph, so, Zoom Proph is a set of EDPF programs that knows how to read,

00:53.000 --> 00:57.000
and it actually goes internal memory profile or data.

00:57.000 --> 01:02.000
And at Zoom time, it actually goes and reads that, records it to some EDPF maps,

01:02.000 --> 01:08.000
and then allows after your program is dead for your memory profile to be saved and uploaded or logged,

01:08.000 --> 01:10.000
or whatever you want to do with it.

01:10.000 --> 01:13.000
So, that's it, so we're done.

01:13.000 --> 01:23.000
So, the motivation for our signals provides the profiling solution.

01:23.000 --> 01:27.000
You know, CPU profiling is kind of our bread and butter, but memory profiling,

01:27.000 --> 01:31.000
and other forms of profiling are things we do as well.

01:31.000 --> 01:39.000
But, you know, if you're profiling memory, you may not get memory profiling typically works by scraping

01:39.000 --> 01:44.000
and programs, maybe like once every minute, once every five minutes, you scrape your p-prof

01:44.000 --> 01:48.000
endpoint and get a memory profile, but that might not be good enough, right?

01:48.000 --> 01:53.000
So, like, sometimes you have some allocation that is too varied,

01:53.000 --> 01:58.000
and you can just die right away with one allocation, or sometimes you have some, like, service failure

01:58.000 --> 02:02.000
and you start getting to some failure, retire loop, and you kind of can die right away

02:02.000 --> 02:08.000
and those allocations may not show up ever in a profile for using ad hoc profiling.

02:08.000 --> 02:12.000
So, that's the motivation.

02:12.000 --> 02:13.000
Not really.

02:13.000 --> 02:19.000
Actually, the motivation is that I've been battling, um, killing problems for a longer than I

02:19.000 --> 02:20.000
cared of in that.

02:20.000 --> 02:27.000
I used to work on databases, and most working on garbage collectors, and, you know, memory

02:27.000 --> 02:31.000
when these problems happen, it's like, it's got to be a better way, right?

02:31.000 --> 02:35.000
It's got to be able to figure out why this item happened and understand what happened.

02:35.000 --> 02:39.000
So, the real answer is that this was an issue I had while I was working on something else,

02:39.000 --> 02:44.000
and because poor things are such a cool company, they allowed me to go take a little while and work on this.

02:44.000 --> 02:51.000
So, anyway, so, for those who don't know, who's right, go programs.

02:51.000 --> 02:54.000
Alright, so you guys probably don't know all of this.

02:54.000 --> 02:58.000
Anyway, go has one of the cool features as a built-in memory profile,

02:58.000 --> 03:02.000
and it's just a statistical memory profile, or the basically just has a little counter,

03:02.000 --> 03:06.000
and every time it crosses a 512k, it's configurable.

03:06.000 --> 03:11.000
But every time that happens, it takes a sample, and then it takes that sample,

03:11.000 --> 03:16.000
it's just a call stack, and it sticks that to a hash table, and then that allows it to,

03:16.000 --> 03:20.000
you know, track your memory usage with very low overhead.

03:20.000 --> 03:26.000
So, you know, 512k is a pretty big number, so most allocations, zero overhead.

03:27.000 --> 03:34.000
And then, if a, you know, allocation gets tracked by the profile, a bit gets flipped,

03:34.000 --> 03:36.000
and so when that thing gets free, by the garbage collector,

03:36.000 --> 03:41.000
some stats are updated, so it knows when those things are free,

03:41.000 --> 03:46.000
and in addition to seeing allocation volume, you can also see in use statistics,

03:46.000 --> 03:48.000
which is usually what we care about, right?

03:48.000 --> 03:51.000
We don't care about all the garbage, all the things that were allocated free.

03:51.000 --> 03:55.000
We want to know what the kind of outstanding and use allocations are.

03:57.000 --> 04:01.000
But goes GC, so do we want to count garbage?

04:01.000 --> 04:04.000
Like garbage, you know, as things that are going to go away,

04:04.000 --> 04:07.000
but if you're dealing with loans, you do want to count garbage, right?

04:07.000 --> 04:11.000
So until the garbage collector gets to the sweet point of its cycle,

04:11.000 --> 04:14.000
that, you know, all your garbage is actually consuming memory.

04:14.000 --> 04:22.000
So, the way this works in the profile is that, as it's profiling, and according to Alex,

04:22.000 --> 04:26.000
and free, they get stored in this kind of like scratch,

04:26.000 --> 04:31.000
temporary, you know, future bucket, and then at the end of the sweep,

04:31.000 --> 04:37.000
all that, you know, future free, temp stuff gets swept into the active bucket,

04:37.000 --> 04:41.000
and then that's what's reported to you from the go pre-process end points.

04:41.000 --> 04:46.000
So the active bucket, the active, you know, profile is never going to have

04:46.000 --> 04:49.000
any garbage in it, it's like the end of the sweep.

04:49.000 --> 04:55.000
So what we do is we count all those things, so we roll it, not only just the active thing,

04:55.000 --> 05:00.000
the P-profit can read those future buckets, and so if, you know,

05:00.000 --> 05:04.000
the allocation to cause your problems was recently, the recent allocation,

05:04.000 --> 05:10.000
which can happen, we'll see those, even though they may not have show up in a memory profile.

05:10.000 --> 05:12.000
Does that make sense?

05:12.000 --> 05:18.000
So we're in the garbage collection cycles, but anyway, so the first step

05:18.000 --> 05:23.000
of doing improve is that, you know, we have a wash and program at scans,

05:23.000 --> 05:26.000
all the pids in your system, it finds the go programs,

05:26.000 --> 05:30.000
and then it tries to find this point or call the runtime and bucket.

05:30.000 --> 05:34.000
So this is the address of the actual memory profile buckets.

05:34.000 --> 05:40.000
So if it can find them, we'll register them, so we'll have a BBS map that says,

05:40.000 --> 05:44.000
these are the go programs, we're care about, these are the ones we know,

05:44.000 --> 05:51.000
we're the profiling data lives, so these are the ones we're going to watch for.

05:51.000 --> 05:57.000
This stuff is all implemented in the parker agent, and the Oom Prof code itself

05:57.000 --> 05:59.000
is a library that we just use.

05:59.000 --> 06:03.000
So you can use the Oom Prof library separate from the parker agent,

06:03.000 --> 06:06.000
but I'm not really going to talk about that because it's a little complicated,

06:06.000 --> 06:12.000
but if you're interested in that, you can see me after class.

06:12.000 --> 06:17.000
Anyway, so it works with a couple trace points,

06:17.000 --> 06:23.000
what you do is a marked victim trace point that allows you to see

06:23.000 --> 06:27.000
when the Oom killer is picked a victim,

06:27.000 --> 06:30.000
and so you can get the pid from that data.

06:30.000 --> 06:33.000
So we can do that to say, okay, here's the pid that's dying,

06:33.000 --> 06:37.000
is this in our go prox maps of the go process as we care about,

06:37.000 --> 06:41.000
and if it is, we know, oh, we care about this, so we can go read it,

06:41.000 --> 06:44.000
go read it's memory and we're done, right?

06:44.000 --> 06:48.000
Well, not really, it's a little more complicated,

06:48.000 --> 06:52.000
because the Oom killer runs in the context of some programs

06:52.000 --> 06:55.000
are allocating memory that may not be your program, right?

06:55.000 --> 06:58.000
So the marked victim, even here, program,

06:58.000 --> 07:02.000
can't just start reading data from your go program

07:02.000 --> 07:06.000
because the context may be systemd or some other program.

07:06.000 --> 07:09.000
So how do we solve that?

07:09.000 --> 07:14.000
So the way we solve that after much spelunking

07:14.000 --> 07:19.000
was to realize that the way that Oom works is it uses kill,

07:19.000 --> 07:22.000
so it sends a kill signal to the program that's dying,

07:22.000 --> 07:27.000
so we can hook on to the signal signal deliver trace point

07:27.000 --> 07:31.000
to see when our program receives this,

07:31.000 --> 07:36.000
and luckily for us, the signal deliver EVPF trace point

07:36.000 --> 07:39.000
occurs in the context of your program.

07:39.000 --> 07:43.000
So our EVPF program that's hooked up to the signal deliver trace point

07:43.000 --> 07:47.000
can restart reading the memory from your process,

07:47.000 --> 07:51.000
and it's still there and it can read it and it just kind of works,

07:51.000 --> 07:53.000
which is nice.

07:53.000 --> 07:56.000
So, you know, just to brief point about yes,

07:56.000 --> 07:59.000
there is some overhead attaching a trace point,

07:59.000 --> 08:01.000
just signal deliver, but, you know,

08:01.000 --> 08:05.000
typically signal deliver is not like a super high throughput thing,

08:05.000 --> 08:10.000
and typically it's not on the critical path

08:10.000 --> 08:13.000
for key workloads, but who knows,

08:13.000 --> 08:15.000
you know, which may vary, so.

08:15.000 --> 08:18.000
When it's out, measure.

08:18.000 --> 08:22.000
Let's see, let's set it.

08:22.000 --> 08:24.000
All right.

08:24.000 --> 08:31.000
So just to overview of, you know, how this all works,

08:32.000 --> 08:35.000
Parker agent, Parker agents are, you know,

08:35.000 --> 08:39.000
profiling agent that, that poor signals helps develop,

08:39.000 --> 08:43.000
and then it, installs the trace points and sets up the maps.

08:43.000 --> 08:45.000
And then it's gives the process,

08:45.000 --> 08:48.000
it doesn't register the ones that can find the bucket address,

08:48.000 --> 08:52.000
the one who happens, mark victim, signal deliver trace points,

08:52.000 --> 08:54.000
exit, execute, sorry,

08:54.000 --> 09:00.000
and then what we do in the EVPF single delivery trace point

09:00.000 --> 09:05.000
is that we rip through the buckets of the GoPro file,

09:05.000 --> 09:07.000
copy of those into an EVPF map,

09:07.000 --> 09:10.000
and then we send a signal via a PVDF

09:10.000 --> 09:13.000
purpose and output to the user agent.

09:13.000 --> 09:16.000
And then, Parker agent gets that signal and says,

09:16.000 --> 09:19.000
oh, and program died as an event.

09:19.000 --> 09:21.000
Now, I can read that map, read those buckets,

09:21.000 --> 09:23.000
which have been copied from your GoPro,

09:23.000 --> 09:26.000
and your GoPro and going at this point is dead and gone,

09:26.000 --> 09:29.000
but that EVPF map is still there.

09:29.000 --> 09:32.000
And we can read all the entries from that,

09:32.000 --> 09:34.000
and then create a people off profile,

09:34.000 --> 09:37.000
and it can be uploaded to polar signals,

09:37.000 --> 09:39.000
and then you see your profile,

09:39.000 --> 09:42.000
and you can figure out what happened and save the day, right?

09:42.000 --> 09:44.000
So, does it work?

09:44.000 --> 09:46.000
Sometimes, you know?

09:46.000 --> 09:48.000
Actually, it works pretty well, but like,

09:48.000 --> 09:50.000
there are things that can go wrong, right?

09:50.000 --> 09:56.000
So, the hash table that go uses distort,

09:56.000 --> 09:59.000
things has a massive prime number of entries,

09:59.000 --> 10:03.000
and then each entry has an infinite number of

10:03.000 --> 10:07.000
possible entries because it's just used as linked list

10:07.000 --> 10:08.000
chaining.

10:08.000 --> 10:12.000
So, in theory, we could have an infinite number of records to read.

10:12.000 --> 10:17.000
And obviously, that kind of doesn't fly with an EVPF program,

10:17.000 --> 10:18.000
right? It's pretty strict.

10:18.000 --> 10:21.000
And it's, you know, how much execution you can do.

10:21.000 --> 10:25.000
So, our current implementation

10:25.000 --> 10:29.000
can, in one EVPF program,

10:29.000 --> 10:35.000
read 360,362 buckets, which seems like a small number,

10:35.000 --> 10:38.000
but we haven't done a lot to optimize that

10:38.000 --> 10:41.000
to probably get a little better, but anyway,

10:41.000 --> 10:45.000
that's not enough for a kind of, you know,

10:45.000 --> 10:48.000
resumong sized go program.

10:48.000 --> 10:50.000
So, that's no good.

10:50.000 --> 10:53.000
But then luckily, we can tap into tail calls

10:53.000 --> 10:56.000
and get that up to 100,000 buckets.

10:56.000 --> 10:59.000
And that turns out to, you know,

10:59.000 --> 11:01.000
for all the things we've tried to throw at it,

11:01.000 --> 11:02.000
there's plenty of buckets.

11:02.000 --> 11:04.000
So, the map is a little hard, right?

11:04.000 --> 11:07.000
So, what is a bucket of bucket is a unique allocation stack.

11:07.000 --> 11:08.000
And so, if you look at a program,

11:08.000 --> 11:10.000
like you can't just look at it and go,

11:10.000 --> 11:13.000
how many unique allocation paths are in that program?

11:13.000 --> 11:14.000
You know, it could be a small number,

11:14.000 --> 11:16.000
it could be a huge number, you know,

11:16.000 --> 11:18.000
basically, probably roughly correlates

11:18.000 --> 11:19.000
with the size of the program.

11:19.000 --> 11:21.000
So, if you have like a huge go program,

11:22.000 --> 11:23.000
maybe that's not enough.

11:23.000 --> 11:25.000
But in practice that we've seen,

11:25.000 --> 11:26.000
that's plenty.

11:26.000 --> 11:29.000
And then also, because that maps pre-allocated,

11:29.000 --> 11:32.000
how many buckets you support

11:32.000 --> 11:34.000
factors into the size of the map.

11:34.000 --> 11:37.000
And the map roughly wakes that works out to

11:37.000 --> 11:39.000
60,000 buckets takes 40 megabytes.

11:39.000 --> 11:42.000
So, that's not a ton of memory,

11:42.000 --> 11:44.000
but this is configurable, right?

11:44.000 --> 11:46.000
So, you could jack it up,

11:46.000 --> 11:48.000
and you can get up to, like I said,

11:48.000 --> 11:50.000
100,000 before we start running into

11:50.000 --> 11:53.000
our EVPF limits.

11:53.000 --> 11:55.000
So, if you have more,

11:55.000 --> 11:56.000
what happens?

11:56.000 --> 11:58.000
Well, we just stop, right?

11:58.000 --> 12:00.000
I mean, we only, you know,

12:00.000 --> 12:03.000
the loop in our program is limited

12:03.000 --> 12:05.000
to a certain number of iterations.

12:05.000 --> 12:06.000
If we get to the end,

12:06.000 --> 12:08.000
and we haven't reached the end of all

12:08.000 --> 12:10.000
of the memory allocation,

12:10.000 --> 12:11.000
buffets in the hash table,

12:11.000 --> 12:12.000
we'll just record,

12:12.000 --> 12:14.000
but now we didn't get to the end.

12:14.000 --> 12:16.000
This is incomplete.

12:16.000 --> 12:18.000
So, now that may or may not be useful,

12:18.000 --> 12:20.000
like, you know, depends

12:20.000 --> 12:22.000
to be hit the interesting allocations,

12:22.000 --> 12:24.000
to the interesting allocations

12:24.000 --> 12:26.000
come later in the hash map,

12:26.000 --> 12:27.000
and we didn't get to them,

12:27.000 --> 12:29.000
but I really still have something

12:29.000 --> 12:30.000
so you can look at it.

12:30.000 --> 12:33.000
So, it's a bit of a crapshoot in that situation,

12:33.000 --> 12:34.000
but at least you know,

12:34.000 --> 12:35.000
like, let me tell you,

12:35.000 --> 12:38.000
if it's a complete reader not.

12:38.000 --> 12:40.000
So, why do these limits exist?

12:40.000 --> 12:42.000
Just a little more detail,

12:42.000 --> 12:43.000
like,

12:43.000 --> 12:44.000
the,

12:44.000 --> 12:47.000
my slides suck by the way,

12:47.000 --> 12:49.000
so I apologize for that.

12:49.000 --> 12:50.000
Like, I should have like,

12:50.000 --> 12:51.000
pictures and stuff.

12:51.000 --> 12:52.000
Anyway,

12:52.000 --> 12:56.000
the way these records are laid out,

12:56.000 --> 12:58.000
you have like a header,

12:58.000 --> 13:01.000
you have the stack,

13:01.000 --> 13:04.000
and then you have like,

13:04.000 --> 13:07.000
their actual mem stuff that tells you,

13:07.000 --> 13:08.000
how many, you know,

13:08.000 --> 13:10.000
Alex and Freeze and whatnot.

13:10.000 --> 13:11.000
So, the header's a fixed size,

13:11.000 --> 13:13.000
and then it's going to number at the end,

13:13.000 --> 13:14.000
which is the number of stacks,

13:14.000 --> 13:16.000
and the stack part is anywhere from zero to a thousand,

13:16.000 --> 13:18.000
stack entries,

13:18.000 --> 13:20.000
which is just the call stack up to that,

13:20.000 --> 13:22.000
and then after that is the mem.

13:22.000 --> 13:25.000
So, our ADPF program has to do three reads,

13:25.000 --> 13:28.000
to figure this out right next to read the header,

13:28.000 --> 13:29.000
read the stack,

13:29.000 --> 13:30.000
and then read the mem.

13:30.000 --> 13:31.000
So, that's,

13:31.000 --> 13:32.000
it's pretty simple,

13:32.000 --> 13:33.000
you know,

13:33.000 --> 13:34.000
if you want to go look at the code,

13:34.000 --> 13:35.000
I encourage you to do so,

13:35.000 --> 13:36.000
but that,

13:36.000 --> 13:37.000
you know,

13:37.000 --> 13:39.000
doing those three reads,

13:39.000 --> 13:41.000
the instructions to pull out the data

13:41.000 --> 13:42.000
we're interested in,

13:42.000 --> 13:44.000
copy them to the EDPF map,

13:45.000 --> 13:46.000
is why,

13:46.000 --> 13:47.000
you know,

13:47.000 --> 13:50.000
we only get three thousand or so buckets per program.

13:55.000 --> 13:56.000
The,

13:56.000 --> 13:57.000
you know,

13:57.000 --> 13:58.000
the other thing I was going to point out is that

13:58.000 --> 14:01.000
this can be up to a thousand,

14:01.000 --> 14:04.000
but our representation fixes it at 64,

14:04.000 --> 14:06.000
so it's a fixed size allocation,

14:06.000 --> 14:08.000
so that's why the three reads instead of two,

14:08.000 --> 14:13.000
because we don't support the full thousand stack frames,

14:13.000 --> 14:15.000
that go to us.

14:15.000 --> 14:17.000
It's typically 64's enough.

14:17.000 --> 14:18.000
This is all configurable,

14:18.000 --> 14:19.000
like,

14:19.000 --> 14:20.000
in your use case is different.

14:20.000 --> 14:22.000
You can go in there and change it.

14:22.000 --> 14:24.000
So,

14:24.000 --> 14:26.000
walking go wrong,

14:26.000 --> 14:27.000
you know,

14:27.000 --> 14:28.000
BMPF reads can fail,

14:28.000 --> 14:29.000
you know,

14:29.000 --> 14:30.000
it's effective life,

14:30.000 --> 14:31.000
you know,

14:31.000 --> 14:32.000
if the, you know,

14:32.000 --> 14:34.000
the programs under memory pressure,

14:34.000 --> 14:35.000
which,

14:35.000 --> 14:37.000
if it's boom killers happening,

14:37.000 --> 14:38.000
you know,

14:38.000 --> 14:40.000
machine is under memory pressure,

14:40.000 --> 14:42.000
it could be possible that parts of the,

14:43.000 --> 14:45.000
the bucket array,

14:45.000 --> 14:47.000
massive array on any of those allocations,

14:47.000 --> 14:48.000
what pages they live on,

14:48.000 --> 14:50.000
could get put out to swap,

14:50.000 --> 14:51.000
and if we try to read them,

14:51.000 --> 14:53.000
we just get an error back.

14:53.000 --> 14:54.000
So, that happens.

14:54.000 --> 14:55.000
So, again,

14:55.000 --> 14:57.000
that's why we have that incomplete flag.

14:57.000 --> 14:58.000
If we can,

14:58.000 --> 14:59.000
some reason,

14:59.000 --> 15:01.000
we don't get to the end of all of your buckets,

15:01.000 --> 15:03.000
we'll let you know that's an incomplete read.

15:06.000 --> 15:07.000
And then,

15:07.000 --> 15:09.000
something that can go wrong

15:09.000 --> 15:10.000
is,

15:10.000 --> 15:11.000
sometimes,

15:11.000 --> 15:12.000
the Oomkill gets impatient,

15:12.000 --> 15:13.000
and,

15:13.000 --> 15:15.000
while it's trying to kill your program,

15:15.000 --> 15:17.000
it will come along and kill another program,

15:17.000 --> 15:18.000
in that case,

15:18.000 --> 15:19.000
it can also be the park agents,

15:19.000 --> 15:20.000
it gets killed,

15:20.000 --> 15:21.000
and then,

15:21.000 --> 15:22.000
you know,

15:22.000 --> 15:24.000
what's the point of Oombrough at that point,

15:24.000 --> 15:26.000
because that data's not going anywhere.

15:26.000 --> 15:27.000
But,

15:27.000 --> 15:29.000
it was surprisingly well in practice,

15:29.000 --> 15:30.000
you know,

15:30.000 --> 15:32.000
we have a bunch of different tests in our CI,

15:32.000 --> 15:33.000
that tests different,

15:33.000 --> 15:34.000
you know,

15:34.000 --> 15:36.000
programs that crash in different ways,

15:36.000 --> 15:38.000
and it's pretty reliable for the test.

15:38.000 --> 15:40.000
That we have so far,

15:40.000 --> 15:41.000
but,

15:41.000 --> 15:42.000
you know,

15:42.000 --> 15:43.000
probably lots of fillings,

15:43.000 --> 15:44.000
I don't know where of,

15:44.000 --> 15:45.000
but,

15:45.000 --> 15:47.000
those are the ones we've seen.

15:47.000 --> 15:50.000
So future directions for Oombrough.

15:50.000 --> 15:51.000
We,

15:51.000 --> 15:55.000
you know,

15:55.000 --> 15:56.000
kind of,

15:56.000 --> 15:57.000
I think the inspiration for it goes,

15:57.000 --> 15:58.000
memory profile,

15:58.000 --> 15:59.000
probably came from J Malik,

15:59.000 --> 16:00.000
and it's profile,

16:00.000 --> 16:01.000
so,

16:01.000 --> 16:04.000
including J Malik's support would be pretty easy,

16:04.000 --> 16:05.000
and be cool,

16:05.000 --> 16:07.000
so it wouldn't be just go programs,

16:08.000 --> 16:09.000
or TC Malik,

16:09.000 --> 16:10.000
or Mew Malik,

16:10.000 --> 16:12.000
which is commonly used in rust programs,

16:12.000 --> 16:14.000
would be good to support.

16:14.000 --> 16:15.000
In theory,

16:15.000 --> 16:17.000
we could probably support J lived C,

16:17.000 --> 16:21.000
but that would be a little bit more of a project,

16:21.000 --> 16:24.000
because it doesn't have a profile built in.

16:24.000 --> 16:26.000
And then right now,

16:26.000 --> 16:28.000
we have that limitation where,

16:28.000 --> 16:30.000
if we can't find the runtime,

16:30.000 --> 16:32.000
and bucket symbol in your program,

16:32.000 --> 16:33.000
then we're dead,

16:33.000 --> 16:34.000
right,

16:34.000 --> 16:36.000
we can't program a lot of go-binder,

16:36.000 --> 16:37.000
as well,

16:37.000 --> 16:38.000
you know,

16:38.000 --> 16:39.000
strip out those symbols,

16:39.000 --> 16:40.000
so,

16:40.000 --> 16:41.000
not a lot,

16:41.000 --> 16:43.000
I would say some.

16:43.000 --> 16:45.000
But there are ways of finding that stuff,

16:45.000 --> 16:46.000
if it's a strip binary,

16:46.000 --> 16:49.000
but we just haven't gone to those links yet,

16:49.000 --> 16:50.000
but,

16:50.000 --> 16:51.000
you know,

16:51.000 --> 16:52.000
in theory,

16:52.000 --> 16:53.000
you can find a function that,

16:53.000 --> 16:54.000
references that thing,

16:54.000 --> 16:56.000
and then disassemble the function,

16:56.000 --> 16:58.000
and find the address that way.

16:58.000 --> 17:00.000
And the other thing we're looking at,

17:00.000 --> 17:01.000
is,

17:01.000 --> 17:02.000
you know,

17:02.000 --> 17:03.000
like I said before,

17:03.000 --> 17:05.000
a lot of profiles for go programs

17:05.000 --> 17:07.000
and by scraping the people off end point,

17:07.000 --> 17:08.000
it's a mineral.

17:08.000 --> 17:10.000
But what if we could do push profile,

17:10.000 --> 17:12.000
and what if your program could say,

17:12.000 --> 17:13.000
oh,

17:13.000 --> 17:14.000
something interesting happened,

17:14.000 --> 17:16.000
or I'm in some interesting phase of my program.

17:16.000 --> 17:18.000
We're doing some kind of memory,

17:18.000 --> 17:20.000
intensive optimization thing.

17:20.000 --> 17:22.000
Let's push,

17:22.000 --> 17:23.000
you know,

17:23.000 --> 17:24.000
so we have the ability,

17:24.000 --> 17:26.000
because we can just read them at any time.

17:26.000 --> 17:28.000
You can push those profiles.

17:28.000 --> 17:29.000
You know,

17:29.000 --> 17:31.000
having to get pulled.

17:32.000 --> 17:34.000
So,

17:34.000 --> 17:35.000
I encourage you to check it out.

17:35.000 --> 17:36.000
Like I said,

17:36.000 --> 17:38.000
the UMPROF project is,

17:38.000 --> 17:40.000
under the parka umbrella,

17:40.000 --> 17:43.000
that the pole signal is created.

17:43.000 --> 17:45.000
That's the library,

17:45.000 --> 17:46.000
and then it's included,

17:46.000 --> 17:48.000
and built into the parka agent.

17:48.000 --> 17:51.000
And then the parka website has a bunch of,

17:51.000 --> 17:53.000
a link to start on it.

17:53.000 --> 17:54.000
Is anyone ever used parka?

17:54.000 --> 17:56.000
Anybody?

17:57.000 --> 18:00.000
So, check it out.

18:00.000 --> 18:02.000
You get parka agent.

18:02.000 --> 18:03.000
There's a lot of useful things,

18:03.000 --> 18:07.000
in addition to just doing Prophiling.

18:07.000 --> 18:09.000
So,

18:09.000 --> 18:11.000
and then I guess my last slide wasn't included.

18:11.000 --> 18:12.000
That's all right.

18:12.000 --> 18:13.000
It was just my name.

18:13.000 --> 18:14.000
So,

18:14.000 --> 18:16.000
well,

18:16.000 --> 18:17.000
thank you.

18:17.000 --> 18:18.000
Thank you.

18:18.000 --> 18:19.000
Thank you.

18:19.000 --> 18:20.000
Thank you.

18:20.000 --> 18:21.000
Thank you.

18:21.000 --> 18:22.000
Thank you.

18:23.000 --> 18:24.000
Thank you.

18:24.000 --> 18:25.000
Thank you.

18:25.000 --> 18:26.000
Thank you.

18:30.000 --> 18:32.000
So, we have lots of time for questions,

18:32.000 --> 18:34.000
so I hope you have questions.

18:34.000 --> 18:35.000
Yeah, thanks very much.

18:35.000 --> 18:37.000
I'll definitely will try it out.

18:37.000 --> 18:39.000
Especially I do a lot of direct,

18:39.000 --> 18:40.000
we do a lot of direct profiling,

18:40.000 --> 18:41.000
just like you said,

18:41.000 --> 18:42.000
and I directly,

18:42.000 --> 18:43.000
with running,

18:43.000 --> 18:44.000
executable,

18:44.000 --> 18:45.000
go.

18:45.000 --> 18:46.000
I was just wondering,

18:46.000 --> 18:49.000
does this offer any additional presentation?

18:49.000 --> 18:51.000
Or is it literally I'm just going to get people

18:51.000 --> 18:52.000
by the end?

18:52.000 --> 18:54.000
Is there something extra I get,

18:54.000 --> 18:56.000
additional to the people of itself?

18:56.000 --> 18:57.000
So,

18:57.000 --> 18:59.000
it's just the P Proph,

18:59.000 --> 19:02.000
and then the information you get is

19:02.000 --> 19:03.000
did a reader occur,

19:03.000 --> 19:04.000
so,

19:04.000 --> 19:05.000
along the way,

19:05.000 --> 19:07.000
if we've got failed to read from the map,

19:07.000 --> 19:08.000
we'll tell you what that happened,

19:08.000 --> 19:10.000
and then is it complete?

19:10.000 --> 19:13.000
It can be incomplete if we,

19:13.000 --> 19:15.000
if there's too many documents for us to read,

19:15.000 --> 19:16.000
so we didn't get to the end.

19:16.000 --> 19:17.000
So,

19:17.000 --> 19:19.000
for most cases,

19:19.000 --> 19:20.000
you know,

19:20.000 --> 19:21.000
complete will be true,

19:21.000 --> 19:22.000
and read there will be false,

19:22.000 --> 19:24.000
and that's a full profile,

19:24.000 --> 19:25.000
and you should be able to use that,

19:25.000 --> 19:27.000
to kind of figure out what happened.

19:30.000 --> 19:31.000
Hi,

19:31.000 --> 19:32.000
yeah,

19:32.000 --> 19:33.000
I have a question myself,

19:33.000 --> 19:34.000
over here.

19:34.000 --> 19:35.000
Yeah.

19:37.000 --> 19:39.000
I'm really useful to,

19:39.000 --> 19:41.000
I'm super excited to try it out.

19:41.000 --> 19:42.000
I'm sure it will help

19:42.000 --> 19:45.000
to debug lots of these own issues.

19:45.000 --> 19:46.000
While you're beginning the talk,

19:46.000 --> 19:47.000
I had a thought,

19:47.000 --> 19:50.000
and I'm wondering what your opinion is on it.

19:50.000 --> 19:52.000
So, as the memory pressure builds up,

19:52.000 --> 19:54.000
the kernel obviously tries to do a bunch of things,

19:54.000 --> 19:56.000
like try to put pages as well,

19:56.000 --> 19:59.000
and basically get them out of memory,

19:59.000 --> 20:01.000
so it can free up space with the allocation,

20:01.000 --> 20:03.000
and it's only after a lot of trying

20:03.000 --> 20:04.000
when it can,

20:04.000 --> 20:06.000
that it owns the process.

20:06.000 --> 20:08.000
And I'm wondering if you can,

20:08.000 --> 20:10.000
maybe use EBBF in this process

20:10.000 --> 20:13.000
to kind of start to build a very compact model,

20:13.000 --> 20:14.000
of which pages,

20:14.000 --> 20:16.000
the kernel is having problems with,

20:16.000 --> 20:19.000
and if you can communicate that to the user space,

20:19.000 --> 20:21.000
so maybe the user space application can get,

20:21.000 --> 20:22.000
ahead of the own,

20:22.000 --> 20:24.000
like it has more context about

20:24.000 --> 20:25.000
a which memory should be available,

20:25.000 --> 20:26.000
and not be available.

20:26.000 --> 20:27.000
Like early warning system.

20:27.000 --> 20:29.000
Yeah, like an early warning system,

20:29.000 --> 20:31.000
but like with a model of the memory that the kernel has,

20:31.000 --> 20:32.000
so it knows that,

20:32.000 --> 20:34.000
okay, the kernels having troubles with this section,

20:34.000 --> 20:35.000
I was an honest memory,

20:35.000 --> 20:37.000
and maybe I have some buckets over here

20:37.000 --> 20:39.000
that I really don't need,

20:39.000 --> 20:41.000
and I can shunt them out.

20:41.000 --> 20:43.000
Yeah, I mean that would be nice, right?

20:43.000 --> 20:46.000
If there was like a signal or something you could tap into,

20:46.000 --> 20:48.000
I'm not aware of anything like that.

20:48.000 --> 20:51.000
And I do know that like the go garbage collector

20:51.000 --> 20:52.000
will, you know,

20:52.000 --> 20:54.000
if it can't allocate,

20:54.000 --> 20:56.000
will, you know, before,

20:56.000 --> 20:57.000
you know, dying,

20:57.000 --> 20:59.000
it will try to run the garbage collector

20:59.000 --> 21:01.000
and try to free up space.

21:01.000 --> 21:04.000
But I don't think there's any kind of like proactive signals

21:04.000 --> 21:06.000
that Linux sends,

21:06.000 --> 21:08.000
but that would probably the way to do it, right?

21:08.000 --> 21:10.000
Like if basically garbage collector could know

21:10.000 --> 21:12.000
that it's running out of memory

21:12.000 --> 21:14.000
that maybe it can try to adjust its,

21:14.000 --> 21:15.000
you know, policy,

21:15.000 --> 21:18.000
so that maybe it doesn't overal get too much.

21:18.000 --> 21:20.000
But yeah, it's good idea.

21:20.000 --> 21:22.000
I mean, I also know that there's work

21:22.000 --> 21:24.000
ongoing in the kernel

21:24.000 --> 21:26.000
with improving some of this stuff

21:26.000 --> 21:27.000
and making, you know,

21:27.000 --> 21:28.000
kind of more information available,

21:28.000 --> 21:29.000
like you're talking about.

21:29.000 --> 21:31.000
So, I think the issue is also,

21:31.000 --> 21:33.000
preemptively numbing one hour

21:33.000 --> 21:34.000
and it's going to happen

21:34.000 --> 21:35.000
because you could get a pressure.

21:35.000 --> 21:37.000
But it's not necessarily,

21:37.000 --> 21:38.000
but I'm just going to necessity.

21:38.000 --> 21:41.000
That's the challenge I think.

21:42.000 --> 21:43.000
Sorry?

21:45.000 --> 21:46.000
Well, he said,

21:46.000 --> 21:47.000
oh,

21:47.000 --> 21:48.000
so what do you say?

21:48.000 --> 21:50.000
The preemptively.

21:50.000 --> 21:51.000
The preempt is hard.

21:51.000 --> 21:53.000
It's hard to know preemptively

21:53.000 --> 21:55.000
when you're going to get the RM, right?

21:55.000 --> 21:56.000
So even if the pressure's high,

21:56.000 --> 21:58.000
even if I set the signals out,

21:58.000 --> 22:00.000
it could be 10,000 or four signals,

22:00.000 --> 22:01.000
so like, you know,

22:01.000 --> 22:02.000
which one do I get?

22:02.000 --> 22:04.000
The current response to the RM I'm looking for.

22:04.000 --> 22:06.000
So, I don't think that.

22:06.000 --> 22:07.000
I think that.

22:07.000 --> 22:08.000
That's okay.

22:08.000 --> 22:09.000
Preemptively is hard.

22:10.000 --> 22:14.000
Yeah, thanks again for your presentation.

22:14.000 --> 22:16.000
I'm wondering a bit,

22:16.000 --> 22:18.000
and that we kind of have the O,

22:18.000 --> 22:20.000
M, not student to two use cases.

22:20.000 --> 22:22.000
It's the machine runs out of memory

22:22.000 --> 22:24.000
and you have a secret limited process,

22:24.000 --> 22:28.000
which is like 99% of the O, M's

22:28.000 --> 22:30.000
that process is usually C

22:30.000 --> 22:32.000
because it's not the machine that runs out of memory.

22:32.000 --> 22:34.000
It's the limit that's been given.

22:34.000 --> 22:35.000
Yeah.

22:35.000 --> 22:36.000
So the secret.

22:36.000 --> 22:38.000
And as I was wondering,

22:38.000 --> 22:40.000
there's no, there's work being done

22:40.000 --> 22:43.000
by Roman Kuchen from Google belief

22:43.000 --> 22:48.000
to add O, M management to the BFF capabilities.

22:48.000 --> 22:50.000
So you, you don't get a warning,

22:50.000 --> 22:53.000
but you get plenty more abilities to manage that.

22:53.000 --> 22:55.000
And then the machine is not out of memory.

22:55.000 --> 22:58.000
So you have plenty of time and plenty of memory

22:58.000 --> 23:01.000
to keep done but do also the stuff,

23:01.000 --> 23:03.000
which is more of a thing

23:03.000 --> 23:05.000
than managing the out of memory of the machine,

23:05.000 --> 23:07.000
because then that's a panic.

23:07.000 --> 23:08.000
Yeah.

23:08.000 --> 23:09.000
Yeah.

23:09.000 --> 23:10.000
I mean, it will proff works whether

23:10.000 --> 23:12.000
it's a C group or a machine.

23:12.000 --> 23:15.000
So, you know, parker agent usually runs in its own,

23:15.000 --> 23:17.000
you know, par and Kubernetes and when it

23:17.000 --> 23:19.000
ooms, that can happen,

23:19.000 --> 23:22.000
because people will typically tell parker agent

23:22.000 --> 23:23.000
like don't use a lot of memory

23:23.000 --> 23:24.000
because it's a, you know,

23:24.000 --> 23:26.000
continuous production profile.

23:26.000 --> 23:27.000
So it's not supposed to,

23:27.000 --> 23:30.000
but yeah, it works in both cases.

23:30.000 --> 23:32.000
And, you know, these things are constantly

23:32.000 --> 23:34.000
going to get refined and it's good thing, right?

23:34.000 --> 23:37.000
Like, I think, you know, for certain applications,

23:37.000 --> 23:38.000
especially databases.

23:38.000 --> 23:40.000
Like, you want to be able to use as much memory

23:40.000 --> 23:42.000
as you possibly can and run tight

23:42.000 --> 23:45.000
and, you know, cash as many things as you want

23:45.000 --> 23:48.000
and have flexible manner, you know, flexible ways

23:48.000 --> 23:50.000
to respond to a lot of memory on payment

23:50.000 --> 23:52.000
and be clean up some stuff.

23:52.000 --> 23:55.000
But this is kind of the something, like,

23:55.000 --> 23:57.000
and develop before or most,

23:57.000 --> 23:59.000
like, when things got too bad,

23:59.000 --> 24:01.000
the process has had to die,

24:01.000 --> 24:03.000
now you know why?

24:03.000 --> 24:05.000
One question over here.

24:05.000 --> 24:06.000
Yeah.

24:06.000 --> 24:07.000
So you find that them correctly

24:07.000 --> 24:08.000
in the end of the day.

24:08.000 --> 24:09.000
It's a correlation of information

24:09.000 --> 24:11.000
from user space in the kernel space

24:11.000 --> 24:13.000
via regular BPF map.

24:13.000 --> 24:15.000
So is there feasible or reasonable

24:15.000 --> 24:17.000
to try maybe to use task local storage

24:17.000 --> 24:19.000
for these purposes to store those buckets?

24:19.000 --> 24:20.000
Use which storage?

24:20.000 --> 24:21.000
Just local storage.

24:21.000 --> 24:23.000
So like a different type of BPF maps

24:23.000 --> 24:25.000
that are local to, to tasks.

24:25.000 --> 24:26.000
Ah, maybe.

24:26.000 --> 24:28.000
So what would be the range there?

24:28.000 --> 24:30.000
Essentially it would be a advantage

24:30.000 --> 24:31.000
would be that you will get

24:31.000 --> 24:33.000
inside your BPF program,

24:33.000 --> 24:35.000
right there, local to your threat

24:35.000 --> 24:37.000
that's going to be a performance advantage

24:37.000 --> 24:39.000
and probably a couple of other perks.

24:39.000 --> 24:42.000
It is a different type of BPF maps for the purposes.

24:42.000 --> 24:43.000
Interesting idea.

24:43.000 --> 24:44.000
Yeah.

24:44.000 --> 24:45.000
Yeah.

24:45.000 --> 24:46.000
Because I'm really, I'm asking,

24:46.000 --> 24:48.000
because it seems to be popular nowadays

24:48.000 --> 24:49.000
for profiling of different type of thing,

24:49.000 --> 24:51.000
mostly for CPU profiling, like,

24:51.000 --> 24:52.000
if you get item of Python,

24:52.000 --> 24:54.000
run time or something like that,

24:54.000 --> 24:56.000
would still seem a similar idea,

24:56.000 --> 24:57.000
run time profiling,

24:57.000 --> 24:59.000
but in your case it's going to be memory.

25:00.000 --> 25:01.000
Okay.

25:01.000 --> 25:02.000
And what's curious if you're like,

25:02.000 --> 25:04.000
we're investigating at the direction?

25:04.000 --> 25:06.000
Not at the moment.

25:06.000 --> 25:07.000
Yeah.

25:07.000 --> 25:09.000
Excuse me, could you have this?

25:09.000 --> 25:10.000
Yeah.

25:10.000 --> 25:15.000
This project was used in the context of a project

25:15.000 --> 25:18.000
that we had to do an unwinder for Luigi.

25:18.000 --> 25:21.000
And in the course of doing that project,

25:21.000 --> 25:25.000
we had some code that was written by your

25:25.000 --> 25:26.000
colleague.

25:26.000 --> 25:28.000
It wasn't so great that umed,

25:28.000 --> 25:30.000
so that was kind of the,

25:30.000 --> 25:31.000
that was the thing it was solving.

25:31.000 --> 25:32.000
And it worked.

25:32.000 --> 25:34.000
So, I'm sure there's a million ways

25:34.000 --> 25:35.000
that could be improved,

25:35.000 --> 25:37.000
but to open source and it's out there.

25:37.000 --> 25:39.000
So, going through that.

25:39.000 --> 25:42.000
So, I'm curious if you have plans to work

25:42.000 --> 25:44.000
is going around time,

25:44.000 --> 25:46.000
and you mentioned the legit,

25:46.000 --> 25:49.000
to have some support on their site,

25:49.000 --> 25:51.000
to make it more reliable.

25:52.000 --> 25:53.000
Thank you.

25:53.000 --> 25:55.000
So, making things more reliable

25:55.000 --> 25:57.000
from a running out of memory perspective.

25:57.000 --> 26:00.000
Well, for, as I go this right now,

26:00.000 --> 26:01.000
it's super inconvenient,

26:01.000 --> 26:03.000
because this Hashmap, for instance,

26:03.000 --> 26:06.000
is scattered all over the memory space.

26:06.000 --> 26:09.000
So, for instance, just, you know,

26:09.000 --> 26:10.000
it's done by the,

26:10.000 --> 26:12.000
if they did something that it wasn't,

26:12.000 --> 26:13.000
that much scattered,

26:13.000 --> 26:14.000
but state,

26:14.000 --> 26:18.000
I don't know,

26:18.000 --> 26:20.000
three megabytes of continuous memory space,

26:20.000 --> 26:22.000
and this could be some special making

26:22.000 --> 26:27.000
that lived after a process has been killed.

26:27.000 --> 26:30.000
But I don't want to suggest this.

26:30.000 --> 26:32.000
I'm just thinking, general,

26:32.000 --> 26:34.000
that maybe programming languages

26:34.000 --> 26:36.000
at some support,

26:36.000 --> 26:38.000
the whole infrastructure could work better.

26:38.000 --> 26:39.000
And since you're like,

26:39.000 --> 26:41.000
doing the peer-nearing work in this area,

26:41.000 --> 26:45.000
maybe you started doing something like that already.

26:45.000 --> 26:46.000
I haven't,

26:46.000 --> 26:47.000
but I mean,

26:47.000 --> 26:48.000
I know there are solutions,

26:49.000 --> 26:51.000
especially in database software that do stuff like this,

26:51.000 --> 26:52.000
that kind of monitor,

26:52.000 --> 26:53.000
you know,

26:53.000 --> 26:54.000
how much memory is free,

26:54.000 --> 26:56.000
and we'll kind of,

26:56.000 --> 26:58.000
react early

26:58.000 --> 27:00.000
to low memory situations

27:00.000 --> 27:02.000
to avoid this kind of stuff.

27:02.000 --> 27:04.000
But things happen,

27:04.000 --> 27:07.000
and it's good to have a escape,

27:07.000 --> 27:08.000
you know,

27:08.000 --> 27:11.000
patch for when it dies.

27:11.000 --> 27:12.000
All right.

27:12.000 --> 27:13.000
Thanks for coming out.

27:13.000 --> 27:15.000
Thank you, Tommy.

27:18.000 --> 27:20.000
Thank you.

