WEBVTT

00:00.000 --> 00:21.640
All right, good. Everybody has a seat. Yeah, there's. Is it the cable or is it. It could be the

00:21.640 --> 00:31.640
cable.

00:42.640 --> 00:48.640
So, yep, hello everyone. I'm Matt Graham. I am a research data scientist at the UCL Centre

00:48.640 --> 00:52.640
for Advanced Research Computing, and today I'm going to be giving you a whirlwind to sort

00:52.640 --> 00:58.640
of accelerate it in differentiable scientific computing with Jacks. So, just to begin with

00:58.640 --> 01:02.640
an overview of what Jacks is, so it's a Python library which is perhaps best well known

01:02.640 --> 01:06.640
for its use in kind of logic air machine learning and artificial intelligence context, but it's

01:06.640 --> 01:11.640
actually also kind of using a broader scientific computing high-performance numerical computing context,

01:11.640 --> 01:16.640
and that's what we're considering on here. It started out as a research project in Google

01:16.640 --> 01:24.640
Brain by a group including some of the key contributors to the Python autographed package and the

01:24.640 --> 01:30.640
XLA kind of compiler project. The initial version of Jacks was described in a 2018

01:30.640 --> 01:35.640
system L conference paper, and there was an associated open source release under it

01:35.640 --> 01:41.640
and I'm actually 2.0 license on getter been october 2018. Since that point, it's had seen

01:41.640 --> 01:46.640
widespread adoption in both machine learning and scientific computing context, and there's a sort of large

01:46.640 --> 01:51.640
and vibrant open source community that's developed around it, and in particular quite a kind of

01:51.640 --> 01:58.640
very huge downstream package ecosystem building various functionality on top of it. So, just to kind of

01:58.640 --> 02:04.640
give some of the key features that Jacks offers. So, I would say one of the sort of standout

02:04.640 --> 02:10.640
features is if you're already familiar with the NumPy, kind of way of coding up problems

02:11.640 --> 02:15.640
and with the scientific Python ecosystem, there's a very kind of shallow learning curve to get

02:15.640 --> 02:21.640
them up and running with Jacks. It offers a almost drop in NumPy API replacement that you can

02:21.640 --> 02:27.640
directly use in place of NumPy, but the real power of Jacks comes from the use

02:27.640 --> 02:31.640
composable function transformations that are offered for doing things like just in time

02:31.640 --> 02:39.640
compilation, automatic differentiation, automated vectorization, and parallelization. Under the

02:39.640 --> 02:46.640
platform, it allows you running across both CPUs, but also accelerated devices, it's just

02:46.640 --> 02:53.640
graphics processing units, and Google's 10s processing units. So, just to kind of give a bit of context

02:53.640 --> 02:57.640
while I'm talking about Jacks, so my main experience is from developing downstream packages that

02:57.640 --> 03:02.640
build on top of it, and so I'm not going to have much time to talk about these today, but if you're

03:02.640 --> 03:06.640
interested and come and want to chat to me afterwards, I work on a package for approximate

03:06.640 --> 03:11.640
which builds upon Jacks' differentiability and another one for doing spherical monotransforms

03:11.640 --> 03:22.640
in sort of a space in F sciences context. So, how is Jacks actually structured? At the

03:22.640 --> 03:27.640
top, it's the actual kind of Jacks' Python package, and it's public API, which exposes both

03:27.640 --> 03:32.640
these functional transforms. I'll talk a little bit about various kind of numerical modules

03:32.640 --> 03:38.640
for kind of building your functions on top of, and there's also a relatively well-developed

03:38.640 --> 03:43.640
kind of series of APIs for providing extensions on top of Jacks.

03:43.640 --> 03:47.640
These are the internals. Underneath that, I'm one of the key components of that is Jack's

03:47.640 --> 03:52.640
loop, which is the sort of Python wrappers around the binary components of Jacks, which

03:52.640 --> 03:57.640
is separated from the sort of pure Python Jacks package. Alongside that, there are some

03:57.640 --> 04:03.640
vendor-specific plugins for some of the multi-device support. Underneath all of that, Jacks

04:03.640 --> 04:08.640
basically builds along on top of the kind of open-axle compiler infrastructure, uses the

04:08.640 --> 04:13.640
stable-hitch-low MLI-elect for the kind of communication between the Python package and the

04:13.640 --> 04:20.640
xeno-compiler tool chain, and that's what gives you this kind of device-agnostic behavior.

04:20.640 --> 04:25.640
So, just to kind of flag that from our recent versions of Jacks and, and, indeed,

04:25.640 --> 04:30.640
just numpy, there's support for what's called the Python API, standard, which allows you to write code,

04:30.640 --> 04:35.640
which is possible across different array back-ends. So, this would look something like rather than

04:35.640 --> 04:41.640
importing numpy as NP, you would take your array namespace from one of your array arguments,

04:41.640 --> 04:47.640
using this special, and the array namespace method, and then define your numerical operations

04:47.640 --> 04:51.640
in terms of that. And this gives you the advantage of if you pass in a Jacks array, you will

04:52.640 --> 04:58.640
dispatch to the Jacks operations, but if you pass in a numpy array, you will dispatch to the numpy operations.

04:58.640 --> 05:04.640
So, I've mentioned Jacks has these functional transformations, just to dig into a little bit how they actually

05:04.640 --> 05:10.640
work, and in particular, the just-in-time compilation transformation. So, Jacks transforms work by

05:10.640 --> 05:16.640
twisting a function with these kind of abstract arguments that represent the specific shapes and types of the

05:16.640 --> 05:21.640
arguments and trace all of the different numerical operations that apply to those.

05:21.640 --> 05:26.640
For particularly, the just-in-time compilation, the output of that tracing is an internal

05:26.640 --> 05:30.640
intermediate representation called a Jacksper, short-for-a-Jacks expression.

05:30.640 --> 05:36.640
There are then a series of lowering rules defined to translate those numerical

05:36.640 --> 05:41.640
primitives that you've traced down the Jacksper to operations in the stable HTML,

05:42.640 --> 05:47.640
which is then what has passed to the XLE compilers, executable compiled,

05:47.640 --> 05:52.640
wrapped as a new Python function, and executed in the functions with outputs returned,

05:52.640 --> 05:58.640
and the executable cached, so that we don't hit that kind of compiler overhead on subsequent

05:58.640 --> 06:04.640
function implications. So, in code, what does this look like? If we have some Python function,

06:04.640 --> 06:10.640
we apply our Jacks.ditch transform, we get a new function out, and then we would expect

06:10.640 --> 06:16.640
if we evaluate that on some real argument, we will get numerically equivalent outputs,

06:16.640 --> 06:21.640
but for larger functions, we would have a performance gain from the sort of jittered version once

06:21.640 --> 06:26.640
we amortized a cost of compilation. So, as well as just-in-time compilation,

06:26.640 --> 06:31.640
one of the other key features that Jacksper provides is automatic differentiation.

06:31.640 --> 06:38.640
So, one of the most basic instances of this is the Jacks.grad,

06:38.640 --> 06:44.640
transform, so this takes a skill of valued function, which is the function which computed the

06:44.640 --> 06:50.640
gradient, we would respect any real arguments. So, if we had applied that to the simple example,

06:50.640 --> 06:54.640
we quoted up earlier, rather than getting scalar out now, we're getting a ray that represents the

06:54.640 --> 07:00.640
derivatives of respect to each of the elements. And just to kind of flag, because this is using a sort of

07:00.640 --> 07:04.640
efficient reverse mode, ultimately a definition implementation, the gradients are actually

07:04.640 --> 07:08.640
similar kind of constant overhead factor from just evaluating the function itself.

07:08.640 --> 07:14.640
The AD transform uses the same tracing approach as just-in-time compilation,

07:14.640 --> 07:20.640
and indeed the other transforms that Jacks provides, also built on top of this tracing architecture,

07:20.640 --> 07:24.640
and can be composed with each other, so you can do just-in-time compilation of your derivatives,

07:24.640 --> 07:30.640
and so on. So, just to kind of touch on some of their sharp edges,

07:30.640 --> 07:34.640
it can be a little bit challenging to work around, maybe when you're first using Jacks.

07:34.640 --> 07:38.640
So, the eagle-eyed menu may have noticed, and some of those examples there that we were

07:38.640 --> 07:42.640
getting kind of single-position, floating-point outputs. So, this is kind of aligned with

07:42.640 --> 07:46.640
you know, Jacks is typically used often on accelerators, so it's default

07:46.640 --> 07:50.640
semantics around it, it types a more detailed for that. Use case,

07:50.640 --> 07:54.640
if you want to get closer behavior to what you would expect from kind of base number pie,

07:54.640 --> 07:58.640
you will need to enable this Jacks. Use this Jacks enable X64 configuration option,

07:58.640 --> 08:06.640
Jacks arrays are immutable, and in general it kind of assumes a sort of function

08:06.640 --> 08:10.640
program style, whether the function is being transformed appear,

08:10.640 --> 08:14.640
and to do kind of indexed updates, you need to use these kind of functional

08:14.640 --> 08:20.640
app methods for other than the standard numpy in place update syntax.

08:20.640 --> 08:26.640
So, just-in-time compilation in particular, put some requirements on the trace control flow,

08:26.640 --> 08:32.640
so this can only depend on if using base python control flow on the shapes and

08:32.640 --> 08:36.640
data types of non-stack array arguments, and you also need to be able to

08:36.640 --> 08:40.640
infer all of the intermediate and output arrays, statically from the

08:40.640 --> 08:44.640
input arguments. There are structured control primitives that

08:44.640 --> 08:47.640
Jacks provides that do allow data dependent control flow within a

08:47.640 --> 08:50.640
jit compilation context, but then you're having kind of step away from

08:50.640 --> 08:56.640
just using basic python syntax. So just to conclude, in summary,

08:56.640 --> 09:00.640
so I would say the sort of major selling point of Jacks is it's a very

09:00.640 --> 09:04.640
low barrier entry to exploiting GPU acceleration,

09:04.640 --> 09:08.640
and indeed doing that in a multi-banger way, it does support

09:08.640 --> 09:12.640
Nvidia and AMD GPUs, and I think there's some limited support for

09:12.640 --> 09:16.640
Intel as well, and on top of that, getting automatic differentiation

09:16.640 --> 09:22.640
and the other trans-films at Jacks offers, with Jacks and NumPy

09:22.640 --> 09:26.640
and a number of other libraries now supporting this array API.

09:26.640 --> 09:28.640
You can actually write portable code across these different

09:28.640 --> 09:32.640
ray back ends with relatively minimal effort, just compared to talking

09:32.640 --> 09:36.640
just one of them, and the requirement, though, to be able to trace

09:36.640 --> 09:40.640
functions as part of Jacks' transform infrastructure does put

09:40.640 --> 09:44.640
some limitations on, for example, control flow, and that can somewhat

09:44.640 --> 09:48.640
run counter-to-differ example that device portable, so back end portability,

09:48.640 --> 09:51.640
so if you're having to lean on these structure control

09:51.640 --> 09:55.640
permutives, they're Jacks specific, and so you do end up being a

09:55.640 --> 09:59.640
little bit more tied in some cases. Okay, that's everything.

09:59.640 --> 10:03.640
Next for listening, if you are interested more learning more about Jacks,

10:03.640 --> 10:07.640
there's a longer tutorial I give to the archer to training series,

10:07.640 --> 10:10.640
back in December, and there's all material available for it here.

10:10.640 --> 10:11.640
Thank you.

10:11.640 --> 10:14.640
Thank you.

