- Hands-On Mathematics for Deep Learning
- Jay Dawani
- 130字
- 2024-10-30 02:24:30
Chain rule
Let's take an arbitrary function f that takes variables x and y as input, and there is some change in either variable so that . Using this, we can find the change in f using the following:
data:image/s3,"s3://crabby-images/8dddb/8dddb2984a2e2ff6624369d4c85eebf12e0efc93" alt=""
This leads us to the following equation:
data:image/s3,"s3://crabby-images/04646/04646401eb62936e1772a35832de47d776daadcb" alt=""
Then, by taking the limit of the function as , we can derive the chain rule for partial derivatives.
We express this as follows:
data:image/s3,"s3://crabby-images/f9964/f99647fa58775fed96ca7a3474f82ca029e676a8" alt=""
We now divide this equation by an additional small quantity (t) on which x and y are dependent, to find the gradient along . The preceding equation then becomes this one:
data:image/s3,"s3://crabby-images/35d4a/35d4a3b17773e7e5a7f49c4a2207d953f64774e2" alt=""
The differentiation rules that we came across earlier still apply here and can be extended to the multivariable case.