The Chain Rule
Primary tabs
Recall that in function composition we obtain the function $(f\circ g)(x)=f(g(x))$ by placing $g$ in for every value of $x$ in $f(x)$ to obtain $f(g)$ and then by placing the formula for $g(x)$ into every value of $g$ in $f(g)$. What if we then wanted to determine the derivative of $(f\circ g)(x)$?
For example, if $f(x)=x^3+3$ and $g(x)=x+2$ then
\begin{eqnarray}
(f\circ g)(x)&=&f(g(x)),\\
&=&(x+2)^3+3,\\
&=&(x+2)(x+2)(x+2)+3,\\
&=&(x^2+4x+4)(x+2)+3,\\
&\mbox{and}&\\
&=&x^3+6x^2+12x+11.\\
\end{eqnarray}
Since $(f\circ g)(x)$ is a polynomial and because we know what the derivative of a polynomial is, then $(f\circ g)'(x)=3x^2+12x+12$.
Finding $(f\circ g)'(x)$ was difficult mostly because we would first had to find $(f\circ g)(x)$. But what if we could find $(f\circ g)'(x)$ without actually having to find $(f\circ g)(x)$ first? This is where the chain rule comes in handy. It is best to state the chain rule using Leibniz notation:
The Chain Rule (Leibniz notation) |
---|
\[ \frac{d}{dx}\left[f\circ g\right] =\frac{df}{dg}\cdot\frac{dg}{dx} \] |
The Leibniz notation is useful here because it provides an easy tool to remember the Chain Rule. Let's call $e(x)=f(g(x))=(f\circ g)(x)$. Since $e(x)=(f\circ g)(x)$ the Chain Rule is equivalent to finding $de/dx$. Remember that anything divided by itself is $1$. So let's use $1=dg/dg$. Therefore,
\[
\frac{de}{dx}=\frac{de}{dx}\cdot 1=\frac{de}{dx}\cdot \frac{dg}{dg}=\frac{de}{dg}\cdot\frac{dg}{dx}.
\]
But $de/dg=df/dg$ because $e=f(g)$. Therefore,
\[
\frac{de}{dx}=\frac{df}{dg}\cdot\frac{dg}{dx}.
\]
Again, this is not a proof, but a simple way to keep the rule in your head. (The actual proof is here in case you really need to know.)
In our "prime" notation the Chain Rule is:
The Chain Rule (prime notation) |
---|
\[(f\circ g)'(x)=f'(g(x))\,g'(x)\] |
Now let's write down step-by-step procedures and apply them to our example above:
Step | Example |
---|---|
Determine $\frac{df}{dx}$ and $\frac{dg}{dx}$. | \[\frac{df}{dx}=3x^2\mbox{ and }\frac{dg}{dx}=1\] |
Place $g$ in for every value of $x$ in $\frac{df}{dx}$. Call this $\frac{df}{dg}$. | \[\frac{df}{dg}=3g^2\] |
Place the function $g(x)$ into every value of $g$ in $\frac{df}{dg}$. We still call this $\frac{df}{dg}$. | \[\frac{df}{dg}=3(x+2)^2\] |
Multiply $\frac{dg}{dx}$ times $\frac{df}{dg}$. This is $\frac{d}{dx}\left[f\circ g\right]$. | \[\frac{df}{dg}\cdot\frac{dg}{dx}=\left[3(x+2)^2\right]\cdot 1\] |
If needed, now do some algebra to simplify the problem. | \[\frac{df}{dg}\cdot\frac{dg}{dx}=3x^2+12x+12\] |
So in summary there are two ways to find the derivative of $(f\circ g)(x)$. Find $(f\circ g)(x)$ and determine its derivative or apply the chain rule knowing the derivatives of $f(x)$ and $g(x)$.
One of the most important places the Chain Rule is used is in the solution of problems of the form $[f(x)]^n$. In this situation we can think of this as a function composition and find the derivative using the Chain Rule. Let's say that $u(x)=x^n$. Then $[f(x)]^n=u(f(x))=(u\circ f)(x)$ and we therefore know that
\[ \frac{d}{dx}\left[\,[f(x)]^n\,\right]=n\left[\,f(x)\,\right]^{n-1}\cdot\frac{df}{dx} \] |
---|
For example, try finding the derivative of $(x^2-1)^{100}$ by directly expanding it and then by using the Chain Rule. Expanding it will take you forever and then you still have to calculate the derivative. With the Chain Rule it only takes a few seconds. Problems like this make you appreciate the Chain Rule!
Finally, for a beginning Calculus course you typically don't need to understand the proof of the Chain Rule. Our recommendation is that you do not attempt to learn it, but instead understand how it is applied. However, if you still want to see the proof you may find it here.