In these exercises we'll take a close look at inverse kinematics, through the lens of continuous optimization. In fact, you'll implement a scheme like this one "for real" in A4. Hopefully these exercises will build up your confidence in understanding the concepts behind the A4 tasks, and provide a bit deeper understanding of what's going on, how to improve your IK solver---and why things might fail!

## Build a Kinematic Chain

Consider a linear kinematic chain connecting three vertices p0, p1, p2, and angles of rotation theta0, theta1, as depicted above. Ultimately, this chain will try to "reach" for a given goal point x, i.e., it will try to make the final vertex close to x. For now, just give expressions for the locations of q0, q1, and q1 as functions of the original positions p0, p1, p2 and the current angles theta0, theta1.

## Define the Energy

Write an expression for an energy Phi of the angles theta0 and theta1 that is minimized when the chain is as close as possible to reaching the goal. You may also use other quantities in this [removed]such as the points p or q), as long as the dependence of q on the angles theta0 and theta1 is clearly indicated.

## Take the Gradient

Give an expression for the gradient of the energy Phi with respect to the angles theta0 and theta1.

## Implement Gradient Descent

Write a basic gradient descent routine that minimizes Phi with respect to theta0, theta1. This routine will take an initial guess for the two angles, a maximum number of steps, and a step size tau. For now, you should just take fixed-size time steps.

## Run Gradient Descent

Now run gradient descent, for a variety of different initial guesses and step sizes. Plot the trajectory, and visualize the resulting animation. How does the system behave for a large step? A small step? Does it always converge to a local minimum?

## Add Line Search

You may have noticed that it's hard to pick the "best" time step for gradient descent a priori. Let's improve our gradient descent routine by adding a simple backtracking line search. In particular, before taking a gradient step, we will "search" for a time step tau that ensures the energy goes down (rather than up) on the next step. To do this, we first measure the energy at the current step. We then take a tentative step of size tau, and measure the energy at the new configuration. If the new energy is bigger than the old energy, we decrease the size of the step and try again.

Once you've implemented line search, try it out for an initial time step that didn't work with fixed-step gradient descent. As a sanity check, also plot the energy as a function of the step number. Is your routine more reliable than it was before?