Week 211 – Sept. 11th to Sept. 17th

I had an interesting week working through KA. As you’ll see below, I only made it through four questions/concepts from two articles and six videos but everything I worked on was pretty hard and took a good amount of time and effort for me to understand. I started the week finishing off an article on the second partial derivative test and, as you’ll see, worked through a question that took was TWELVE screen shots long and took me eight pages of notes to work through. 😳 In the four-plus years I’ve been working on KA, that is without a doubt the longest question I’ve added to one of these posts. The other three things I worked through were also long and fairly involved too, so even though I didn’t get as much done as I’d hoped I would this week, I’m still happy with the effort I made.

As a recap from last week, the second partial derivative test (SPDT) tells you if there’s a local max/min or if there’s a saddle point on a 3D function. It does this by 1) finding where the gradient equals 0 and then 2) checking to see that the partial derivatives in the x- and y-axes AND the partial derivatives in the diagonal, xy-axes are all going in the same direction. Here’s the question I worked through from that article I was talking about:

The fifth and final article from the section Optimizing Multivariable Functions (Articles) was titled Gradient Descent. I hadn’t worked on or learned anything about this concept before so reading through this article was pretty confusing. I was able to get decent idea of what gradient descent is and how it works, but I don’t really understand why the formula works. My general understanding is that gradient descent is a formula used to find and approximate a local minimum. It’s used when a function has a bunch of inputs and so setting the function’s gradient to equal 0 and solving would be very complicated. The gradient descent formula is used to simplify the function and is similar to how a Taylor Polynomial works. The formula works by picking a random point in the multivariable function and then taking a step, denoted with “alpha” (α), in the negative-gradient direction. (I.e. in the ‘downhill’ direction.) It does this over and over until it finds the ‘bottom’ of the function in a specific, local area (a.k.a. it finds the local minimum).

As I said, I still don’t fully understand how/why it all works, but here are some screen shots from the article which partly summarizes the concept:

On Thursday I began a new section titled Lagrange Multipliers and Constrained Optimization. The general idea of constrained optimization is that if you have a 3D function, say f(x, y), constrained optimization will tell you what (x, y) values to input into the function to output the highest z-value when the (x, y) coordinates are limited to specific ranges. Here’s a question I worked through from the first two videos of the section that explain how it works:

This image show the gradient of f(x, y) and indicates that f is tangent to the edge of the red circle. (On the right side of the image in the middle.)

This image takes the constraint, x² + y² = 1 (a.k.a. the unit circle), and turns it into its own function. Here you can see the introduction of what’s called the Lagrange Multiplier which uses “lambda” (λ) as a variable which is used to make the scale of the gradient of g(x, y) the same length as the gradient of f(x, y). (I talk more about how this works below.)

I find the math is easy-ish to do but I don’t understand why simply finding where the gradients are equal in both equations and then inputting those critical points back into the OG function tells you the max of the function. I don’t even know if that’s correct, but if it is I don’t intuitively understand why it works and definitely can’t visualize it.

The last three videos I watched/worked through this week talked about the Lagrange Multiplier. It was the same type of question as the one just above except the OG function, R(h, s) = 100h^2/3s^1/3, is a bit trickier than the OG function from the former question. This question is supposed to represent a potential real world example of when you’d use constrained optimization. There are two inputs, Labour costs and Steel costs, and you’re trying to maximize the revenue, R(h, s). One thing I didn’t understand was how Grant would have come up with the revenue function in the first place. 🤔 In any case, here’s the question and my notes working through it:

This image shoes the contour lines for the function R in the top right corner of the image which are the curved blue lines on the graph. The red line represents the constraint function which is g(h, s) = 20h + 2,000 s = $20,000. It means that where the gradient of R and g are tangent, that point indicates what values of h and s to input into R to output $20,000 which would be the optimal output. (I understand what’s going on but I don’t know how to put this into words so that may not have made any sense. I also don’t know why this works. 😔)

This image shows the gradients for R in the top right corner and the gradient of g in the bottom left. It also indicates that the partial derivatives are equal, R_h(h_m, s_m) = λ g_h(h_m, s_m) and R_s(h_m, s_m) = λ g_s(h_m, s_m). You can see that in the bottom right corner. The coordinates (h_m, s_m) are where h and s are maximized which is why there’s an m in the subscript.

Here you can see that Grant substituted u in for (s/h) which makes the algebra a lot easier.

This image shows how Grant solved the system of equations. He did it differently than how I did though. In my notes below you can see that I changed each equation to equal λ and then said the equations equaled each other and solved for u.

This last image shows at the bottom that the optimal values of h and s are 2,000/3 and 10/3, respectively.

I definitely think I can get through the last two sections of Applications of Multivariable Derivatives (400/500 M.P.) this week. I only have three videos and three articles left between the two of them so I may even be able to have a go at the unit test by the end of the week. I’m hoping that, considering how short this unit it, it won’t take me too many attempts to get through it. I just looked and there’s only 10 questions on the test! There are a few concepts from this unit that I still don’t understand why the formulas work the way they do, but for the most part I pretty much understand what the concepts are talking about and how to solve them using the formulas. I guess I’ll find out if I can make it through to the test by the end of this week! 🤞🏼

Solve For Why

by Will Malmo

Week 211 – Sept. 11th to Sept. 17th