I had a pretty disappointing week on KA. I got through the final three videos and three articles of Applications of Multivariable Derivatives but didn’t start the unit test. I got a much better idea of what constrained optimization is, how it works, and a decent grasp on why it works. A new equation/formula/concept was introduced to me called the Lagrangian which is a formula that (as far as I understand it) puts constrained optimization into a single equation. I spent the entire week trying to understand it but couldn’t figure it out. I ended up watching two other videos on Lagrange multipliers hoping that they’d help me understand this new formula, but they didn’t. 😒 All in all, I’m not thrilled with what I was able to learn/get through this week, but I put in a solid effort so at least I’ve got that going for me. 🤷🏻♂️
Below is the first of the two supplementary videos I watched this week. It’s from a channel I’ve seen before from a guy named Professor Lenard. Having learned about Lambda and Lagrange multipliers last week, this video didn’t teach me anything new but it helped broaden my understanding of Lagrange multipliers and give me a bit more clarity on how/why they work . Here’s the vid:
The other video I watched, which is just below, gave me a way better idea of how and why constrained optimization works and why Lagrange multipliers play a role. Here’s the video:
Here are some screen shots from that video and a breakdown of what they’re saying:
Here you can see a 3D function in purple and a constraint below it in 2D shown in green. The red border of the 3D function is the most important part of the question in this video. It’s hard to tell from this image, but the red line has the same (x, y) coordinates as the border of the green circle. I like to think of the green function as being projected onto the 3D function. The primary purpose here is to figure out the coordinates of the local maximums and minimums of the 3D function on the red line. As a side benefit is that by doing so you’ll also figure out the z-value of the 3D function at those points. (I’m pretty sure that’s the purpose of what’s happening anyways. 😬)
This screen shot just shows that g is not the same thing as f. It’s what’s known as a ‘level curve’, i.e. a single, constant value on some function g.
This screen shot shows what the function g would actually look like if you were to model it in 3D. This image shows g being sliced at the value of the constraint (g = 4) that is being used to solve the constrained optimization of f.
This image shows that f has multiple z-values along the red curve but the only places where the gradient points directly up would be at the local maximums and minimums. (Goes without saying that the image above shows only a maximum.)
It’s a bit hard to understand, but this image brings it all together and is indicating that, since the gradient of f at the local maximum from above is going straight up, it’s exactly parallel to the gradient of g at that exact point. This is THE reason why you can set the gradient’s of f and g to equal each other and figure out where the local mins and maxes are, because they will ONLY be parallel at those points. At any other point on the border of f (i.e. the red line), the gradient would be ‘tilted’, so to speak, and would not be parallel to g.
(I’m pretty sure this is right and it’s THE reason why constrained optimization works.)
This image just says that for the gradients to be equal, you need to scale g with λ so that their magnitudes are the same length.
This last image shows the simplified math behind the entire process. To be honest, I’m not 100% sure if I understand everything above but I think I have the gist of it correct. It’s still very confusing to me though and so I could definitely be wrong about everything I just talked about.
Like I said at the beginning, my week actually started out trying to learn about what Grant called the Lagrangian. As far as I understand it, it’s simply a formula that takes constrained optimization and puts it into a single formula. Here are the first few notes I took about it this week:
I took the notes above after watching the last three videos from the Lagrange Multipliers and Constrained Optimization section multiple times. Below are a few screen shots from those videos with descriptions of what’s going on and the value in understanding and using the Lagrangian:
This screen shot indicates that there’s some multivariable function R(h, s) which is supposed to model revenue given two inputs of h (hours) and s (steel). Another function, B(h, s), represents the budget, i.e. the constraint, being equal to $10k. In the middle/left of the image you can see the equation L(h, s, λ) = R(h, s) – λ(B(h, s) – b) which is the Lagrangian equation. Below it you can see where Grant stated that it equals the 0 vector specifically where h, s, and λ are maximized which is denoted with a * beside them in superscript (below the Lagrangian equation). To the right where it says M* = R(h*, s*) and M(b)* = R(h(b)*, s(b)*), this is stating that the maximum revenue, M*, can be thought of as a function of the budget.
As far as I understand it, this image is explaining that when you maximize the revenue, M*, it means that the budget goes to 0 which you can see in the bottom right of the image. This is because you’d have to spend the entire budget in some allocation of h and s in order to maximize revenue and therefore B(h*, s*) = b so B(h*, s*) – b = 0.)
Clearly this is where things start to get tricky. I believe what Grant is doing here is proving that the derivative of the Lagrangian as a function of b is equal to λ (you can see in the bottom left corner that’s the derivative he’s solving for). The middle of the image shows that he reset the Lagrangian to be a partial function of b and added b in as an input. He then uses the partial derivative chain rule to show that the other variables go to 0 when finding the derivative of the Lagrangian with respect to b but I’m not sure exactly why. I’m pretty sure it’s because h* , s* and λ* are equal to the 0 vector (🤔) so those terms go to 0. What you’re left with is dL/db = ∂L/∂b which is important for the screen shot below:
When you find the derivative of the Lagrangian with respect to b in the equation at the bottom of the screen, the only thing you’re left with is λ. All this is to say that if you add the budget in as a variable and make the Lagrangian a function of JUST the budget and assume that the two variables h and s are both maximized and therefore the budget as a function of h and s goes to 0, then when finding the derivative of the Lagrangian with respect to b everything goes to 0 EXCEPT for λ and so you’re left with the dL/db = λ.
(I’ve never been more confident that I’ve been wrong about something I’ve written as I am about what I just wrote. I think I’m on the right track with all of this but am very certain my reasoning above is not accurate or even coherent. I’m trying though! 😭)
Finally, below are some screen shots from the third article in this section which was titled Interpretation of Lagrange Multipliers. The screen shots go through why the Lagrangian itself can be a function of the budget. The screen shots don’t explain how λ is the derivative of L/b but the graphs help make the function L(b) easier to understand:
The graph here lets you move the revenue line (the blue line) and the budget line (the red line). I moved the red line to equal $15K and the blue line to be tangent to it at ~$30k i.e. R/b = $2.
Here I took another screen shot of the same graph but moved the red line to equal $25K and the blue line to be tangent to it at ~$70k i.e. R/b = $2.8. This shows that as you adjust the budget, the max revenue moves with it in a non-linear fashion and so M* = L(b). (I think…)
In this image the revenue is locked to the budget to drive home the point that the budget is the variable in this circumstance and the maximum revenue is the output.
Although I clearly don’t completely understand the Lagrangian, I’m going to move on this coming week and start the unit test for Applications of Multivariable Derivatives (400/500 M.P.). I understand the concept of Lagrange multipliers well enough that, even though I don’t understand the Lagrangian equation, I’m happy enough with my grasp on what’s going on, in general, to feel good about moving forward. As I’ve said many times before, my guess is that it’ll become more clear to me in the units ahead. I’m REALLY hoping I can get through this unit before the end of the week. My goal was to finish this unit before the end of September, meaning I have until Sunday to get through it. The test is only 10 questions long so even if I mess up a few times, it’s not going to take me long to review the questions I mess up and then redo it. Considering how tough I found this week though, I’m really hoping I can crush the unit test so I can get some confidence back. I’m feeling a bit demoralized so hopefully I can turn it around! 😤