COMP 3007 Final Take-home - Page 2 of 7 Due: TBD - See wall

Part 1: The method of Gradient Descent

2.1 Goal

The goal of this project is to find the zeroes of a certain class of function using (indirectly)

an algorithm known as the method of gradient descent.

2.2 Background: Linear Alge

a

This project will require very basic linear alge

a - no more than matrix multiplication.

Consider the following matrix equation:

f(x, y) =

(

3 1

2 −1

)(

x

y

)

−

(

−1

1

)

=

(

3x+ y + 1

2x− y − 1

)

If we define A :=

(

3 1

2 −1

)

, ~x :=

(

x

y

)

, and ~b :=

(

−1

1

)

then we can write this expression

as A~x−~b. Now observe that this expression is functionally equivalent to the following:

f(x, y) = 〈3x+ y + 1, 2x− y − 1〉

Thus there are two ways to write the function f(x, y) - as a either a matrix equation, or as

a typical multivariate function. Symbolab is a great tool if you are unfamiliar with matrix

multiplication. Click here to see the matrix multiplication above done in symbolab. Also,

while we can’t graph f(x, y) fully, we have some tools to analyze this function: link.

The motivation behind this project comes from an important topic in linear alge

a -

solving systems of equations. For example, we might be interested in know for which x, y the

following is true:

3x+ y = −1

2x− y = 1

This is equivalent to asking when f(x, y) = 〈0, 0〉 or equivalently A~x−~b = ~0.

2.3 Background: The Gradient

Recall that for a multivariate function h(x, y) from Rn → R, we can construct the gradient

of h, denoted ∇h. When n = 2, the gradient of h : R2 → R is:

∇h = ∇h(x, y) =

〈

∂h

∂x

,

∂h

∂y

〉

In particular, we saw this indirectly when taking taking a directional derivative of h(x, y).

Notice that the output of h(x, y) is a single number, whereas ∇h is a vector with two

components. There is another important fact about gradients which is illustrated in this

link: https:

www.desmos.com/calculato

sduapnehxo. This illustration shows the level

sets (i.e. preimages!!) of the function h(x, y) = x2 + xy + y2. It also contains a movable

point. I have also graphed the gradient of h(x, y) (which is a vector!) and translated it to

this movable point. You can change which function we take the level sets of in the link, but

you have to change both functions it in the top 2 lines. here is another example.

https:

www.symbolab.com/solve

step-by-step/%5Cbegin%7Bpmatrix%7D3%261%5C%5C%20%202%26-1%5Cend%7Bpmatrix%7D%5Cbegin%7Bpmatrix%7Dx%5C%5C%20%20y%5Cend%7Bpmatrix%7D-%5Cbegin%7Bpmatrix%7D-1%5C%5C%201%5Cend%7Bpmatrix%7D

https:

www.desmos.com/calculato

wovmgdw3om

https:

www.desmos.com/calculato

sduapnehxo

https:

www.desmos.com/calculato

azt4uaopy5

Rectangle

Rectangle

COMP 3007 Final Take-home - Page 3 of 7 Due: TBD - See wall

2.4 Gradient descent project example

2.4.1 Properties of the gradient

(i) Experiment with the the desmos link given in section 2.3 above, trying out a variety of

functions. For each function you choose, move the gradient along one or more of the

level sets. What relationships do you observe between the gradient and the level sets of

the function?

(ii) Now, using any example you chose or the one provided, drag the gradient to a point on

some level set, and zoom in to that point until the level set looks straight. What would

happen on the surface above this point if we move slightly in the direction perpendicula

to the gradient?

(iii) Using the explanation you gave in the previous step, explain why the gradient points in

the direction in which f(x, y) increases the fastest.

2.4.2 Set up

(iv) Now recall the function f(x, y) given in section 2.2 above,

f(x, y) = 〈3x+ y + 1, 2x− y − 1〉 = A~x−~

Why are we unable to take the gradient of f(x, y)?

(v) Using the fact that the length of a vector is given by ||〈a, b〉|| =

√

a2 + b2, compute

||f(x, y)||2 =: L(x, y). Note that your answer should be a function L : R2 → R.

(vi) Now compute ∇||f(x, y)||2 = ∇L(x, y). Note that your answer should be a function

∇L : R2 → R2.

(vii) Now compute AT (A~x−~b). Here AT is the transpose of A, so AT =

(

3 2

1 −1

)

. Again,

if you are unfamiliar with matrix multiplication, the link above to symbolab is a great

tool to ca

y out this calculation. Your answer should be a 2× 1 matrix (vector).

(viii) Compare the results from the previous two steps. Use this to guess the general matrix

formula for the gradient of the function ||A~x −~b||2. That is, complete the following

equation:

∇||A~x−~b||2 =???

2.4.3 Investigating the method of Gradient Descent

(ix) Now consider the general formula for the method of gradient descent for a function

g(x, y), or more formally g : R2 → R:

~an+1 = ~an − η∇g(~an)

Rectangle

Rectangle

COMP 3007 Final Take-home - Page 4 of 7 Due: TBD - See wall

Here η is a step size, and we must start with an initial location of ~a0 =

(

x0

y0

)

. The

method of gradient descent is an algorithm that generates a sequence of points that,

under the appropriate conditions, will converge to the location of a local minimum of

the function g. Give a short explanation for why this algorithm will find the location of

the minimum. You might mention what problems one might encounter. For example,

what can go wrong with η? Give as much detail as possible. It would be a great idea

to include a picture to show what this process looks like.

(x) Provided is Python code to implement the method of gradient descent for the function

L(x, y), using the initial point

(

.5

−.5

)

and a step size of η = .01. Summarize the results

of the output. The code can be run here if you are not familiar with Python.

(xi) Turning our attention back to the functions f(x, y) and L(x, y) = ||f(x, y)||2, explain

why we can’t use the method of gradient descent on f(x, y), but why we can use it on

L(x, y).

(xii) Our goal with this project is to find the (x, y) pair(s) that make the function f(x, y) =

〈0, 0〉. Explain why the results of applying the method of gradient descent on L(x, y)

gives us just that.

(xiii) Use the results of the Python code to estimate a point (x, y) that makes f(x, y) = 0.

Verify that your guess makes f(x, y) = 0.

(xiv) (bonus) What would have changed if our function did not have a solution? What would

have changed in the steps above and what would have remained the same?

https:

epl.it/join/sbpegyhx-

endtgerics

Rectangle

Rectangle

COMP 3007 Final Take-home - Page 5 of 7 Due: TBD - See wall

# Please note t ha t t h i s code i s minimal and i s on ly meant f o r demonstrat ion .

#I t i s c o r r e c t mathemat ica l ly , but not what I would use in a r e a l s i t u a t i o n .

# I t con ta ins no e

or checking , and the f unc t i on s cou ld ( shou ld )

#be wr i t t en in a g en e r a l i z e d f a sh i on . I encourage you to r ewr i t e t h i s

#more f o rma l l y i f you are f am i l i a r wi th programming .

import math

import numpy as np

def vecLength ( v ) :

tempSum = 0.0

for x in v :

tempSum += x [ 0 ]∗∗2

eturn math . s q r t (tempSum)

def f (A, b , x , y ) :

temp = np . matrix ( [ [ x ] , [ y ] ] )

eturn np . matrix (np . dot (A, temp)−b)

def grad (A, b , x , y ) :

vec = np . matrix ( [ [ x ] , [ y ] ] )

eturn np . matrix (2∗np . dot (np . t ranspose (A) , np . dot (A, vec)−b ) )

def main ( ) :

eta = .01

A = np . matrix ( [ [ 3 , 1 ] , [ 2 , − 1 ] ] )

= np . matrix ( [ [ − 1 ] , [ 1 ] ] )

x = −1

y = 1

step = 0

while vecLength ( grad (A, b , x , y ) ) > . 0 1 :

s tep += 1

x = (np . matrix ( [ [ x ] , [ y ] ] ) − eta ∗grad (A, b , x , y ) ) . item (0)

y = (np . matrix ( [ [ x ] , [ y ] ] ) − eta ∗grad (A, b , x , y ) ) . item (1)

print ( ” (x , y ) = ( ” ,x , ” , ” ,y , ” ) ” )

print ( ”The magnitude squared ( l ength ) o f the output o f f )

p r i n t ( , i . e . , | | f (x , y ) | |ˆ2=L(x , y ) at s tep ” , step , ” i s ” , vecLength ( f (A, b , x , y ) )∗∗2)

print ( ”The magnitude ( l ength ) o f the g rad i ent at s tep ” , step , ” i s ” , vecLength ( grad (A, b , x , y ) ) , ’ \n ’ )

main ( )

Rectangle

Answered 2 days AfterMar 11, 2022

Ans 1:

The set of points in the domain of a function where the function is constant is another way to

visualize a 3D function in a 2D domain. As the constant in level sets changes the level curve

of the function providing the gradient vector to change. The change in gradient vector length

was observed with moving the gradient along different level sets.

The minimum value that ?�⃗� ?(? ) can achieve is −∥ ??(? ) ∥, and this occurs when unit

vector ? points in the opposite direction from ∇?(? ), in the direction of −??(? ).

Thus, −??(? ) points in the direction of greatest decrease.

Ans 2:

It was observed that the gradient vector is perpendicular to all the level curves. The level curve

is simply a cross section of the graphs of function ?(?, ?). Moving perpendicular to the gradient

vector will act as normal to the point of gradient. Gradient descent works in spaces of any

number of dimensions, even in infinite-dimensional ones. In the latter case, the search space is

typically a function space, and one calculates the Fréchet derivative of the functional to be

minimized to determine the descent direction.

Figure 1: Variation of gradient with moving the point of value.

https:

en.wikipedia.org/wiki/Function_space

https:

en.wikipedia.org/wiki/Fr%C3%A9chet_derivative

Ans3:

From ?�⃗� ?(? ) =∥ ??(? ) ∥. ?⃗⃗ ⃗ , we can see that ? is the direction vector of the gradient. Now

as the value of ??(? ) increases which is a scalar quantity with the increase in value of the level

set. Thus, the gradient points in the increasing direction in which ?(? ) increases the fastest.

Ans4:

The function ?(?) =< 3? + ? + 1,2? − ? − 1 > = ?�̅� − �̅� represent a relation a point vector

in a space with one-to-one relation between ℝ2 → ℝ2. The gradient of a point space vector is

not defined therefore we are unable to take the gradient of ?(?, ?).

Ans 5:

?(?, ?) = < 3? + ? + 1,2? − ? − 1...

The set of points in the domain of a function where the function is constant is another way to

visualize a 3D function in a 2D domain. As the constant in level sets changes the level curve

of the function providing the gradient vector to change. The change in gradient vector length

was observed with moving the gradient along different level sets.

The minimum value that ?�⃗� ?(? ) can achieve is −∥ ??(? ) ∥, and this occurs when unit

vector ? points in the opposite direction from ∇?(? ), in the direction of −??(? ).

Thus, −??(? ) points in the direction of greatest decrease.

Ans 2:

It was observed that the gradient vector is perpendicular to all the level curves. The level curve

is simply a cross section of the graphs of function ?(?, ?). Moving perpendicular to the gradient

vector will act as normal to the point of gradient. Gradient descent works in spaces of any

number of dimensions, even in infinite-dimensional ones. In the latter case, the search space is

typically a function space, and one calculates the Fréchet derivative of the functional to be

minimized to determine the descent direction.

Figure 1: Variation of gradient with moving the point of value.

https:

en.wikipedia.org/wiki/Function_space

https:

en.wikipedia.org/wiki/Fr%C3%A9chet_derivative

Ans3:

From ?�⃗� ?(? ) =∥ ??(? ) ∥. ?⃗⃗ ⃗ , we can see that ? is the direction vector of the gradient. Now

as the value of ??(? ) increases which is a scalar quantity with the increase in value of the level

set. Thus, the gradient points in the increasing direction in which ?(? ) increases the fastest.

Ans4:

The function ?(?) =< 3? + ? + 1,2? − ? − 1 > = ?�̅� − �̅� represent a relation a point vector

in a space with one-to-one relation between ℝ2 → ℝ2. The gradient of a point space vector is

not defined therefore we are unable to take the gradient of ?(?, ?).

Ans 5:

?(?, ?) = < 3? + ? + 1,2? − ? − 1...

SOLUTION.PDF## Answer To This Question Is Available To Download

- MATH XXXXXXXXXX/22) Computational Methods and Numerical Techniques Contribution: 25% of course Course Leader: Dr Kayvan Nejabati Zenouz Coursework 2 Deadline Date: Thursday 17/03/2022...SolvedMar 16, 2022
- Assessment 1: Review literatureHide Folder InformationInstructionsOverviewLength: 1500 wordsWeight: 30%Due date: End of Week 2 (11.59pm Sunday)This assessment requires you to build strategies for...SolvedMar 15, 2022
- MTH 154 Practice with Google Sheets/MS Excel Finance Complete the following tasks below. Some recommendations for clarity/organization: 1. Label your spreadsheet clearly. It’s very difficult...SolvedMar 05, 2022
- 21. Solve the following differential equation:y′′(x) + 6y′(x) + 25y(x) = 0.32. Consider a linear differential equationay′′ + by′ + cy = f(x)with constant coefficients. Let y1(x), y2(x) be...SolvedFeb 19, 2022
- Math 1104C, 1104J: Test # 1, Due Feb. 9 11:59, 2022Directives:• This document has 2 pages (including this page).• This assignment is due on February 9th at 11:59 on Brightspace.• The...SolvedFeb 09, 2022
- Grading SheetMajor Assignment 2 Grading SheetCompetencyRequirements for full credit(optional for student use) Did you meet the requirements?Points possibleYour pointsScoring...SolvedFeb 05, 2022
- MATH2004C Assignment 1Last Name: First Name:Student ID:• You may either write your answers on a copy of this assignment, or on your ownpaper or on your electronic devices (you do not need to...SolvedFeb 05, 2022
- WH 3SolvedFeb 04, 2022
- For Major Assignment 3 in Topic 6, you will develop a project to provide humanitarian assistance to people in your local community.For this DQ, you will look ahead at the project instructions and take...SolvedJan 30, 2022
- WH2SolvedJan 28, 2022

Copy and Paste Your Assignment Here

Disclaimer: The reference papers provided by TAE serve as model papers for students and are not to be submitted as it is. These papers are intended to be used for research and reference purposes only.

Copyright © 2022. All rights reserved.