Gradient descent versus finding where the gradient vanishes via solving systems of equations

up vote
1
down vote

favorite

I started learning machine learning and got stuck at the following questions:

Why do we need to iterate the gradient descent algorithm?

Why don't we equate the gradient to zero and find all local minima?

Most likely, we can't reach the minimum; we can just come as close as possible and the learning rate controls how close. Am I right? Or do I miss something?

Sorry if this is a duplicate question. Thanks in advance.

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

asked Jul 26 at 8:52

Anton

migrated from cstheory.stackexchange.com Jul 30 at 12:18

This question came from our site for theoretical computer scientists and researchers in related fields.

Generally speaking, finding where the gradient equals zero is only easy for quadratic cost functions. Solving systems of polynomial equations is not easy.
â€“Â Rodrigo de Azevedo
Jul 26 at 18:08

@RodrigodeAzevedo, thanks for reply first of all! but why we can't use Laplace transform in that case? I mean it could take much less computing time
â€“Â Anton
Jul 30 at 9:14

Laplace transform? Where are the differential equations?
â€“Â Rodrigo de Azevedo
Jul 30 at 12:36

@RodrigodeAzevedo, sorry, I might misunderstand you. I thought, that when we find a derivatives from MSE function, we are getting system of differential equations and in case it is difficult to solve it, we might use Laplace transform.
â€“Â Anton
Jul 30 at 12:46

Take a look at this.
â€“Â Rodrigo de Azevedo
Jul 30 at 12:52

Â |Â
show 1 more comment

up vote
1
down vote

favorite

I started learning machine learning and got stuck at the following questions:

Why do we need to iterate the gradient descent algorithm?

Why don't we equate the gradient to zero and find all local minima?

Most likely, we can't reach the minimum; we can just come as close as possible and the learning rate controls how close. Am I right? Or do I miss something?

Sorry if this is a duplicate question. Thanks in advance.

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

asked Jul 26 at 8:52

Anton

migrated from cstheory.stackexchange.com Jul 30 at 12:18

This question came from our site for theoretical computer scientists and researchers in related fields.

Generally speaking, finding where the gradient equals zero is only easy for quadratic cost functions. Solving systems of polynomial equations is not easy.
â€“Â Rodrigo de Azevedo
Jul 26 at 18:08

@RodrigodeAzevedo, thanks for reply first of all! but why we can't use Laplace transform in that case? I mean it could take much less computing time
â€“Â Anton
Jul 30 at 9:14

Laplace transform? Where are the differential equations?
â€“Â Rodrigo de Azevedo
Jul 30 at 12:36

@RodrigodeAzevedo, sorry, I might misunderstand you. I thought, that when we find a derivatives from MSE function, we are getting system of differential equations and in case it is difficult to solve it, we might use Laplace transform.
â€“Â Anton
Jul 30 at 12:46

Take a look at this.
â€“Â Rodrigo de Azevedo
Jul 30 at 12:52

Â |Â
show 1 more comment

up vote
1
down vote

favorite

I started learning machine learning and got stuck at the following questions:

Why do we need to iterate the gradient descent algorithm?

Why don't we equate the gradient to zero and find all local minima?

Most likely, we can't reach the minimum; we can just come as close as possible and the learning rate controls how close. Am I right? Or do I miss something?

Sorry if this is a duplicate question. Thanks in advance.

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

asked Jul 26 at 8:52

Anton

I started learning machine learning and got stuck at the following questions:

Why do we need to iterate the gradient descent algorithm?

Why don't we equate the gradient to zero and find all local minima?

Most likely, we can't reach the minimum; we can just come as close as possible and the learning rate controls how close. Am I right? Or do I miss something?

Sorry if this is a duplicate question. Thanks in advance.

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

asked Jul 26 at 8:52

Anton

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

edited Jul 30 at 12:34

Rodrigo de Azevedo

12.5k41751

asked Jul 26 at 8:52

Anton

asked Jul 26 at 8:52

Anton

asked Jul 26 at 8:52

Anton

migrated from cstheory.stackexchange.com Jul 30 at 12:18

This question came from our site for theoretical computer scientists and researchers in related fields.

migrated from cstheory.stackexchange.com Jul 30 at 12:18

This question came from our site for theoretical computer scientists and researchers in related fields.

Generally speaking, finding where the gradient equals zero is only easy for quadratic cost functions. Solving systems of polynomial equations is not easy.
â€“Â Rodrigo de Azevedo
Jul 26 at 18:08

@RodrigodeAzevedo, thanks for reply first of all! but why we can't use Laplace transform in that case? I mean it could take much less computing time
â€“Â Anton
Jul 30 at 9:14

Laplace transform? Where are the differential equations?
â€“Â Rodrigo de Azevedo
Jul 30 at 12:36

@RodrigodeAzevedo, sorry, I might misunderstand you. I thought, that when we find a derivatives from MSE function, we are getting system of differential equations and in case it is difficult to solve it, we might use Laplace transform.
â€“Â Anton
Jul 30 at 12:46

Take a look at this.
â€“Â Rodrigo de Azevedo
Jul 30 at 12:52

Â |Â
show 1 more comment

Generally speaking, finding where the gradient equals zero is only easy for quadratic cost functions. Solving systems of polynomial equations is not easy.
â€“Â Rodrigo de Azevedo
Jul 26 at 18:08

@RodrigodeAzevedo, thanks for reply first of all! but why we can't use Laplace transform in that case? I mean it could take much less computing time
â€“Â Anton
Jul 30 at 9:14

Laplace transform? Where are the differential equations?
â€“Â Rodrigo de Azevedo
Jul 30 at 12:36

@RodrigodeAzevedo, sorry, I might misunderstand you. I thought, that when we find a derivatives from MSE function, we are getting system of differential equations and in case it is difficult to solve it, we might use Laplace transform.
â€“Â Anton
Jul 30 at 12:46

Take a look at this.
â€“Â Rodrigo de Azevedo
Jul 30 at 12:52

Generally speaking, finding where the gradient equals zero is only easy for quadratic cost functions. Solving systems of polynomial equations is not easy.
â€“Â Rodrigo de Azevedo
Jul 26 at 18:08

@RodrigodeAzevedo, thanks for reply first of all! but why we can't use Laplace transform in that case? I mean it could take much less computing time
â€“Â Anton
Jul 30 at 9:14

Laplace transform? Where are the differential equations?
â€“Â Rodrigo de Azevedo
Jul 30 at 12:36

@RodrigodeAzevedo, sorry, I might misunderstand you. I thought, that when we find a derivatives from MSE function, we are getting system of differential equations and in case it is difficult to solve it, we might use Laplace transform.
â€“Â Anton
Jul 30 at 12:46

Take a look at this.
â€“Â Rodrigo de Azevedo
Jul 30 at 12:52

Â |Â
show 1 more comment

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2866962%2fgradient-descent-versus-finding-where-the-gradient-vanishes-via-solving-systems%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik