Prove Neg. Log Likelihood for Gaussian distribution is convex in mean and variance.

up vote
1
down vote

favorite

I am looking to compute maximum likelihood estimators for $mu$ and $sigma^2$, given n i.i.d random variables drawn from a Gaussian distribution. I believe I know how to write the expressions for negative log likelihood (kindly see below), however before I take derivatives with respect to $mu$ and $sigma^2,$ I want to prove that the neg. log likelihood is a convex function in $mu$ and $sigma^2$.

This is where I'm stuck - I'm unable to prove that the Hessian is Positive Semidefinite.

The negative log-likelihood function,
$$ l(mu, sigma^2) = fracn2ln(2pi) + fracn2ln(sigma^2) + sum_i=1^n frac(xi - mu)^22sigma^2$$
Let $alpha = frac1sigma^2$ (The book Convex Optimization by Boyd & Vandenberghe notes in Section 7.1 that this transformation should make the neg. log-likelihood convex in $alpha$). We now get,
$$ l(mu, alpha) = fracn2ln(2pi) - fracn2ln(alpha) + sum_i=1^n frac(x_i - mu)^2alpha2$$
$$ = fracn2ln(2pi) + frac12sum_i=1^nleft(-ln(alpha) + frac(x_i - mu)^2alpha2right)$$

Define,
$$g_i(mu, alpha) = -ln(alpha) + frac(x_i - mu)^2alpha2 $$

Now my approach is to show that $g_i(mu, alpha)$ is convex in $mu$, $alpha$ and use that to say that $l(mu, alpha)$ being a sum of convex $g_i$'s is also convex in $mu$, $alpha$. The Hessian for $g_i$ is:

$$ nabla^2g_i =
beginpmatrix
2alpha & -2(x_i - mu)\
-2(x_i - mu) & frac1alpha^2 \
endpmatrix
$$

And the determinant of the Hessian is,
$$ lvert nabla^2g_i rvert = frac2alpha - 4(x_i - mu)^2$$
This is where I'm stuck - I cannot show that this determinant is non-negative for all values of $mu$ and $alpha (>0)$. Kindly help figure out my conceptual or other errors.

Kindly note I've consulted the following similar queries:
How to prove the global maximum log likelihood function of a normal distribution is concave

and Proving MLE for normal distribution

However both of them only show that the Hessian is non-negative at a point where $mu$ and $alpha$ equal their estimated values. The mistake I see is that the estimates were arrived in the first place by assuming the neg. log-likelihood is convex (i.e. by equating gradient to 0, which is the optimality criterion for a convex function).

Thanks

edited Jul 16 at 14:10

asked Jul 16 at 3:51

abhimanyutalwar

113

What is Prof. Stephen Boyd's book on Convex Optimization? Are there perhaps other authors?
â€“Â LinAlg
Jul 16 at 11:59

is the statement about convexity in $alpha$, or about convexity in both $alpha$ and $x$?
â€“Â LinAlg
Jul 16 at 12:04

1

@LinAlg I was looking for convexity in z = [mu, alpha], however looking at the expression for my Hessian, I can not show its determinant to be always non-negative, and therefore I can conclude that the Hessian is not positive semi-definite. Ahmad Bazzi below, has confirmed this in his answer, and I now do realize that to find MLE estimates, it is not required for l to be convex in both mu and alpha. Separately, I have now listed both the authors of the Convex Optimization book I alluded to in my original query. My bad.
â€“Â abhimanyutalwar
Jul 16 at 14:07

add a commentÂ |Â

up vote
1
down vote

favorite

This is where I'm stuck - I'm unable to prove that the Hessian is Positive Semidefinite.

Define,
$$g_i(mu, alpha) = -ln(alpha) + frac(x_i - mu)^2alpha2 $$

$$ nabla^2g_i =
beginpmatrix
2alpha & -2(x_i - mu)\
-2(x_i - mu) & frac1alpha^2 \
endpmatrix
$$

Kindly note I've consulted the following similar queries:
How to prove the global maximum log likelihood function of a normal distribution is concave

and Proving MLE for normal distribution

Thanks

edited Jul 16 at 14:10

asked Jul 16 at 3:51

abhimanyutalwar

113

What is Prof. Stephen Boyd's book on Convex Optimization? Are there perhaps other authors?
â€“Â LinAlg
Jul 16 at 11:59

is the statement about convexity in $alpha$, or about convexity in both $alpha$ and $x$?
â€“Â LinAlg
Jul 16 at 12:04

1

@LinAlg I was looking for convexity in z = [mu, alpha], however looking at the expression for my Hessian, I can not show its determinant to be always non-negative, and therefore I can conclude that the Hessian is not positive semi-definite. Ahmad Bazzi below, has confirmed this in his answer, and I now do realize that to find MLE estimates, it is not required for l to be convex in both mu and alpha. Separately, I have now listed both the authors of the Convex Optimization book I alluded to in my original query. My bad.
â€“Â abhimanyutalwar
Jul 16 at 14:07

add a commentÂ |Â

up vote
1
down vote

favorite

This is where I'm stuck - I'm unable to prove that the Hessian is Positive Semidefinite.

Define,
$$g_i(mu, alpha) = -ln(alpha) + frac(x_i - mu)^2alpha2 $$

$$ nabla^2g_i =
beginpmatrix
2alpha & -2(x_i - mu)\
-2(x_i - mu) & frac1alpha^2 \
endpmatrix
$$

Kindly note I've consulted the following similar queries:
How to prove the global maximum log likelihood function of a normal distribution is concave

and Proving MLE for normal distribution

Thanks

edited Jul 16 at 14:10

asked Jul 16 at 3:51

abhimanyutalwar

113

This is where I'm stuck - I'm unable to prove that the Hessian is Positive Semidefinite.

Define,
$$g_i(mu, alpha) = -ln(alpha) + frac(x_i - mu)^2alpha2 $$

$$ nabla^2g_i =
beginpmatrix
2alpha & -2(x_i - mu)\
-2(x_i - mu) & frac1alpha^2 \
endpmatrix
$$

Kindly note I've consulted the following similar queries:
How to prove the global maximum log likelihood function of a normal distribution is concave

and Proving MLE for normal distribution

Thanks

edited Jul 16 at 14:10

asked Jul 16 at 3:51

abhimanyutalwar

113

edited Jul 16 at 14:10

asked Jul 16 at 3:51

abhimanyutalwar

113

asked Jul 16 at 3:51

abhimanyutalwar

113

asked Jul 16 at 3:51

abhimanyutalwar

113

What is Prof. Stephen Boyd's book on Convex Optimization? Are there perhaps other authors?
â€“Â LinAlg
Jul 16 at 11:59

is the statement about convexity in $alpha$, or about convexity in both $alpha$ and $x$?
â€“Â LinAlg
Jul 16 at 12:04

1

@LinAlg I was looking for convexity in z = [mu, alpha], however looking at the expression for my Hessian, I can not show its determinant to be always non-negative, and therefore I can conclude that the Hessian is not positive semi-definite. Ahmad Bazzi below, has confirmed this in his answer, and I now do realize that to find MLE estimates, it is not required for l to be convex in both mu and alpha. Separately, I have now listed both the authors of the Convex Optimization book I alluded to in my original query. My bad.
â€“Â abhimanyutalwar
Jul 16 at 14:07

add a commentÂ |Â

What is Prof. Stephen Boyd's book on Convex Optimization? Are there perhaps other authors?
â€“Â LinAlg
Jul 16 at 11:59

is the statement about convexity in $alpha$, or about convexity in both $alpha$ and $x$?
â€“Â LinAlg
Jul 16 at 12:04

1

@LinAlg I was looking for convexity in z = [mu, alpha], however looking at the expression for my Hessian, I can not show its determinant to be always non-negative, and therefore I can conclude that the Hessian is not positive semi-definite. Ahmad Bazzi below, has confirmed this in his answer, and I now do realize that to find MLE estimates, it is not required for l to be convex in both mu and alpha. Separately, I have now listed both the authors of the Convex Optimization book I alluded to in my original query. My bad.
â€“Â abhimanyutalwar
Jul 16 at 14:07

What is Prof. Stephen Boyd's book on Convex Optimization? Are there perhaps other authors?
â€“Â LinAlg
Jul 16 at 11:59

is the statement about convexity in $alpha$, or about convexity in both $alpha$ and $x$?
â€“Â LinAlg
Jul 16 at 12:04

@LinAlg I was looking for convexity in z = [mu, alpha], however looking at the expression for my Hessian, I can not show its determinant to be always non-negative, and therefore I can conclude that the Hessian is not positive semi-definite. Ahmad Bazzi below, has confirmed this in his answer, and I now do realize that to find MLE estimates, it is not required for l to be convex in both mu and alpha. Separately, I have now listed both the authors of the Convex Optimization book I alluded to in my original query. My bad.
â€“Â abhimanyutalwar
Jul 16 at 14:07

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

So you get $$l(mu,alpha) =fracn2ln 2 pi - fracn2 ln alpha+ sum frac(x_i- mu)^2alpha2$$
Convex in $mu$

The second derivative w.r.t $mu$ is $$fracpartial^2partial mu^2l = n alpha > 0$$
So we get convexity in $mu$.

Convex in $alpha$

The second derivative w.r.t $alpha^2$ is $$fracpartial^2partial alpha^2l = frac1alpha^2 > 0$$
So we get convexity in $alpha$.

What I think you meant is that you would want to prove that $l(pmbz)$ is convex in $pmbz$, where $pmbz = [mu, alpha]$ (jointly). Well, it is not convex in $pmbz$ because the Hessian you wrote has negative values for values of $x_i,mu,alpha$: Choose a small $frac2alpha$ and a large $4(x_i - mu)^2$, this leaves us with a negative determinant. Boyd does not tell you that $l(mu,alpha)$ is convex in $mu,alpha$. The statement convex in mean and variance means that it is convex in mean and it is convex in variance.

The link you shared here is something completely different. They want to show that the optimal values are concave (at least this is what they state).

edited Jul 16 at 13:17

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

Thanks Ahmad, this is quite helpful, I will mark this as the answer. Yes, I was looking to prove convexity in z. I realize that to find the MLE estimates, l need not be convex in z, and that I can first minimize over mu and then over alpha. Thanks for your help. (For anyone else interested, refer to "Optimizing over some variables", Section 4.1, in the book "Convex Optimization" by Boyd & Vandenberghe).
â€“Â abhimanyutalwar
Jul 16 at 14:00

I'm glad you found it helpfuk @abhimanyutalwar .. If you found the answer helpful, you could upvote it as well ;)
â€“Â Ahmad Bazzi
Jul 16 at 14:17

@abhimanyutalwar first minimizing over mu and then over alpha will only be beneficial if the latter optimization problem is convex, which it probably is not; that means that your strategy is not simple to execute
â€“Â LinAlg
Jul 16 at 14:24

1

@LinAlg Agree that this doesn't work as a general strategy. However, in this case my original neg-log-likelihood function is convex in mu, and its infimum over mu is convex in alpha, so it worked. Ahmad, I would very much like to upvote!! However, as I'm a new user I do not have enough rep to do that :(
â€“Â abhimanyutalwar
Jul 19 at 3:22

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853070%2fprove-neg-log-likelihood-for-gaussian-distribution-is-convex-in-mean-and-varian%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

So you get $$l(mu,alpha) =fracn2ln 2 pi - fracn2 ln alpha+ sum frac(x_i- mu)^2alpha2$$
Convex in $mu$

The second derivative w.r.t $mu$ is $$fracpartial^2partial mu^2l = n alpha > 0$$
So we get convexity in $mu$.

Convex in $alpha$

The second derivative w.r.t $alpha^2$ is $$fracpartial^2partial alpha^2l = frac1alpha^2 > 0$$
So we get convexity in $alpha$.

The link you shared here is something completely different. They want to show that the optimal values are concave (at least this is what they state).

edited Jul 16 at 13:17

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

Thanks Ahmad, this is quite helpful, I will mark this as the answer. Yes, I was looking to prove convexity in z. I realize that to find the MLE estimates, l need not be convex in z, and that I can first minimize over mu and then over alpha. Thanks for your help. (For anyone else interested, refer to "Optimizing over some variables", Section 4.1, in the book "Convex Optimization" by Boyd & Vandenberghe).
â€“Â abhimanyutalwar
Jul 16 at 14:00

I'm glad you found it helpfuk @abhimanyutalwar .. If you found the answer helpful, you could upvote it as well ;)
â€“Â Ahmad Bazzi
Jul 16 at 14:17

@abhimanyutalwar first minimizing over mu and then over alpha will only be beneficial if the latter optimization problem is convex, which it probably is not; that means that your strategy is not simple to execute
â€“Â LinAlg
Jul 16 at 14:24

1

@LinAlg Agree that this doesn't work as a general strategy. However, in this case my original neg-log-likelihood function is convex in mu, and its infimum over mu is convex in alpha, so it worked. Ahmad, I would very much like to upvote!! However, as I'm a new user I do not have enough rep to do that :(
â€“Â abhimanyutalwar
Jul 19 at 3:22

add a commentÂ |Â

up vote
1
down vote

accepted

So you get $$l(mu,alpha) =fracn2ln 2 pi - fracn2 ln alpha+ sum frac(x_i- mu)^2alpha2$$
Convex in $mu$

The second derivative w.r.t $mu$ is $$fracpartial^2partial mu^2l = n alpha > 0$$
So we get convexity in $mu$.

Convex in $alpha$

The second derivative w.r.t $alpha^2$ is $$fracpartial^2partial alpha^2l = frac1alpha^2 > 0$$
So we get convexity in $alpha$.

The link you shared here is something completely different. They want to show that the optimal values are concave (at least this is what they state).

edited Jul 16 at 13:17

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

Thanks Ahmad, this is quite helpful, I will mark this as the answer. Yes, I was looking to prove convexity in z. I realize that to find the MLE estimates, l need not be convex in z, and that I can first minimize over mu and then over alpha. Thanks for your help. (For anyone else interested, refer to "Optimizing over some variables", Section 4.1, in the book "Convex Optimization" by Boyd & Vandenberghe).
â€“Â abhimanyutalwar
Jul 16 at 14:00

I'm glad you found it helpfuk @abhimanyutalwar .. If you found the answer helpful, you could upvote it as well ;)
â€“Â Ahmad Bazzi
Jul 16 at 14:17

@abhimanyutalwar first minimizing over mu and then over alpha will only be beneficial if the latter optimization problem is convex, which it probably is not; that means that your strategy is not simple to execute
â€“Â LinAlg
Jul 16 at 14:24

1

@LinAlg Agree that this doesn't work as a general strategy. However, in this case my original neg-log-likelihood function is convex in mu, and its infimum over mu is convex in alpha, so it worked. Ahmad, I would very much like to upvote!! However, as I'm a new user I do not have enough rep to do that :(
â€“Â abhimanyutalwar
Jul 19 at 3:22

add a commentÂ |Â

up vote
1
down vote

accepted

So you get $$l(mu,alpha) =fracn2ln 2 pi - fracn2 ln alpha+ sum frac(x_i- mu)^2alpha2$$
Convex in $mu$

The second derivative w.r.t $mu$ is $$fracpartial^2partial mu^2l = n alpha > 0$$
So we get convexity in $mu$.

Convex in $alpha$

The second derivative w.r.t $alpha^2$ is $$fracpartial^2partial alpha^2l = frac1alpha^2 > 0$$
So we get convexity in $alpha$.

The link you shared here is something completely different. They want to show that the optimal values are concave (at least this is what they state).

edited Jul 16 at 13:17

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

So you get $$l(mu,alpha) =fracn2ln 2 pi - fracn2 ln alpha+ sum frac(x_i- mu)^2alpha2$$
Convex in $mu$

The second derivative w.r.t $mu$ is $$fracpartial^2partial mu^2l = n alpha > 0$$
So we get convexity in $mu$.

Convex in $alpha$

The second derivative w.r.t $alpha^2$ is $$fracpartial^2partial alpha^2l = frac1alpha^2 > 0$$
So we get convexity in $alpha$.

The link you shared here is something completely different. They want to show that the optimal values are concave (at least this is what they state).

edited Jul 16 at 13:17

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

edited Jul 16 at 13:17

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

answered Jul 16 at 13:12

Ahmad Bazzi

2,8531418

Thanks Ahmad, this is quite helpful, I will mark this as the answer. Yes, I was looking to prove convexity in z. I realize that to find the MLE estimates, l need not be convex in z, and that I can first minimize over mu and then over alpha. Thanks for your help. (For anyone else interested, refer to "Optimizing over some variables", Section 4.1, in the book "Convex Optimization" by Boyd & Vandenberghe).
â€“Â abhimanyutalwar
Jul 16 at 14:00

I'm glad you found it helpfuk @abhimanyutalwar .. If you found the answer helpful, you could upvote it as well ;)
â€“Â Ahmad Bazzi
Jul 16 at 14:17

@abhimanyutalwar first minimizing over mu and then over alpha will only be beneficial if the latter optimization problem is convex, which it probably is not; that means that your strategy is not simple to execute
â€“Â LinAlg
Jul 16 at 14:24

1

@LinAlg Agree that this doesn't work as a general strategy. However, in this case my original neg-log-likelihood function is convex in mu, and its infimum over mu is convex in alpha, so it worked. Ahmad, I would very much like to upvote!! However, as I'm a new user I do not have enough rep to do that :(
â€“Â abhimanyutalwar
Jul 19 at 3:22

add a commentÂ |Â

Thanks Ahmad, this is quite helpful, I will mark this as the answer. Yes, I was looking to prove convexity in z. I realize that to find the MLE estimates, l need not be convex in z, and that I can first minimize over mu and then over alpha. Thanks for your help. (For anyone else interested, refer to "Optimizing over some variables", Section 4.1, in the book "Convex Optimization" by Boyd & Vandenberghe).
â€“Â abhimanyutalwar
Jul 16 at 14:00

I'm glad you found it helpfuk @abhimanyutalwar .. If you found the answer helpful, you could upvote it as well ;)
â€“Â Ahmad Bazzi
Jul 16 at 14:17

@abhimanyutalwar first minimizing over mu and then over alpha will only be beneficial if the latter optimization problem is convex, which it probably is not; that means that your strategy is not simple to execute
â€“Â LinAlg
Jul 16 at 14:24

1

@LinAlg Agree that this doesn't work as a general strategy. However, in this case my original neg-log-likelihood function is convex in mu, and its infimum over mu is convex in alpha, so it worked. Ahmad, I would very much like to upvote!! However, as I'm a new user I do not have enough rep to do that :(
â€“Â abhimanyutalwar
Jul 19 at 3:22

Thanks Ahmad, this is quite helpful, I will mark this as the answer. Yes, I was looking to prove convexity in z. I realize that to find the MLE estimates, l need not be convex in z, and that I can first minimize over mu and then over alpha. Thanks for your help. (For anyone else interested, refer to "Optimizing over some variables", Section 4.1, in the book "Convex Optimization" by Boyd & Vandenberghe).
â€“Â abhimanyutalwar
Jul 16 at 14:00

I'm glad you found it helpfuk @abhimanyutalwar .. If you found the answer helpful, you could upvote it as well ;)
â€“Â Ahmad Bazzi
Jul 16 at 14:17

@abhimanyutalwar first minimizing over mu and then over alpha will only be beneficial if the latter optimization problem is convex, which it probably is not; that means that your strategy is not simple to execute
â€“Â LinAlg
Jul 16 at 14:24

@LinAlg Agree that this doesn't work as a general strategy. However, in this case my original neg-log-likelihood function is convex in mu, and its infimum over mu is convex in alpha, so it worked. Ahmad, I would very much like to upvote!! However, as I'm a new user I do not have enough rep to do that :(
â€“Â abhimanyutalwar
Jul 19 at 3:22

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik