Derivation of derivative of multivariate Gaussian w.r.t. covariance matrix
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I'm reading a paper, probabilistic CCA, in which the authors state derivatives without showing derivations. I would like step-by-step derivations to convince myself. Consider a $d$-dimensional multivariate Gaussian random variable:
$$
textbfx sim mathcalN(boldsymbolmu, Sigma)
$$
In probabilistic CCA, we define $Sigma = W W^top + Psi$, where $W in mathbbR^d times q$ and $Psi in mathbbR^d times d$. I'd like to compute the derivative w.r.t. $boldsymbolmu$, $W$, and $Psi$ for the negative log-likelihood.
The stationary point for $boldsymbolmu$ is just the empirical mean (shown below*) or $hatboldsymbolmu$. Plugging in the minimum for the parameter $boldsymbolmu$ into the negative log-likelihood, we get:
$$
fracpartial mathcalLpartial W
=
fracpartialpartial W Big
overbrace
frac12 sum_i=1^n(textbfx_i - hatboldsymbolmu)^top Sigma^-1 (textbfx_i - hatboldsymbolmu)
^A
+
overbracefracn2 ln ^B + overbracetextconst^C
Big
$$
Clearly, $C = 0$. But I'm not sure how to handle $A$ and $B$, particularly since $Sigma = W W^top + Psi$.
*Derivative w.r.t. $boldsymbolmu$
The negative log-likelihood is:
$$
mathcalL
=
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu) + fracn2 ln |Sigma| + textconst
$$
The derivative of the two rightmost terms with respect to $boldsymbolmu$ is $0$, meaning we just need to compute:
$$
fracpartialpartial boldsymbolmu
Big
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
By the linearity of differentiation, we have:
$$
frac12
sum_i=1^n
fracpartialpartial boldsymbolmu
Big
(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Using Equation ($86$) from the Matrix Cookbox, we get:
$$
frac12
sum_i=1^n
Big
-2 Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Finally, solve for $boldsymbolmu$, we get:
$$
beginalign
0
&= frac12 sum_i=1^n Big -2 Sigma^-1 (textbfx_i - boldsymbolmu) Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i - Sigma^-1 boldsymbolmu Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i Big + n Sigma^-1 boldsymbolmu
\
- n Sigma^-1 boldsymbolmu &= - Sigma^-1 sum_i=1^n textbfx_i
\
boldsymbolmu &= frac1n sum_i=1^n textbfx_i
endalign
$$
And we're done.
statistics derivatives partial-derivative matrix-calculus
add a comment |Â
up vote
0
down vote
favorite
I'm reading a paper, probabilistic CCA, in which the authors state derivatives without showing derivations. I would like step-by-step derivations to convince myself. Consider a $d$-dimensional multivariate Gaussian random variable:
$$
textbfx sim mathcalN(boldsymbolmu, Sigma)
$$
In probabilistic CCA, we define $Sigma = W W^top + Psi$, where $W in mathbbR^d times q$ and $Psi in mathbbR^d times d$. I'd like to compute the derivative w.r.t. $boldsymbolmu$, $W$, and $Psi$ for the negative log-likelihood.
The stationary point for $boldsymbolmu$ is just the empirical mean (shown below*) or $hatboldsymbolmu$. Plugging in the minimum for the parameter $boldsymbolmu$ into the negative log-likelihood, we get:
$$
fracpartial mathcalLpartial W
=
fracpartialpartial W Big
overbrace
frac12 sum_i=1^n(textbfx_i - hatboldsymbolmu)^top Sigma^-1 (textbfx_i - hatboldsymbolmu)
^A
+
overbracefracn2 ln ^B + overbracetextconst^C
Big
$$
Clearly, $C = 0$. But I'm not sure how to handle $A$ and $B$, particularly since $Sigma = W W^top + Psi$.
*Derivative w.r.t. $boldsymbolmu$
The negative log-likelihood is:
$$
mathcalL
=
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu) + fracn2 ln |Sigma| + textconst
$$
The derivative of the two rightmost terms with respect to $boldsymbolmu$ is $0$, meaning we just need to compute:
$$
fracpartialpartial boldsymbolmu
Big
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
By the linearity of differentiation, we have:
$$
frac12
sum_i=1^n
fracpartialpartial boldsymbolmu
Big
(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Using Equation ($86$) from the Matrix Cookbox, we get:
$$
frac12
sum_i=1^n
Big
-2 Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Finally, solve for $boldsymbolmu$, we get:
$$
beginalign
0
&= frac12 sum_i=1^n Big -2 Sigma^-1 (textbfx_i - boldsymbolmu) Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i - Sigma^-1 boldsymbolmu Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i Big + n Sigma^-1 boldsymbolmu
\
- n Sigma^-1 boldsymbolmu &= - Sigma^-1 sum_i=1^n textbfx_i
\
boldsymbolmu &= frac1n sum_i=1^n textbfx_i
endalign
$$
And we're done.
statistics derivatives partial-derivative matrix-calculus
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm reading a paper, probabilistic CCA, in which the authors state derivatives without showing derivations. I would like step-by-step derivations to convince myself. Consider a $d$-dimensional multivariate Gaussian random variable:
$$
textbfx sim mathcalN(boldsymbolmu, Sigma)
$$
In probabilistic CCA, we define $Sigma = W W^top + Psi$, where $W in mathbbR^d times q$ and $Psi in mathbbR^d times d$. I'd like to compute the derivative w.r.t. $boldsymbolmu$, $W$, and $Psi$ for the negative log-likelihood.
The stationary point for $boldsymbolmu$ is just the empirical mean (shown below*) or $hatboldsymbolmu$. Plugging in the minimum for the parameter $boldsymbolmu$ into the negative log-likelihood, we get:
$$
fracpartial mathcalLpartial W
=
fracpartialpartial W Big
overbrace
frac12 sum_i=1^n(textbfx_i - hatboldsymbolmu)^top Sigma^-1 (textbfx_i - hatboldsymbolmu)
^A
+
overbracefracn2 ln ^B + overbracetextconst^C
Big
$$
Clearly, $C = 0$. But I'm not sure how to handle $A$ and $B$, particularly since $Sigma = W W^top + Psi$.
*Derivative w.r.t. $boldsymbolmu$
The negative log-likelihood is:
$$
mathcalL
=
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu) + fracn2 ln |Sigma| + textconst
$$
The derivative of the two rightmost terms with respect to $boldsymbolmu$ is $0$, meaning we just need to compute:
$$
fracpartialpartial boldsymbolmu
Big
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
By the linearity of differentiation, we have:
$$
frac12
sum_i=1^n
fracpartialpartial boldsymbolmu
Big
(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Using Equation ($86$) from the Matrix Cookbox, we get:
$$
frac12
sum_i=1^n
Big
-2 Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Finally, solve for $boldsymbolmu$, we get:
$$
beginalign
0
&= frac12 sum_i=1^n Big -2 Sigma^-1 (textbfx_i - boldsymbolmu) Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i - Sigma^-1 boldsymbolmu Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i Big + n Sigma^-1 boldsymbolmu
\
- n Sigma^-1 boldsymbolmu &= - Sigma^-1 sum_i=1^n textbfx_i
\
boldsymbolmu &= frac1n sum_i=1^n textbfx_i
endalign
$$
And we're done.
statistics derivatives partial-derivative matrix-calculus
I'm reading a paper, probabilistic CCA, in which the authors state derivatives without showing derivations. I would like step-by-step derivations to convince myself. Consider a $d$-dimensional multivariate Gaussian random variable:
$$
textbfx sim mathcalN(boldsymbolmu, Sigma)
$$
In probabilistic CCA, we define $Sigma = W W^top + Psi$, where $W in mathbbR^d times q$ and $Psi in mathbbR^d times d$. I'd like to compute the derivative w.r.t. $boldsymbolmu$, $W$, and $Psi$ for the negative log-likelihood.
The stationary point for $boldsymbolmu$ is just the empirical mean (shown below*) or $hatboldsymbolmu$. Plugging in the minimum for the parameter $boldsymbolmu$ into the negative log-likelihood, we get:
$$
fracpartial mathcalLpartial W
=
fracpartialpartial W Big
overbrace
frac12 sum_i=1^n(textbfx_i - hatboldsymbolmu)^top Sigma^-1 (textbfx_i - hatboldsymbolmu)
^A
+
overbracefracn2 ln ^B + overbracetextconst^C
Big
$$
Clearly, $C = 0$. But I'm not sure how to handle $A$ and $B$, particularly since $Sigma = W W^top + Psi$.
*Derivative w.r.t. $boldsymbolmu$
The negative log-likelihood is:
$$
mathcalL
=
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu) + fracn2 ln |Sigma| + textconst
$$
The derivative of the two rightmost terms with respect to $boldsymbolmu$ is $0$, meaning we just need to compute:
$$
fracpartialpartial boldsymbolmu
Big
frac12 sum_i=1^n(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
By the linearity of differentiation, we have:
$$
frac12
sum_i=1^n
fracpartialpartial boldsymbolmu
Big
(textbfx_i - boldsymbolmu)^top Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Using Equation ($86$) from the Matrix Cookbox, we get:
$$
frac12
sum_i=1^n
Big
-2 Sigma^-1 (textbfx_i - boldsymbolmu)
Big
=
0
$$
Finally, solve for $boldsymbolmu$, we get:
$$
beginalign
0
&= frac12 sum_i=1^n Big -2 Sigma^-1 (textbfx_i - boldsymbolmu) Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i - Sigma^-1 boldsymbolmu Big
\
&= - sum_i=1^n Big Sigma^-1 textbfx_i Big + n Sigma^-1 boldsymbolmu
\
- n Sigma^-1 boldsymbolmu &= - Sigma^-1 sum_i=1^n textbfx_i
\
boldsymbolmu &= frac1n sum_i=1^n textbfx_i
endalign
$$
And we're done.
statistics derivatives partial-derivative matrix-calculus
asked Jul 30 at 13:34
gwg
8501920
8501920
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
All those Greek letters are a pain to type, so let's use these variables
$$eqalign
S = Sigma,,,,P = Phi,,,,L=mathcal L,,,,Z = (X-mu 1) cr
$$
where $X$ is the matrix whose columns are the $x_i$ vectors, and $(mu 1)$ is a matrix all of whose elements are equal to $mu$.
Further, let's use a colon to denote the trace/Frobenius product
$$A:B = rm tr(A^TB)$$
Write the objective function in terms of the Frobenius product and these new variables. Then find its differential and gradients.
$$eqalign
L &= tfracn2log(det(S)) + tfrac12ZZ^T:S^-1 + K cr
dL
&= tfracn2rm tr,(dlog(S)) + tfrac12ZZ^T:dS^-1 + 0 cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dS cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):d(WW^T+P) cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T+dP) cr
$$
Setting $dW=0$ yields the gradient wrt $P$
$$eqalign
dL &= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dP cr
fracpartial Lpartial P
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big)cr
$$
While setting $dP=0$ recovers the gradient wrt $W$
$$eqalign
dL
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T) cr
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W:dW cr
fracpartial Lpartial W
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W cr
$$
In several of the steps, we've made use of the fact that $S$ is symmetric.
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
All those Greek letters are a pain to type, so let's use these variables
$$eqalign
S = Sigma,,,,P = Phi,,,,L=mathcal L,,,,Z = (X-mu 1) cr
$$
where $X$ is the matrix whose columns are the $x_i$ vectors, and $(mu 1)$ is a matrix all of whose elements are equal to $mu$.
Further, let's use a colon to denote the trace/Frobenius product
$$A:B = rm tr(A^TB)$$
Write the objective function in terms of the Frobenius product and these new variables. Then find its differential and gradients.
$$eqalign
L &= tfracn2log(det(S)) + tfrac12ZZ^T:S^-1 + K cr
dL
&= tfracn2rm tr,(dlog(S)) + tfrac12ZZ^T:dS^-1 + 0 cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dS cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):d(WW^T+P) cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T+dP) cr
$$
Setting $dW=0$ yields the gradient wrt $P$
$$eqalign
dL &= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dP cr
fracpartial Lpartial P
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big)cr
$$
While setting $dP=0$ recovers the gradient wrt $W$
$$eqalign
dL
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T) cr
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W:dW cr
fracpartial Lpartial W
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W cr
$$
In several of the steps, we've made use of the fact that $S$ is symmetric.
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
add a comment |Â
up vote
2
down vote
accepted
All those Greek letters are a pain to type, so let's use these variables
$$eqalign
S = Sigma,,,,P = Phi,,,,L=mathcal L,,,,Z = (X-mu 1) cr
$$
where $X$ is the matrix whose columns are the $x_i$ vectors, and $(mu 1)$ is a matrix all of whose elements are equal to $mu$.
Further, let's use a colon to denote the trace/Frobenius product
$$A:B = rm tr(A^TB)$$
Write the objective function in terms of the Frobenius product and these new variables. Then find its differential and gradients.
$$eqalign
L &= tfracn2log(det(S)) + tfrac12ZZ^T:S^-1 + K cr
dL
&= tfracn2rm tr,(dlog(S)) + tfrac12ZZ^T:dS^-1 + 0 cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dS cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):d(WW^T+P) cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T+dP) cr
$$
Setting $dW=0$ yields the gradient wrt $P$
$$eqalign
dL &= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dP cr
fracpartial Lpartial P
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big)cr
$$
While setting $dP=0$ recovers the gradient wrt $W$
$$eqalign
dL
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T) cr
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W:dW cr
fracpartial Lpartial W
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W cr
$$
In several of the steps, we've made use of the fact that $S$ is symmetric.
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
add a comment |Â
up vote
2
down vote
accepted
up vote
2
down vote
accepted
All those Greek letters are a pain to type, so let's use these variables
$$eqalign
S = Sigma,,,,P = Phi,,,,L=mathcal L,,,,Z = (X-mu 1) cr
$$
where $X$ is the matrix whose columns are the $x_i$ vectors, and $(mu 1)$ is a matrix all of whose elements are equal to $mu$.
Further, let's use a colon to denote the trace/Frobenius product
$$A:B = rm tr(A^TB)$$
Write the objective function in terms of the Frobenius product and these new variables. Then find its differential and gradients.
$$eqalign
L &= tfracn2log(det(S)) + tfrac12ZZ^T:S^-1 + K cr
dL
&= tfracn2rm tr,(dlog(S)) + tfrac12ZZ^T:dS^-1 + 0 cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dS cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):d(WW^T+P) cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T+dP) cr
$$
Setting $dW=0$ yields the gradient wrt $P$
$$eqalign
dL &= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dP cr
fracpartial Lpartial P
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big)cr
$$
While setting $dP=0$ recovers the gradient wrt $W$
$$eqalign
dL
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T) cr
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W:dW cr
fracpartial Lpartial W
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W cr
$$
In several of the steps, we've made use of the fact that $S$ is symmetric.
All those Greek letters are a pain to type, so let's use these variables
$$eqalign
S = Sigma,,,,P = Phi,,,,L=mathcal L,,,,Z = (X-mu 1) cr
$$
where $X$ is the matrix whose columns are the $x_i$ vectors, and $(mu 1)$ is a matrix all of whose elements are equal to $mu$.
Further, let's use a colon to denote the trace/Frobenius product
$$A:B = rm tr(A^TB)$$
Write the objective function in terms of the Frobenius product and these new variables. Then find its differential and gradients.
$$eqalign
L &= tfracn2log(det(S)) + tfrac12ZZ^T:S^-1 + K cr
dL
&= tfracn2rm tr,(dlog(S)) + tfrac12ZZ^T:dS^-1 + 0 cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dS cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):d(WW^T+P) cr
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T+dP) cr
$$
Setting $dW=0$ yields the gradient wrt $P$
$$eqalign
dL &= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):dP cr
fracpartial Lpartial P
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big)cr
$$
While setting $dP=0$ recovers the gradient wrt $W$
$$eqalign
dL
&= frac12Big(nS^-1 - S^-1ZZ^TS^-1Big):(dW,W^T+ W,dW^T) cr
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W:dW cr
fracpartial Lpartial W
&= Big(nS^-1 - S^-1ZZ^TS^-1Big)W cr
$$
In several of the steps, we've made use of the fact that $S$ is symmetric.
edited Jul 30 at 21:21
answered Jul 30 at 18:01
greg
5,6331715
5,6331715
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
add a comment |Â
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Thanks! I can follow your derivation of $B$, the log of the determinant. I'm confused about how you setup for $A$ though. My part $A$ is $frac12 sum z^top S z$. If I use the trace trick, that's: $frac12 sum texttr(z z^top S)$. But your setup is $frac12 texttr(z^top z S^-1)$. What happened to the summation?
– gwg
Jul 30 at 19:47
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
Sorry, I misread the A term. The answer has been updated with the correct term. The change consisted of changing the $z$ vector into the $Z$ matrix.
– greg
Jul 30 at 20:36
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2867022%2fderivation-of-derivative-of-multivariate-gaussian-w-r-t-covariance-matrix%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password