关灯
护眼
字体:
大
中
小
烟雨山河 [不良人|尤川](4)
作者:紫微客 阅读记录
我本该高兴的,却差点儿红了眼眶。
我记得,圣女吹笛方能召唤出漫天蝴蝶,在一年前的庆典之上,我还曾在圣女用笛音的伴奏下跳过一曲霓裳舞,那日的万毒窟祭祀台,就似一方绝美的蝴蝶谷。
圣女同我的生辰相隔不久,尤川同她如今的关系不似从前,大抵有两相对峙之意,我想,这簪子本该是要送给圣女的吧?
不过,只要是他送的,那就是好的,我接过簪子,恍若无事般扬头乐呵呵一笑:“在中原买的?”
“对。”他一如既往地言简意赅。
“豁!那你这是打算拿替我买礼物这事儿,给自个儿做掩咯!”我故意皱了皱鼻子,假意生气,“尤川,你好不厚道呢!”
“我……”
果不其然,他终究还是没“我”出个所以然来。
于是我率先面露惬意地朝他莞尔:“好了,逗你玩的。”
“多谢,我很喜欢!”
第 3 章
Now, we can ignore the constant terms since they won't affect the maximization result. Therefore, we only need to focus on the terms that are related to $\boldsymbol{\Sigma}_k$, which is
$$
-\frac{1}{2} \sum_{n=1}^N \gamma\left(z_{n k}\right) (\mathbf{x}_n - \boldsymbol{\mu}_k)^T \boldsymbol{\Sigma}_k^{-1} (\mathbf{x}_n - \boldsymbol{\mu}_k)+ \frac{1}{2} \sum_{n=1}^N \gamma\left(z_{n k}\right) \ln \boldsymbol{\Sigma}_k^{-1}.
$$
To maximize this expression, we can take the derivative with respect to $\boldsymbol{\Sigma}_k$ and set it equal to zero. This will give us the solution for maximization. Specifically, we have
$$
\frac{\partial}{\partial \boldsymbol{\Sigma}_k} \left(-\frac{1}{2} \sum_{n=1}^N \gamma\left(z_{n k}\right) (\mathbf{x}_n - \boldsymbol{\mu}_k)^T \boldsymbol{\Sigma}_k^{-1} (\mathbf{x}_n - \boldsymbol{\mu}_k) + \frac{1}{2} \sum_{n=1}^N \gamma\left(z_{n k}\right) \ln \boldsymbol{\Sigma}_k^{-1}\right) = 0.
$$
First, let's consider the first term:
\[
-\frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) (\mathbf{x}_n - \boldsymbol{\mu}_k)^T \boldsymbol{\Sigma}_k^{-1} (\mathbf{x}_n - \boldsymbol{\mu}_k).
\]
Using matrix calculus, the derivative of \(\mathbf{a}^T \mathbf{A}^{-1} \mathbf{a}\) with respect to \(\mathbf{A}\) is:
\[
\frac{\partial}{\partial \mathbf{A}} (\mathbf{a}^T \mathbf{A}^{-1} \mathbf{a}) = -\mathbf{A}^{-1} \mathbf{a} \mathbf{a}^T \mathbf{A}^{-1}.
\]
Therefore, for the first term:
\[
\frac{\partial}{\partial \boldsymbol{\Sigma}_k} \left( -\frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) (\mathbf{x}_n - \boldsymbol{\mu}_k)^T \boldsymbol{\Sigma}_k^{-1} (\mathbf{x}_n - \boldsymbol{\mu}_k) \right) = \frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) \boldsymbol{\Sigma}_k^{-1} (\mathbf{x}_n - \boldsymbol{\mu}_k) (\mathbf{x}_n - \boldsymbol{\mu}_k)^T \boldsymbol{\Sigma}_k^{-1}.
\]
Next, consider the second term:
\[
\frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) \ln \boldsymbol{\Sigma}_k^{-1}.
\]
We know that \(\ln \mathbf{A}^{-1} = -\ln \mathbf{A}\), and the derivative of \(\ln \mathbf{A}\) with respect to \(\mathbf{A}\) is \(\mathbf{A}^{-1}\). Thus,
\[
\frac{\partial}{\partial \boldsymbol{\Sigma}_k} \left( \frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) \ln \boldsymbol{\Sigma}_k^{-1} \right) = -\frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) \boldsymbol{\Sigma}_k^{-1}.
\]
bining the derivatives of both terms, we get:
\[
\frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) \boldsymbol{\Sigma}_k^{-1} (\mathbf{x}_n - \boldsymbol{\mu}_k) (\mathbf{x}_n - \boldsymbol{\mu}_k)^T \boldsymbol{\Sigma}_k^{-1} - \frac{1}{2} \sum_{n=1}^N \gamma(z_{nk}) \boldsymbol{\Sigma}_k^{-1} = 0.
\]
\newpage
To find the optimal value of \(\pi_k\), we can use the Lagrange multiplier method.
Consider the maximization of the following expression with respect to \(\pi_k\) while keeping the responsibilities \(\gamma(z_{nk})\) fixed:
\[
\mathbb{E}_{\mathbf{Z}}[\ln p(\mathbf{X}, \mathbf{Z} \mid \boldsymbol{\mu}, \boldsymbol{\Sigma}, \boldsymbol{\pi})] = \sum_{n=1}^N \sum_{k=1}^K \gamma(z_{nk}) \left\{\ln \pi_k + \ln \mathcal{N}(\mathbf{x}_n \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k)\right\}.
\]
Let's define the Lagrangian function as:
\[
\mathcal{L}(\boldsymbol{\pi}, \lambda) = \sum_{n=1}^N \sum_{k=1}^K \gamma(z_{nk}) \left\{\ln \pi_k + \ln \mathcal{N}(\mathbf{x}_n \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k)\right\} + \lambda \left(\sum_{k=1}^K \pi_k - 1\right).