<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Stochastic Environment |</title><link>https://emma-gnabeyeu.github.io/tags/stochastic-environment/</link><atom:link href="https://emma-gnabeyeu.github.io/tags/stochastic-environment/index.xml" rel="self" type="application/rss+xml"/><description>Stochastic Environment</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 22 Oct 2025 00:00:00 +0000</lastBuildDate><image><url>https://emma-gnabeyeu.github.io/media/icon_hu_da05098ef60dc2e7.png</url><title>Stochastic Environment</title><link>https://emma-gnabeyeu.github.io/tags/stochastic-environment/</link></image><item><title>👩🏼‍🏫 Awards and Certificates</title><link>https://emma-gnabeyeu.github.io/blog/award/</link><pubDate>Wed, 22 Oct 2025 00:00:00 +0000</pubDate><guid>https://emma-gnabeyeu.github.io/blog/award/</guid><description>&lt;h2 id="-award-2025-best-european-masters-thesis-in-mathematical-finance"&gt;🏆 Award: 2025 Best European Master’s Thesis in Mathematical Finance.&lt;/h2&gt;
&lt;p&gt;Recipient of the 2025 Natixis Award for Best Master’s Thesis in Mathematical Finance, awarded by the
. |
|&lt;/p&gt;
&lt;h2 id="related-work"&gt;Related Work&lt;/h2&gt;
&lt;p&gt;Below is the research paper associated with this award:&lt;/p&gt;
&lt;div class="pub-list-item view-citation" style="margin-bottom: 1rem"&gt;
&lt;i class="far fa-file-alt pub-icon" aria-hidden="true"&gt;&lt;/i&gt;
&lt;span class="article-metadata li-cite-author"&gt;
&lt;span &gt;&lt;a href="../../authors/me/"&gt;Emmanuel G.&lt;/a&gt;&lt;/span&gt;&lt;span class="relative inline-block ml-1" x-data="{ tooltip: false }"&gt;
&lt;button
@mouseenter="tooltip = true"
@mouseleave="tooltip = false"
@click="tooltip = !tooltip"
class="author-notes text-primary-600 dark:text-primary-400 hover:text-primary-800 dark:hover:text-primary-200 transition-colors cursor-help"
data-tooltip="Equal contribution"
aria-label="Equal contribution"
type="button"
&gt;
&lt;svg class="inline-block w-4 h-4" fill="currentColor" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg"&gt;
&lt;path fill-rule="evenodd" d="M18 10a8 8 0 11-16 0 8 8 0 0116 0zm-7-4a1 1 0 11-2 0 1 1 0 012 0zM9 9a1 1 0 000 2v3a1 1 0 001 1h1a1 1 0 100-2v-3a1 1 0 00-1-1H9z" clip-rule="evenodd"&gt;&lt;/path&gt;
&lt;/svg&gt;
&lt;/button&gt;
&lt;div
x-show="tooltip"
x-transition:enter="transition ease-out duration-200"
x-transition:enter-start="opacity-0 transform scale-95"
x-transition:enter-end="opacity-100 transform scale-100"
x-transition:leave="transition ease-in duration-150"
x-transition:leave-start="opacity-100 transform scale-100"
x-transition:leave-end="opacity-0 transform scale-95"
@click.away="tooltip = false"
class="absolute z-50 bottom-full left-1/2 transform -translate-x-1/2 mb-2 px-3 py-2 text-sm text-white bg-gray-900 dark:bg-gray-700 rounded-lg shadow-lg whitespace-nowrap"
x-cloak
&gt;
Equal contribution
&lt;div class="absolute top-full left-1/2 transform -translate-x-1/2 -mt-1 w-0 h-0 border-l-4 border-r-4 border-t-4 border-transparent border-t-gray-900 dark:border-t-gray-700"&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/span&gt;, &lt;span &gt;&lt;a href="../../authors/omar-karkar/"&gt;Omar Karkar&lt;/a&gt;&lt;/span&gt;&lt;span class="relative inline-block ml-1" x-data="{ tooltip: false }"&gt;
&lt;button
@mouseenter="tooltip = true"
@mouseleave="tooltip = false"
@click="tooltip = !tooltip"
class="author-notes text-primary-600 dark:text-primary-400 hover:text-primary-800 dark:hover:text-primary-200 transition-colors cursor-help"
data-tooltip="Equal contribution"
aria-label="Equal contribution"
type="button"
&gt;
&lt;svg class="inline-block w-4 h-4" fill="currentColor" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg"&gt;
&lt;path fill-rule="evenodd" d="M18 10a8 8 0 11-16 0 8 8 0 0116 0zm-7-4a1 1 0 11-2 0 1 1 0 012 0zM9 9a1 1 0 000 2v3a1 1 0 001 1h1a1 1 0 100-2v-3a1 1 0 00-1-1H9z" clip-rule="evenodd"&gt;&lt;/path&gt;
&lt;/svg&gt;
&lt;/button&gt;
&lt;div
x-show="tooltip"
x-transition:enter="transition ease-out duration-200"
x-transition:enter-start="opacity-0 transform scale-95"
x-transition:enter-end="opacity-100 transform scale-100"
x-transition:leave="transition ease-in duration-150"
x-transition:leave-start="opacity-100 transform scale-100"
x-transition:leave-end="opacity-0 transform scale-95"
@click.away="tooltip = false"
class="absolute z-50 bottom-full left-1/2 transform -translate-x-1/2 mb-2 px-3 py-2 text-sm text-white bg-gray-900 dark:bg-gray-700 rounded-lg shadow-lg whitespace-nowrap"
x-cloak
&gt;
Equal contribution
&lt;div class="absolute top-full left-1/2 transform -translate-x-1/2 -mt-1 w-0 h-0 border-l-4 border-r-4 border-t-4 border-transparent border-t-gray-900 dark:border-t-gray-700"&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/span&gt;, &lt;span &gt;&lt;a href="../../authors/imad-idboufous/"&gt;Imad Idboufous&lt;/a&gt;&lt;/span&gt;
&lt;/span&gt;
(2024).
&lt;a href="../../publications/conference-paper/" class="underline"&gt;Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach&lt;/a&gt;.
In &lt;em&gt;IJCNN&lt;/em&gt;.
&lt;div class="flex flex-wrap space-x-3"&gt;
&lt;a class="hb-attachment-link hb-attachment-link-small" href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4991699" target="_blank" rel="noopener"&gt;
&lt;svg style="height: 1em" class='inline-block' xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"&gt;&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="M19.5 14.25v-2.625a3.375 3.375 0 0 0-3.375-3.375h-1.5A1.125 1.125 0 0 1 13.5 7.125v-1.5a3.375 3.375 0 0 0-3.375-3.375H8.25m0 12.75h7.5m-7.5 3H12M10.5 2.25H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 0 0-9-9"/&gt;&lt;/svg&gt;
PDF
&lt;/a&gt;
&lt;a class="hb-attachment-link hb-attachment-link-small" href="../../../uploads/PresentationVolFitting.pdf" &gt;
&lt;svg style="height: 1em" class='inline-block' xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"&gt;&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="M3.75 3v11.25A2.25 2.25 0 0 0 6 16.5h2.25M3.75 3h-1.5m1.5 0h16.5m0 0h1.5m-1.5 0v11.25A2.25 2.25 0 0 1 18 16.5h-2.25m-7.5 0h7.5m-7.5 0l-1 3m8.5-3l1 3m0 0l.5 1.5m-.5-1.5h-9.5m0 0l-.5 1.5M9 11.25v1.5M12 9v3.75m3-6v6"/&gt;&lt;/svg&gt;
Slides
&lt;/a&gt;
&lt;a class="hb-attachment-link hb-attachment-link-small" href="../../../uploads/Poster_RL_for_vol_IJCNN2025-_.pdf" &gt;
&lt;svg style="height: 1em" class='inline-block' xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"&gt;&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m2.25 15.75l5.159-5.159a2.25 2.25 0 0 1 3.182 0l5.159 5.159m-1.5-1.5l1.409-1.409a2.25 2.25 0 0 1 3.182 0l2.909 2.909m-18 3.75h16.5a1.5 1.5 0 0 0 1.5-1.5V6a1.5 1.5 0 0 0-1.5-1.5H3.75A1.5 1.5 0 0 0 2.25 6v12a1.5 1.5 0 0 0 1.5 1.5m10.5-11.25h.008v.008h-.008zm.375 0a.375.375 0 1 1-.75 0a.375.375 0 0 1 .75 0"/&gt;&lt;/svg&gt;
Poster
&lt;/a&gt;
&lt;button class="hb-attachment-link hb-attachment-link-small js-cite-clipboard cursor-pointer" type="button" data-filename="/publications/conference-paper/cite.bib"&gt;
&lt;svg style="height: 1em" class='inline-block' xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"&gt;&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="M15.75 17.25v3.375c0 .621-.504 1.125-1.125 1.125h-9.75a1.125 1.125 0 0 1-1.125-1.125V7.875c0-.621.504-1.125 1.125-1.125H6.75a9.06 9.06 0 0 1 1.5.124m7.5 10.376h3.375c.621 0 1.125-.504 1.125-1.125V11.25c0-4.46-3.243-8.161-7.5-8.876a9.06 9.06 0 0 0-1.5-.124H9.375c-.621 0-1.125.504-1.125 1.125v3.5m7.5 10.375H9.375a1.125 1.125 0 0 1-1.125-1.125v-9.25m12 6.625v-1.875a3.375 3.375 0 0 0-3.375-3.375h-1.5a1.125 1.125 0 0 1-1.125-1.125v-1.5a3.375 3.375 0 0 0-3.375-3.375H9.75"/&gt;&lt;/svg&gt;
&lt;span&gt;Cite&lt;/span&gt;
&lt;/button&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h2 id="video"&gt;Video&lt;/h2&gt;
&lt;p&gt;Here I share the Full ceremony replay: |
|&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dailymotion&lt;/strong&gt;:&lt;/p&gt;
&lt;iframe frameborder="0" width="670" height="400" src="https://geo.dailymotion.com/player/xh36y.html?video=k3bKFXMZ7XNtD6E3r5U" allowfullscreen&gt;&lt;/iframe&gt;
&lt;p&gt;&lt;strong&gt;Video file&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Videos Videos of some annimation:&lt;/p&gt;
&lt;!--
may be added to a page by either placing them in your `assets/media/` media library
--&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Training Skew&lt;/th&gt;
&lt;th&gt;Volatility Fitting&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;div style="width:100%; max-width:720px; height:520px; margin:auto; overflow:hidden; display:flex; align-items:center; justify-content:center;"&gt;
&lt;video controls &gt;
&lt;source src="../../media/TrainingSkew_RL_animation.mp4" type="video/mp4"&gt;
&lt;/video&gt;
&lt;/div&gt;&lt;/td&gt;
&lt;td&gt;&lt;div style="width:90%; max-width:520px; height:520px; margin:auto; overflow:hidden; display:flex; align-items:center; justify-content:center;"&gt;
&lt;video controls &gt;
&lt;source src="../../media/vol_fitting_animation_equity.mp4" type="video/mp4"&gt;
&lt;/video&gt;
&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="podcast"&gt;Podcast&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;{{&amp;lt; audio src=&amp;quot;ambient-piano.mp3&amp;quot; &amp;gt;}}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try it out:&lt;/p&gt;
&lt;audio controls &gt;
&lt;source src="../../blog/award/ambient-piano.mp3" type="audio/mpeg"&gt;
&lt;/audio&gt;
&lt;h2 id="math-block"&gt;Math block:&lt;/h2&gt;
$$
\textcolor{red}{\text{Action network:}} \quad \text{DPG} = \textcolor{blue}{\text{Deep Policy Gradient}}
$$$$
a_t \sim \textcolor{red}{\pi^{D}(s_t,\theta^{\pi})} + \epsilon_t,
\quad \text{with} \quad
\textcolor{brown}{\epsilon_t \sim \mathcal{N}(0, \sigma_n^2 I_K)},
\quad \text{and} \quad
\sigma_n = \max\!\left(\sigma_0\left(1-\frac{n}{N}\right)^4,\sigma_{\min}\right)
$$$$
\textcolor{red}{\text{Critic network:}} \quad \textcolor{blue}{\text{Q-Learning and Bellman equation}}
$$$$
\begin{cases}
R_{t}=\sum_{i=t}^{T}\gamma^{(i-t)} r(s_{i},a_{i}) \\[6pt]
Q^{\pi}(s_{t},a_{t})=\mathbb{E}[R_{t}\mid s_{t},a_{t}]
\end{cases}
\quad \Rightarrow \quad
\textcolor{red}{
L(\theta^{Q})=
\mathbb{E}\left[
\left(
Q^{\pi}(s_{t},a_{t}; \theta^{Q})-
\left(r(s_{t},a_{t})+\gamma Q^{\pi}(s_{t+1},a_{t+1};\theta^{Q})\right)
\right)^{2}
\right]
}
$$&lt;p&gt;&lt;strong&gt;Latex code&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-latex" data-lang="latex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;\begin&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;itemize&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;\item&lt;/span&gt; &lt;span class="k"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;red&lt;span class="nb"&gt;}{&lt;/span&gt;Action network:&lt;span class="nb"&gt;}&lt;/span&gt; DPG= &lt;span class="k"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;blue&lt;span class="nb"&gt;}{&lt;/span&gt;Deep Policy gradient&lt;span class="nb"&gt;}&lt;/span&gt; &lt;span class="c"&gt;% $ J = \mathbb{E}[R_{s_{t}}] ==== $ (\textbf{Proof:} Cf. Report)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;\end&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;itemize&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sb"&gt;$$&lt;/span&gt;&lt;span class="nb"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;a_t &lt;/span&gt;&lt;span class="nv"&gt;\sim&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{red}{&lt;/span&gt;&lt;span class="nv"&gt;\pi&lt;/span&gt;&lt;span class="nb"&gt;^{D} &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;s_t,&lt;/span&gt;&lt;span class="nv"&gt;\theta&lt;/span&gt;&lt;span class="nb"&gt;^{&lt;/span&gt;&lt;span class="nv"&gt;\pi&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;} &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\epsilon&lt;/span&gt;&lt;span class="nb"&gt;_t &lt;/span&gt;&lt;span class="nv"&gt;\quad&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\text&lt;/span&gt;&lt;span class="nb"&gt;{with} &lt;/span&gt;&lt;span class="nv"&gt;\quad&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{brown}{&lt;/span&gt;&lt;span class="nv"&gt;\epsilon&lt;/span&gt;&lt;span class="nb"&gt;_t &lt;/span&gt;&lt;span class="nv"&gt;\sim&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\mathcal&lt;/span&gt;&lt;span class="nb"&gt;{N}&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="nb"&gt;, &lt;/span&gt;&lt;span class="nv"&gt;\sigma&lt;/span&gt;&lt;span class="nb"&gt;_n^&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="nb"&gt; I_K&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;} &lt;/span&gt;&lt;span class="nv"&gt;\quad&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\text&lt;/span&gt;&lt;span class="nb"&gt;{and} &lt;/span&gt;&lt;span class="nv"&gt;\quad&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\sigma&lt;/span&gt;&lt;span class="nb"&gt;_n &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\text&lt;/span&gt;&lt;span class="nb"&gt;{max}&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;\sigma&lt;/span&gt;&lt;span class="nb"&gt;_&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nv"&gt;\frac&lt;/span&gt;&lt;span class="nb"&gt;{n}{N} &lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;^{&lt;/span&gt;&lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="nb"&gt;},&lt;/span&gt;&lt;span class="nv"&gt;\sigma&lt;/span&gt;&lt;span class="nb"&gt;_{&lt;/span&gt;&lt;span class="nv"&gt;\text&lt;/span&gt;&lt;span class="nb"&gt;{min}}&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt;$$&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;\begin&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;itemize&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;\item&lt;/span&gt; &lt;span class="k"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;red&lt;span class="nb"&gt;}{&lt;/span&gt;Critic network:&lt;span class="nb"&gt;}&lt;/span&gt; &lt;span class="k"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;blue&lt;span class="nb"&gt;}{&lt;/span&gt;Q-Learning and Bellman equation.&lt;span class="nb"&gt;}&lt;/span&gt; &lt;span class="c"&gt;% $ Q_{\theta^{Q}}(s_{t_{i}},a_{t_{i}}) = - \sum_{k=t_i}^{T} \mathbb{E}_{(s_{k},a_{k})\sim \rho_\pi} [ \gamma^{(k-t_i)} \xi (\vec{\theta}_{t_{k}} )] $
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;\end&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;itemize&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sb"&gt;$$&lt;/span&gt;&lt;span class="nb"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nv"&gt;\begin&lt;/span&gt;&lt;span class="nb"&gt;{cases} R_{t}&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;\sum&lt;/span&gt;&lt;span class="nb"&gt;_{i&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;t}^{T}&lt;/span&gt;&lt;span class="nv"&gt;\gamma&lt;/span&gt;&lt;span class="nb"&gt;^{&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt; t&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;}r&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;s_{i},a_{i}&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nv"&gt;\\&lt;/span&gt;&lt;span class="nb"&gt;Q^{&lt;/span&gt;&lt;span class="nv"&gt;\pi&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;s_{t},a_{t}&lt;/span&gt;&lt;span class="o"&gt;)=&lt;/span&gt;&lt;span class="nv"&gt;\mathbb&lt;/span&gt;&lt;span class="nb"&gt;{E}&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;R_{t}|s_{t},a_{t}&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nb"&gt;&amp;amp;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nv"&gt;\end&lt;/span&gt;&lt;span class="nb"&gt;{cases} &lt;/span&gt;&lt;span class="nv"&gt;\quad&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\Rightarrow&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\textcolor&lt;/span&gt;&lt;span class="nb"&gt;{red}{L&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;\theta&lt;/span&gt;&lt;span class="nb"&gt;^{Q}&lt;/span&gt;&lt;span class="o"&gt;)=&lt;/span&gt;&lt;span class="nv"&gt;\mathbb&lt;/span&gt;&lt;span class="nb"&gt;{E}&lt;/span&gt;&lt;span class="o"&gt;[(&lt;/span&gt;&lt;span class="nb"&gt;Q^{&lt;/span&gt;&lt;span class="nv"&gt;\pi&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;s_{t},a_{t}; &lt;/span&gt;&lt;span class="nv"&gt;\theta&lt;/span&gt;&lt;span class="nb"&gt;^{Q}&lt;/span&gt;&lt;span class="o"&gt;)-[&lt;/span&gt;&lt;span class="nb"&gt;r&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;s_{t},a_{t}&lt;/span&gt;&lt;span class="o"&gt;)+&lt;/span&gt;&lt;span class="nv"&gt;\gamma&lt;/span&gt;&lt;span class="nb"&gt; Q^{&lt;/span&gt;&lt;span class="nv"&gt;\pi&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;s_{t&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="nb"&gt;},a_{t&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="nb"&gt;};&lt;/span&gt;&lt;span class="nv"&gt;\theta&lt;/span&gt;&lt;span class="nb"&gt;^{Q}&lt;/span&gt;&lt;span class="o"&gt;)])&lt;/span&gt;&lt;span class="nb"&gt;^{&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nb"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s"&gt;$$&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description></item></channel></rss>