CS-566 Deep Reinforcement Learning
Notebook: Gaussian Density
Instructor: Nazar Khan
This notebook provides an interactive introduction to the Gaussian Density function.
1. What is a Gaussian Density?¶
A Gaussian (also called Normal Distribution or Bell Curve) is a smooth hill-shaped curve.
- Most values are near the center
- Fewer values appear far away
- The shape looks like a bell
It is important in:
- Machine learning
- Reinforcement learning (policies!)
- Statistics
- Physics
- Nature (heights, errors, noise, etc.) -- hence the name Normal
2. Why is Gaussian Density important in RL?¶
In REINFORCE with continuous actions:
- The policy outputs mean (
μ) and standard deviation (σ) - Actions are sampled from a Gaussian, which provides an inherent way of exploration.
- The log-probability of the action is needed for gradients
So Gaussian density is at the heart of continuous-action RL.
3. The Formula¶
The probability of a value x under a Gaussian with mean $\mu$ and standard deviation $\sigma$ is:
$\mathcal{N}(x \mid \mu, \sigma) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left( -\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2 \right)$
Think of it like this:
- $\mu$ = where the bell curve is centered
- $\sigma$ = how wide or skinny it is
- The exponential makes the curve drop smoothly as you move away from the center.
4. Let's interactively change $\mu$ and $\sigma$¶
We will create interactive sliders to change the bell curve shape.
Requires: ipywidgets. Install if needed:
!pip install ipywidgets
Requirement already satisfied: ipywidgets in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (8.1.7) Requirement already satisfied: comm>=0.1.3 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipywidgets) (0.2.3) Requirement already satisfied: ipython>=6.1.0 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipywidgets) (9.8.0) Requirement already satisfied: traitlets>=4.3.1 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipywidgets) (5.14.3) Requirement already satisfied: widgetsnbextension~=4.0.14 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipywidgets) (4.0.14) Requirement already satisfied: jupyterlab_widgets~=3.0.15 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipywidgets) (3.0.15) Requirement already satisfied: decorator>=4.3.2 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (5.2.1) Requirement already satisfied: ipython-pygments-lexers>=1.0.0 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (1.1.1) Requirement already satisfied: jedi>=0.18.1 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.19.2) Requirement already satisfied: matplotlib-inline>=0.1.5 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.2.1) Requirement already satisfied: pexpect>4.3 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (4.9.0) Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (3.0.52) Requirement already satisfied: pygments>=2.11.0 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (2.19.2) Requirement already satisfied: stack_data>=0.6.0 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.6.3) Requirement already satisfied: typing_extensions>=4.6 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (4.15.0) Requirement already satisfied: wcwidth in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython>=6.1.0->ipywidgets) (0.2.13) Requirement already satisfied: parso<0.9.0,>=0.8.4 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from jedi>=0.18.1->ipython>=6.1.0->ipywidgets) (0.8.5) Requirement already satisfied: ptyprocess>=0.5 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets) (0.7.0) Requirement already satisfied: executing>=1.2.0 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython>=6.1.0->ipywidgets) (2.2.1) Requirement already satisfied: asttokens>=2.1.0 in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython>=6.1.0->ipywidgets) (3.0.0) Requirement already satisfied: pure_eval in /home/nazar/anaconda3/envs/rl_env/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython>=6.1.0->ipywidgets) (0.2.3)
Interactive Gaussian Visualization¶
# ============================================================
# INTERACTIVE GAUSSIAN DEMO
# Extremely child-friendly explanation + visualization
# ============================================================
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact, FloatSlider
from IPython.display import display
# Nice plot style
plt.style.use("seaborn-v0_8")
# ------------------------------------------------------------
# Function to draw a Gaussian curve
# ------------------------------------------------------------
def plot_gaussian(mu, sigma):
# Make 400 points from -10 to 10
x = np.linspace(-10, 10, 400)
# Gaussian formula
coef = 1.0 / (np.sqrt(2 * np.pi) * sigma)
exponent = np.exp(-0.5 * ((x - mu) / sigma) ** 2)
y = coef * exponent
plt.figure(figsize=(8, 4))
plt.plot(x, y, linewidth=3)
plt.title(f"Gaussian Density (μ = {mu}, σ = {sigma})", fontsize=16)
plt.xlabel("x")
plt.ylabel("Probability Density")
plt.ylim(0, max(0.5, np.max(y)*1.1))
# Vertical line at mean
plt.axvline(mu, color="red", linestyle="--", label="Mean (center)")
plt.legend()
plt.grid(True)
plt.show()
# ------------------------------------------------------------
# Interactive sliders
# ------------------------------------------------------------
interact(
plot_gaussian,
mu=FloatSlider(value=0, min=-5, max=5, step=0.1, description="μ (center)"),
sigma=FloatSlider(value=1.0, min=0.2, max=5, step=0.1, description="σ (spread)")
);
5. Explanation of parameter effects¶
$\mu$ (mean): Moves the whole bell curve left or right. (Like moving the center of a flashlight beam.)
$\sigma$ (standard deviation): Controls width
- Small $\sigma\implies$ skinny + tall
- Large $\sigma\implies$ wide + flat (Like zooming in and out on a hill.)
The interactive sliders show this clearly.
Interactive Sampling Demo¶
# ============================================================
# GAUSSIAN SAMPLING DEMO
# ============================================================
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact, FloatSlider, IntSlider
plt.style.use("seaborn-v0_8")
def sample_gaussian(mu, sigma, n):
samples = np.random.normal(mu, sigma, n)
plt.figure(figsize=(8, 4))
plt.hist(samples, bins=20, density=True, alpha=0.6, color="orange")
plt.title(f"Samples from Gaussian (μ = {mu}, σ = {sigma})", fontsize=16)
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.grid(True)
plt.show()
interact(
sample_gaussian,
mu=FloatSlider(value=0, min=-5, max=5, step=0.1),
sigma=FloatSlider(value=1, min=0.2, max=5, step=0.1),
n=IntSlider(value=200, min=50, max=2000, step=50, description="Samples")
);
interactive(children=(FloatSlider(value=0.0, description='mu', max=5.0, min=-5.0), FloatSlider(value=1.0, desc…
7. Sampling animation¶
# ============================================================
# GAUSSIAN SAMPLING ANIMATION (GIF)
# Samples "falling" onto a histogram frame-by-frame
# ============================================================
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation, PillowWriter
# Parameters
mu = 0 # center
sigma = 1 # spread
num_samples = 4000 # total points to animate
frames = 100 # how many animation frames
gif_name = "gaussian_falling_samples.gif"
# Generate all samples in advance
samples = np.random.normal(mu, sigma, num_samples)
# Prepare figure
fig, ax = plt.subplots(figsize=(8, 4))
ax.set_xlim(-5, 5)
ax.set_ylim(0, 0.5)
ax.set_title(f"Gaussian Sampling Animation (μ={mu}, σ={sigma})")
ax.set_xlabel("Value")
ax.set_ylabel("Density")
# Draw the true Gaussian density curve
x = np.linspace(-5, 5, 400)
true_pdf = (1/(np.sqrt(2*np.pi)*sigma)) * np.exp(-0.5 * ((x - mu)/sigma)**2)
ax.plot(x, true_pdf, linewidth=2, color="blue", label="True Density")
ax.legend()
# Prepare histogram data
hist_data = []
# Histogram bars (initial empty)
bars = ax.bar([], [], width=0.3)
# Animation function
def update(frame_index):
global bars
# Add some samples each frame
chunk = samples[: int((frame_index+1) * num_samples/frames)]
# Compute histogram
counts, bin_edges = np.histogram(chunk, bins=30, range=(-5, 5), density=True)
# Clear previous bars
for col in ax.collections:
col.clear()
#ax.collections.clear()
# Draw updated histogram
ax.hist(chunk, bins=30, range=(-5, 5), density=True, alpha=0.6, color="orange")
ax.set_title(f"Gaussian Sampling (Frame {frame_index+1}/{frames})")
# Create animation
anim = FuncAnimation(fig, update, frames=frames, interval=80)
# Save as GIF
anim.save(gif_name, writer=PillowWriter(fps=20))
plt.close()
print("GIF saved as:", gif_name)
/tmp/ipykernel_30545/3930654035.py:64: UserWarning: Creating legend with loc="best" can be slow with large amounts of data. anim.save(gif_name, writer=PillowWriter(fps=20))
GIF saved as: gaussian_falling_samples.gif

from IPython.display import Image
Image(filename='gaussian_falling_samples.gif')
<IPython.core.display.Image object>