{ "cells": [ { "cell_type": "markdown", "id": "bd46ea45-c580-4205-9374-37fc926b6c12", "metadata": {}, "source": [ "\n", "
\n", "
\n", " \"Department\n", "
\n", "
\n", "

CS-866 Deep Reinforcement Learning

\n", " \n", "

Notebook: Gaussian Density

\n", "

Instructor: Nazar Khan   |   Semester: Fall 2025

\n", "
\n", "
\n", " \"University\n", "
\n", "
\n", "\n", "---\n", "\n", "**This notebook** provides an interactive introduction to the Gaussian Density function." ] }, { "cell_type": "markdown", "id": "90972004-63cb-4fce-8f4b-742591f2e48c", "metadata": {}, "source": [ "## **1. What is a Gaussian Density?**\n", "\n", "A **Gaussian** (also called *Normal Distribution* or *Bell Curve*) is a smooth hill-shaped curve.\n", "\n", "- Most values are near the center\n", "- Fewer values appear far away\n", "- The shape looks like a **bell**\n", "\n", "It is important in:\n", "\n", "* Machine learning\n", "* Reinforcement learning (policies!)\n", "* Statistics\n", "* Physics\n", "* Nature (heights, errors, noise, etc.) -- hence the name **Normal**\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "8204a639-a982-407d-b329-14cd2c0894a8", "metadata": {}, "source": [ "## **2. Why is Gaussian Density important in RL?**\n", "\n", "In REINFORCE with continuous actions:\n", "\n", "- The policy outputs **mean** (`μ`) and **standard deviation** (`σ`)\n", "- Actions are **sampled** from a Gaussian, which provides an inherent way of exploration.\n", "- The log-probability of the action is needed for gradients\n", "\n", "So Gaussian density is at the **heart of continuous-action RL**.\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "e4a31203-1a73-466d-9a79-81e0503f0c52", "metadata": {}, "source": [ "## **3. The Formula**\n", "\n", "The probability of a value `x` under a Gaussian with mean $\\mu$ and standard deviation $\\sigma$ is:\n", "\n", "$\\mathcal{N}(x \\mid \\mu, \\sigma)\n", "= \\frac{1}{\\sqrt{2\\pi}\\sigma}\n", "\\exp\\left(\n", "-\\frac{1}{2}\n", "\\left(\n", "\\frac{x - \\mu}{\\sigma}\n", "\\right)^2\n", "\\right)$\n", "\n", "**Think of it like this:**\n", "\n", "* $\\mu$ = where the bell curve is centered\n", "* $\\sigma$ = how wide or skinny it is\n", "* The exponential makes the curve drop smoothly as you move away from the center.\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "ff4ebfd0-8d19-48a0-a365-27ca6e4e6aab", "metadata": {}, "source": [ "## **4. Let's interactively change $\\mu$ and $\\sigma$**\n", "\n", "We will create **interactive sliders** to change the bell curve shape.\n", "\n", "Requires: `ipywidgets`. Install if needed:" ] }, { "cell_type": "code", "execution_count": 1, "id": "5d35d5ef-a090-42f5-a9ef-1213e9a08178", "metadata": {}, "outputs": [], "source": [ "#!pip install ipywidgets" ] }, { "cell_type": "markdown", "id": "8fd322fd-c19f-402a-84c0-de670ce4e3ea", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "b8c74fdd-7a75-457c-bf3a-2f1c706a0a85", "metadata": {}, "source": [ "### **Interactive Gaussian Visualization**" ] }, { "cell_type": "code", "execution_count": 2, "id": "aa6483f7-e66f-4ce4-a3dd-774edfc733cf", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "abae2f9770fe4dd3b3bcb316a2a34653", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(FloatSlider(value=0.0, description='μ (center)', max=5.0, min=-5.0), FloatSlider(value=1…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# ============================================================\n", "# INTERACTIVE GAUSSIAN DEMO\n", "# Extremely child-friendly explanation + visualization\n", "# ============================================================\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from ipywidgets import interact, FloatSlider\n", "from IPython.display import display\n", "\n", "# Nice plot style\n", "plt.style.use(\"seaborn-v0_8\")\n", "\n", "# ------------------------------------------------------------\n", "# Function to draw a Gaussian curve\n", "# ------------------------------------------------------------\n", "def plot_gaussian(mu, sigma):\n", " # Make 400 points from -10 to 10\n", " x = np.linspace(-10, 10, 400)\n", "\n", " # Gaussian formula\n", " coef = 1.0 / (np.sqrt(2 * np.pi) * sigma)\n", " exponent = np.exp(-0.5 * ((x - mu) / sigma) ** 2)\n", " y = coef * exponent\n", "\n", " plt.figure(figsize=(8, 4))\n", " plt.plot(x, y, linewidth=3)\n", " plt.title(f\"Gaussian Density (μ = {mu}, σ = {sigma})\", fontsize=16)\n", " plt.xlabel(\"x\")\n", " plt.ylabel(\"Probability Density\")\n", " plt.ylim(0, max(0.5, np.max(y)*1.1))\n", "\n", " # Vertical line at mean\n", " plt.axvline(mu, color=\"red\", linestyle=\"--\", label=\"Mean (center)\")\n", " plt.legend()\n", " plt.grid(True)\n", " plt.show()\n", "\n", "# ------------------------------------------------------------\n", "# Interactive sliders\n", "# ------------------------------------------------------------\n", "interact(\n", " plot_gaussian,\n", " mu=FloatSlider(value=0, min=-5, max=5, step=0.1, description=\"μ (center)\"),\n", " sigma=FloatSlider(value=1.0, min=0.2, max=5, step=0.1, description=\"σ (spread)\")\n", ");" ] }, { "cell_type": "markdown", "id": "6c4fb049-cb4c-48ae-9730-71c3ea1b7ac8", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "905f2459-68c0-44a1-97c2-46b955f6a91b", "metadata": {}, "source": [ "## **5. Explanation of parameter effects**\n", "\n", "**$\\mu$ (mean):**\n", " Moves the whole bell curve left or right.\n", "(*Like moving the center of a flashlight beam.*)\n", "\n", "**$\\sigma$ (standard deviation):**\n", " Controls width\n", "\n", "* Small $\\sigma\\implies$ skinny + tall\n", "* Large $\\sigma\\implies$ wide + flat\n", " (*Like zooming in and out on a hill.*)\n", "\n", "The interactive sliders show this clearly.\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "539c2ba1-4f49-41bc-871a-11937de067bc", "metadata": {}, "source": [ "## **6. Sampling From the Gaussian**\n", "\n", "Let’s draw random points from the Gaussian and plot them.\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "839f8e14-c19f-4ae8-a0db-044735f564bb", "metadata": {}, "source": [ "### **Interactive Sampling Demo**" ] }, { "cell_type": "code", "execution_count": 3, "id": "80d1b67a-23ee-4797-b418-ae71b0afa749", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5a6560a94a4a40369d1bf40bbe383e5c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(FloatSlider(value=0.0, description='mu', max=5.0, min=-5.0), FloatSlider(value=1.0, desc…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# ============================================================\n", "# GAUSSIAN SAMPLING DEMO\n", "# ============================================================\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from ipywidgets import interact, FloatSlider, IntSlider\n", "\n", "plt.style.use(\"seaborn-v0_8\")\n", "\n", "def sample_gaussian(mu, sigma, n):\n", " samples = np.random.normal(mu, sigma, n)\n", "\n", " plt.figure(figsize=(8, 4))\n", " plt.hist(samples, bins=20, density=True, alpha=0.6, color=\"orange\")\n", " plt.title(f\"Samples from Gaussian (μ = {mu}, σ = {sigma})\", fontsize=16)\n", " plt.xlabel(\"Value\")\n", " plt.ylabel(\"Frequency\")\n", " plt.grid(True)\n", " plt.show()\n", "\n", "interact(\n", " sample_gaussian,\n", " mu=FloatSlider(value=0, min=-5, max=5, step=0.1),\n", " sigma=FloatSlider(value=1, min=0.2, max=5, step=0.1),\n", " n=IntSlider(value=200, min=50, max=2000, step=50, description=\"Samples\")\n", ");" ] }, { "cell_type": "markdown", "id": "6578c6c0-114b-4e80-9939-6f6da5581e87", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "d551b679-4377-4086-83be-25110176c99e", "metadata": {}, "source": [ "## **7. Sampling animation**" ] }, { "cell_type": "code", "execution_count": null, "id": "23599ee6-a1a1-4b62-a622-7efe9870da6f", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_30545/3930654035.py:64: UserWarning: Creating legend with loc=\"best\" can be slow with large amounts of data.\n", " anim.save(gif_name, writer=PillowWriter(fps=20))\n" ] } ], "source": [ "# ============================================================\n", "# GAUSSIAN SAMPLING ANIMATION (GIF)\n", "# Samples \"falling\" onto a histogram frame-by-frame\n", "# ============================================================\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from matplotlib.animation import FuncAnimation, PillowWriter\n", "\n", "# Parameters\n", "mu = 0 # center\n", "sigma = 1 # spread\n", "num_samples = 4000 # total points to animate\n", "frames = 100 # how many animation frames\n", "gif_name = \"gaussian_falling_samples.gif\"\n", "\n", "# Generate all samples in advance\n", "samples = np.random.normal(mu, sigma, num_samples)\n", "\n", "# Prepare figure\n", "fig, ax = plt.subplots(figsize=(8, 4))\n", "ax.set_xlim(-5, 5)\n", "ax.set_ylim(0, 0.5)\n", "ax.set_title(f\"Gaussian Sampling Animation (μ={mu}, σ={sigma})\")\n", "ax.set_xlabel(\"Value\")\n", "ax.set_ylabel(\"Density\")\n", "\n", "# Draw the true Gaussian density curve\n", "x = np.linspace(-5, 5, 400)\n", "true_pdf = (1/(np.sqrt(2*np.pi)*sigma)) * np.exp(-0.5 * ((x - mu)/sigma)**2)\n", "ax.plot(x, true_pdf, linewidth=2, color=\"blue\", label=\"True Density\")\n", "ax.legend()\n", "\n", "# Prepare histogram data\n", "hist_data = []\n", "\n", "# Histogram bars (initial empty)\n", "bars = ax.bar([], [], width=0.3)\n", "\n", "# Animation function\n", "def update(frame_index):\n", " global bars\n", " \n", " # Add some samples each frame\n", " chunk = samples[: int((frame_index+1) * num_samples/frames)]\n", " \n", " # Compute histogram\n", " counts, bin_edges = np.histogram(chunk, bins=30, range=(-5, 5), density=True)\n", " \n", " # Clear previous bars\n", " for col in ax.collections:\n", " col.clear()\n", " #ax.collections.clear()\n", " \n", " # Draw updated histogram\n", " ax.hist(chunk, bins=30, range=(-5, 5), density=True, alpha=0.6, color=\"orange\")\n", "\n", " ax.set_title(f\"Gaussian Sampling (Frame {frame_index+1}/{frames})\")\n", "\n", "# Create animation\n", "anim = FuncAnimation(fig, update, frames=frames, interval=80)\n", "\n", "# Save as GIF\n", "anim.save(gif_name, writer=PillowWriter(fps=20))\n", "\n", "plt.close()\n", "\n", "print(\"GIF saved as:\", gif_name)" ] }, { "cell_type": "markdown", "id": "9b01cd69-8eea-4ea0-9f3c-a59d81cb0ebb", "metadata": {}, "source": [ "![alt text](gaussian_falling_samples.gif)" ] }, { "cell_type": "code", "execution_count": null, "id": "2132b9bd-2ea7-4868-ab3e-83773435b55f", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" } }, "nbformat": 4, "nbformat_minor": 5 }