<h1 style="text-align:center;">Gaussian Density Estimation Using Maximum Likelihood</h1>
<p style="text-align:center;">
Nazar Khan
<br>CVML Lab
<br>University of The Punjab
</p>

This is a Python tutorial on **Gaussian density estimation** that demonstrates the equivalence of maximizing likelihood with minimizing mean squared error (MSE). We solve a simple density estimation problem using maximum likelihood estimation (MLE).

---

#### **Objective:**
**Maximum Likelihood Estimation (MLE):** We will show that maximizing the likelihood is equivalent to minimizing the mean squared error (MSE).

We will work with a simple synthetic dataset generated from a normal distribution, and then both MLE for density estimation.

---

### **Step 1: Import Libraries**



In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# For visualization
import seaborn as sns
sns.set(style="whitegrid")

---

### **Step 2: Generate Synthetic Data**

Let's generate some synthetic data from a normal distribution, which will be our target density to estimate.



In [None]:

# Generating synthetic data
#np.random.seed(42)
data = np.random.normal(loc=0, scale=1, size=10)

# Visualizing the data with a histogram
plt.figure(figsize=(8, 4))
sns.histplot(data, bins=20, kde=True, color="blue", stat="density")
plt.title("Histogram of the Synthetic Data")
plt.xlabel("Data Values")
plt.ylabel("Density")
plt.show()

---

### **Step 3: Maximum Likelihood Estimation (MLE)**

In MLE, we assume the data is generated from a normal distribution, and we want to estimate the parameters (mean $\mu$ and variance $\sigma^2$) by maximizing the likelihood function. 

The likelihood function is defined as:

$
L(\mu, \sigma^2) = \prod_{i=1}^{N} \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x_i - \mu)^2}{2\sigma^2}\right)
$

Maximizing the log-likelihood:

$
\log L(\mu, \sigma^2) = -\frac{N}{2} \log(2\pi) - \frac{N}{2} \log(\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^{N} (x_i - \mu)^2
$

The maximum likelihood estimates (MLE) for the mean and variance are:

$
\hat{\mu}_{MLE} = \frac{1}{N} \sum_{i=1}^{N} x_i
$

$
\hat{\sigma}^2_{MLE} = \frac{1}{N} \sum_{i=1}^{N} (x_i - \hat{\mu}_{MLE})^2
$

Maximizing the log-likelihood is equivalent to minimizing the **mean squared error (MSE)**:

$
\text{MSE} = \frac{1}{N} \sum_{i=1}^{N} (x_i - \mu)^2
$

#### **MLE Solution:**

In [None]:
# MLE estimation for the mean and variance
mu_mle = np.mean(data)
sigma_mle = np.var(data)

# Displaying the results
print(f"MLE Estimate for Mean: {mu_mle}")
print(f"MLE Estimate for Variance: {sigma_mle}")

---

### **Step 5: Visualize the Results**

Let's compare the density estimates from both MLE and MAP with the true distribution.

In [None]:
# Generating the density estimates
x_values = np.linspace(-3, 3, 100)
mle_density = norm.pdf(x_values, loc=mu_mle, scale=np.sqrt(sigma_mle))
true_density = norm.pdf(x_values, loc=0, scale=1)  # True distribution

# Plotting the densities
plt.figure(figsize=(10, 6))
plt.plot(x_values, true_density, label="True Density", color="black", linestyle="--")
plt.plot(x_values, mle_density, label="MLE Estimate", color="blue")
plt.title("Density Estimation: MLE vs MAP")
plt.xlabel("x")
plt.ylabel("Density")
plt.legend()
plt.show()


---

### **Step 6: Conclusion**

- **MLE**: Maximizes the likelihood, equivalent to minimizing MSE. The resulting estimate fits the data well but may overfit when data is sparse.