\n",
"## Click here for text recap of video

\n",
"\n",
"Finding a good model can be difficult. One of the most important concepts to keep in mind when modeling is the **bias-variance tradeoff**.\n",
"\n",
"**Bias** is the difference between the prediction of the model and the corresponding true output variables you are trying to predict. Models with high bias will not fit the training data well since the predictions are quite different from the true data. These high bias models are overly simplified - they do not have enough parameters and complexity to accurately capture the patterns in the data and are thus **underfitting**.\n",
"\n",
"\n",
"**Variance** refers to the variability of model predictions for a given input. Essentially, do the model predictions change a lot with changes in the exact training data used? Models with high variance are highly dependent on the exact training data used - they will not generalize well to test data. These high variance models are **overfitting** to the data.\n",
"\n",
"In essence:\n",
"\n",
"* High bias, low variance models have high train and test error.\n",
"* Low bias, high variance models have low train error, high test error\n",
"* Low bias, low variance models have low train and test error\n",
"\n",
"\n",
"As we can see from this list, we ideally want low bias and low variance models! These goals can be in conflict though - models with enough complexity to have low bias also tend to overfit and depend on the training data more. We need to decide on the correct tradeoff.\n",
"\n",
"In this section, we will see the bias-variance tradeoff in action with polynomial regression models of different orders.\n",
"\n",
"

\n",
"\n",
"Graphical illustration of bias and variance.\n",
"(Source: http://scott.fortmann-roe.com/docs/BiasVariance.html)\n",
"\n",
"![bias-variance](https://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/images/bias_variance/bullseye.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"We will first fit polynomial regression models of orders 0-5 on our simulated training data just as we did in Tutorial 4."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Execute this cell to estimate theta_hats\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @markdown Execute this cell to estimate theta_hats\n",
"max_order = 5\n",
"theta_hats = solve_poly_reg(x_train, y_train, max_order)"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"## Coding Exercise 2: Compute and compare train vs test error\n",
"\n",
"We will use MSE as our error metric again. Compute MSE on training data ($x_{train},y_{train}$) and test data ($x_{test}, y_{test}$) for each polynomial regression model (orders 0-5). Since you already developed code in T4 Exercise 4 for making design matrices and evaluating fit polynomials, we have ported that here into the functions `make_design_matrix` and `evaluate_poly_reg` for your use.\n",
"\n",
"*Please think about it after completing the exercise before reading the following text! Do you think the order 0 model has high or low bias? High or low variance? How about the order 5 model?*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Execute this cell for function `evalute_poly_reg`\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @markdown Execute this cell for function `evalute_poly_reg`\n",
"\n",
"def evaluate_poly_reg(x, y, theta_hats, max_order):\n",
" \"\"\" Evaluates MSE of polynomial regression models on data\n",
"\n",
" Args:\n",
" x (ndarray): input vector of shape (n_samples)\n",
" y (ndarray): vector of measurements of shape (n_samples)\n",
" theta_hats (dict): fitted weights for each polynomial model (dict key is order)\n",
" max_order (scalar): max order of polynomial fit\n",
"\n",
" Returns\n",
" (ndarray): mean squared error for each order, shape (max_order)\n",
" \"\"\"\n",
"\n",
" mse = np.zeros((max_order + 1))\n",
" for order in range(0, max_order + 1):\n",
" X_design = make_design_matrix(x, order)\n",
" y_hat = np.dot(X_design, theta_hats[order])\n",
" residuals = y - y_hat\n",
" mse[order] = np.mean(residuals ** 2)\n",
"\n",
" return mse"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {}
},
"outputs": [],
"source": [
"def compute_mse(x_train, x_test, y_train, y_test, theta_hats, max_order):\n",
" \"\"\"Compute MSE on training data and test data.\n",
"\n",
" Args:\n",
" x_train(ndarray): training data input vector of shape (n_samples)\n",
" x_test(ndarray): test data input vector of shape (n_samples)\n",
" y_train(ndarray): training vector of measurements of shape (n_samples)\n",
" y_test(ndarray): test vector of measurements of shape (n_samples)\n",
" theta_hats(dict): fitted weights for each polynomial model (dict key is order)\n",
" max_order (scalar): max order of polynomial fit\n",
"\n",
" Returns:\n",
" ndarray, ndarray: MSE error on training data and test data for each order\n",
" \"\"\"\n",
"\n",
" #######################################################\n",
" ## TODO for students: calculate mse error for both sets\n",
" ## Hint: look back at tutorial 5 where we calculated MSE\n",
" # Fill out function and remove\n",
" raise NotImplementedError(\"Student exercise: calculate mse for train and test set\")\n",
" #######################################################\n",
"\n",
" mse_train = ...\n",
" mse_test = ...\n",
"\n",
" return mse_train, mse_test\n",
"\n",
"\n",
"# Compute train and test MSE\n",
"mse_train, mse_test = compute_mse(x_train, x_test, y_train, y_test, theta_hats, max_order)\n",
"\n",
"# Visualize\n",
"plot_MSE_poly_fits(mse_train, mse_test, max_order)"
]
},
{
"cell_type": "markdown",
"metadata": {
"cellView": "both",
"colab_type": "text",
"execution": {}
},
"source": [
"[*Click for solution*](https://github.com/NeuromatchAcademy/course-content/tree/main/tutorials/W1D2_ModelFitting/solutions/W1D2_Tutorial5_Solution_bb5f169f.py)\n",
"\n",
"*Example output:*\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Compute_train_vs_test_error_Exercise\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"As we can see from the plot above, more complex models (higher order polynomials) have lower MSE for training data. The overly simplified models (orders 0 and 1) have high MSE on the training data. As we add complexity to the model, we go from high bias to low bias.\n",
"\n",
"The MSE on test data follows a different pattern. The best test MSE is for an order 2 model - this makes sense as the data was generated with an order 2 model. Both simpler models and more complex models have higher test MSE.\n",
"\n",
"So to recap:\n",
"\n",
"Order 0 model: High bias, low variance\n",
"\n",
"Order 5 model: Low bias, high variance\n",
"\n",
"Order 2 model: Just right, low bias, low variance\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Summary\n",
"\n",
"*Estimated timing of tutorial: 25 minutes*\n",
"\n",
"- Training data is the data used for fitting, test data is held-out data.\n",
"- We need to strike the right balance between bias and variance. Ideally we want to find a model with optimal model complexity that has both low bias and low variance\n",
" - Too complex models have low bias and high variance.\n",
" - Too simple models have high bias and low variance."
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"**Note**\n",
" - Bias and variance are very important concepts in modern machine learning, but it has recently been observed that they do not necessarily trade off (see for example the phenomenon and theory of \"double descent\")\n",
"\n",
"**Further readings:**\n",
"- [The elements of statistical learning](https://web.stanford.edu/~hastie/ElemStatLearn/) by Hastie, Tibshirani and Friedman"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Bonus\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"## Bonus Exercise: Proof of bias-variance decomposition\n",
"\n",
"Prove the bias-variance decomposition for MSE\n",
"\n",
"\\begin{equation}\n",
"\\mathbb{E}_{x}\\left[\\left(y-\\hat{y}(x ; \\theta)\\right)^{2}\\right]=\\left(\\operatorname{Bias}_{x}[\\hat{y}(x ; \\theta)]\\right)^{2}+\\operatorname{Var}_{x}[\\hat{y}(x ; \\theta)]+\\sigma^{2}\n",
"\\end{equation}\n",
"\n",
"where\n",
"\n",
"\\begin{equation}\n",
"\\operatorname{Bias}_{x}[\\hat{y}(x ; \\theta)]=\\mathbb{E}_{x}[\\hat{y}(x ; \\theta)]-y\n",
"\\end{equation}\n",
"\n",
"and\n",
"\n",
"\\begin{equation}\n",
"\\operatorname{Var}_{x}[\\hat{y}(x ; \\theta)]=\\mathbb{E}_{x}\\left[\\hat{y}(x ; \\theta)^{2}\\right]-\\mathrm{E}_{x}[\\hat{y}(x ; \\theta)]^{2}\n",
"\\end{equation}\n",
"\n",
"\n",
"\n",
"## Click here for a hint

\n",
"\n",
"Use the equation:\n",
"\n",
"\\begin{equation}\n",
"\\operatorname{Var}[X]=\\mathbb{E}\\left[X^{2}\\right]-(\\mathrm{E}[X])^{2}\n",
"\\end{equation}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Proof_bias_variance_for_MSE_Bonus_Exercise\")"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"include_colab_link": true,
"name": "W1D2_Tutorial5",
"provenance": [],
"toc_visible": true
},
"kernel": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"toc-autonumbering": true
},
"nbformat": 4,
"nbformat_minor": 0
}