https://colab.research.google.com/drive/1Gdhg31uDsLyZsbpy5FXtvx925ogxGATr

Question

Kshitij · Accepted Answer

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Copy of hw4_yourName.ipynb",
      "provenance": [],
      "collapsed_sections": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eKIz_ttKiG_1",
        "colab_type": "text"
      },
      "source": [
        "# IFT6269 - Homework 4 - Hidden Markov Models
",
        "**Due:**  Tuesday, November 26, 2019"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "whO8SD53Y9Vl",
        "colab_type": "text"
      },
      "source": [
        "#### Name:
",
        "#### Student ID: 
",
        "#### Collaborators: 
",
        "
"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "02fg3bxiZOMv",
        "colab_type": "text"
      },
      "source": [
        "## Introduction
",
        "
",
        "The file `EMGaussian.train` contains samples of data $\{x_t\}_{t=1}^T$ where $x_t \in \mathbb{R}^2$, with one datapoint per row. `EMGaussian.test` is structured similarly. This is the same data we used in Homework 3, but this time we use an HMM model to account for the possible temporal structure of the data. This means that we now consider each row of the dataset to be a point $x_t \in \mathbb{R}^2$ corresponding to some temporal process, rather than thinking of them as *independent* samples as we did in the last homework. 
",
        "
",
        "We consider the following HMM model: the chain $(z_t)_{t=1}^T$ has  $K=4$ possible states, with an initial probability distribution $\pi\in\Delta_4$ and a probability transition matrix  $A \in \mathbb{R}^{4 \times 4}$ where $A_{ij} = p(z_t=i | z_{t-1} = j),$ and conditionally on the current state $z_t$, we have observations obtained from Gaussian emission probabilities $x_t| (z_t=k) \sim \mathcal{N}(x_t | \mu_k, \Sigma_k)$.  This is thus a generalization of a GMM since we now allow for time dependencie across the latent states $z_t$.
",
        "
",
        "This exercise has several implementation objectives:
",
        "* **Sum-product**: probabilistic inference on the HMM
",
        "* **Expectation-Maximization**: parameter estimation
",
        "* **Viterbi**: decoding.
",
        "
",
        "**Note:** You may use the (*possibly corrected*) code you created for the previous assignment. Furthermore, notice there are some math questions in this notebook: do not forget to solve them! 
",
        "
",
        "### Tasks
",
        "0.   Get your own copy of this file via "File > Save a copy in Drive...",
",
        "1.   Fill your personal information and collaborators at the top of this assignment, and rename the notebook accordingly, e.g., `hw4_thomasBayes.ipynb`
",
        "2.   Read the instructions provided on each section and cell carefully,
",
        "4.   Complete the exercises in the sections **Sum-product**, **Expectation-Maximization**, **Viterbi**, **Comparing methods** and **What about K?**.
",
        "    
",
        "**Important**: You are allowed to collaborate with other students in both the math and coding parts of this assignment. However, the answers provided here must reflect your individual work. For that reason, you are not allowed to share this notebook, except for your submission to the TA for grading. **Don't forget to pin and save the version of the notebook you want to be graded on!**"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "S3gk4JHNY9yW",
        "colab_type": "code",
        "outputId": "85d33c40-d77d-46a2-c73b-d36f8d15e228",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 252
        }
      },
      "source": [
        "!wget http://www.iro.umontreal.ca/~slacoste/teaching/ift6269/A19/notes/hwk3data.zip
",
        "!unzip hwk3data.zip
",
        "
",
        "import numpy as np
",
        "import matplotlib.pyplot as plt
",
        "plt.style.use('seaborn-white')
",
        "
",
        "X_train = np.loadtxt("/content/hwk3data/EMGaussian.train")
",
        "X_test = np.loadtxt("/content/hwk3data/EMGaussian.test")"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "--2019-11-26 06:01:46--  http://www.iro.umontreal.ca/~slacoste/teaching/ift6269/A19/notes/hwk3data.zip
",
            "Resolving www.iro.umontreal.ca (www.iro.umontreal.ca)... 132.204.26.36
",
            "Connecting to www.iro.umontreal.ca (www.iro.umontreal.ca)|132.204.26.36|:80... connected.
",
            "HTTP request sent, awaiting response... 200 OK
",
            "Length: 7269 (7.1K) [application/zip]
",
            "Saving to: ‘hwk3data.zip’
",
            "
",
            "hwk3data.zip          0%[                    ]       0  --.-KB/s               hwk3data.zip        100%[===================>]   7.10K  --.-KB/s    in 0s      
",
            "
",
            "2019-11-26 06:01:46 (585 MB/s) - ‘hwk3data.zip’ saved [7269/7269]
",
            "
",
            "Archive:  hwk3data.zip
",
            "  inflating: hwk3data/EMGaussian.test  
",
            "  inflating: hwk3data/EMGaussian.train  
"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cBJGYsmug-ZD",
        "colab_type": "text"
      },
      "source": [
        "## Playground
",
        "
",
        "You are allowed to add as many cells and functions as you wish in this section, but not allowed to change the signature (name and inputs) of the functions we provided!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A2sgnsCmhG9p",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "# ---------------------------------------------------------------------------
",
        "#                       Code for plotting the results 
",
        "#                      ! DO NOT MODIFY THESE FUNCTIONS !
",
        "# ---------------------------------------------------------------------------
",
        "
",
        "def plot_smoothing(gamma, K=4, time_limit=100):
",
        "    plt.figure(figsize=(14, 2*K))
",
        "    plt.suptitle('Smoothing probabilities $p(z_t|x_1, ..., x_T)$', fontsize=16)
",
        "    for k in range(K):
",
        "        plt.subplot(K, 1, 1+k)
",
        "        plt.plot(range(1, time_limit+1), gamma[:time_limit, k] )
",
        "        plt.ylabel(r'$p(z_t = ' + str(k+1) + ' | x_{1:T})$')
",
        "        plt.ylim(0, 1)
",
        "        plt.grid(True)
",
        "    plt.xlabel('t')
",
        "    plt.show()
",
        "
",
        "def plot_labelling(X, labels, mus, title=""):
",
        "    shapes = ['o', '*', 'v', '+']  
",
        "    colors = [[31, 119, 180], [255, 127, 14], [44, 160, 44], [148, 103, 189],
",
        "              [140, 86, 75], [227, 119, 194], [127, 127, 127], [188, 189, 34]]
",
        "
",
        "    fig = plt.figure(figsize=(5, 5))
",
        "    cs = [colors[int(_) % len(colors)] for _ in labels]
",
        "    plt.scatter(X[:, 0], X[:, 1], c=np.array(cs)/255.)
",
        "    plt.scatter(mus[:, 0], mus[:, 1], marker='o', c='#d62728')
",
        "    plt.xlim(-12, 12), plt.ylim(-12, 12)
",
        "    plt.title(title, fontsize=16)        
",
        "    plt.show()
",
        "
",
        "def plot_dominoes(data):
",
        "    # Pick max from data per timestep
",
        "    data_maxhot = (data == data.max(axis=1, keepdims=True))
",
        "    
",
        "    fig, ax = plt.subplots()
",
        "    fig.set_size_inches(12, 3)
",
        "    ax.pcolor(1 - data_maxhot[:100,::-1].T, cmap=plt.cm.gray, alpha=0.6)
",
        "    ax.set_yticks(np.arange(4) + 0.5, minor=False)
",
        "    ax.set_yticklabels([4,3,2,1], minor=False)
",
        "    plt.grid(True)
",
        "    plt.tight_layout()
",
        "    
",
        "    plt.show()"
      ],
      "execution_count": 0,
      "outputs":

https://colab.research.google.com/drive/1Gdhg31uDsLyZsbpy5FXtvx925ogxGATr

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment