Self-Organizing Maps (SOM)

Self-Organizing Maps (SOM) are one of the clustering techniques under the branch of unsupervised learning. It was invented by Teuvo Kohonen in 1982, and therefore, sometimes called as Kohonen map.

In this post, we introduce the concept of SOM and the how the algorithm works in Python using an intuitive example. By the end of the post, you will learn about (1) what is SOM, (2) how is it working, and (3) how to implement it in Python.

  • Introduction to Self-Organizing Maps (SOM)
  • Importing required packages
  • Creating training data and initializing SOM cells
  • SOM implementation
  • Results
  • Introduction to Self-Organizing Maps (SOM)

    SOM represents can be treated as a form of artificial neural network that constructs a map based on the examples of training dataset. This map typically takes the form of a 2D rectangular grid of weights, although it can also be expanded to include 3D or higher-dimensional models.

    Each cell in the map can be thought as n-dimentional vector containing learnable weights, where n is also the same as the dimension of an example or the number of features in the example. For example, a training example $x_1$ and a cell $w_{ij}$ of SOM: $$x_1=\{f_{11}, \cdots, f_{1n}\}$$ $$w_{ij}=\{f_{ij1}, \cdots, f_{ijn}\}$$

    Initialization

    For the simplicity, we use the 2D case as our example. To get started, we need to pre-define the shape of SOM - axb - where a and b indicate the number of rows and columns of SOM respectively. In other words, we have axb cells and each cell can be thought as a cluster represented as a n-dimentional vector or weights. As one might expect, we need to initialize those weights. This can be done with a random initialization strategy or using training examples.

    Training

    Those initalized weights of SOM need to be trained/updated with training examples based on predefined number of epochs as we do in other machine learning methods. One big difference here is we don't define a loss function and use backpropagation to update the weights of SOM. Instead, SOM uses Competitive Learning which is a form of unsupervised learning, where constituent elements compete to produce a satisfying result, and only one gets to win the competition. 

    More specifically, for a given trainining instance x, we determine the best-matching cell/unit (refered as BMU) using a distance metric such as Euclidean distance (as both x and cell weight vector are n dimenitonal) who win the competition. Afterwards, we update the weights of the BMU as well as those of its neighboring cells, e.g., those on the left, right, above, and below. An update at time $t+1$ for the weight vector $w_{ij}$ of SOM can be described as: $$w_{ij}^{t+1} \leftarrow w_{ij}^{t} + \eta f(i,j,r_{t}) (x-w_{ij}^t)$$ where $\eta$ is the learning rate, (i,j) indicates the index of the cell in SOM, and $r_{t}$ refers to the radius to determine neighboring cells. $f(\cdot)$ is the neighborhood distance function. Simply put, a training example x pulls the weights of these cells towards it.

    To summarize, the training can be illustrated as below:
    • Initialize all grid weights of the SOM 
    • Repeat until convergence or the predefined maximum epochs 
      • Shuffle the training examples 
      • For each training instance x, we get the best matching unit BMU, and then update the weights of BMU and its neighboring cells
    One thing to note is that both the learning rate and the radius are decayed at each epoch during training. This means the weight updates and the number of neighboring cells is getting smaller in each epoch. In the following, we illustrate the concept of SOM in Python.

    Importing required packages

    First, let's import some required Python packages such as numpy and matplotlib.
    
    import numpy as np
    import matplotlib
    import matplotlib.pyplot as plt
    
    print(np.__version__)
    print(matplotlib.__version__)
    
    

    1.19.5
    3.3.4



    Creating training data and initializing SOM cells

    Here, we create a training data containing 3000 examples, and each example is 3-dimentional vector representing Red, Green, and Blue component value in the RGB color space. Then, we also create $10 \times 10$ SOM in which each cell is also 3-dimentional vector.
    
    m_som = 10
    n_som = 10
    n_x = 3000
    
    rand = np.random.RandomState(777)
    
    # Create training data
    x_train = rand.randint(
        low=0,
        high=255,
        size=(n_x, 3)
    )
    print(x_train.shape)
    
    # Initialize SOM cells
    som = rand.randint(
        low=0,
        high=255,
        size=(m_som, n_som, 3)
    )
    print(som.shape)
    
    

    (3000, 3)
    (10, 10, 3)

    I like this example as we can visualize it and see the generated training examples and the initialized SOM.

    
    fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 6))
    ax[0].imshow(x_train.reshape(50, 60, 3))
    ax[0].set_title('Training data')
    ax[1].imshow(som)
    ax[1].set_title('Initialized SOM')
    
    

    SOM implementation

    Now, we can implement two functions, getBMU() for retrieving the BMU of SOM and update_weights() to update the weights of cells for a given training example as we discussed before.

    Getting BMU cell

    
    def getBMU(x, som):
        """ Return cell index g,h of som clostest to x """
        square_distance = (np.square(x - som)).sum(axis=2)
        return np.unravel_index(
            np.argmin(square_distance, axis=None), 
            square_distance.shape
        )
    
    getBMU(x_train[0], som)
    
    

    (8, 1)

    Updating the weights of SOM cells

    
    def update_weights(
        som, 
        x, 
        lr, 
        radius_sq, 
        bmu, 
        step=3
    ):
        """
        Update the weights of the SOM cells with a training example
    
        :parameter som: SOM
        :parameter x: a training example
        :parameter lr: learing rate
        :parameter radius_sq: radius
        :parameter bmu: (g,h) coordinates of bmu
        :parameter step: to determine neighborhood
        """
        g, h = bmu
        #if radius is close to zero then only BMU is changed
        if radius_sq < 1e-3:
            som[g,h,:] += lr * (x - som[g,h,:])
            return som
        # Change all cells in a small neighborhood of BMU based on step 
        for i in range(max(0, g-step), min(som.shape[0], g+step)):
            for j in range(max(0, h-step), min(som.shape[1], h+step)):
                dist_sq = np.square(i - g) + np.square(j - h)
                dist_func = np.exp(-dist_sq / 2 / radius_sq)
                som[i,j,:] += lr * dist_func * (x - som[i,j,:])   
        return som  
    
    

    Training: Learning the weights of SOM cells

    Now we can start training SOM using those two implemented functions. Here, we train 10 epochs with a decay of 0.1 for both learning rate and radius.
    
    def train(
        som, 
        x_train, 
        learn_rate = 0.1, 
        radius_sq = 1, 
        lr_decay = .1, 
        radius_decay = .1, 
        epochs = 10
    ):
        """ Training SOM cell weights """
        learn_rate_0 = learn_rate
        radius_0 = radius_sq
        for epoch in np.arange(0, epochs):
            rand.shuffle(x_train)      
            for x in x_train:
                g, h = getBMU(som, x)
                som = update_weights(
                        som.astype(float), 
                        x, 
                        learn_rate, 
                        radius_sq, 
                        (g,h)
                )
            # Update learning rate and radius
            learn_rate = learn_rate_0 * np.exp(-epoch * lr_decay)
            radius_sq = radius_0 * np.exp(-epoch * radius_decay)            
        return som
    
    

    Results

    We can also visualize the change of weights in SOM after certain epochs to investigate how the weights of SOM have been changing during our training process.
    
    fig, ax = plt.subplots(
        nrows=1, 
        ncols=4, 
        figsize=(15, 4), 
    )
    
    total_epochs = 0
    for epochs, i in zip([1, 4, 5, 10], range(0,4)):
        total_epochs += epochs
        som = train(som, x_train, epochs=epochs)
        ax[i].imshow(som.astype(int))
        ax[i].title.set_text('Epochs = ' + str(total_epochs))
    
    


    References