Self-Organizing Maps (SOM) are one of the clustering techniques under the branch of unsupervised learning. It was invented by Teuvo Kohonen in 1982, and therefore, sometimes called as Kohonen map.
In this post, we introduce the concept of SOM and the how the algorithm works in Python using an intuitive example. By the end of the post, you will learn about (1) what is SOM, (2) how is it working, and (3) how to implement it in Python.
Introduction to Self-Organizing Maps (SOM)
SOM represents can be treated as a form of artificial neural network that constructs a map based on the examples of training dataset. This map typically takes the form of a 2D rectangular grid of weights, although it can also be expanded to include 3D or higher-dimensional models.
Each cell in the map can be thought as n-dimentional vector containing learnable weights, where n is also the same as the dimension of an example or the number of features in the example. For example, a training example $x_1$ and a cell $w_{ij}$ of SOM: $$x_1=\{f_{11}, \cdots, f_{1n}\}$$ $$w_{ij}=\{f_{ij1}, \cdots, f_{ijn}\}$$
Initialization
For the simplicity, we use the 2D case as our example. To get started, we need to pre-define the shape of SOM - axb - where a and b indicate the number of rows and columns of SOM respectively. In other words, we have axb cells and each cell can be thought as a cluster represented as a n-dimentional vector or weights. As one might expect, we need to initialize those weights. This can be done with a random initialization strategy or using training examples.Training
Those initalized weights of SOM need to be trained/updated with training examples based on predefined number of epochs as we do in other machine learning methods. One big difference here is we don't define a loss function and use backpropagation to update the weights of SOM. Instead, SOM uses Competitive Learning which is a form of unsupervised learning, where constituent elements compete to produce a satisfying result, and only one gets to win the competition.More specifically, for a given trainining instance x, we determine the best-matching cell/unit (refered as BMU) using a distance metric such as Euclidean distance (as both x and cell weight vector are n dimenitonal) who win the competition. Afterwards, we update the weights of the BMU as well as those of its neighboring cells, e.g., those on the left, right, above, and below. An update at time $t+1$ for the weight vector $w_{ij}$ of SOM can be described as: $$w_{ij}^{t+1} \leftarrow w_{ij}^{t} + \eta f(i,j,r_{t}) (x-w_{ij}^t)$$ where $\eta$ is the learning rate, (i,j) indicates the index of the cell in SOM, and $r_{t}$ refers to the radius to determine neighboring cells. $f(\cdot)$ is the neighborhood distance function. Simply put, a training example x pulls the weights of these cells towards it.
- Initialize all grid weights of the SOM
- Repeat until convergence or the predefined maximum epochs
- Shuffle the training examples
- For each training instance x, we get the best matching unit BMU, and then update the weights of BMU and its neighboring cells
Importing required packages
First, let's import some required Python packages such as numpy and matplotlib.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
print(np.__version__)
print(matplotlib.__version__)
1.19.5
3.3.4
Creating training data and initializing SOM cells
Here, we create a training data containing 3000 examples, and each example is 3-dimentional vector representing Red, Green, and Blue component value in the RGB color space. Then, we also create $10 \times 10$ SOM in which each cell is also 3-dimentional vector.
m_som = 10
n_som = 10
n_x = 3000
rand = np.random.RandomState(777)
# Create training data
x_train = rand.randint(
low=0,
high=255,
size=(n_x, 3)
)
print(x_train.shape)
# Initialize SOM cells
som = rand.randint(
low=0,
high=255,
size=(m_som, n_som, 3)
)
print(som.shape)
(3000, 3)
(10, 10, 3)
I like this example as we can visualize it and see the generated training examples and the initialized SOM.
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 6))
ax[0].imshow(x_train.reshape(50, 60, 3))
ax[0].set_title('Training data')
ax[1].imshow(som)
ax[1].set_title('Initialized SOM')
SOM implementation
Now, we can implement two functions, getBMU() for retrieving the BMU of SOM and update_weights() to update the weights of cells for a given training example as we discussed before.Getting BMU cell
def getBMU(x, som):
""" Return cell index g,h of som clostest to x """
square_distance = (np.square(x - som)).sum(axis=2)
return np.unravel_index(
np.argmin(square_distance, axis=None),
square_distance.shape
)
getBMU(x_train[0], som)
(8, 1)
Updating the weights of SOM cells
def update_weights(
som,
x,
lr,
radius_sq,
bmu,
step=3
):
"""
Update the weights of the SOM cells with a training example
:parameter som: SOM
:parameter x: a training example
:parameter lr: learing rate
:parameter radius_sq: radius
:parameter bmu: (g,h) coordinates of bmu
:parameter step: to determine neighborhood
"""
g, h = bmu
#if radius is close to zero then only BMU is changed
if radius_sq < 1e-3:
som[g,h,:] += lr * (x - som[g,h,:])
return som
# Change all cells in a small neighborhood of BMU based on step
for i in range(max(0, g-step), min(som.shape[0], g+step)):
for j in range(max(0, h-step), min(som.shape[1], h+step)):
dist_sq = np.square(i - g) + np.square(j - h)
dist_func = np.exp(-dist_sq / 2 / radius_sq)
som[i,j,:] += lr * dist_func * (x - som[i,j,:])
return som
Training: Learning the weights of SOM cells
Now we can start training SOM using those two implemented functions. Here, we train 10 epochs with a decay of 0.1 for both learning rate and radius.
def train(
som,
x_train,
learn_rate = 0.1,
radius_sq = 1,
lr_decay = .1,
radius_decay = .1,
epochs = 10
):
""" Training SOM cell weights """
learn_rate_0 = learn_rate
radius_0 = radius_sq
for epoch in np.arange(0, epochs):
rand.shuffle(x_train)
for x in x_train:
g, h = getBMU(som, x)
som = update_weights(
som.astype(float),
x,
learn_rate,
radius_sq,
(g,h)
)
# Update learning rate and radius
learn_rate = learn_rate_0 * np.exp(-epoch * lr_decay)
radius_sq = radius_0 * np.exp(-epoch * radius_decay)
return som
Results
We can also visualize the change of weights in SOM after certain epochs to investigate how the weights of SOM have been changing during our training process.
fig, ax = plt.subplots(
nrows=1,
ncols=4,
figsize=(15, 4),
)
total_epochs = 0
for epochs, i in zip([1, 4, 5, 10], range(0,4)):
total_epochs += epochs
som = train(som, x_train, epochs=epochs)
ax[i].imshow(som.astype(int))
ax[i].title.set_text('Epochs = ' + str(total_epochs))
References
- Kohonen, Teuvo. "Self-organized formation of topologically correct feature maps." Biological cybernetics 43, no. 1 (1982): 59-69.
- Self-Organizing Maps: Theory and Implementation in Python with NumPy
- Beginners Guide to Anomaly Detection Using Self-Organizing Maps