Self-Organizing Maps | TrendSpider Learning Center (2024)

16 mins read

A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to create a low-dimensional (typically two-dimensional) representation of a higher-dimensional dataset while preserving the topological structure of the data.

For example, a dataset with multiple variables measured across numerous observations can be represented as clusters of observations with similar values. These clusters can then be visualized as a two-dimensional “map,” where observations in closer clusters have more similar values than those in distant clusters, making high-dimensional data easier to visualize and analyze.

An SOM is a type of artificial neural network but is trained using competitive learning rather than the error-correction learning (e.g., backpropagation with gradient descent) used by other neural networks. Introduced by Finnish professor Teuvo Kohonen in the 1980s, the SOM is also known as a Kohonen map or Kohonen network.

It builds on biological models of neural systems from the 1970s and morphogenesis models dating back to Alan Turing in the 1950s. SOMs create internal representations reminiscent of the cortical homunculus—a distorted representation of the human body based on a neurological “map” of the brain areas dedicated to processing sensory functions for different body parts.

SOMs operate in two modes: training and mapping. During training, an input dataset (the “input space”) is used to generate a lower-dimensional representation (the “map space”). Mapping involves classifying additional input data using the generated map. Typically, the goal of training is to represent an input space with multiple dimensions as a two-dimensional map space.

The map space consists of components called “nodes” or “neurons” arranged in a hexagonal or rectangular grid. The number and arrangement of nodes are specified beforehand based on the goals of the data analysis. Each node in the map space is associated with a “weight” vector, representing the node’s position in the input space.

While nodes in the map space remain fixed, training involves adjusting the weight vectors to move closer to the input data, minimizing a distance metric such as Euclidean distance while preserving the topology induced by the map space. After training, the map can classify additional input space observations by finding the node with the closest weight vector to the input space vector.

The self-organizing map (SOM) illustrated below visualizes U.S. Congress voting patterns. The input data includes rows for each member of Congress and columns representing their yes/no/abstain votes on various bills.

The SOM algorithm arranges these members on a two-dimensional grid, placing similar voting patterns closer together.

  • The first plot shows the groupings when the data is split into two clusters.
  • The second plot displays the average distance to neighbors, with larger distances appearing darker.
  • The third plot predicts party membership, with red indicating Republicans and blue indicating Democrats.
  • The remaining plots overlay the map with predicted votes on specific bills: red for a ‘yes’ vote and blue for a ‘no’ vote.
Self-Organizing Maps | TrendSpider Learning Center (1)

Some of the use cases and applications of Self-Organizing Maps have been illustrated in the table below:

Project Prioritization and Selection(Zheng, 2014)
Seismic facies analysis for oil and gas exploration(Walls, 2001)
Failure mode and effects analysis(Chang, 2017)
Representative species for ecological communities(Park, 2006)
Representative days for energy system models(Fouché, 2019)

Origin & History

The development of Self-Organizing Maps by Teuvo Kohonen in the early 1980s revolutionized the field of unsupervised learning. Their biological inspiration, combined with their computational efficiency, allowed for effective visualization and interpretation of high-dimensional data. Over the decades, SOMs have evolved, with various adaptations enhancing their functionality and application scope, solidifying their status as a crucial tool in machine learning and data analysis.

I. Introduction

Self-organizing maps (SOMs) were introduced in the early 1980s by Finnish professor Teuvo Kohonen, who is widely credited with their development. The concept emerged from Kohonen’s research into unsupervised learning and neural networks, drawing inspiration from biological neural systems. The primary goal of SOMs is to visualize and interpret high-dimensional data by mapping it onto a lower-dimensional space, typically two dimensions while preserving the topological relationships of the data.

II. Foundational Work

The foundational paper on SOMs was published in 1981, marking the beginning of their prominence in the fields of machine learning and data analysis. Kohonen’s work was influenced by earlier models of neural computation that explored how the brain organizes sensory information. The self-organizing nature of SOMs allows them to cluster similar data points together without the need for labeled training data, making them particularly valuable for exploratory data analysis. This unsupervised learning capability is a key feature that sets SOMs apart from other neural network models.

III. Applications

Over the years, SOMs have found applications across various domains, including pattern recognition, data mining, and image processing. Their ability to reduce dimensionality while maintaining the inherent structure of the data has made them a popular choice for visualizing complex datasets. The competitive learning mechanism employed in SOMs, where nodes in the output layer compete to represent input data, differentiates them from other neural network models that typically rely on error correction learning methods. This competition among nodes enhances the ability of SOMs to accurately reflect the underlying patterns in the data.

IV. Advancements

As research progressed, numerous enhancements and adaptations of the original SOM algorithm were developed, leading to improved performance and broader applications. These advancements include modifications to the training process, incorporation of different distance metrics, and development of variants tailored for specific tasks. Today, SOMs are recognized as a significant contribution to the field of artificial intelligence and machine learning. Ongoing research continues to explore their potential in diverse applications, ranging from clustering and classification to anomaly detection and feature extraction.

Architecture of Self-Organizing Maps

The architecture of Self-Organizing Maps (SOMs) is designed to transform high-dimensional data into a lower-dimensional, typically two-dimensional, representation while preserving the topological relationships of the data. This architecture consists of several key components and processes that work together to achieve this transformation. Here is a comprehensive look at the architecture of SOMs:

I. Input Layer

The input layer of a Self-Organizing Map (SOM) is responsible for receiving the raw data, which is composed of multiple features or attributes. Each data point in the input layer is represented as a vector, with each vector having a dimensionality equal to the number of features in the dataset.

For instance, if the dataset has 𝑛 features, each input vector will be an 𝑛-dimensional vector. The primary function of the input layer is to serve as a conduit for the data, passing it along to the next stage in the SOM without performing any computations. The input vectors are fed into the SOM during the training process, where they are used to adjust the weights of the neurons in the output layer.

II. Output Layer

The output layer, also known as the feature map or Kohonen layer, is the core component of the SOM. This layer consists of a grid of nodes (neurons), each of which is associated with a weight vector that has the same dimensionality as the input vectors. The output layer can take various topological forms, such as rectangular or hexagonal grids, depending on the specific implementation and the desired properties of the map.

Each node in the output layer represents a particular point in the high-dimensional space of the input data. During training, the weight vectors of these nodes are adjusted to reflect the patterns and features of the input data. This process involves finding the node whose weight vector is closest to a given input vector (Best Matching Unit or BMU) and then updating the weight vectors of the BMU and its neighboring nodes to move closer to the input vector. This iterative adjustment process continues for many iterations, gradually refining the weight vectors so that the output layer becomes a meaningful representation of the input data’s structure.

The arrangement of the nodes in the output layer ensures that similar input vectors map to nearby nodes, preserving the topological relationships present in the original data. This topological preservation allows the SOM to effectively cluster and visualize the data, making it easier to identify patterns and relationships within the dataset.

Architecture Example

In the example below, the diagram indicates that the SOM is used to reduce the 200 input variables and map them to arrive at five clusters. These clusters represent groups of similar data points in the input space, identified through the SOM training process.

Self-Organizing Maps | TrendSpider Learning Center (2)
  • Input Layer: The input layer consists of 200 input variables labeled 𝑋1, 𝑋2,…,𝑋200. Each of these variables represents a feature or attribute of the data being analyzed. In this particular example, the input variables are from a medical informatics database. These input vectors are high-dimensional, meaning they have many features, making them complex and challenging to visualize or analyze directly.
  • Weighted Connections: Each input variable in the input layer is connected to every node in the output layer through weighted connections. These weights are initially set to small random values and are adjusted during the training process. The weights represent the strength of the connection between an input variable and a node in the output layer.
  • Output Layer (Kohonen Space/Grid): The output layer, also known as the Kohonen space or grid, is a two-dimensional grid of nodes (neurons). In this diagram, the grid is organized as a 5×5 grid, meaning there are 25 nodes in total. Each node in the grid has a weight vector of the same dimension as the input vectors (in this case, 200 dimensions). These weight vectors are adjusted during the training process to best represent the input data.

Types of Self-Organizing Maps

The paper “Self-Organizing Maps, Theory and Applications” by Marie Cottrell, Madalina Olteanu, Fabrice Rossi, and Nathalie Villa-Vialaneix explores several types of Self-Organizing Maps (SOMs), highlighting their unique adaptations for handling different types of data and improving their functionality.

I. Standard SOM

The original SOM algorithm, introduced by Teuvo Kohonen, is designed for numerical data and operates as an unsupervised learning algorithm. It uses competitive learning to map high-dimensional input data onto a lower-dimensional grid while preserving the topological relationships of the data. This basic version is effective for clustering and visualizing multidimensional data and is commonly used in various applications such as data mining and pattern recognition.

II. Batch SOM

The Batch SOM is a deterministic version of the standard SOM, introduced to produce reproducible results. Unlike the online version, which updates the map incrementally with each data point, the Batch SOM updates the map using all data points at once in each iteration. This method ensures consistent results when the initial conditions and data remain unchanged, making it suitable for industrial applications where reproducibility is crucial.

III. Median SOM

The Median SOM adapts the standard SOM algorithm to handle data represented by a dissimilarity matrix, where the relationships between data points are given as pairwise dissimilarities rather than explicit vector representations. This method is useful for data types like graphs or sequences where the data points do not naturally reside in a vector space. The Median SOM assigns each prototype to be one of the actual data points, optimizing the dissimilarity measure.

IV. Relational SOM

The Relational SOM extends the SOM algorithm to relational data described by a dissimilarity matrix. This approach uses a method that represents data in a pseudo-Euclidean space, allowing the algorithm to handle complex data structures such as social networks or DNA sequences. The prototypes are expressed as convex combinations of the input data points, and the algorithm optimizes these combinations to preserve the topological relationships of the original data.

V. Kernel SOM

The Kernel SOM uses kernel methods to map the input data into a high-dimensional feature space, where the standard SOM algorithm is then applied. This method leverages the power of kernel functions to handle non-linear data structures, enhancing the SOM’s ability to cluster and visualize complex datasets. The Kernel SOM is particularly effective when dealing with data that cannot be easily represented in an Euclidean space.

VI. Soft Topographic Mapping (STM)

Soft Topographic Mapping generalizes the SOM algorithm by incorporating soft assignments of data points to prototypes rather than hard assignments. This variant uses a probabilistic approach, smoothing the assignment process and avoiding local minima during training. The STM employs a free energy cost function and a deterministic annealing scheme to gradually refine the map, improving the clustering results and stability of the final map.

Advantages of Self-Organizing Maps

Self-organizing maps offer significant advantages in data analysis through dimensionality reduction, topological preservation, data visualization, unsupervised learning, and versatility.

I. Dimensionality Reduction

Self-organizing maps (SOMs) are highly effective in reducing the dimensionality of data. By projecting high-dimensional data into a lower-dimensional space, usually two dimensions, SOMs simplify complex datasets. This dimensionality reduction facilitates the visualization of intricate structures within the data, enabling researchers and analysts to identify patterns and relationships that may not be immediately apparent in higher dimensions. The key advantage is that this process typically occurs with minimal loss of critical information, preserving the essential characteristics of the original dataset while making it more manageable and interpretable.

II. Topological Preservation

One of the defining features of SOMs is their ability to maintain the topological relationships of the input data. This means that data points that are similar in the high-dimensional space will remain close to each other in the lower-dimensional map. This property is crucial for meaningful clustering, as it ensures that the data’s inherent structure is preserved. Maintaining these spatial relationships allows SOMs to better interpret and understand the underlying data distributions. This topological preservation is beneficial in fields where the relationships between data points are complex and multidimensional.

III. Data Visualization

The capability of SOMs to map high-dimensional data onto a two-dimensional grid is a significant advantage for data visualization. This transformation makes it easier for users to interpret and analyze complex datasets. By providing a visual representation of the data, SOMs help in identifying clusters, trends, and outliers that may be difficult to discern in a high-dimensional space. This visualization is particularly beneficial in exploratory data analysis, where understanding the data’s structure and relationships is essential for generating hypotheses and guiding further analysis.

IV. Unsupervised Learning

SOMs operate on the principle of unsupervised learning, meaning they do not require labeled data for training. This makes them particularly useful in situations where obtaining labeled data is challenging, time-consuming, or expensive. SOMs learn to recognize patterns and relationships within the data autonomously, making them suitable for exploratory analysis and preliminary data understanding. This capability allows for the discovery of hidden structures and insights in the data without the need for prior knowledge or labeling, broadening the scope of their applicability.

V. Versatility

The versatility of SOMs is one of their most significant strengths. They can be applied across a wide range of fields, including but not limited to image processing, text clustering, and financial analysis. In image processing, SOMs can help in segmenting and classifying images based on pixel similarities. In text clustering, they can organize documents or articles into meaningful groups based on content similarity. In financial analysis, SOMs can cluster and visualize market trends and behaviors. This wide applicability demonstrates the adaptability of SOMs to various types of data and problems, making them a valuable tool for researchers and practitioners in diverse domains.

Disadvantages of Self-Organizing Maps

While Self-Organizing Maps offer significant advantages in data visualization and clustering, they also come with notable disadvantages.

I. High Computational Cost

Training a Self-Organizing Map (SOM) can be computationally intensive, particularly when dealing with large datasets. The process involves multiple iterations of adjusting weight vectors and calculating distances between data points and nodes, which requires significant computational resources. As the size of the dataset increases, the training time and resource consumption grow correspondingly. This high computational demand can be a limiting factor, especially in environments with limited processing power or where quick turnaround times are essential. Therefore, efficient implementation and access to powerful computational resources are crucial for effectively utilizing SOMs in large-scale data analysis.

II. Sensitivity to Initial Conditions

The performance of SOMs is notably sensitive to the initial conditions, particularly the initialization of weight vectors. Poor or random initialization can lead to suboptimal clustering results, where the map fails to accurately represent the structure of the input data. This sensitivity makes it challenging to determine the best starting point for training, often requiring multiple runs with different initializations to achieve satisfactory results. The need for careful initialization adds complexity to the training process and can impact the reliability and consistency of the outcomes.

III. Need for Large Training Data

SOMs typically require a substantial amount of high-quality training data to produce meaningful and accurate results. Insufficient or poor-quality data can lead to the formation of inaccurate or uninformative clusters, diminishing the utility of the SOM. The effectiveness of SOMs heavily depends on the richness and representativeness of the training data, as they learn to map the input space based on the patterns and relationships present in the data. Consequently, the requirement for extensive and high-quality datasets can be a significant constraint, especially in domains where data is scarce or difficult to obtain.

IV. Difficulty in Handling Categorical Data

SOMs are primarily designed for numerical data and struggle to handle categorical or mixed-type data effectively. This limitation arises because SOMs rely on calculating distances between data points, a process that is inherently more complex and less meaningful for categorical variables. While techniques such as one-hot encoding can transform categorical data into a numerical format, they often lead to high-dimensional and sparse datasets, which can complicate the training process and reduce the interpretability of the results. This difficulty in handling non-numerical data restricts the applicability of SOMs in certain fields and datasets where categorical data is prevalent.

V. Optimal Map Size Determination

Choosing the appropriate size for the output map in a SOM is a critical yet challenging task. An incorrect map size can lead to either overfitting or underfitting, both of which negatively impact the quality of the clustering. If the map is too large, it may overfit the training data, capturing noise and minor variations rather than the underlying structure. Conversely, if the map is too small, it may underfit, failing to capture important patterns and relationships within the data. Determining the optimal map size often involves a trade-off and may require extensive experimentation and validation, adding to the complexity and time required for training a SOM.

Constructing a Self-Organizing Map

To construct a Self-Organizing Map (SOM), you follow a series of steps that involve initializing the map, training it with input data, and updating the weights of the neurons based on the input. Here’s a comprehensive guide on how to construct a SOM:

I. Initialization

Begin by defining the structure of the SOM, which typically consists of a two-dimensional grid of neurons. Each neuron in this grid will have an associated weight vector of the same dimension as the input data. Initialize these weight vectors either randomly or using a heuristic approach to ensure a diverse starting point. Proper initialization is crucial as it influences the convergence and final accuracy of the SOM.

II. Inout Data Preparation

Prepare your dataset by normalizing the input data. Normalization ensures that all features contribute equally to the distance calculations during the training process. This step is essential, especially if the input features have different scales, to prevent features with larger scales from dominating the distance metric.

III. Training Process

The training of the SOM involves several iterations where input data points are fed into the network. The training process can be broken down into the following sub-steps:

Step 1: Sampling: Randomly select an input vector from the dataset.

Step 2: Finding the Best Matching Unit (BMU): Calculate the distance between the selected input vector and the weight vectors of all neurons in the output layer. The neuron with the smallest distance is identified as the BMU. Common distance metrics include Euclidean distance.

Step 3: Updating Weights: Adjust the weights of the BMU and its neighboring neurons to make them more similar to the input vector. The update is controlled by a learning rate and a neighborhood function, which determines how much influence the BMU has on its neighbors. The weight update can be expressed as:

w(t+1) = w(t) + α(t) ⋅ h(c,t) ⋅ (x−w(t))

  • 𝑤(𝑡) is the weight vector at the time 𝑡
  • 𝛼(𝑡) is the learning rate, ℎ(𝑐,𝑡) is the neighborhood function
  • 𝑥 is the input vector.

The learning rate and neighborhood function typically decrease over time to fine-tune the map gradually.

Step 4: Iterate: Repeat the sampling, BMU identification, and weight updating steps for a predefined number of iterations or until the map converges. Convergence is reached when the weight adjustments become negligible, indicating that the SOM has learned the input data distribution.

The illustration below depicts the training process of a self-organizing map (SOM). The blue area represents the distribution of the training data, while the small white disc indicates the current training datum sampled from this distribution.

Self-Organizing Maps | TrendSpider Learning Center (3)

In this above illustration:

  • Initial State (Left): At the beginning, the SOM nodes are randomly positioned within the data space. The highlighted yellow node is the one nearest to the current training datum.
  • Training Step (Middle): The nearest node (highlighted in yellow) is selected. This node is moved towards the training datum. Additionally, its neighboring nodes on the grid are also adjusted, though to a lesser extent. This step ensures that the map starts to reflect the structure of the data.
  • Convergence (Right): After many iterations, the nodes in the SOM adjust their positions to approximate the distribution of the training data. The grid formed by the nodes represents the underlying data structure, enabling the visualization and interpretation of high-dimensional data.

This process continues iteratively, gradually refining the positions of the nodes to create a meaningful low-dimensional representation of the original data.

IV, Visualization & Interpretation

Once training is complete, visualize the SOM. Each neuron in the grid will represent a cluster of similar input data points. You can use color coding or other visualization techniques to represent the distribution of the input data across the map. This visualization helps in interpreting the relationships and patterns within the data, making it easier to identify clusters, outliers, and the overall structure of the dataset.

V. Evaluation

Evaluate the performance of the SOM by analyzing how well it has clustered the input data. You can assess the quality of clustering by examining the distribution of data points across the map, the distances between similar and dissimilar points, and the preservation of topological relationships. Metrics such as quantization error and topographic error can be used to quantify the map’s performance. Fine-tuning the SOM parameters and re-evaluating can help achieve the best possible results.

Self-Organizing Maps in Trading

Case Study I

The paper “Discovering Intraday Price Patterns by Using Hierarchical Self-Organizing Maps” by Chueh-Yung Tsao and Chih-Hao Chou explores the use of Hierarchical Self-Organizing Maps (HSOMs) to detect price patterns in financial markets during different trading periods: the opening, middle, and closing sessions of the market. The study is motivated by the observation that trading activities and behaviors exhibit distinct characteristics during these periods, influenced by market microstructure and trader behaviors, which results in a U-shaped pattern of intraday volatility (Wood et al., 1985; Harris, 1986; Jain and Joh, 1988; McInish and Wood, 1990; Chan et al., 1991).

Methodology

The authors modify the traditional Self-Organizing Maps (SOMs) by implementing a hierarchical structure to address the issue of determining the appropriate number of patterns (map size) for different markets. Traditional SOMs require a pre-specified number of neurons, which can vary significantly between different datasets. The hierarchical approach automates the determination of map size and structure and presents the detected patterns in a hierarchical manner. This method uses a criterion similar to the Akaike Information Criterion (AIC) to balance between explanation and exploitation by penalizing map size based on the average quantization error (AQE) and the number of neurons (Akaike, 1973).

The hierarchical SOM approach involves several steps:

  • Construct an initial map using basic SOMs to determine the best map size and structure.
  • Identify neurons that need further partitioning based on their inner inconsistency measured by cumulative quantization error (CQE).
  • Recursively apply SOMs to the identified neurons to create a second-layer map and continue this process until no further partitioning is required.

Empirical Study

The empirical analysis is conducted on two financial markets: the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) and the Taiwan Futures Exchange (TAIFEX). The sample period spans from January 2, 2001, to February 27, 2007, using 1-minute intraday data. The study focuses on three trading periods within a day: the first 30 minutes (opening), the middle 30 minutes, and the last 30 minutes (closing). Six experiments (covering both markets and three trading periods) are conducted.

The first layer of the hierarchical SOM detects different map structures for each trading period and market. For TAIEX, the maps are 1×8 (opening), 1×8 (middle), and 9×1 (closing), while for TAIFEX, the maps are 5×1 (opening), 5×1 (middle), and 3×2 (closing). Most neurons in the first-layer maps require further partitioning, leading to the construction of second-layer maps.

Results and Findings

The study finds that each trading period exhibits unique characteristics, confirmed by the quality of the maps (measured by AQE). For example, TAIEX shows higher quality maps for the opening and closing periods compared to the middle period, suggesting more informative patterns during these times. Moreover, there is evidence of a relationship between the opening and closing patterns within the same day and between the closing pattern of one day and the opening pattern of the next day for TAIEX, as indicated by Pearson chi-square tests (Table 2).

Case Study II

In the paper “Self Organizing Map Neural Network and Fuzzy based Method to Identify Profit Making Stocks in Stock Market,” the authors Dr. Asif Ullah Khan, Dr. Bhupesh Gour, and Dr. Krishna Kumar Dwivedi propose a hybrid approach to stock selection using Self-Organizing Maps (SOM) and fuzzy logic. This method aims to identify profitable stocks by combining technical analysis with a neural network and fuzzy logic system.

The primary challenge in stock investment is selecting stocks that have the potential to increase in value. Traditional methods, such as technical and fundamental analysis, have limitations, particularly in appropriately weighing different criteria. The proposed method seeks to improve stock selection by using a hybrid approach that leverages SOMs and fuzzy logic on selected technical indicators.

Methodology

The hybrid approach begins with the selection of stocks based on fundamental analysis criteria. These criteria include various financial metrics like Price/Earnings (P/E) Ratio, Dividend Yield, Return on Net Worth (RONW), and others. Stocks are classified using SOMs, which group stocks into clusters based on these criteria. Each cluster represents a set of stocks with similar characteristics.

Once the stocks are clustered, the best-performing clusters are identified. These selected stocks are then subjected to further analysis using fuzzy logic. The fuzzy logic module evaluates the stocks based on technical indicators such as Relative Strength Index (RSI), Williams %R, Ultimate Oscillator, MACD, Stochastic Oscillator, and On Balance Volume (OBV). The output of this analysis is a trading signal that suggests whether to buy or sell the stock.

Results

The system was tested using historical data from the National Stock Exchange (NSE). The authors found that the stocks selected using the hybrid method of SOM and fuzzy logic provided significantly better returns than the NSE index. Specifically, during the testing period from April 30, 2015, to June 1, 2015, the selected stocks achieved a return of 12.60%, compared to the NSE index’s 3.07% return. This represents a 9.53% improvement in returns.

Conclusion

The hybrid model of SOM and fuzzy logic not only helps in selecting profitable stocks but also aids in determining the optimal timing for investment. This approach has demonstrated its effectiveness in improving returns on investment, making it a valuable tool for investors seeking to maximize their profits. The integration of technical indicators with SOM and fuzzy logic provides a robust framework for stock market prediction and investment decision-making.​

The Bottom Line

Self-organizing maps (SOMs), introduced by Teuvo Kohonen in the 1980s, are a powerful unsupervised learning technique used to visualize and interpret high-dimensional data by mapping it onto a lower-dimensional space, typically two dimensions. By preserving the topological structure of the input data, SOMs effectively cluster similar data points together, making them invaluable for exploratory data analysis across various fields such as finance, biology, and pattern recognition. Their ability to reduce dimensionality while maintaining data relationships offers a unique and intuitive way to uncover patterns and insights from complex datasets.

Preview some of TrendSpider’s Data and Analytics on select Stocks and ETFs

Free Stock Chart for CMCSA$39.09 USD+0.01 (+0.03%)Free Stock Chart for KEY$14.60 USD-0.01 (-0.07%)Free Stock Chart for CSX$34.02 USD+0.02 (+0.06%)Free Stock Chart for NVDA$104.81 USD+0.06 (+0.06%)Free Stock Chart for PFE$28.54 USD-0.01 (-0.04%)Free Stock Chart for JPM$205.90 USD+0.10 (+0.05%)

Self-Organizing Maps | TrendSpider Learning Center (2024)

FAQs

How to understand self-organizing maps? ›

Self-organizing maps (SOMs) are an unsupervised form of Machine Learning that can be used to cluster data that has many features. The SOM not only clusters your data but also “maps” it on to a lower dimension (usually two dimensions) so that you can more easily visualize the clusters.

What is the best matching unit in self-organizing map? ›

Each node is examined to find the one which its weights are most similar to the input vector. This unit is known as the Best Matching Unit (BMU) since its vector is most similar to the input vector. This selection is done by Euclidean distance formula, which is a measure of similarity between two datasets.

What are the five stages in self-organizing maps? ›

We saw that the self organization has two identifiable stages: ordering and convergence. 3. We ended with an overview of the SOM algorithm and its five stages: initialization, sampling, matching, updating, and continuation.

How does the SOM algorithm work? ›

It employs an unsupervised learning methodology and uses a competitive learning algorithm to train its network. To minimize complex issues for straightforward interpretation, SOM is utilized for mapping and clustering (or dimensionality reduction) procedures to map multidimensional data onto lower-dimensional spaces.

What are the disadvantages of self-organizing maps? ›

The main drawback of the SOM is that it requires neuron weights be necessary and sufficient to cluster inputs.

What is the learning process of SOM? ›

Unlike other learning technique in neural networks, training a SOM requires no target vector. A SOM learns to classify the training data without any external supervision. Getting the Best Matching Unit is done by running through all wright vectors and calculating the distance from each weight to the sample vector.

What is BMU in self-organizing maps? ›

The neuron whose weight vector is most similar to the input is called the best matching unit (BMU). The weights of the BMU and neurons close to it in the SOM grid are adjusted towards the input vector. The magnitude of the change decreases with time and with the grid-distance from the BMU.

How do you implement self-organizing maps? ›

SOMs iteratively run three stages until convergence: competition, cooperation, and adaptation.
  1. 3.1. Competition. In this stage, we go over all the input data points. ...
  2. 3.2. Cooperation. The cooperation phase begins after finding the winning neuron. ...
  3. 3.3. Adaptation. Here, we update the neurons according to : ...
  4. 3.4. Algorithm.
Mar 18, 2024

What is the difference between self-organizing map and K clustering? ›

In K-means the nodes (centroids) are independent from each other. The winning node gets the chance to adapt each self and only that. In SOM the nodes (centroids) are placed onto a grid and so each node is consider to have some neighbors, the nodes adjacent or near to it in repspect with their position on the grid.

What is the Kohonen learning rule? ›

Kohonen Learning Rule (LEARNK)

The Kohonen rule allows the weights of a neuron to learn an input vector, and because of this it is useful in recognition applications. Thus, the neuron whose weight vector was closest to the input vector is updated to be even closer.

What is the purpose of self organizing maps? ›

The Self-Organizing Map (SOM) method is a new, powerful software tool for the visualization of multi-dimensional data. It converts complex, non-linear statistical relationships among high-dimensional data into simple geometric relationships on a low-dimensional display (Kohonen and Oja, 1996).

What is an example of a Kohonen? ›

Kohonen nets -- examples

A widely quoted (and more useful) example is the Kohonen Phonetic Typewriter. A 2D array of neurons, each of which have 15 inputs, is fed the Fourier coefficients of a speech signal that is sampled every 9.83msec.

What is sigma in SOM? ›

Sigma is the radius of the different neighbors in the SOM. The default value for this is 1.0. The last parameter is the learning_rate, which determines how much weights are adjusted during each iteration. After several experiments, I found a sigma of 0.1 and a learning rate of 0.2, which give some decent results.

Is self-organizing map supervised or unsupervised learning? ›

Self Organizing Map (or Kohonen Map or SOM) is a type of Artificial Neural Network which is also inspired by biological models of neural systems from the 1970s. It follows an unsupervised learning approach and trained its network through a competitive learning algorithm.

What is the size of the self-organizing map grid? ›

The SOM map itself (the output space) is formed as a regular grid of cells or units (neurons), for example a 10x8 grid (n=80 units) or a 16x32 grid (n=512 units) ― in fact the grid can be much larger if desired.

How do you understand self-organizing concept in scrum? ›

Unlike traditional teams, the self-organizing empowered teams are not directed and controlled from the top; rather they evolve from team members participating actively & collectively in all the Scrum practices and events.

How do you implement self organizing maps? ›

SOMs iteratively run three stages until convergence: competition, cooperation, and adaptation.
  1. 3.1. Competition. In this stage, we go over all the input data points. ...
  2. 3.2. Cooperation. The cooperation phase begins after finding the winning neuron. ...
  3. 3.3. Adaptation. Here, we update the neurons according to : ...
  4. 3.4. Algorithm.
Mar 18, 2024

What is the structure of self organizing maps? ›

The SOM consists of two layers (Figure 7): the first one (input layer) is connected to each vector of the data set, the second one (output layer) forms a two-dimensional array of nodes (computational units).

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Stevie Stamm

Last Updated:

Views: 6025

Rating: 5 / 5 (80 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Stevie Stamm

Birthday: 1996-06-22

Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

Phone: +342332224300

Job: Future Advertising Analyst

Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.