Cluster analysis

Cluster analysis is applied in many different areas such as: social behavior study, psychology, geography, business ... In marketing, cluster analysis is used to segment the market, understand customer behavior, identify opportunities for new products or test different market strategies ...

Khảo sát nhanhApplication

Cluster analysis is usually applied in marketing primarily for the following purposes:

1. Market segmentation

In cluster analysis, objects are identified and divided into groups (people, markets, institutions) with certain characteristics such as attitude, consumer behavior, media viewing habit..., in order to help manufacturers / organizations understand the existing market segments.

Case study:

Khảo sát nhanh

2.Understanding consumer behavior

Cluster analysis helps identify similar consumer groups, which supporting manufacturers / organizations to focus on study about purchasing behavior of each separate group, to help capture and better understand behavior of consumers.

3.Identify opportunities for new products

By clustering brands / products, we can define competitive sets in the market. The brands of the same group have a robust level of competition than brands of different groups. Therefore, testing competitive sets will assist manufacturers / organizations in identifying opportunities for potential product on market.

4.Select testing markets

Cluster analysis has also been applied in selection of the market for testing different strategies. This application involves identifying similar market, which can substitute for each other when performing experimental research and helps reduce the number of market testing requirements.

5. Reduce data volume

Be used as a tool to reduce the amount of data, cluster analysis helps develop subset data or phrases, which is more manageable than individuals observed data. Specifically, customers will be divided into groups and difference between groups can be checked through discriminant analysis method (MDA ) to describe difference in using product behavior.

Khảo sát đơn giảnAnalysis method

Clustering methods including Hierarchical clustering, Non-hierarchical clustering and other methods.

Hierarchical clustering

Hierarchical clustering is built under the hierarchical structure or tree diagram. Hierarchical clustering methods can be carried out by agglomerating or dividing method.

Agglomerative

The "bottom-up": Each observation is made starting from separate clusters. Then, clusters are combined, moving nearby on the hierarchy into a single cluster.

Divisive

The "top-down": All observations start in a cluster and are broken down into smaller clusters, until each object becomes a separate cluster.

The advantage of hierarchical cluster analysis is that we do not need to pre-defined number of clusters, which can observe through tree diagram and determine clusters number.For hierarchical clustering - agglomerative, there are three methods to estimate distance of cluster:

  • Distance links (Linkage methods): include Single linkage, Fully linkage and Average linkage
  • Total squared deviation / variance (Variance Methods): the most common method is "Ward Method".
  • Center distance (Linkage methods)

Among clustering methods, Center distance and Ward methods have been shown to receive better result than other methods.

Non-hierarchical clustering

Non-hierarchical clustering (k-means) is pre-defined method of center clusters, and then objects in a predefined threshold value will be group into the central clusters.
Some cluster allocation method:

  • Sequential threshold
  • Parallel threshold
  • Optimizing partitioning

Due to less calculation volume and faster execution length, this method is often used in case of large sample number . As there are drawbacks including we have to predetermine number of clusters, and number of center clusters selected is quite arbitrary, so both methods hierarchical and non-hierarchical clustering are often used. Firstly, we use hierarchical clustering in order to find results, then the number of clusters and center clusters are used as initial information to apply optimum distribution method.