The "Customer Segmentation using K-Means Clustering" project utilizes the K-means clustering algorithm to categorize supermarket customers based on their spending behavior. The primary objective is to identify distinct customer segments and devise targeted marketing strategies to enhance customer engagement and satisfaction.
Dataset on Kaggle: https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python
-
Loading and Exploring Data:
- The dataset is loaded into a Pandas DataFrame.
- Basic exploratory data analysis is performed, including examining the first five rows, checking the data shape, and reviewing data information for insights.
-
Data Preprocessing:
- Missing values are checked and found to be absent.
- The relevant columns for clustering (Annual Income and Spending Score) are selected.
-
Determining Optimal Clusters:
- The "Elbow Method" is employed to determine the optimal number of clusters.
- The Within-Cluster-Sum-of-Squares (WCSS) is plotted against the number of clusters.
-
Training K-Means Model:
- The K-means clustering model is trained with the optimal number of clusters determined from the elbow method.
- Labels are assigned to each data point based on their respective clusters.
-
Visualization: