PANYAKO, ASHA

PANYAKO, MAKANA ASHA

Asha is a business intelligence developer with 4 years experience in the telecommunications industry. Her current role entails data engineering, data mining, data analytics and visualization using various tools and languages such as python, spark scripting and SQL on both relational and non-relational databases. She is also very passionate about community development and currently serves as the Impact Officer at the Nairobi Hub of the Global Shapers Community. She holds a BSc in Computer Science from Karatina University.

Project Summary

Project Title: Customer Segmentation on Mobile Money Users in Kenya.

Research Supervisor: Dr. Evans Miriti

Abstract: Customer segmentation enables organizations to partition a market into subsets that have common needs, interests and priorities. This helps businesses to come up with design and strategies that fulfils the customer needs.Using network and mobile money data, this study compared various clustering algorithms aiming at identifying the algorithm that creates the most solid customer profiles. Hierarchical clustering, KMeans and affinity propagation algorithms were used to segment customers and compared using internal validation measures. Our dataset comprised of various demographic and behavioural features obtained from a telecommunications company data warehouse. Co-relation between the features was tested enabling us to focus on age, network revenue, amounts transacted on mobile money, frequency of loan uptake, customer and organization transfers, goods and service payments and deposits and withdrawals as our features for modelling. The dataset was then fit into our algorithms. Agglomerative clustering generated seven clusters with a normalized mutual score of 0.5526 and adjusted rand score of 0.5436 and silhouette coefficient of 0.4523. KMeans generated 11 clusters with an NMI score of 0.5168 and adjusted rand score of 0.3315 and silhouette coefficient of 0.2369. Affinity propagation generated the largest number of clusters of 504, had a memory utilization of 91% and took the longest time to execute. This established AP as unsuitable for our dataset. Agglomerative clustering had the best performance in terms of the compactness and connectedness of clusters however clusters obtained from KMeans were more granular as compared to agglomerative clustering segments.