OTIENO, CHRISGONE ADEDE
Student Short Biography:
Dr. ChrisgoneAdede is a data scientist with an interest in exploring the potential of digital technologies in solving societal and business challenges. His core interests include machine learning, data visualization, systems analysis and data protection. A data fanatic, Chrisgone works for a European Union (EU) funded drought management project where he leverages his knowledge and experience in predictive data modelling to address drought-associated risks. In this role, he designs systems that enhance transparency in the use of contingency funds, assesses systems responsiveness to shocks and the potential to scale.
Prior, Chrisgone worked in systems development and business intelligence in diverse sectors including technology consulting, medical research, micro-finance, banking and finance.
Chrisgone, who believes that data not only tells an organization’s story but also defines the future of organizations, holds a BSc in Computer Science from Egerton University, an MSc and a PhD in Computer Science from the University of Nairobi and is a PRINCE2© certified project manager.
Project Summary
Thesis / Project Title:Model Ensembles for Predictive Drought Severity and Drought Effects Monitoring using Remote Sensing & Socio-Economic data
Thesis / Project Abstract:
The increasing frequency of occurrence of droughts especially in the Greater Horn of Africa and their effects on livelihoods has led to an increase in the demand for ex-ante drought early warning systems that are stable, highly predictive and that have sufficient lead times. The study uses the case study techniques of artificial neural networks (ANN) and support vector regression (SVR) to build predictive models 1-month ahead for both drought severity and effects. Vegetation condition index aggregated over 3 months and nutrition status of children below 5 years are used as the proxy variables for drought severity and effects respectively. Homogenous and heterogenous ensembles are built from the three approaches of simple averaging, ranked weighted averaging and model stacking. We overproduce 244 ANN and SRV models from which we select 111 models for model ensembling. In regression, the stacked heterogeneous model ensemble with an R2 of 0.94 is shown to outperform both the homogeneous ensembles and the individual champion models that post a maximum R2 of 0.83. Similarly, in classification, the heterogeneous stacked ensemble offered a 9 and 11 percentage points’ improvement over the performance of the SVR and ANN champion models respectively with an even better performance in outlier classes. We conclude that despite the computational resource intensiveness of model ensembling, the returns in predictive performance is worth the investment. We nevertheless advise on the building of ensembles from more diverse techniques over extended forecasting periods to estimate the prediction skill of model ensembles over longer lead times.