End-to-end AI and data systems for targeted surveillance and management of COVID-19 and future pandemics affecting Uganda (COAST))
In sub-Saharan Africa data imbalances and underrepresentation can easily arise due to unequal access to government and private services where data is collected due to socio-demographic conditions. COAST will address these challenges through three specific objectives:  1- To strengthen data systems resulting in usable and equitable datasets for AI-driven COVID-19 responses and future pandemics.    2- To develop and deploy AI-driven detection and diagnosis tools for improved patient care and management. 3- To model and evaluate COVID-19 interventions for targeted government responses based on the fused datasets from objectives 1 and 2.
Building NLP Text and Speech Datasets for Low Resourced Languages in East Africa
The project will deliver open, accessible, and high-quality text and speech datasets for low-resource East African languages from Uganda, Tanzania, and Kenya. Taking advantage of the advances in NLP and voice technology requires a large corpora of high quality text and speech datasets. This project will aim to provide this data for these languages: Luganda, Runyankore-Rukiga, Acholi, Swahili, and Lumasaaba. The speech data for Luganda and Swahilli will be geared towards training a speech-to-text engine for an SDG relevant use-case and general-purpose ASR models that could be used in tasks such as driving aids for people with disabilities and development of AI tutors to support early education. Monolingual and parallel text corpora will be used in several NLP applications that need NLP models, including natural language classification, topic classification, sentiment analysis, spell checking and correction, and machine translation. This work is supported by;
Machine Learning Datasets for Crop Pest and Disease Diagnosis based on Crop Imagery and Spectrometry Data
This project will produce quality open and accessible image and spectrometry datasets from Uganda, Tanzania, Namibia, and Ghana for several crops that contribute to food security in Sub-Saharan Africa, including cassava, maize, beans, bananas, pearl millet, and cocoa. The team -composed of data scientists and researchers from Makerere University, The Nelson-Mandela African Institution of Science and Technology, Namibia University of Science and Technology, and the karaAgro AI Foundation – expect the image and spectral datasets will be used for early disease identification, disease diagnosis, and modelling disease spread, which will ultimately help in breeding resistant crop varieties.  This work is supported by;  
Image phenotyping for necrosis in cassava roots
We automate necrosis phenotyping with more efficiency than current methods 
We use artificial intelligence to mine data from local village radio stations to generate timely data on crop pests and disease in sub-Saharan Africa. Crop loss due to pests and disease threatens the economic survival of smallholder farmers, and access to surveillance data is critically important yet often not affordable. Local radio shows are a powerful source of information flow in rural African villages: they cover topics including politics, policy, climate, and social circumstances, in addition to crop concerns. Collectively, this information provides a holistic representation of current events in these communities. They will analyze local broadcasts to generate crop surveillance data that is linked to the local community situation.Radio content will be collected at low cost through a collaboration with Pulse Labs Kampala, and they will build artificial intelligence models based on deep neural networks and keyword identification to mine the data.The results will be combined with photographs of diseased crops provided by local farmers and used to train machine learning models to ultimately extract radio information in multiple languages and with diverse accents. This project will provide near real-time crop surveillance data and allow for timely responses to threats.
Early detection and diagnosis of crop diseases in asymptomatic plants. Site
Causal discovery in disease data

It is sometimes thought to be impossible to discover causes of events without any background knowledge or the ability to do experiments. However, the field of inferring causes and effects with purely observational data is developing. Correlation does not directly imply causation, but some patterns of association make particular causal relationships more likely than others.

This work is focused on developing fast methods to find strong causes and effects related to a target variable from a large set of covariates. This is useful (1) for gaining insight into a domain, and (2) for prediction of the effects of interventions. We are particularly interested in applying this to data collected in Uganda concerning prevalence of disease and the outbreak of epidemics such as cholera and ebola. This analysis could confirm or disconfirm our ideas about climatic, demographic and environmental factors which are thought to influence such events. An indication of the relative strengths of different causes can also help in predicting the efficacy of different eradication policies. Entry to NIPS 2008 causal discovery competition received honourable mention for “significant advance on the REGED dataset”.
Mobile monitoring of crop disease
Cassava is the world’s third-largest source of carbohydrate and can grow in hostile conditions where other crops cannot, but has one major weakness: susceptibility to viral disease. Monitoring the spread of disease is essential in countries that depend on it as a staple crop, but the processes currently employed are expensive and slow. We are working on an automated system using $100 smartphones to capture images, diagnose disease with computer vision techniques and provide real-time map information, as well as extensions into banana diseases and automated pest survey. For more information about this project, see blog Work supported by Bill & Melinda Gates Foundation.
Automated malaria diagnosis with digital microscopy

The most reliable test for malaria is microscopic examination of blood films for presence of the parasite. The problem with this is that it requires equipment, and an expert on-site to use it. Some researchers have recently indicated the promise of combining microscopy with mobile phones, in order to mitigate the requirement for an expert to be physically present, and others have investigated the use of computer vision techniques for automatic classification, so that a human expert need not be available at all. However, all of this work has been undertaken in ideal laboratory conditions. We are working on developing these ideas and to trial an automated diagnosis system in the field, intended for use by non-experts. We deal with thick blood film slides as shown.

For more information about this project here. Work supported by Microsoft Research.
Data generation and language technology for low-resourced African languages

The realization of developing natural language processing techniques in tasks such as Machine Translation (MT) requires the availability of monolingual and cross-lingual resources. Currently, the exploration of various advances in NLP techniques for low-resource languages and language pairs in the developing world is complicated by the lack of data resources. For example, in Uganda, where there are over 40 independent languages, there are no monolingual nor bi/multilingual resources for developing NLP systems such as those that significantly benefit well-resourced languages. Now, we are using both manual and existing automated methods to build bilingual corpora for several language pairs involving any low-resourced African language. We plan to use the corpora to explore several NLP applications involving any of the respective low-resourced African languages.

Work supported by a Google Research Award.

Robust traffic flow monitoring
Traffic monitoring systems usually make assumptions about the movement of vehicles, such as that they drive in dedicated lanes, and that those lanes rarely include non-vehicle clutter. Urban settings within developing countries often present extremely chaotic traffic scenarios which make these assumptions unrealistic. We are working on robust techniques for traffic congestion monitoring. Instead of tracking individual vehicles we treat a lane of traffic as a fluid and estimate the rate of flow.
Spatiotemporal models for biosurveillance
It is useful to know the geographical density of a transmittable disease in order to plan interventions and to predict its future developments. This can be difficult where there is a lack of co-ordinated statistics, which is often the case where diseases like malaria or tuberculosis are endemic. In such situations it is possible to combine irregular updates from a variety of less consistent sources. We are looking at the use of spatiotemporal state space models for biosurveillance, for use when there are irregular updates about disease counts.
Kudu: Auction design for agricultural commodity trading

We are trialling an auction system called Kudu, which is designed for trading agricultural commodities in Uganda by phone or web. This is a double auction, meaning that buyers and sellers submit their information separately, and we computationally find the best matches.

This approach seems more promising than both single auction systems (i.e. listings sites, which can’t be used with a basic phone or anywhere bandwidth is scarce) and price advisory systems, which have problems with accuracy and timeliness (wholesale market prices in Kampala change in the course of hours, hence a weekly price bulletin is of limited use). By matching buyers and sellers algorithmic-ally we can overcome these problems. The prototype web interface to the auction system can be tried here: (requires a Ugandan mobile phone number for registration), or text BUY or SELL to 8228. The crops we are currently supporting are coffee, beans, sweet banana and watermelon. Work supported by a Google Research Award.
Learn More