Each year, millions of patient interactions are logged on mHealth platforms. Although organizations implement strategies to ensure data integrity when information is first captured, inconsistent or problematic data can find their way into mission-critical datasets. This reduces trust in data captured by frontline health workers, diminishes the effectiveness of data to provide appropriate and timely treatment, and hinders broader data aggregation pathways for deeper analysis.
To address this, the Johnson & Johnson Center for Health Worker Innovation (the Center), with support from the Johnson & Johnson Foundation, teamed up with DataKind, a global non-profit organization providing pro-bono data science, machine learning and artificial intelligence innovation to social organizations. Building on previous collaboration, the Center is working to ensure Community Health Workers (CHWs) are supported in championing good data practices and have access to trustworthy data to inform their delivery of healthcare services.
The Center and DataKind’s work with Lwala Community Alliance (Lwala) in Kenya is one such project geared towards monitoring the quality of data at the point of collection. It supports not only frontline health workers, but also the supervisors, data analysts and health systems managers who play important roles in the rapid remediation of data quality issues.




Based in Migori County in Kenya, Lwala focuses on reducing maternal and child mortality and scaling community-led solutions to strengthen local, district and national health systems. The organization also empowers CHWs with digital data collection tools on Dimagi’s CommCare platform—the largest mHealth data collection and service delivery platform used by CHWs globally, with nearly 200,000 frontline workers and 700,000 users in total.
Lwala partnered with DataKind to determine the best way to apply data science and machine learning to automate data quality checks in a systematic and routine manner. The project has provided Lwala with data verification processes designed to detect issues early and the ability to track improvements in data quality over time. It has also strengthened Lwala’s own internal operations, proving useful in the training of its CHWs, increasing confidence in CHW-generated data, supporting greater CHW connection and integration, and improving data quality practices across key frontline health system stakeholders.
Furthermore, the creation of a data integrity dashboard has meant that Lwala can actively draw on live data that its frontline health workers are generating and can easily isolate high- and low-performing metrics related to maternal and child health and water and sanitation to inform further on-the-ground investigation.
In early October 2021, Lwala began taking full ownership of the solution for the first time, with DataKind supporting the handover via several training sessions and presentations with clear learning outcomes.
Daniele Ressler, Director of Research, Learning & Impact at Lwala says the organization prioritizes testing its programs, learning continuously, and sharing results. “Real-time digital data systems like our CommCare-based ‘Lwala Mobile’ app offer an innovative job aid to CHWs as they reach last-mile households and provide proactive care,” she says. “Our digital data system also allows Lwala to receive rapid CHW program feedback to track progress and quickly make program adjustments as needed on our path toward impact.”
Ressler goes on to say that the partnership with DataKind offered a unique opportunity to build confidence in community-based digital data through innovative data science solutions. “The data integrity dashboard tool designed by an impressive DataKind volunteer team allows Lwala to further identify user training and support needs quickly; ensure key care delivery metrics are as complete and accurate as possible; and validate the confidence we already hold in our digital data ecosystem,” she says. “We are excited our project is contributing to a larger cross-organizational data integrity initiative through strategic partnership and funding.”
DataKind’s commitment to developing community-oriented, scalable, and replicable data integrity solutions is evident in its earlier work with Medic in Siaya County, Kenya. With support from the Johnson & Johnson Foundation, this initial project looked at building an automated data testing pipeline to scan maternal and child health data and rigorously test it for different types of data quality problems. DataKind and partners continue to explore pathways to increase trust in public health data by demonstrating feasibility of data integrity solutions at a platform level.
“At DataKind, we’ve learned that better data can be a matter of life and death. When lack of ‘data trust’ becomes the norm, it can affect decision making throughout the entire health system, particularly in places where community health services account for a majority of all primary care visits,” says Mitali Ayyangar, Portfolio Lead, Frontline Health Systems, DataKind. “We’re honored to be working with partners like the Johnson & Johnson Center for Health Worker Innovation, Lwala Community Alliance and Medic to develop solutions that advance responsible data collection, advancing both data quality and data trust. We’re motivated by the impact this will have on enabling community health systems stakeholders to make decisions backed by trusted data and support health workers to optimize their care delivery and provide the right care at the right time.”