As the COVID-19 pandemic continues to wreak havoc on economies around the world, poverty levels are on the rise, with many individuals and families struggling to make ends meet. In low- and middle-income countries, identifying those most in need of social protection programs can be challenging, especially when traditional administrative data, like tax records, is unavailable for a large proportion of informal workers. However, recent research from UC Berkeley and the World Bank has shown that the use of mobile phone data and machine learning could provide a promising solution to this problem.
Call detail records (CDRs) from a large mobile phone operator in Afghanistan were analyzed by researchers to evaluate their accuracy in identifying ultra-poor households eligible for social protection programs. CDRs contain valuable information on phone numbers, communication patterns, a network of contacts, and recharge patterns, among others, which can be analyzed to identify households in need.
Three methods were compared to identify ultra-poor households, including a supervised machine learning model trained on CDR data, an asset-based wealth index, and a consumption metric commonly used to measure poverty in low- and middle-income countries. The supervised machine learning algorithm, which was trained on 797 behavioral indicators computed from CDR data, outperformed other common machine learning algorithms. In addition, a combined method that leveraged all three methods was found to be the most promising, with an AUC of 0.78.
While the use of CDR data for targeting has advantages in terms of reducing time and marginal costs, it does raise ethical concerns and limitations. Access to phone data is necessary, and targeting accuracy will suffer if data is unavailable for some segments of the population, such as those without a phone, or if a specific provider does not permit access to the data. The use of CDR for program eligibility may also create incentives for strategic behavior by individuals who want to manipulate the system. Therefore, informed consent and clear privacy standards must be put in place to protect individuals’ sensitive and private information.
Despite these challenges, the use of CDR data and machine learning presents a significant opportunity to improve the targeting of social protection programs in low- to middle-income countries, particularly in the wake of the COVID-19 pandemic. By leveraging the power of mobile phone data and machine learning algorithms, policymakers and development practitioners can target ultra-poor households with greater accuracy and efficiency, delivering much-needed assistance to those who need it the most.
In conclusion, this study highlights the potential of using CDR data and machine learning for targeting ultra-poor households in social protection programs. The results show that a combined method using CDR and asset data could be the most feasible option for identifying ultra-poor households, with the highest accuracy achieved by leveraging all three methods.
However, ethical concerns and limitations must be taken into account when implementing CDR-based targeting, including access to phone data, privacy concerns, and the potential for strategic behavior. With appropriate safeguards in place, the use of CDR data and machine learning can be a powerful tool to combat poverty and support vulnerable populations in low- to middle-income countries.