Big data has been used in and for development much before the UN’s Global Pulse project by Dr MS Swaminathan through the Rural Knowledge Centres. Big data holds a huge promise but must be carefully used lest there be just a fascination with all its techniques

The proliferation of all kinds of media, especially 24/7 news media, both print & TV and online, the wide ranging coverage of ‘human interest’ stories, government action and inaction, grassroots impact of government policies, heart-rending pictures that convey so much information, so many first person narratives, citizen journalism all add up to an enormous amount of information that can be harvested to yield intelligence. Sustained gathering of anecdotal information but on a well-defined topic may compensate for inadequate data collection. At the least, it will throw up the gaps in data collection. Even if we stayed with just textual information, we should discover a fair bit of what we are looking for. The operative word is ‘looking for’. The mere fact of a huge volume of information is not a deterrent as long as you are clear about what you are looking for.

India

The credit for grasping the significance of big data for development, at least in India, should go to Dr MS Swaminathan, who set up Rural Knowledge Centres (later christened Village Knowledge Centres) as a way of harnessing the enormous local wisdom in agriculture in India. As a way of improved organizing, Village Resource Centres were created to cater to a certain number of VKCs within a geographical area, reminding us the hub-and-spokes model used so widely in the FMCG industry. The idea has been taken up many developing economies as a means of collating, indexing information for use and reuse. It can be used as a testing ground for new policies as public response can be gathered through its expressions in various forms of social media, general and specialized. This is now quite a standardized test, given the success of sentiment analysis, used by so many students as a curricular project.

The Government of India has also begun to channelize the potential of big data in the process of development. For example, when it set up a RKC in the Andamans, the government articulated its move thus: “For the development of Agriculture in this Union Territory, farmers have to be motivated, guided, assisted, supported and helped to have access to the latest research findings in the areas of Agriculture, Animal Husbandry and Fisheries by making available all the information that is required in their village so that they may not waste their time, energy and meager resources in transporting the inputs required and their produce to the different locations where the information of goods that they seek is available. With this motive ‘Rural Knowledge Centres’ is established throughout the Islands at identified locations which will function as “Rural Knowledge Bureau” for the respective area”. (http://agri.and.nic.in/RKC.html) While it is no surprise that the GoI used NABARD as an agency facilitating this phenomenon, even some commercial banks venture into setting up VKCs, which makes a lot of business sense.

In 2016, the Government of India undertook an exercise which is a good example of use of big data, although it was not so described. As a story in The Economic Times put it, “The government wants to capture greater market share in global trade and has kick-started an exercise for mapping region specific exports to achieve this aim. It has identified the pharmaceutical sector as an “export commodity with high potential” to garner higher market share in Europe and auto components to drive exports growth in South America”. (https://economictimes.indiatimes.com/news/economy/foreign-trade/government-begins-mapping-region-specific-exports-for-bigger-share-of-global-trade/articleshow/52843325.cms)

Global Pulse

The UN began a project some years ago titled Global Pulse with the stated goal of using big data to facilitate the process of development. We must emphasize a very critical aspect before we discuss any further. The process of development involves both positive and negative factors and while we need to pay attention to both at the same time, their importance might vary based on circumstances. Mitigating the potential negative effects of something before it takes root is a huge contribution to development as it prevents the use of resources and the cost of the time factor. You could look for early warning signals for an epidemic or an as yet undefined disease or the unequal distribution of a problem by looking to textual signals, signals that the use of language can reveal.

The UN’s Global Pulse envisions utilizing the potential of big data, trying to bring to the fore what could be hidden a in vast seemingly incomprehensible group of words. There are many NLP techniques that can be used to find out what is relevant information. And, as the UN recognizes, the backbone of this possibility is a network of pulse labs across the world. It’s expanding of will “provide a space where partners from government, private sector and across the U.N. system can brainstorm practical challenges, design exploratory research projects, prototype applications and share findings”. UN’s Global Pulse is quite ambitious as it should be given that it has a global network, trying to bring together different kinds of expertise to reduce asymmetry in information and knowledge. Consider this. The African Institute for Mathematical Sciences – Next Einstein Initiative (AIMS-NEI) states that it “received a 2-year funding from the International Development Research Centre to implement the Harnessing Big Data to meet the Sustainable Development Goals – Building Capacity in the Global South”, adding that “AIMS-NEI agrees to partner with the Local Development Research Institute to create the regional hub for Africa and collaborate with other regional hubs in Asia and Latin America to form the Global South network for this project”. (https://nexteinstein.org/industry-initiative-2/big-data-for-development-bd4d/) As you can make out, this is Knowledge Management, pure and simple. As we use such data more and more frequently, we will need improved indexing and grouping, which probably will happen with increased usage.

Caution

Perhaps, this is obvious but it is worth spelling it out – big data, as used in development is largely textual data although other forms of big data will also be relevant. For instance, trends in increasing or decreasing volumes of housing-related search queries can be a more accurate predictor of house sales in the next quarter than the forecasts of real estate economists. Students of computer science who study analytics study several techniques to deal with big data but students of development should not get carried away by all these. There is a certain fascination with technology such that the purpose is lost sight of. What is important, from the perspective of development, is that information is available in multiple formats from multiple sources. The challenge lies in integrating them to portray a complete picture of the problem under study. Arguably, this is the most interesting and challenging aspect of big data.

We need to underline one more aspect. Big data is frequently considered to engage with real-time data but that is not sacrosanct. Development is not like trading that it needs real-time data except during pandemics or epidemics. What we need is filtered information subject to the condition that filters used are known to stakeholders. Else, the integrity of such filtered information will be questionable.

It is instructive to look at one of the leading information & intelligence companies get involved with this challenge. SAS identifies different sources of big data:

“Big data for development is constantly evolving. However, a preliminary categorization of sources may reflect:

  • What people say (online content): International and local online news sources, publicly accessible blogs, forum posts, comments and public social media content, online advertising, e-commerce sites and websites created by local retailers that list prices and inventory.
  • What people do (data exhaust): Passively collected transactional data from the use of digital services such as financial services (including purchase, money transfers, savings and loan repayments), communications services (such as anonymized records of mobile phone usage patterns) or information services (such as anonymized records of search queries)”.

((https://www.sas.com/en_us/insights/articles/big-data/big-data-global-development.html)

Clearly, there is a business opportunity even if it is masked as concern for development. In July 2019, Cisco announced that it “would build an Agri-Digital Infrastructure (ADI) Platform and set up Village Knowledge Centers (VKCs) in Kerala’s Kannur district for knowledge delivery and provide access to e-learning and advisory services to the farming and fishing communities”.

(https://www.business-standard.com/article/pti-stories/cisco-to-set-up-village-knowledge-centres-in-kerala-119070200657_1.html)

Even academic institutions have gotten into this environment offering focused courses with a clear message in the title – Big data & Development, some of them housed in their School of Information. Given this domicile status, it is not surprising that they study (and teach) how satellite imagery can be used to underline the extent of poverty in specific regions. Let us not split hairs as to whether satellite imagery is big data. There is a more important question to be asked – what does this trend mean for the future of Development Studies or Development Economics. We will consider this in a later post.

Takeaways

Big data holds great promise but must be carefully used

There is a huge variety of possibilities but keeping it simple and relevant should be the focus

Indexing and creating taxonomies of practices for use by others should be the goal

Photo by Markus Spiske from Pexels