According to Traci Gusher-Thomas, Innovation & Enterprise Solutions Principal, “Any given organisation has 80 percent or more of its data locked up as unstructured or ‘dark’ data,” making it difficult to analyse and gain insights. This is especially true when considering new business models that allow any company—regardless of size or industry—to be nimbler and scale infinitely faster. By accessing services rather than growing internal functions and infrastructure, and by capitalising on emerging technologies such as artificial intelligence, 21st century enterprises can generate insights from massive amounts of data to improve business operations and transform customer experience.

 When we think about the everything-as-a-service model, the huge dividend is how we use the output of data from the service for optimisation and new insights. In the Indian context, it relates to the huge markets, consumers, health requirements, and safety and security. Not only is it vital to encompass the resource to use it to our advantage and development, but also to guard against economic and social crises and any risks pertaining to the security and sovereignty of the nation. However, the May Day urgency and limitless opportunities are yet to be captured and scaled. 

Prime Minister Narendra Modi has described data as the “new oil” and the “new gold” and said that India has the potential to lead the world in the industrial revolution, which relies on big data analytics and digital technology to improve manufacturing. Many data scientists with expertise in institutional research (especially outcomes, enrolment, issues of diversity, retention and finances), survey research, higher education, public policy, data analysis and analytics, statistics, research design, economics, econometrics, predictive modelling and analytics, strategic planning, policy analysis, program evaluation and assessment are needed as a part of Research and Development in every sphere of the economic, social and security domains. The main responsibilities to be assumed right now are gathering vast amounts of structured and unstructured data and converting them into actionable insights, identifying the data-analytics solutions that hold the most significant potential to drive the growth of organisations, using analytical techniques like text analytics, machine learning, and deep learning to analyse data, thereby, unravelling hidden patterns and trends, encouraging data-driven approach to solving complex business problems, cleansing and validating data to optimise data accuracy and efficacy, and communicating all the productive observations and findings to the company stakeholders via data visualisation. 

India is the second-highest country to recruit employees in the field of data science or data analytics with 50,000 positions available—second only to the United States. The demand for data experts is equally competitive, whether you look at the big companies, the e-commerce industry or even startups. We also need accomplished programmers who are freelancers and/ or absorbed in all companies, institutions and administrative bodies, for statistical and survey research software packages such as Qualtrics, Tableau, SPSS, SAS, Stata, R, data mining, and predictive analytics tools, working with large academic and administrative data sets, and who are excellent with MS Access and Excel and good with relational databases, end-user and query tools, SQL and Python. 

India has turned into a hotbed for cybersecurity experts too. According to a recent study by Indeed.com, the scope of cybersecurity has turned more competitive in India. There are more job post clicks in India as compared to the US and the UK. As per industry stats, hiring is at its most for the roles of Network Security Engineer, Cyber Security Analyst, Security Architect, Cyber Security Manager, and Chief Information Security Officer.

 Let us also take a brief look at the developmental changes and challenges in India, which has been a software provider so far but is now one of the biggest network players and digital usage population on social media. In 2020, India had nearly 700 million Internet users across the country. This figure was projected to grow to over 974 million users by 2025, indicating a big market potential in Internet services for the South Asian country. In fact, India was ranked as the second largest online market worldwide in 2019, second only to China. The number of Internet users was estimated to increase in both urban as well as rural regions, indicating a dynamic growth in access to the Internet.

 Of the total Internet users in the country, a majority accesses the Internet via their mobile phones. There is nearly the same amount of smartphone users as Internet users across the country. The cheap availability of mobile data, a growing smartphone user base in the country, along with the utility value of smartphones, as compared to desktops and tablets, are some of the factors contributing to the mobile-heavy Internet access in India. In terms of mobile connectivity, Jio has emerged as the biggest player in the shortest time.

 Growth is still on the cards. Mega data via mobile usage is a game changer. Data lineage is a straightforward use case for metadata. The goal of data lineage is to track data records over their entire life cycle to its original data source. This can increase trust and acceptance by making data transparent and comprehensible for business users. Moreover, data lineage is very helpful in tracking down error causes or complying with laws and regulations (e.g. in banking and finance). Complete data lineage is also necessary to track and store metadata for every step in the data life cycle. The need for holistic metadata management is thus essential.

 Data warehouse automation (DWA) is also a big topic nowadays. DWA automates DW lifecycles from source system analysis, testing, and documentation to reduce the effort required to build and run a DW. It is in need of metadata, especially technical metadata about data structures, data types or relations, and dependencies. DWA is very effective in cost savings, and when done correctly, can increase agility by accelerating development and change cycles. Moreover, automated tests can also increase the quality and solidity of systems. Similarly, it also helps to use automated extraction and transformation of metadata from data sources.

 Artificial Intelligence and Advance Analysis is also a part of the era of big data. A discussion on advanced analytics and innovative applications of artificial intelligence to get more out of data is being seen. However, with tons of unstructured data and concepts like data lakes, it becomes increasingly hard to stay on top of things. This is why metadata is necessary to ensure that big data does not just become a large collection of unusable trash data. Metadata is an essential tool that enables algorithms to automatically analyse and match data and thereby make big data manageable and provide valuable results. Metadata also helps data scientists explore data sets and extract insights. For example, it can be hard for data scientists to understand certain data without knowing where it comes from, how the data was calculated and its meaning in terms of business. Metadata is often a good candidate for open data and data monetization, since actual business data is usually too confidential or too business-critical to share with external parties. For instance, it is not very likely that a car manufacturer would share its customer data. However, metadata like abstracted GPS profiles or engine metadata are not that critical and can be valuable to external partners, for instance, to improve route guidance systems or provide more accurate insurance rates. Coming to the advisable application of it, with our own geosatellite GaGan, many possibilities for more discrete innovation and application is underway. There is also a trend to provide selected data through open APIs to enable external parties to explore and use all kinds of data for new applications. Data shared here is mostly rehashed metadata coming from various processes throughout an organisation. 

Despite the large number of Internet users in the country, Internet penetration levels have taken longer to catch up equally. At the same time, the number of women who have access to the internet is much lower than men in the country and the bias is even more evident in rural India. Similarly, internet usage is lower among older adults in the country due to lack of internet literacy and technological knowhow. By encouraging internet accessibility among marginalised groups including women, older people and rural inhabitants in the country, India’s digital footprint has significant headroom left to grow into. 

Personally, I also see a vast scope for reforms in cyber laws and regulations and the need for trained human resources for safeguarding against hackers and countering cyberattacks. Any company’s outreach should be legitimate and ethical. India must gear up before the business and security environments are secured by having its own servers, bilateral agreements and implementation of cyber policies and data taxations under company law. The rapid development of information technology is already posing new challenges before the law regarding such issues. These challenges are not confined to any single traditional legal category but arise in, for example, Criminal Law, Intellectual Property Law, Contract and Tort. One such challenge is the growing menace of data theft, which refers to when any information in the form of data is illegally copied or taken from a business or other individual without his knowledge or consent. 

Data is a valuable asset in this modern IT-driven era. It is an important raw material for call centres and IT companies, and an important tool and weapon for corporates wishing to capture larger market shares. Due to the importance of data in this new era, its security is also a major issue with the IT industry. Piracy of data is a threat faced by the IT players, who spend millions to compile or buy data from the market, and whose profits depend upon the security of the data. The ongoing case of Facebook and WhatsApp before the Supreme Court deals with the age-old debate of ‘national security vs privacy’, with the Court tasked yet again with balancing the two out. The case, which started with the issue of linking Aadhaar with social media accounts—now turning into a landmark case—could redefine intermediary liability law as well as bring in social media rules for the first time. Given the crucial role that intermediaries are playing with regard to protecting the individual from the prying eyes of the State, the case has particularly significant implications for privacy as well. One such issue that has come up for debate is on whether intermediaries are under an obligation to decrypt the information in their possession and, secondly, whether the government can set up its own decryption agency, and what the surveillance powers and capabilities of such an agency would be. 

I will sum up with this awakening thought by K.N. Govindacharya: “We have to think what should be the swadeshi framework of technology. Everything stemmed from the European vision of the globe and what was useful for their development: Free flow of capital and technology but not of humans. India was a nation of software exporters but hardly had a role in hardware, artificial intelligence, robotics, biotech or genetic engineering. In 1999-2000 (software engineers fixing Y2K), we were like computeriya mazdoors; like girmitiyas.” 

Captain Priti Sidharth Singh is a commercial pilot flying Boeing ‪737-700/800/900, logged seven thousand hours on JET serving as Senior Commander (Line Training Captain) at Air India Express, and is also active in Social and Development National Organisations. She is currently pursuing a PhD from IIT.