India’s Data Annotation Revolution: Gig Workers Fueling AI Training Boom

Akash Pandey, a 26-year-old government job aspirant from Basti, Uttar Pradesh, discovered a flexible work opportunity online, offering earnings of Rs 12,000-13,000 per project for transcribing audio and marking objects in images. On the other hand, Ikshita Nagar, a 26-year-old doctor from Delhi preparing for the PG entrance test, dedicated extra hours to classify types […]

by Drishya Madhur - May 9, 2024, 4:29 pm

Akash Pandey, a 26-year-old government job aspirant from Basti, Uttar Pradesh, discovered a flexible work opportunity online, offering earnings of Rs 12,000-13,000 per project for transcribing audio and marking objects in images.

On the other hand, Ikshita Nagar, a 26-year-old doctor from Delhi preparing for the PG entrance test, dedicated extra hours to classify types of wounds in images (burns, abrasions, or surgical) and solve NEET questions.

The global market for data annotations is projected to reach $8.22 billion by 2028, growing at an annual rate of 26.2%. India’s segment of this market could exceed $7 billion by 2030, with a workforce of up to 1 million.

Alok Aggarwal, CEO of AI startup ScryAI, stated, “Annotation-as-a-service is on a meteoric rise, especially in India.”

According to TeamLease, 20,000 full-time workers in India are engaged in managed services as annotators, with an additional 50,000 Indian annotators employed as independent contractors on international platforms.

Databricks, Fractal, Tredence, and startups like Cropin and Minus Zero are expanding their in-house teams for data annotation while also relying on outsourced services in India.

Rajesh Ramdas, senior director at Databricks India, emphasized the importance of the human-in-the-loop approach for MLOps (machine learning operations) pipeline.

Desicrew Solutions, based in Chennai, reported a 50% average growth over the past few years, driven by the increasing demand for annotation for Large Language Models (LLMs).

Manivannan JK, Desicrew’s CEO, highlighted the nuanced nature of annotation for LLMs, which require annotators to consider factors like sentiments.

Suraj Amonkar, chief AI research & platforms officer at Fractal, noted the impact of self-supervised learning and open-source datasets on reducing the need for manual tasks in data sorting and marking.

Soumendra Mohanty, chief strategy officer at Tredence, mentioned that annotation is evolving as a sub-segment at multiple firms with minimum qualifications of a business analyst possessing domain knowledge.

Hardik Dave, founder and CEO of startup IndikaAI, highlighted the earning potential for skilled annotators, mentioning that a radiologist can make up to Rs 1 lakh per month for a few hours of work.

Ikshita Nagar views annotation as a practice ground and an opportunity beyond a second income, especially for practicing doctors to participate in healthcare innovation.

Flexibench, a managed services platform created by Indika.AI, hosts an on-demand workforce of 23,000 registered contributors for programmatic data labeling and fine-tuning of foundation models, where Nagar freelances with a team of professionals and amateurs.