A brand-new gray industry: thousands of people worldwide are selling their trained AI—but what is the price?

CryptoCity

Thousands of people around the world are selling their voices, images, and call records to feed AI in exchange for income, but they face the risks of deepfakes and irrevocable licensing.

Deep Tide Editorial: An investigative report by The Guardian reveals a rapidly growing gray industry: thousands of people globally are earning AI training fees by selling their voices, faces, call records, and daily videos. This is not a vague discussion of privacy controversies, but an investigation with real people, real amounts, and real consequences—an actor who sold his face later saw “himself” promoting an unknown medical product on Instagram, with comments evaluating his “appearance.” As the data hunger of AI companies combines with global economic disparities, it is creating an unequal transaction.

The full text is as follows:

One morning last year, Jacobus Louw, who lives in Cape Town, South Africa, went for his usual walk, feeding seagulls along the way. But this time he recorded a few clips—capturing his footsteps and views as he walked on the sidewalk. This video earned him $14, about 10 times the country’s minimum wage, equivalent to half a week of food expenses for the 27-year-old.

This was a “city navigation” task he completed on Kled AI. Kled AI is an app that pays users to upload photos, videos, and other data for training AI models. In just a few weeks, Louw earned $50 by uploading photos and videos from his daily life.

Thousands of miles away in Ranchi, India, 22-year-old student Sahil Tigga regularly earns money through Silencio—an app that crowdsources audio data for AI training, accessing his phone’s microphone to capture ambient noise from restaurants or busy intersections. He also uploads recordings of his voice. Sahil makes special trips to unique locations, like hotel lobbies that have not yet been mapped on the Silencio app. He earns over $100 a month from this, enough to cover all his food expenses.

In Chicago, 18-year-old welding apprentice Ramelio Hill sold his private phone chat records with friends and family to Neon Mobile— a conversational AI training platform that pays $0.50 per minute—earning hundreds of dollars. For Hill, the math is simple: he believes tech companies already have a lot of his private data, so he might as well get a cut from it.

These “AI training gigs”—uploading surrounding scenes, personal photos, videos, and audio—are at the forefront of a global new data gold rush. As Silicon Valley’s demand for high-quality human data exceeds what can be scraped from the open web, a burgeoning data market industry has emerged to fill this gap. From Cape Town to Chicago, thousands are granting micro-licenses of their biometric identities and private data to the next generation of AI.

But this new gig economy comes at a cost. The few dollars earned are fueling an industry that may ultimately render their skills obsolete, while exposing themselves to future risks of deepfakes, identity theft, and digital exploitation—of which they are just beginning to understand.

Keeping the AI Gears Turning

AI language models like ChatGPT and Gemini require vast amounts of learning material to continue improving, but they are facing a data drought. The most commonly used training data sources—C4, RefinedWeb, and Dolma—account for a quarter of the highest quality datasets on the web, and are now limiting generative AI companies’ access to their data for training models. Researchers estimate that AI companies will run out of available new, high-quality text as early as 2026. Although some labs have begun to use synthetic data generated by AI itself for training, this recursive process can lead to models producing erroneous “garbage,” ultimately causing crashes.

Image source: The Guardian

Apps like Kled AI and Silencio are stepping in here. In these data markets, millions of people are feeding and training AI by selling their identity data. Besides Kled AI, Silencio, and Neon Mobile, AI trainers have many options: Luel AI, supported by the famous incubator Y-Combinator, acquires multilingual conversation materials for about $0.15 per minute; ElevenLabs allows users to digitally clone their voice and offers it to others at a base rate of $0.02 per minute.

Bouke Klein Teeselink, an economics professor at King’s College London, states that AI training gigs are an emerging category of work that will grow significantly.

AI companies know that paying people for data licenses helps avoid copyright disputes that may arise from relying entirely on web scraping for content, Teeselink says. AI researcher Veniamin Veselovsky adds that these companies also need high-quality data to model new, improved behaviors in the system. “Currently, human data is the gold standard for sampling from outside the model distribution,” Veselovsky adds.

The humans powering these machines—especially those from developing countries—often need the money and have little choice. For many AI training gig workers, taking on this work is a pragmatic response to economic disparity. In countries with high unemployment and depreciating local currencies, earning dollars is often more stable and cost-effective than local jobs. Some people struggle to find entry-level work and are forced to do AI training to make a living. Even in wealthier countries, rising living costs have made selling oneself a logical financial choice.

AI trainer Louw in Cape Town is acutely aware of the privacy costs involved. Although his income is unstable and insufficient to cover all his monthly expenses, he is willing to accept these conditions to earn money. He has struggled with a neurological disease for years and has been unable to find work, but the money earned from AI data markets (including Kled AI) has allowed him to save $500 to enroll in a spa training course to become a massage therapist.

“As a South African, receiving dollars is worth more than people realize,” Louw says.

Mark Graham, a professor of internet geography at Oxford University and author of “Feeding the Machine,” acknowledges that for individuals in developing countries, this money may have practical significance in the short term, but he warns, “Structurally, this work is unstable, with no upward mobility, and is essentially a dead end.”

Graham adds that the AI data market relies on “wage competition” and “the temporary demand for human data.” Once that demand shifts, “workers will have no security, transferable skills, or safety nets.”

Graham states that the only winners are “platforms in the Global North that capture all the lasting value.”

Image source: The Guardian

Full Authorization

AI trainer Hill from Chicago has mixed feelings about selling his private phone calls to Neon Mobile. About 11 hours of call content earned him $200, but he says the app often goes offline and delays payments. “Neon has always seemed suspicious to me, but I kept using it just to earn some extra cash to pay bills,” Hill says.

Now he is starting to reconsider whether the money is really that easy. In September of last year, just a few weeks after Neon Mobile launched, it went offline after TechCrunch discovered a security vulnerability that allowed anyone to access users’ phone numbers, call recordings, and text transcripts. Hill says Neon Mobile never notified him of this situation, and now he is worried his voice might be misused online.

Jennifer King, a data privacy researcher at Stanford University’s Human-Centered AI Institute, is concerned that the AI data market does not clarify how and where users’ data will be used. She adds that without understanding their rights and failing to negotiate, “consumers face the risk of their data being reused in ways they dislike, do not understand, or did not anticipate, with almost no remedy available at that point.”

When AI trainers share data on Neon Mobile and Kled AI, they grant a full authorization (global, exclusive, irrevocable, transferable, and royalty-free) that allows the platform to sell, use, publicly display, and store their images, and even create derivative works based on them.

Avi Patel, founder of Kled AI, states that the company’s data agreements will limit usage to AI training and research purposes. “The entire business model relies on user trust. If contributors believe their data may be misused, the platform cannot operate.” He states that the company reviews buyers before selling datasets to avoid collaborating with “organizations with questionable intentions,” such as the adult industry, and “government institutions” that they believe may use the data in ways that violate that trust.

Neon Mobile did not respond to requests for comments.

Enrico Bonadio, a law professor at City, University of London, points out that these agreement terms allow platforms and their clients “to do almost anything with the material, in perpetuity, without additional payment, with no practical way for contributors to withdraw consent or renegotiate.”

Even more concerning risks include: the trainers’ data being used to create deepfakes and impersonations. Although data markets claim to strip identifiable information (like names and locations) from the data before sale, biometric patterns are inherently difficult to anonymize meaningfully, Bonadio adds.

Sellers’ Regret

Even if AI trainers could negotiate more detailed protective clauses regarding the use of their data, they may still regret it. In 2024, actor Adam Coy from New York sold his image for $1,000 to Captions—an AI video editing software now rebranded as Mirage. His agreement stipulated that his identity would not be used for any political purposes, would not be used to promote alcohol, tobacco, or adult content, and that the authorization period lasted for one year.

Captions did not respond to requests for comments.

Shortly thereafter, Adam’s friends began sharing videos they found online featuring his face and voice, with views reaching millions. In one Instagram video, Adam’s AI replica referred to itself as a “vagina doctor,” promoting unverified medical supplements for pregnant and postpartum women.

“Explaining this to others makes me feel embarrassed,” Coy says.

“The comments are weird because they are evaluating my appearance, but that’s not me at all,” Coy adds. “My thought when I made the (sale of the image) decision was that most models would scrape data and images online anyway, so I might as well get paid.”

Coy says he has not taken any more AI data gigs since then. He says he would only consider doing it again if a company offered significant compensation.

  • This article is reprinted with permission from: Deep Tide TechFlow
  • Original title: Thousands of people are selling their identities to train AI – but at what cost?
  • Original author: Shubham Agarwal, The Guardian
  • Translation: Deep Tide TechFlow
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments