Gig Workers Are Becoming AI’s Data Collection Network
Training AI systems that interact with the physical world requires real-world data that cannot be scraped from the internet. DoorDash and Uber have found a solution: use their existing networks of gig workers. DoorDash launched “Tasks” in March 2026, a stand-alone app that pays delivery couriers to film everyday activities, photograph locations, and record speech for AI data collection. Uber introduced a similar program in late 2025. Together, these companies are creating a new model for training data collection at scale using distributed workers already on the ground.
How Gig Worker Data Collection Works
- DoorDash’s Tasks app pays couriers to film activities like washing dishes or navigating buildings
- Tasks include photographing restaurant menus, recording multilingual speech, and filming hotel entrances
- Pay is shown upfront and determined by effort and complexity
- Data trains both in-house AI models and those developed by partners in retail, insurance, and tech
- Uber lets drivers earn extra income by uploading photos and completing data-labeling tasks
Why This Data Matters for AI and Robotics
AI models designed for robotics, autonomous vehicles, and spatial understanding need to learn from real human interactions with physical objects. A model that controls a dish-washing robot needs thousands of examples of humans washing dishes in different kitchens, with different lighting, different dish types, and different techniques.
Similarly, a delivery robot needs visual data of building entrances, elevator buttons, and door handles from thousands of locations. This kind of data is expensive to collect through traditional means. Hiring dedicated videographers to visit thousands of locations costs far more than paying existing gig workers who already travel to those places every day.
“There are more than 8 million Dashers who can reach almost anywhere in the U.S. That is a powerful capability to digitize the physical world,” said Ethan Beatty, general manager of DoorDash Tasks.
The Economics of Distributed Data Collection
Traditional training data companies like Scale AI and Appen built businesses around centralized data labeling and annotation. Workers sit at computers and label images, transcribe audio, or categorize text. DoorDash and Uber are doing something different. They are collecting raw data in the field, then feeding it to models that learn from unstructured video and audio.
The economics favor the gig model. DoorDash does not need to hire full-time data collectors. Couriers choose tasks voluntarily, work on their own schedules, and get paid per assignment. The per-unit cost of data collection drops significantly when you piggyback on an existing delivery network.
Questions About Worker Compensation and Data Value
The arrangement raises questions about fair value. Video data that trains AI models can generate significant long-term revenue for the companies that use it. A courier paid $5 to film a dish-washing sequence creates data that might train a robotics system worth millions. Whether the compensation reflects the data’s long-term value is a question the industry has not yet answered.
Some regions are already excluded. DoorDash Tasks is not available in California, New York City, Seattle, or Colorado, likely due to stricter labor regulations. As these programs expand, labor advocates will push for clearer standards around data ownership, consent, and compensation that reflect the actual value these workers create.