Client overview
Our client is a U.S.-based research lab working on human-AI interaction. They want to build AI systems that can use digital platforms in ways that look and feel like human behavior. To achieve this, they needed large volumes of realistic user interaction data.
Business Challenges
-
The client requested us to simulate human activity on multiple apps and capture those interactions as training data for LLM development.
- We need to build a framework and various scripts. Following this, we had to generate as many diverse tasks as possible according to the above scripts.
- Annotators worked 6-8 hours per day on screen recording, simulating natural usage of apps.
- Actions had to follow a rhythm of about 2-3 seconds per interaction, mimicking how people use apps and improving natural interaction speed.
- Annotators communicated directly with the Project Manager in a one-to-one format, as though they were interacting with an AI model.
- Require strict QA rubric (accuracy, completeness, efficiency, quality, setting, and keyboard/mouse operation).
- This project prioritizes diverse scripts & behaviors, and the accepted error rate is up to 10%.
Project Detail
Domain: Artificial Intelligence
Solutions

1. Requirements: Aligned with the client on app scope, key user actions (posting, shopping, applying, editing), and how to balance dataset volume with realistic user behavior.
2. Team Setup: Assigned 45 members: 2 PMs, 3 Task Creators, 10 QA, and 30 Annotators recording daily app interactions.
3. Training: Trained annotators to simulate natural app usage across social, e-commerce, content, and professional platforms.
4. Execution: Scripts guided diverse app interactions. Annotators recorded 6-8 hours/day, mimicking real behavior (scrolling, posting, liking). The workflow produced thousands of minutes of video weekly with 98%+ accuracy.
5. Delivery: Data was delivered in regular batches with screen recordings, task logs, and QA reports so the client could feed it directly into LLM training.







