Papers
arxiv:2205.11963

The Data-Production Dispositif

Published on May 24, 2022
Authors:
,

Abstract

Machine learning (ML) depends on data to train and verify models. Very often, organizations outsource processes related to data work (i.e., generating and annotating data and evaluating outputs) through business process outsourcing (BPO) companies and crowdsourcing platforms. This paper investigates outsourced ML data work in Latin America by studying three platforms in Venezuela and a BPO in Argentina. We lean on the Foucauldian notion of dispositif to define the data-production dispositif as an ensemble of discourses, actions, and objects strategically disposed to (re)produce power/knowledge relations in data and labor. Our dispositif analysis comprises the examination of 210 data work instruction documents, 55 interviews with data workers, managers, and requesters, and participant observation. Our findings show that discourses encoded in instructions reproduce and normalize the worldviews of requesters. Precarious working conditions and economic dependency alienate workers, making them obedient to instructions. Furthermore, discourses and social contexts materialize in artifacts, such as interfaces and performance metrics, limiting workers' agency and normalizing specific ways of interpreting data. We conclude by stressing the importance of counteracting the data-production dispositif by fighting alienation and precarization, and empowering data workers to become assets in the quest for high-quality data.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2205.11963 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2205.11963 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.