exploring new ways in which the Deviant Dataset could exist and where

Carly

BACKGROUND This past summer, I completed an MDes in Responsible AI at Elisava in Barcelona. For my thesis, I explored the pervasive and discriminatory censorship of sex-positive and hyper-sexualized bodies in online spaces, addressing the digital censorship caused by content classification systems and binary dataset designs within AI and ML systems. To explore these issues, I employed a multifaceted methodology. The research included a comprehensive literature review to define existing biases in AI and content moderation systems, alongside qualitative ethnographic interviews with individuals affected by the censorship of these technologies including sextech founders, sex workers, body positivity influencers. I also analyzed social media platforms' content moderation policies and practices which provided insight into the discriminatory nature of automated censorship. Together, this mixed-method approach allowed for a thorough examination of the intersection between AI, gender identities, sexual identities, and body representation.

The outcome was an alternative dataset – the Deviant Dataset – positioned as a critical design tool and speculative project developed to explore different approaches to data classification and collection. Its primary goal was to challenge existing paradigms by questioning traditional data points and collection methods, while promoting a more contextual and temporal approach to data labeling, positioning data labeling itself as a form of empowerment. In its current structure, the Deviant Dataset has an emphasis on self-identifying inputs and optional categories, rendering it non-machine readable and thus non-interoperable.

When I found Solidarity Infrastructures, my goal was to explore new ways in which the Deviant Dataset could exist and where.

Take a stroll through my digital garden why dontcha

☁️ SELF-HOSTING AS SOLIDARITY INFRASTRUCTURE ☁️ Corporate clouds and mainstream hosting services operate within a logic of surveillance and profit. Content deemed "deviant" is particularly vulnerable to moderation policies that align with state and corporate interests rather than community needs. Self-hosting is an assertion of autonomy, a way that the Deviant Dataset could exist on its own terms. As part of this effort, I set up a remote server using Digital Ocean and installed YunoHost, an open-source self-hosting platform. This was one of the most exciting moments of the course for me as it went beyond theory and speculative design into testing something practical even without deep technical expertise. I felt powerful! By moving the Deviant Dataset forward within the framework of Solidarity Infrastructures (as a course and concept), I finally felt like I can contribute to a larger movement toward technological autonomy—one that prioritizes collective care, resilience, and resistance over convenience and growth above all else.

Baby's first virtual server ????

MACHINE NON-READABLE | I find myself thinking often about what is means to be read by machines and how to resist against that. Being machine non-readable is not just a technical choice, but a political and ethical one. It reinforces the Deviant Dataset's role as a relational rather than extractive infrastructure. Self-hosting adds another layer of autonomy, but it also raises questions about long-term feasibility, sustainability, and labor—all of which I need to examine more critically.

MY VALUES AROUND TECHNOLOGY | One of the things I really appreciated about my time in Solidarity Infrastructures was the encouragement to really think about and define my values around technology. I would define my values around technology in no particular as: