Who:

I am a human-centered computing PhD candidate at Georgia Tech and a member of the DataWorks research team. I am advised by Dr. Betsy DiSalvo, and Dr. Carl DiSalvo, and collaborate closely with Dr. Ben Rydal Shapiro (GSU). Before Georgia Tech, I studied computer science at Wellesley College.

What:

At a high level, I am interested in how people collect and curate data, as well as how they begin to make sense of it. Said differently, how do people decide what data represents? I examine the tools and devices people employ to perform these sense-making activities, with an eye towards improving the experiences of documentation and dataset contextualization. Working primarily in critical data studies and responsible AI (R-AI), my approach is also informed by prior research experience and ongoing interest in programming languages, usable security and privacy, and information credibility.

Where: 

My primary research site is DataWorks, combined a data services firm and work-training program. At DataWorks, we are figuring out how to build an alternative data annotation site that does not reproduce the exploitative work practices common among data annotation platforms and providers. The Data Fellows are full-time university employees with competitive pay and benefits, and become experts in data cleaning, organization, and standardization through a mix of dedicated training modules and work on real client projects. The emphasis is on creating a lucrative, sustainable career in data work.

Through the unique structure of DataWorks, we are also able to create comprehensive understandings of how a dataset came into being, both in origin and its current form, helping determine fair and pro-social later use. At DataWorks, I focus primarily on datasets being used to train and develop AI and ML systems.

When and how:

I am in the fifth and final year of my PhD, finishing up my dissertation, “Developing Pro-Social AI Training Datasets Through Data Workers’ Critical Perspectives.” Within the dissertation, I explore a series of projects related to developing an alternative data work site — or, one in which data workers’ lived experiences and perspectives are not only valued, but brought in as valuable assets for the process of dataset collection and curation. To me, data workers can serve as our best chance at auditing a dataset before it is used for something like training an AI system, because the data workers have actually seen the contents of the dataset and noted irregularities or offensive content, which the requesters of such data labor often don’t realize.

Other research “hobbies”

  • reading about anything related to data and computing, particularly studies of how computing and computational tools are used for surveillance, prescribing boundaries on human movement, and enaction of borders, nationality, and immigration;
  • keeping a kind of data diary of the temporal and financial cost of protecting my personal information on the web – mostly to highlight the barriers to doing so;
  • right to repair and restoration of old and aging computer systems, along with free & open source software, in protest of planned obsolescence and platform capture.