Author: annabelrothschild

  • Configuring Mailbox.org and Addy.io

    I previously used addy.io (used to be called AnonAddy) with my Proton Mail account with no issue and no additional configuration. However, when I moved to Mailbox.org recently, I kept getting bounce-backs with unconfigured (misconfigured?) DKIM, DMARC, and SPF policies, despite having configured them in NameCheap (my host). Ugh. Issue appears to be the policies for my custom domain.

    After trying several things, one thing that works is linking Addy not to my custom domain on Mailbox.org, but the default Mailbox.org (xyz@mailbox.org) email that comes with any account. Those DKIM, DMARC, and SPF policies, which are set by Mailbox.org automatically, are intact, and Addy works seamlessly. Just remember to change the sender from xyz@customdomain.com to xyz@mailbox.org when sending an Addy-facilitated reply.

  • “Please add a picture to your student profile”

    This is a fairly innocuous request by one of my PhD advisors: please add a profile picture to your student profile so the rest of the faculty can recognize you during student review. The student review is a time when my graduate program’s faculty gather and each PhD student’s profile is discussed to make sure students are achieving satisfactory progress and/or provide opportunities for other faculty members to offer advice and support. (It is not as scary as it may sound.)

    My instinctual answer was “no! Why do the faculty need to know what I look like to assess my progress?“. My advisor points out that despite our small program size (40 students or so), research groups are spread across three buildings and faculty might know us by sight but not name. Perfectly reasonable.

    How is passing me in the hallway, or knowing that I prefer to start work early (and then leave early, to be clear) going to help them assess my competencies as a student? Sure, having exchanged pleasantries while heating up our lunches might give an idea of my general personality or lack of cooking skills, but these, again, aren’t concepts on which I’m being assessed. But, it bothered people that I was visually anonymous when it was time to discuss my progress.

    A colleague, overhearing the conversation, makes the fair point that it would be to my benefit to “personalize” myself by having a photo in my student profile. We then get into a discussion of what personalization (or de-anonymization) means and who benefits—mostly along the classical social lines of things that may or may not be conveyed (but are often assumed) through a photo (inc. race, class, and gender). Some of these values may or may not be conveyed by my name, but it’s somewhat impossible to be a grad student without a name. I’ve had versions of this conversation with a family for years in an ongoing argument over LinkedIn profiles (and profile pictures).

    What I find interesting about this entire exchange is that I have some degree of freedom to make a choice about whether or not I’ll distribute these identity markers. In this case, my advisor certainly isn’t going to punish me (though they would certainly be entitled to some annoyance regarding my stubbornness over such a seemingly trivial topic).

    I can’t help but compare this to something like the identity markers of an Amazon Mechanical Turk worker (or, “Turker”), who is prevented from sharing such information, by platform design. In some cases, I am led to understand this does benefit a worker who might be from social group(s) that have been historically marginalized or even discriminated against. On the other hand, the de-personalization has been argued to effectively de-humanize the Turker in the eyes of requesters.

    I truly don’t know what the balance is. In my case, I might just be making my life slightly harder for no good reason. At the same time, this very tiny quasi-ethnographic snapshot of anonymous data worker (what is a grad student if not a data worker?) makes me wonder about MTurk requesters. Platform requesters of data work are often white-collar professionals in research or industry for whom their professional identity is their professional value (or “personal brand” in corporate-speak).

  • Configuring Hetzner turn-key (“Storage”) Nextcloud with OnlyOffice

    *** Sad update as of 30 April 2025: due to incompatibility between the latest version of CDS and NextCloud, there has been a major bug (https://github.com/ONLYOFFICE/onlyoffice-nextcloud/issues/1066). Via email, Hetzner announced (https://status.hetzner.com/incident/e9878d8c-9875-49bb-ba10-3aa263957db2) they are discontinuing the built-in document server. Ugh. ***

    I recently switched to Hetzner’s Storage solution (from self-hosting via YunoHost on a raspberry pi), in preparation for a move, during which I wouldn’t be able to run my home server. After a fair amount of trial and error really just error, thanks to some help from r/Hetzner I figured out how to integrate OnlyOffice with Hetzner’s turn-key (“Storage”) server.

    1. Get OnlyOffice app: <HOST>/settings/apps/office/onlyoffice where HOST is your Hetzner Storage domain
    2. Go to the OnlyOffice admin page: <HOST>/settings/admin/onlyoffice and add your community document server location (<HOST>/apps/documentserver_community/). This also requires downloading the “Community document server” app, which weirdly I had trouble locating until using the search bar after several reboots.
    3. Save!
  • The problem with proxies

    AIES – 258: “The Problems with Proxies: Making Data Work Visible through Requester Practices”

    Link to qualification activity

    Link: https://gatech.co1.qualtrics.com/jfe/form/SV_e9Blg7h1sCWISHQ

    Link to paper

    Link: https://filedn.eu/lldOHjCIRMjfewo3JirFYqh/website-documents/Website/AIES-24-253.pdf

  • Zotero 7 and ZotFile

    I suspect know I’m not the only person out there that used ZotFile to manage their slightly-overflowing Zotero libraries. Turns out, ZotFile is incompatible with Zotero 7. After paging through a series of Zotero forum posts, I came across ZotMoov which functions more or less the same. I was pleasantly surprised to have no compatibility issues once I switched and could access all old files stored with ZotFile schema from ZotMoov.

    One major shortcoming of ZotMoov (while I do not use it, forum-goers appear displeased) is that the “send to tablet” option doesn’t have a clear equivalence, though there appear to be some workarounds available.

    My ZotMoov config was pleasantly simple; from the Zotero preferences or settings pane, a new tab for ZotMoov appeared post-install, and then I linked my cloud storage directories (blurred out for partial privacy) as I would with ZotFile. Voila!

  • Fedora 40 and the case of the missing wifi

    I’ve out-of-the-box transformed enough machines from Windows and MacOS to Linux that missing wifi is no great surprise. What is, however, an extremely unpleasant surprise is when functional wifi goes missing, as it has during the transition to Fedora 40. As it turns out, this is not such a big problem to fix, but isolating the issue itself was difficult. For the record, I had stable wifi on two 2015 Macs (one MacBook pro, one Air) both with Broadcom network chips, which are known to be finicky.

    Turns out in Fedora 40, MAC addresses are now stable. Good for privacy, but bad for my immediate wifi setup.

    Turns out the fix is not so bad. First, check chipset details to make sure chip is discoverable:

    lspci | grep Network

    output should reference chipset (e.g., 03:00.0 Network controller: Broadcom Inc. and subsidiaries BCM43602 802.11ac Wireless LAN SoC (rev 01))

    And wpa_supplicant is not yet matched with stable paradigm for MAC addresses, so downgrade it:

    sudo dnf downgrade wpa_supplicant

    Then reboot except as a hard reboot — aka, power off machine and then power back on, without continuous reboot.

    More information / original source: https://discussion.fedoraproject.org/t/wifi-networks-not-showing-up-in-setting-panel/127924/3

  • A self-hosting journey

    A couple of months ago I decided to finally take the plunge and try self-hosting. In particular, I was sick of trying to find a Google Docs alternative, and sharing random CryptPads and Etherpads with academic collaborators seemed dodgy. I’d been interested in Nextcloud, but I’d been using a random provider through their marketplace, which didn’t seem too great, either. With the increase in AI cannibalism scraping-for-LLM-training, I wanted to move my documents to a place where I understood how they would be used. An E2E service didn’t quite work, since I wanted to be able to have people edit documents without needing an account on whatever service, and I needed it to appear semi-professional, at least.

    My approach has been one of relative low-tech; that is to say, I wanted to create relatively stable infrastructure that I could rely on and thus wanted reasonably well-developed infrastructure, rather that trying to do set up my own system entirely from scratch. I bought a used Raspberry Pi 400 (in a case — for convenient proximity to Ethernet connection, it’s housed next to the front door, aka lots of dust and dirt) on eBay for $40.

    Pi 400 specs from Raspberry PI

    I booted it up with a fresh install of Raspberry Pi OS, once I secondarily purchased the micro HDMI converter I forgot I’d need 🤦‍♀️

    I’d read about YunoHost and it seemed like one of the better options — free, open source, decent user community, and needed little to no custom software design. And indeed, setting up the OS itself wasn’t much of a problem. On the other hand, configuring my router to work with YunoHost and the Pi was much, much harder. I lost a few hours mucking around with Netgear Genie, I realized two things: 1) I should actually step back and learn some fundamentals of networking, and, 2) interesting networking projects should probably not be run off a who-knows-how-old router that has been inherited by multiple generations of tenants off contract.

    That joyful learning experience aside (also with thanks to my flatmate for putting up with at least one complete firmware reset on the router), I got the thing up and running. Yay!

    Now a couple months on, I’ve got a stable workflow with several apps I regularly use:

    • Google Drive –> Nextcloud — whew, this one was not fun to configure. I usually use x86-64 hardware (along with an ARM-based PineBook Pro, the Pi is the only ARM64 hardware I’ve used) and in the process of setting up the Collabora server (which, reader, is a separate YunoHost application) EXCEPT if you’re using ARM, in which case you should use the Nextcloud add-on CODE server. I learned this the hard way, though I now see the app page for Collabora now imparts this rather handy knowledge.
      • There’s one major bug I cannot figure out, however: when I create a new document on the web version of Nextcloud (as opposed to mounting Nextcloud as an external drive and creating a new file with a local word processor), I will always lose the first draft of what I write. I *think* this is some kind of sync error, wherein I’m failing to establish (.touch) the new document before I start writing to it. My current work-around is creating new documents via the word processor / external drive workflow and then editing from the web view, which is not ideal.
      • I also cannot get the Zotero integration to work. While I’ve got the add-on installed, I cannot get it to load my entire library. Instead, I get maybe twenty or so entries, when I have 2,000 or so entries in my Zotero library. This is another reason why I keep returning to the local word processing.
    • Todoist –> Vikunja — this is one of the rare times that I genuinely cannot find a solid FOSS replacement. I’m making Vikunja work, but it’s driving me a little batty. While I followed the steps to import my data from Todoist, I’m too used to Todoist’s sleek interface and miss the shorthand input (e.g., “repeat week” will set to auto-repeat weekly). Vikunja does have a setting to change the shorthand (I selected Todoist) but it’s not as sensitive — e.g. I rarely get the date format correct enough that it automatically assigns the desired completion date. I also wish there was a widget to add a new task from any page, rather than just the homepage. I do really appreciate the Teams feature and plan to use it at some point in the future — for now, my to-do list is feels too personal to open up the subdomain to any potential user (I know I can configure it to only users I’ve created, but still…)
      • My meta-reflection on switching is that I hate how I completely fell prey to Todoist’s gamification of task completion. Somehow seeing the arbitrary five goals completed marker made me feel success in a way that merely checking off individual task in Vikunja does not. Of all the things to be gamified, maybe it’s good my to-do list is, though?
    • Google Docs –> CryptPad — I had to give up on (well I’ve “paused” my instance, which is a great YunoHost feature) until I can figure out how to configure it such that only registered users can create new documents. This is a setting that exists for CryptPad more generally, but due to the way it is configured by default on YunoHost, it’s proving more tricky.
    • ? –> Readeck — this is the one I didn’t realize I needed. Super easy to configure, Readeck is a place to save articles I want to (or should) read but haven’t gotten around to it yet. Removing these from my to-do list and instead putting them in a designed app is relieving some stress (from having a never-ending to-do list).
    • Multiple proprietary URL archivers –> Archivebox — I worry a lot about bit rot, especially since I often want to archive things to return to for research. With Archivebox, I’m able to download the site in various forms (PDF, html, etc.). The site interface is clunky but the service is highly valuable — also because I sometimes save things I wouldn’t necessarily want to archive on a public site, where they might be subject to more AI cannibalism web crawling, e.g., someone’s creative work.

    Overall, the static IP address finagling was worth it — I’m happy to have control over where my data is housed. Also, it’s been an interesting experience of coming to understand the material infrastructure of my house (a rental). I live in a fairly run-down neighborhood and our internet service, for example, reflects that — I was surprised when I was traveling for the summer and obviously away from the server how often our service was down. Given that I’m usually using my home network outside of regular working hours, issues that occur overnight or during those hours go unnoticed. That’s yet another reason we should all be in favor of expanded broadband access, etc., etc.

  • Operating a Kindle without Amazon (sort of)

    I suppose the title gives it away. After the untimely (only 13 years of use!) death of my 2011 Kindle Touch, I’d been in the market for a replacement. I’ve been trying to buy most (re- or non-programmable) tech products used, that I either feel confident I can fully wipe or don’t have to worry about wiping. Part “slightly less tech waste” and part “can’t shell out $200 at the moment”. To avoid Amazon, I wanted to buy a Kobo, but all versions I could find on U.S. sites (eBay, BackMarket) were pricey, which is how I ended up with yet another Kindle ($40 for a scratched up Kindle 7th gen that works fine). In a weak protest, I decided that this version would stay entirely (as much as possible) outside Amazon’s reach. In other words, I’m trying to use it without activating an Amazon account…I’ve connected wifi but try to use it in airplane mode in the general case.

    So far I’ve found a semi-decent workflow:

    1 – setting up the device. The setup sequence required signing in with an Amazon account once at some point; to escape it, I actually had to reset the entire device (apparently there’s “I changed my mind / misclicked” button). Annoying.

    2 – I’m loading my books from Calibre straight to the Kindle, with a data-capacity micro-USB cord. As far as getting my books…I’m not sure how well Calibre does with things like library books. I’m downloading files from books I’ve got access too online, DRM free.

    3 – for the dictionary, since I can’t use the Kindle store, I’ve found an original Kindle dictionary (specifically, Merriam Webster’s Advanced Learner’s English Dictionary, which is the most recent edition I can find). I found it on one of ~those~ sites which I would likely regret linking to. The Kindle can handle “.mobi” dictionaries as native ones, it seems, regardless of origin; once I went into settings –> dictionary, and chose it, it’s been working as my default dictionary.

    4 – general experience is fairly painless, once committed to the Calibre workflow. The one extremely annoying thing is a popup every time I click to and from a book back to the main library, where I get a message from Amazon that “cloud not available, since you must register”. Which is rather the point of not being registered, smh.

  • Configuring Vikunja on YunoHost

    1 – create new todoist authorized app: https://developer.todoist.com/appconsole.html

    2 – o-auth redirect should be [your domain]/migrate/todoist

    3 – take note of client id and secret

    4 – then edit your Vikunja config located at /opt/vikunja/config.yml

    5 – in the config.yml, add client id, secret and redirect url under migrate –> todoist

    6 – then reload vikunka: yunohost service restart vikunja

    7 – then go into vikunja to the import page, where you should see todoist id, which will link to todist to authorize data sharing (read only), then should link back to vikunja. I had an error when I had wrong o-auth url redirect (see step #2).

    As a regular Todoist user for a few years (text only, no attached files), it took maybe 10 minutes for my data to finish transferring over.

  • SEA/SAW 2024

    Curious about DataWorks? Here’s an article and a podcast about the organization. And the critical data literacy curriculum is available here.

    Poster: “Calculating the cost of data refusal”

    Abstract:

    How does data refusal fit into the economic structure of a pro-social data work enterprise? If data is seen as “the new oil,” how can we argue for the value of refusal to engage in potentially harmful data work, in contestation of prevailing conceptions of data as a commodity? In this presentation, I will explore the implications of data workers’ refusal to continue working on a project that capitalized on their identity with concerning social implications. Reporting from DataWorks, a combined work training program and data services provider, I will share an ethnography of data refusal, stemming from a worker-centered training on critical data literacy.  

    Pulling from the larger theory of feminist refusal, I will employ the feminist data manifest-no to understand the tensions between economic productivity and socially just data creation and curation practices. Namely, as a small social enterprise with a limited budget, how do we, at DataWorks, weigh a given act of data refusal, and its implications, against a larger mission? 

    As a critical data scientist, my concern is both with the implications for data workers, as individuals, and the resulting dataset itself. In critical data studies, we know that pro-social and just treatment, including right to refusal, of data workers results in datasets that of higher quality (in terms of requester specifications), but this often runs counter to the expectations of ML and AI dataset requesters. In this presentation, I will share my work translating concerns of data workers as domain experts to machine learning practitioners.  

  • Datasets are eggs

    This excerpt is premised on the differences between eggs at American and European grocery stores. Eggs in the US are pasteurized (cleaned) before they can be sold, resulting in a bleached shell that must be refrigerated. Eggs sold in Europe (and some US farm-to-table situations) are more commonly unpasteurized and therefore maintain the dirt and debris of the hen house, a sight that commonly surprises Americans abroad.

    As Bowker reminds us, “data is never raw,” [1] but when it arrives in a spreadsheet, not yet cleaned or standardized, it can give the appearance of an unpasteurized egg purchased from a commercial supermarket; on the surface, its aura of distinct origin (feathers, bits of debris clinging to its shell) can mask the complex sociotechnical process by which that given egg arrived on the grocery store shelf. Recall here the vast and potentially even global transportation networks, trade agreements between store and farm, and the hundreds of years of transformation from subsistence to commercial farming that made possible this egg’s residence—if debris coated—on the grocery store shelf. This example parallels an uncleaned or standardized dataset, where the presence of uncleanliness allows us to know something about the object’s origins while simultaneously leading up to think we know more about those origins than we actually do, until we examine further the sociotechnical process that made possible that object’s proximity to us (within arm’s reach on grocery store shelf, or arrival in our email inbox).

    [1] Geoffrey C. Bowker. 2005. Memory practices in the sciences. MIT Press, Cambridge, Mass.

    From my thesis proposal (in progress).

  • Privacy Diary: 5+ months running Lineage OS

    I finally switched to Android when, in 2020, my old iPhone 5S forcibly and needlessly bit the dust at the behest of the Apple Corporation’s planned obsolesce policy.

    While in the process of moving back to Germany during the COVID-19 pandemic, my temporary housing was through a shall-not-be-named platform, whose app no longer ran on the iOS version the 5S had been limited to. In order to adhere to Germany’s (very reasonably) strict quarantine policy for new arrivals at the time, I realized I had no choice but to upgrade, seeing as making it expeditiously to my lodging was a matter of public health, and the app was the only reasonable way of communicating about my arrival with my landlady.

    My logic behind making the switch to Android was, “oh **** I need a phone that runs a newer OS” and “my budget is about 0 dollars”. In the end, I ended up with a Moto Power G (2020), which was the cheapest conventional smartphone I could find that was compatible with my current cellular plan. The fact that it was an Android was almost an after-fact, though my deep frustration with a certain company’s proclivity towards deceitful dealings with aging products did play some role, I’d like to think. While, yes, I will admit that the Moto G did seem to handle daily life better than the 5S (no, it did not lose all my texts now and then), by the time I’d had it for about two years it fared worse than the 5S did after four or five. Which, in retrospect, did make some sense — I mean, I did buy the cheapest smartphone I could.

    The idea that I had a two-year-old smartphone that was no longer functional drove me crazy though. Sure, there was the financial kicker; at this point, the Moto G was probably as expensive as if I’d just bought an iPhone and kept it for an extra year or two, which I suspect it could have probably handled (especially after those class-action lawsuits Apple ended up in). But there was also the environmental impact — I mean, good gosh, was I supposed to just dispose of a ridiculously resource-draining device after a mere two years of use? Incidentally, this coincided with a research project I was conducting about dumbphones, and the desires of many dumbphone users to keep devices that just worked for a long time. Better for the wallet, better for the environment.

    This led me down a rabbit hole of modular phones, most of which exist only in popular form with a “X Company Shuts Down Development of Planned Modular Phone”. Failing that, I figured the next best thing was a phone that would at least keep current (receive regular OS and security updates) for some time (i.e., more than two years). So obviously Apple devices were out. While Fairphone was the most reasonable dealer I could find, reviews by mainland-US users indicated that the EU-intended device barely functioned, and rarely reliably, if at all, stateside. After much searching, I realized one option would be to buy an older flagship device (easier on the wallet, and somewhat environmentally less bad?) and flash it with a mobile OS more dedicated to longer-lasting support. Which resulted in my purchasing a Google Pixel 4a (via a refurbished tech site), a device which was, approximately, ironically, the same age as my malfunctioning Moto G.

    I knew I was making a few concessions to modern ease when I switched to Lineage as a “daily driver”. First, Google would treat the bootloader as being tampered with, and some apps might be incompatible. I’d read ahead of time that many financial apps, for example, would disable sign in with finger-print ID, which was fine with me, since I’d quit using biometric login features since the 5S.

    On the plus side, it meant that I could change my relationship with Google, which was inflexible on the stock Android the Pixel 4a ran by default. Instead, I admitted to my unhealthy reliance on Google Maps (particularly when traveling) and added the Google Apps for Lineage OS (GAPPS) package. When I’m not traveling or anticipating getting lost, which is, I admit, a fairly imperfect solution, since I tend to get lost at unexpected times. Maybe the secret to personal privacy is perfecting one’s sense of location?

    I’m now about five or six months into daily life with the mostly-de-Googled, Lineage-running Pixel 4a. On the whole, I’m pleased with my experience. The actual experience of flashing Lineage to the device was much easier than with my Samsung tablet, and took about 30 minutes (though, at this point, I have some experience mucking around in adb).

    As for the experience of using Lineage, there are quirks, most often with the default phone application. I suppose I could just download the Google one. The Google Wallet feature doesn’t work with any financial details (I can still store thinks like plane e-tickets, but not credit cards). I feel like this is probably for the best, considering that lodging my credit card in my phone is just one more case of data seepage, but would be an issue for more regular users of contactless payment. In general, the Lineage default app versions sometimes just don’t work quite right, which isn’t something I can really complain about, given that Lineage doesn’t have the kind of financial and organization backing that Google’s Android OS teams have. Further, a 1-3% latency with basic applications probably helps me use my device less, since things are not quite as quick and easy as they are with flagship devices. Or maybe this is psychological. Who cares, I think my screentime is slightly down, which is all I can ask for.

    As far as hardware, the battery life is much better with Lineage than it was for the short time I ran out-of-the-box Google Android. The device I bought has been well used, and the battery life is definitely strained to last a whole day without a partial recharge, which might require use of an external battery pack for someone who doesn’t have a desk job. I’m pleased the device still has a headphone jack, so I can make use of the dozens of old Apple corded headphones that have been passed on to me by the rest of my family members, who have upgraded to jack-less iPhone versions. As someone who frequently listens to radio and music, having a dozen or so pairs of headphones makes it a whole lot easier to always have a pair within reach, something I definitely can’t say about bluetooth devices (did I mention the cord also means there’s no battery life to deal with? Wild.)

    Somewhere between hardware and software is my main gripe: dual SIM support. While back in Germany this summer, I needed to maintain both my US number and my German one, with easy access to both. Thanks to a thankfully well-timed introduction of Edeka Smart’s e-SIM option, I used my Mint Mobile (US) plan SIM in the physical SIM slot, with the Edeka as the e-SIM. My voicemail has never recovered, which, honestly, is fine since in the five or so months I’ve been using the device, I think I’ve gotten about four voicemails total. Would this be an issue for perhaps an older user more accustomed to actually speaking with people on the phone? Yes, absolutely. I’m also aware that the 4a is relatively unsophisticated in it’s dual SIM capabilities and newer versions of the Pixel might handle it better.

    On the whole, I like having my phone be my phone and not an advertising portal I carry around with me. Is it still a little bit of an advertising portal? Of course, but I feel like I’m able to make reasonable trade offs in my exposure to data collection — for example, figuring out how to navigate around a new city is worth a few breadcrumbs of location data. Do I use a different Google account with each Google app that I do have installed? Sure. Does it help minimize my exposure? Probably not?

    On the whole, I feel like this is one of the more reasonable options for a privacy-respecting smartphone. While it certainty requires an intermediate level of tech savvy, at least in getting set up, I think it could be reasonably used as a “daily driver” for anyone used to contemporary smartphones and willing to make some small sacrifices to protect their personal information, while still getting many of the benefits of a smartphone.

  • Privacy Diary: On data brokers

    By last count, I’ve lived at six addresses in the United States, with varying degrees of permanence (I’ve been an official resident of one state the entire time, but had mailing addresses at five other locations, some in-state, some out, due to temporary jobs and schooling). So, when I recently went to fill out an update renter’s insurance application, in order to confirm my identity, I had to stare long and hard at the list of alleged prior addresses.

    If you’re unfamiliar with this kind of verification system, institutions will contract with data brokers, who scrape public data (like voter registration or addresses on tax returns) and ask you to verify whether or not you’ve, for example, lived at any of the five prior addresses, or have ever owned a certain model of car. Making a mistake can send you into a long loop of escalated verification processes, some of which record your conversation with the customer service representative for “security and verification purposes”. I’m not a big fan of biometric data records and avoid them where possible, so I like to guess any prior addresses and car models correctly on the first try. However, there’s ambiguity in the questions themselves, given that I am perhaps not the default case. Having initially registered to vote, for example, at my parents’ address (I was completing high school at the time), I’ve definitely been “associated” with that address. But the question, as posed by my renter’s insurance firm, via their contracted data broker, is, “have you ever owned property” where one of the options is my parents’ address, where I have been registered. The crux of the problem is that I definitely didn’t own that property (as my parents would be quick to remind you, given my lack of contribution to their property taxes), but I don’t know if the data broker has effectively discerned that. Instead, all I know is that I, yes, have lived at that address. So I do what feels to be the reasonable thing — that is, I click “none of the above”.

    Sure enough, I am immediately informed that my verification process has failed, which is deeply ironic, given that the data broker has actually misidentified me as a homeowner. I am then routed to a dreaded customer service interaction, where sure enough, I am given no option but to consent to my voice (as part of the entire conversation) being recorded, subject to a privacy policy “available on the firm’s website”. I need renter’s insurance, so I give in. Of course a lengthy wait time is required and I am forced to give a variety of identifying information, including my social security number, via audio call.

    Reflecting later on the incident, it bothers me. Why did it fall to me to go out of my way to correct misleading (in fact, incorrect) data? Why are data brokers allowed to sell faulty systems that could lead, in fact, to false verification of identity? After some Web searches, I find out that there are a few key data brokerage firms in the US, including three big ones: Acxiom, Experian, and Epsilon. If there’s a category of business I hate to support more than credit report firms, there’s only the American tax return preparation services lobby…

    Now, I have a few options. First, I can request to have a copy of all my data pulled. There appear to be some options for correcting the data found on that report, but I have little to negative interest in giving these firms better records on me. Second, I can opt to have all my data deleted. Depending on the firm, there are additional options. No service (of the three aforementioned) will let me complete multiple steps at once, leading to 10-minute per brokerage firm submissions (and yes, I have to verify my identity to perform these tasks) for each desired goal, i.e. getting a copy of my report, and coming back later to request deletion.

    Radio button menu, with note that can only submit one request per application. There are eight options.
    From the Epsilon privacy center application page. Note that I can only select one privacy measure per application, requiring 8x the work to fully remove my data and Epsilon’s use of it.

    Oh what fun! After about an hour of time, I submit three basic applications, to get a copy of “my” data (or data on the person these sites seem to think I am) from each of the three firms. I receive those reports each about a month after filling. The data inside those reports deserves a much longer post, but suffice it to say, there are plenty of errors. For example, my dad’s name shows up under one of my legal aliases. I’m pretty sure I know how it got there, as he’s listed as a legal custodian on my first bank account, and our motor vehicle registrations are intertwined.

    I can easily imagine a situation where this quickly becomes a serious problem. For example, would my dad’s property ownership records then get mixed up with mine, since our “legal aliases” are? I suspect that’s exactly what happened to me in the case that launched this entire rambling post. This is yet another case of the fallacy of data as truth, and it makes me consider attempting to track down the personal phone number of the CEOs of these firms and deliver a message about the importance of personal privacy. But, unlike these firms, I respect personal privacy.

    For the immediate now, I tell each firm (well, I take 10 minutes to submit a new privacy application, since opt-out culture is alive and thriving) they can’t use “my personal information”.

    To see the effects, I try to open a new checking account a few weeks later, at a bank I know uses these data brokerage firms to verify customer identity. Sure enough, where I should hit a “just verify a few basic facts for us by selecting from the following…” page, I get an error code! They are unable to verify my identity at this time! I am always so happy to see when the most fundamental errors go unhandled — for example, an API request returning a “no one with that profile in our system”. I am instead given a phone number to call, and it is the general number for the bank. I am reconnected three or four times before I make it to someone who can actually verify my identity. Interestingly, they ask only for a recording of the call for security purposes, and insist that it will not be used for any marketing ones (do we believe them? I sniff a future class-action lawsuit.) It takes the representative a few minutes to verify my basic information, including social security number (which I could have typed online anyway), and then I am told a decision will be made in a few days. While I am never notified of a decision, a few days later I do receive login information for my shiny new checking account.

    I’m not sure what the concrete results of my opting-out are yet. I know that it led to a long phone call and some honestly horrific hold music (banks should be ranked not by interest rate, but by hold music, hear me out), which isn’t ideal. At the same time, the information I had to provide this time was easy for me to provide since it was basic information that I, you know, am actually associated with. I am still curious how the final verification happened — was it even a full employee of the bank, or a contract worker? Who signed off on what disclosure of data? I was never asked to consent to any information sharing outside of the bank itself.

    I am also aware that, besides burning some time that I should probably be using to do other things, I am yet to encounter the more concerning implications of refusal. For example, with these “informal” tenant screening tools used by plenty of landlords, if I have no profile, will that count against me? I guess time will tell, but I am not entirely optimistic. To future me, I do apologize, but it was for the best (I hope).

    Notes: 

    I found the following site deeply helpful: https://privacyrights.org/data-brokers

  • De-Googling a Samsung Galaxy Tab A 8” 2019

    When I started grad school in 2020, I wanted a basic tablet to read papers. Having switched to an Android phone, I figured I’d give Android tablets a shot. The Tab A (SM-T290, also known as ‘gotowifi’ model) was sluggish from the start, but now, ~3 years after acquiring it, it’s unusable. Beyond wanting to de-Google, my other key complaint was about planned obsolescence (though you could argue that the SM-T290 was maybe obsolescent from the start) and my tablet was rendered “old” within two years of purchase. I’m actually never sure that it saw Android 11, or may I had decided it was unfit for use by then.

    Anyway, given these concerns, I thought I’d try Lineage OS, which conveniently was just certified fit-to-run on the SM-T290 (here’s a thread of its development from XDA). Luckily, I had not yet updated the boot loader past T290XXU3CVG3. I found the process much trickier than expected, due in part to a critical step — rebooting the device into various modes. As it turns out, from adb (yes I should have known this from the start) you can reboot into download mode, which eventually I discovered after trying to toggle the SM-T290’s ultra slim keys. Something something haptic feedback. If I had been more forward-thinking I would have taken screenshots, since the only limitation to the Lineage OS official install directions is that (to a novice, like me) the wording sounded close enough for the Samsung stock Android boot loader that I didn’t realize why my installation kept failing — namely, that I was missing rebooting into the right mode.

    lineage os boot loader main menu

    Note to self (and maybe other potential Lineage converts reading this), the blue “download” screen is not what you want. Photo below (showing Lineage OS logo) is what you should see (picture shows the second level of the menu, not the main menu).

    Anyway, now it’s * drumroll * suddenly a perfectly usable tablet! The SM-T290 now runs faster with Lineage than it ever did with stock Samsung Android. Maybe that’s not unexpected but it drives me insane that the mass-consumer options are so limited.

    I also played around with a few variations of limited Google presence — mainly, I wanted to be able to access a few 3rd party apps for which I could not locate the APK from the source and was hesitant to use one of those APK storefronts. In the end, I went with the default Lineage OS option (well, default if you’re going to have a bit of Google) MindTheGapps to have access to the Google Play Store to get the desired apps.

    The end result:

    lineage os initial set up screen

     

  • Moving from Word, with Zotero citations, to Overleaf

    This post is a workaround for indirect transformation of Word (or any word processing software) with linked Zotero citations into Overleaf. This post expands a StackOverflow answer.

    1. Set up Zotero desktop. You will need to use the “Better BibTex for Zotero” (BBT) extension. You can download it here. As with Zotero XPI files on Firefox, you’ll want to right-click on the latest version and “save link as”. Once downloaded, you’ll upload it to your list of existing Zotero add-ons (from the main Zotero menu: Tools–>Add-ons–>Gear icon–>Install add-on from file). Upload the XPI file. You’ll need to restart Zotero. On restart, it will ask you guide you through a series of prompts – I personally used all defaults but customize as necessary. Depending on the size of your library, it make take several minutes to attach Citation Keys to all entries.
    2. Once you have configured BBT, right click on the entry pane menu to add an additional column. Here, you can click to show “Citation Key” (last entry in the list). Select that. Then, from the “My Library” collection, select all entries in your Zotero library and right click on the Citation Key value to select Better BibTex–>”Pin BibTex key”. This will lock your Citation Keys so that they find-able in the “extras” section of an individual entry in your library – aka, it will override your default key to be rendered usable to Overleaf. See more information here in the “Pinning BBT cite keys for Overleaf” section.
    3. Download this file which is a custom style file for Zotero citation formatting (again, you can use “save link as” if you get an error message). It will transform inline citations in your Word document from whatever your current citation is to Overleaf’s default citation style (e.g., “\cite{}”). To add this style to Zotero, from the main Zotero menu: Preferences –> Cite –> add additional style (+ button) –> upload saved CSL file. It will be entitled “BBC for latex -annabel” (you can change this by editing the CSL file directly; I did not create this file and you will see the original author’s name and information if you open the raw file, linked here).
    4. Change the citation format for your Word document, via the in-program Zotero menu, to “BBC for latex -annabel” (or whatever you renamed the Overleaf specific CSL file to). This will take a minute to run.
    5. You’ll now need to upload your Zotero library to Overleaf, if you haven’t already done so. Details for that process are here. If you already linked the two before adding and pinning the BBT citation keys, you’ll need to refresh the .bib file, otherwise the citation keys won’t match.
    6. You can then copy and paste directly from your Word document to Overleaf and the citation keys should compile without issue. For citations that include page numbers, see known bug 1 below.
    7. If you’re also converting a large amount of formatted Word text, you can use a tool like Pandoc to automate stylistic conversions. My current issue with this is that I haven’t figured out how to make Pandoc ignore the Overleaf-compatible citations (aka, things already in “\cite{}” format).

    Known bugs:

    1. For citations with attached page numbers, I have yet to create custom code that handles these correctly. You can flag these via the Overleaf compiler and then manually move pages to the correct Overleaf format (see here).
  • Recruiting for study on dumbphone use and users

    9 Jul 2022 – Annabel

    A nokia dumbphone (basic phone)

    We (researchers at Aalto University) are running an interview-based study to better understand users of “dumbphones” and the people who use them.

    Are you a “dumbphone” {1} user (or have previously used a dumbphone since 2017) and are you over age 18? We are researchers at Aalto University (Finland) investigating contemporary voluntary dumbphone use in high income {2} settings and would like to interview approximately 10 dumbphone users. Interviews will be held on Zoom, in English, and will last about an hour including a brief demographic survey. Participants will be compensated with a 20-euro gift card. If you are interested in enrolling in the study, please see this link to determine eligibility and to provide your contact information to schedule an interview. If you are not selected for an interview you will be informed and your eligibility data will not be used, aside from reporting the total number of interview candidates.

    We are also specifically seeking individuals who have used a “designer dumbphone” (examples include the Light Phone and Punkt phone), or dumbphones developed by individuals, startups, or other small organizations. Further, we’d love to speak with individuals who have been engaged in the design or ideation process behind these devices.

    {1} We define a dumbphone as any mobile phone that is not a smartphone (e.g., traditional flip phones, feature phones) – typically, they have a limited set of features and minimal hardware (e.g., no QWERT{Z/Y} keyboard). If you are not sure if your device qualifies, please email us.

    {2} Our definition of high-income is having been able to purchase and use a smartphone, but have chosen to use a dumbphone instead. We are looking for individuals for whom using a smartphone would be the default, but they have actively sought to use a dumbphone instead.

  • Running standalone Tor-Snowflake instance on PineBook Pro

    06 Mar 2022 – Annabel

    I’ve been running browser-based Snowflake instances for a week or so now and notice that I get the most activity (aka my instance is actually useful) at times I’m not reliably on my laptop. So, I set up a standalone instance on my currently minimally-used PineBook Pro. I wanted to leave it running during the day and didn’t need the battery drain of a GUI (my PBP has a terrible battery life [to be fixed at some indiscriminate future date]), so I went to follow these steps, with the goal of running it off a small solar generator & panel I have set up.

    Chaos ensued. I would use my PBP so much more regularly if ARM64 was…a few years more developed (sigh). Anyway, after much trial and error, here’s what worked on my Armbian (Ubuntu-based)-running (‘focal’ version) PBP:

    • Get Docker (different directions than regular Docker download) $ curl -fsSL test.docker.com -o get-docker.sh && sh get-docker.sh
    • Go ahead and add Docker to usergroup if that’s your thing. Reminder to logout after doing so. $ sudo usermod -aG docker [your-username]
    • Test Docker $ docker run hello-world
    • Get Docker-Compose $ curl -L "https://github.com/docker/compose/releases/download/v2.2.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    • Get the Snowflake Docker yml file and save.
    • Run your instance $ docker-compose up -d && docker ps

    If you’re still feeling edgy about what supporting Tor means (anti-censorship is good, socially-illict [as harmful unto other people] is probably bad), here’s a good read (scroll down to the last part). Mainly: who will be most affected if Tor (+ Snowflake, as an anti-censorship tool) ceases to function?

  • How to write a statement of purpose for computing-as-an-approach PhD programs

    26 Nov 2021 – Annabel

    This guide comes with a few important caveats:

    • I have never sat on a hiring / admissions committee. I read a lot of graduate statements as a part of formal and informal outreach programs, but I am not a decision-maker myself; my idea of what a “good” statement looks like is based on whether their authors ended up joining a program that they were happy to be joining rather than my own judgment of what a “good” program looks like.
    • This is specific to PhD applications in interdisciplinary fields that use computing as a means to an end, i.e. computing for the sake of computing is NOT the goal. While there’s a lot of excellent advice on there about applications for roles in which the technology is the focus, there’s less on technology-as-a-tool minded folks, which is the population I intend to speak to. Examples of people for whom this guide might be useful: those interested in iSchool programs (which tend to prioritize implications of information exchange rather than the exchange system design details), contemporary human-computer interaction (HCI) questions, computer science education (CS-ed, CS-E), HCI meets programming languages, etc. An example of one good guide for more tools-as-a-focus folks is that of Mor Harchol-Balter.
    • This advice applies to the United States where my experience is centered. Some of it may apply to other educational systems, some may not.
    1. Make sure you’re writing the right statement. Language between programs (even at the same school) vary a great deal. You should always confirm that the statement they’re asking for is the one you’re writing. In my impression (but this is not a perfect delineation, so double check) a “statement of purpose” explains who you are and why you’re applying to the program you’re applying. If it’s paired with a “personal statement”, the former should emphasize research and your personal background as it related DIRECTLY to your research. If it’s a “personal statement” paired with a “research statement”, the research statement should be exclusively about what research topic you’re interested and why; the personal statement can explain your motivations for applying for a PhD in the first place. Throughout this guide I’m referring to all of the above with “statement”, but pay attention to balance the kind of content with the kind of statement.
    2. Use your CV properly. Your essays need not reiterate the specific details of any past internships, jobs, or publications (if you have them – a gentle reminder that the expectation PhD applications already have publications is elitist, and you should not worry if you don’t have them). Instead, focus on what you learned in those experiences, and why you sought them out (if relevant). For example, you need not include a references section in your personal statement that includes full citations for your works; rather you can refer to your “Cool Conference ’21 publication” or “personal project A” in which you had the chance to study underwater basket weaving in detail, how it informed your desire to study the way in which fibers can be combined while immersed, which explains why you want to get your PhD in underwater basket weaving. Readers can then flip to the publication area of your CV if they want more details on the venue or the publication or nature of the project.
    3. The Rules of 3. Now is not the time for complex, flowery language. I started my undergraduate career as an English major – I promise you, I love a good metaphor as much as the rest of them – but you should avoid including any figurative language. Your readers (potential advisors, other faculty members) will scan your statement for the first time in a few minutes, often three or less. You want to use simple, precise language that is easy to follow. Further, avoid run-on sentences at all cost. A general rule is (assuming a Times New Roman 11 font or similar) no sentence should go longer than 2 lines, three at the absolute most. If this is a familiar binary to you, keep reading, otherwise don’t worry about it: in my mind, statements of purpose should be written in German-esque language, not French. The former emphasizes direct language, the latter tends to speak broadly and indirectly (e.g., to say it’s raining outside, German says just that, while the structure of French dictates something closer to “there is some rain which is outside”). There’s a place for both in the real world, but on a space-limited statement one is sometimes better suited.
    4. Time to brag factually. For many of us, bragging may be uncomfortable and/or an anti-social practice (hello from your Scandinavian-descended author), but you should be factual about your accomplishments. Check that your language is active and emphasizes your contribution. For example, you didn’t “work on the larger team” that explored basket weaving, you “explored basket weaving” as a member of “the Basket-Weaving Lab”. The latter centers you, the former diminishes your role. This is a great time to have a friend or family members who knows you well read your essay. They don’t need to have any domain knowledge (besides language familiarity with the language your statement is written in) or experience with an application statement or graduate education; the goal is that they’ve heard you talk about the things you’ve worked on, your challenges and successes, and can help you gut-check that you’re speaking to those fully and giving yourself enough credit. If you don’t have someone who fits those qualifications, you should reach out to me and I would be happy to fill in. Finally, I found it useful to keep in mind that a potential advisor was hiring me, not the team I worked with; therefore, while it’s important to convey that it was not an individual project, you can focus on your role on the project.
    5. Make smart use of space. When you’re transitioning experiences or ideas, make sure to start a new paragraph and use clear transition sentences between each; this will help readers visually recognize there’s a shift occurring which should reflect the change (or evolution) you’re trying to demonstrate. Following United States-English grammar rules, your paragraphs should subsist of at least 3-4 sentences, so be careful they’re not too short.
    6. The “Why I’m Not Going to Drop Out” thesis. Let’s be clear here, there should be no shame in dropping out of a PhD program; the program might not be the right fit, you might have found something better to do with your time, etc. The goal of this rule is to help you structure your work in a way that conveys your interests clearly to your advisor. At the end of the day, your potential future advisor will be sinking a big chunk of their research budget into you, so they likely want a clear picture on how interested you are in the thing you’re studying. This is where the computing-as-a-tool-but-not-the-focus requires special attention –- what is the domain you’re interested in studying and why? The most compelling statements I’ve read often open with some personal passion followed by some experience with computing in which computing became a tool to address that passion. The key here (IMO) is to be truthful: what is your connection to the domain area and why is it that thing that “keeps you up at night” (i.e., is always on the back of your mind). For me, that thing was data literacy –- when I watch the news, I tend to subject those around me to a tirade on why many of our domestic problems in the United States come back to a lack of critical data literacy and a social environment that doesn’t support active questioning of authority. You don’t have to indicate that this topic takes over your life –- please take time for both yourself and your hobbies! –- but you should explain how it shapes your perspective / provides a frame of reference.
    7. Run it past someone else. This section is also depressing, because it reinforces a lot of privileges those with family / friends / peers who meet certain criteria already have; e.g., connections with proficient English-speakers, adjacency to those with higher education (namely, PhD-level), etc. There is a plus-side however: many programs have pre-read assistance programs for applicants who don’t have those advantages to get preliminary (i.e., non-official) feedback from current students. You can inquire with specific programs; most have a deadline well in advance of the actual application deadline (so you have time to incorporate feedback), so start looking in late September and early October. We have one at the Georgia Tech College of Computing that I’m happy to answer questions about, or direct you to someone with more information. With that being said, you should go through (IMO) somewhere between two and four rounds of feedback. Getting more than that may dilute your voice and purpose.
      • First read: someone who has familiarity with a PhD application (doesn’t have to be specific to your research area; experience in US higher education is preferable). This is the person who should tell you “nice first try” and hand you back a copy marked in lots of red pen. Their job is to tell how well your statement aligns to the goals of a statement in PhD applications generally. My first reader was a friend who is studying for their PhD in Biology –- I promise, there was literally no content overlap in our fields –- but she could tell me that my opening was trite and that I talked too much about my motivation and too little about my concrete research experience.
      • Second read: someone who has familiarity in your area (educational background doesn’t matter). They can help you make sure you’re using the right language to talk about the things you discuss. They might know specific terms that will help your reader connect immediately with what you’re trying to say. In absence of such a reader, check your terms on a resource like Wikipedia that will show you related terms. Use it to double check your definitions and take a few minutes to explore adjacent areas briefly to make sure there’s not a better term. As a general idea, it’s always good to default to spelling out a term you’re using and then introducing the abbreviation (e.g., information retrieval as “IR”) in case there’s any confusion or ambiguity for your reader (they might have a background in international relations, for example, and the abbreviation could be confusing).
      • Third read: shortly before you submit it, have someone who knows you well and has English-language proficiency double check your grammar and that you defaulted to active language and fully expounded on your contribution and role in group projects (see #4).
    8. After you submit. Find inner peace with that one grammar mistake you made. You’re going to find at least one. Good news: your readers probably won’t catch it and, if they do, they probably won’t care. As people who have submitted many a statement in their own lives, they can probably tell you some of their favorite post-submissions errors they found in their own statements. Mine is that I spelled “the” as “teh” and somehow didn’t catch it between manual proof reading and an automatic spell-checker. If a potential advisor cares enough to disregard your application entirely, I promise you, you don’t want to be working with them.
    9. Roughly one year afterwards. Share your experience and knowledge for the next round of students! Make yourself available to communities around you (as you are able to) to help other perspective PhD students adjust to the “hidden curriculum” that is statements. You could join the staff of a pre-read program, for example.
  • New GT CoC Pre-App Review Program

    02 Nov 2021 – Annabel

    The College of Computing has a pre-read program where students from underrepresented backgrounds can have a current PhD student informally read their application materials and give feedback (completely separate from the actual application). The goal is to provide the informal support network that students from overrepresented backgrounds likely already have. You can find more information about our program here.

    If you come across this post and are interested in the GT program, or the HCC PhD program at Georgia Tech, I am happy to answer any questions over email (kontakt@thisdomain).


    More information from the leaders of the program:

    The Georgia Institute of Technology Graduate Application Support (GT-GAS) program aims to assist underrepresented applicants as they apply to GT College of Computing PhD program. A graduate student volunteer may provide one round of feedback on an applicant’s statement of purpose and CV.

    To participate in the GT-GAS Program, please submit your application materials here by 11:59PM EST on November 15th, 2021. If your submission is complete, you will receive feedback by November 30th, 2021. Submissions will be accepted on a rolling basis until our volunteer capacity is reached.

    Mentors may reach out to you for drafts of your Statement of Purpose and CV. We don’t expect them to be polished, as long as there is enough for us to read and provide meaningful feedback! Mentors will provide one round of written feedback, though it is possible individual mentors may want to set up virtual meetings to give feedback in-person.

    Participation in GT-GAS is completely separate from the official PhD application process and therefore does not guarantee admission to the College of Computing. Your mentor will not review your official application, and participation will not be disclosed to the admissions committee or faculty. Information will be aggregated and anonymized to evaluate the impact of this program, but individual responses will be deleted at the end of this application cycle.

  • An early “Guerilla Linux”

    old desktop computer running Ubuntu, covered in stickers
    An already Frankenstein-ed 2008 iMac I used to write mediocre English papers in high school, covered in age appropriate stickers, now running surprisingly fast with Ubuntu 20.04 LTS. Yes, I covered my device camera with a matching sticker, because I was both technically precocious and aesthetically very not precocious.

  • PhD Recruiting Panel Discussion

    06 Nov 2020 – Annabel

    As part of the larger “PhD Recruiting” event, I’m speaking on a panel with some other super cool folks about the PhD experience and how we arrived at our current positions. The whole event is geared towards students both interested in a PhD and those who aren’t sure if it’s the right choice for them, so join us! More information, here.

    Ps. If you didn’t find out about the event in time to sign up, there are still ways for you to get involved with the community at the above link.