Jump to ContentJump to Main Navigation
ObfuscationA User's Guide for Privacy and Protest$

Finn Brunton and Helen Nissenbaum

Print publication date: 2015

Print ISBN-13: 9780262029735

Published to MIT Press Scholarship Online: September 2016

DOI: 10.7551/mitpress/9780262029735.001.0001

Show Summary Details
Page of

PRINTED FROM MIT PRESS SCHOLARSHIP ONLINE (www.mitpress.universitypressscholarship.com). (c) Copyright The MIT Press, 2018. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in MITSO for personal use. Subscriber: null; date: 23 October 2019

Other Examples

Other Examples

Chapter:
(p.25) 2 Other Examples
Source:
Obfuscation
Author(s):

Finn Brunton

Helen Nissenbaum

Publisher:
The MIT Press
DOI:10.7551/mitpress/9780262029735.003.0003

Abstract and Keywords

The authors provide an additional eighteen examples of obfuscation. By contrast with the core cases of I.1, these examples are broader and more varied in their application, including speculative design projects, patents, file-sharing tools, and business and financial strategies. This section does not expand the fundamental lexicon of what obfuscation is and does, but instead shows how it is applied to specific kinds of challenges, including textual stylometry, facial recognition, loyalty cards, criminal investigations, and software code.

Keywords:   Obfuscation, Privacy, Networks, Identity, Tracking, Filesharing

2.1 Orb-weaving spiders: obfuscating animals

Some animals (and some plants too) have ways to conceal themselves or engage in visual trickery. Insects mimic the appearance of leaves or twigs, rabbits have countershading (white bellies) to eliminate the cues of shape that enables a hawk to easily see and strike, and spots on buttterflies’ wings mimic the eyes of predatory animals.

A quintessential obfuscator in the animal world is Cyclosa mulmeinensis, an orb-weaving spider.1 This spider faces a particular problem for which obfuscation is a sound solution: its web must be somewhat exposed in order to catch prey, but that makes the spider much more vulnerable to attack by wasps. The spider’s solution is to make stand-ins for itself out of remains of its prey, leaf litter, and spider silk, with (from the perspective of a wasp) the same size, color, and reflectivity of the spider itself, and to position these decoys around the web. This decreases the odds of a wasp strike hitting home and gives Cyclosa mulmeinensis time to scuttle out of harm’s way.

2.2 False orders: using obfuscation to attack rival businesses

The obfuscation goal of making a channel noisier can be employed not only to conceal significant traffic, but also to raise the costs of organization through that channel—and so raise the cost of doing business. The taxi-replacement company Uber provides an example of this approach in practice.

The market for businesses that provide something akin to taxis and car services is growing fast, and competition for both customers and drivers is fierce. Uber has offered bonuses to recruit drivers from competing services, and rewards merely for visiting the company’s headquarters. In New York, Uber pursued a particularly aggressive strategy against its competitor Gett, using obfuscation to recruit Gett’s drivers.2 Over the course of a few days, several Uber employees would order rides from Gett, then would cancel those orders shortly before the Gett drivers arrived. This flood of fruitless orders kept the Gett drivers in motion, not earning fees, and unable to fulfill many legitimate requests. Shortly after receiving a fruitless order, or several of them, a Gett driver would receive a text message from Uber offering him money to switch jobs. Real requests for rides were effectively obfuscated by Uber’s fake requests, which reduced the value of a job with Gett. (Lyft, a ride- (p.26) sharing company, has alleged that Uber has made similar obfuscation attacks on its drivers.)

2.3 French decoy radar emplacements: defeating radar detectors

Obfuscation plays a part in the French government’s strategy against radar detectors.3 These fairly common appliances warn drivers when police are using speed-detecting radar nearby. Some radar detectors can indicate the position of a radar gun relative to a user’s vehicle, and thus are even more effective in helping drivers to avoid speeding tickets.

In theory, tickets are a disincentive to excessively fast and dangerous driving; in practice, they serve as a revenue source for local police departments and governments. For both reasons, police are highly motivated to defeat radar detectors.

The option of regulating or even banning radar detectors is unrealistic in view of the fact that 6 million French drivers are estimated to own them. Turning that many ordinary citizens into criminals seems impolitic. Without the power to stop surveillance of radar guns, the French government has taken to obfuscation to render such surveillance less useful in high-traffic zones by deploying arrays of devices that trigger radar detectors’ warning signals without actually measuring speed. These devices mirror the chaff strategy in that the warning chirps multiply and multiply again. One of them may, indeed, indicate actual speed-detecting radar, but which one? The meaningful signal is drowned in a mass of other plausible signals. Either drivers risk getting speeding tickets or they slow down in response to the deluge of radar pings. And the civic goal is accomplished. No matter how one feels about traffic cops or speeding drivers, the case holds interest as a way obfuscation serves to promote an end not by destroying one’s adversaries’ devices outright but by rendering them functionally irrelevant.

2.4 AdNauseam: clicking all the ads

In a strategy resembling that of the French radar-gun decoys, AdNauseam, a browser plug-in, resists online surveillance for purposes of behavioral advertising by clicking all the banner ads on all the Web pages visited by its users. In conjunction with Ad Block Plus, AdNauseam functions in the background, quietly clicking all blocked ads while recording, for the user’s interest, details about ads that have been served and blocked.

(p.27) The idea for AdNauseam emerged out of a sense of helplessness: it isn’t possible to stop ubiquitous tracking by ad networks, or to comprehend the intricate institutional and technical complexities constituting its socio-technical backend. These include Web cookies and beacons, browser fingerprinting (which uses combinations and configurations of the visitor’s technology to identify their activities), ad networks, and analytics companies. Efforts to find some middle ground through a Do Not Track technical standard have been frustrated by powerful actors in the political economy of targeted advertising. In this climate of no compromise, AdNauseam was born. Its design was inspired by a slender insight into the prevailing business model, which charges prospective advertisers a premium for delivering viewers with proven interest in their products. What more telling evidence is there of interest than clicks on particular ads? Clicks also sometimes constitute the basis of payment to an ad network and to the ad-hosting website. Clicks on ads, in combination with other data streams, build up the profiles of tracked users. Like the French radar decoy systems, AdNauseam isn’t aiming to destroy the ability to track clicks; instead it functions by diminishing the value of those clicks by obfuscating the real clicks with clicks that it generates automatically.

2.5 Quote stuffing: confusing algorithmic trading strategies

The term “quote stuffing” has been applied to bursts of anomalous activity on stock exchanges that appear to be misleading trading data generated to gain advantage over competitors on the exchange. In the rarefied field of high-frequency trading (HFT), algorithms perform large volumes of trades far faster than humans could, taking advantage of minute spans of time and differences in price that wouldn’t draw the notice of attention of human traders. Timing has always been critical to trading, but in HFT thousandths of a second separate profit and loss, and complex strategies have emerged to accelerate your trades and retard those of your competitors. Analysts of market behavior began to notice unusual patterns of HFT activity during the summer of 2010: bursts of quote requests for a particular stock, sometimes thousands of them in a second. Such activity seemed to have no economic rationale, but one of the most interesting and plausible theories is that these bursts are an obfuscation tactic. One observer explains the phenomenon this way: “If you could generate a large number of quotes that your competitors have to process, but you can ignore since you generated them, you gain valuable processing time.”4 (p.28) Unimportant information, in the form of quotes, is used to crowd the field of salient activity so that the generators of the unimportant information can accurately assess what is happening while making it more difficult and time consuming for their competitors to do so. They create a cloud that only they can see through. None of the patterns in that information would fool or even distract an analyst over a longer period of time—it would be obvious that they were artificial and insignificant. But in the sub-split-second world of HFT, the time it takes merely to observe and process activity makes all the difference.

If the use of “quote stuffing” were to spread, it might threaten the very integrity of the stock market as a working system by overwhelming the physical infrastructure on which the stock exchanges rely with hundreds of thousands of useless quotes consuming bandwidth. “This is an extremely disturbing development,” the observer quoted above adds, “because as more HFT systems start doing this, it is only a matter of time before quote-stuffing shuts down the entire market from congestion.”5

2.6 Swapping loyalty cards to interfere with analysis of shopping patterns

Grocery stores have long been in the technological vanguard when it comes to working with data. Relatively innocuous early loyalty-card programs were used to draw repeat customers, extracting extra profit margins from people who didn’t use the card and aiding primitive data projects such as organizing direct mailings by ZIP code. The vast majority of grocers and chains outsourced the business of analyzing data to ACNielsen, Catalina Marketing, and a few other companies.6 Although these practices were initially perceived as isolated and inoffensive, a few incidents altered the perception of purpose from innocuous and helpful to somewhat sinister.

In 1999, a slip-and-fall accident in a Los Angeles supermarket led to a lawsuit, and attorneys for the supermarket chain threatened to disclose the victim’s history of alcohol purchases to the court.7 A string of similar cases over the years fed a growing suspicion in the popular imagination that so-called loyalty cards were serving ends beyond the allotment of discounts. Soon after their widespread introduction, card-swapping networks developed. People shared cards in order to obfuscate data about their purchasing patterns—initially in ad hoc physical meetings, then, with the help of mailing lists and online social networks, increasingly in large populations and over wide (p.29) geographical regions. Rob’s Giant Bonus Card Swap Meet, for instance, started from the idea that a system for sharing bar codes could enable customers of the DC-area supermarket chain Giant to print out the bar codes of other customers and then paste them onto their cards.8 Similarly, the Ultimate Shopper project fabricated and distributed stickers imprinted with the bar code from a Safeway loyalty card, thereby creating “an army of clones” whose shopping data would be accrued.9 Cardexchange.org, devoted to exchanging loyalty cards by mail, presents itself as a direct analogue to physical meet-ups held for the same purpose. The swapping of loyalty cards constitutes obfuscation as a group activity: the greater the number of people who are willing to share their cards, and the farther the cards travel, the less reliable the data become.

Card-swapping websites also host discussions and post news articles and essays about differing approaches to loyalty-card obfuscation and some of the ethical issues they raise. Negative effects on grocery stores are of concern, as card swapping degrades the data available to them and perhaps to other recipients. It is worth noting that such effects are contingent both on the card programs and on the approaches to card swapping. For example, sharing of a loyalty card within a household or among friends, though it may deprive a store of individual-level data, may still provide some useful information about shopping episodes or about product preferences within geographic areas. The value of data at the scale of a postal code, a neighborhood, or a district is far from insignificant. And there may be larger patterns to be inferred from the genuine information present in mixed and mingled data.

2.7 BitTorrent Hydra: using fake requests to deter collection of addresses

BitTorrent Hydra, a now-defunct but interesting and illustrative project, fought the surveillance efforts of anti-file-sharing interests by mixing genuine requests for bits of a file with dummy requests.10 The BitTorrent protocol broke a file into many small pieces and allowed users to share files with one another by simultaneously sending and receiving the pieces.11 Rather than download an entire file from another user, one assembled it from pieces obtained from anyone else who had them, and anyone who needed a piece that you had could get it from you. This many-pieces-from-many-people approach expedited the sharing of files of all kinds and quickly became the method of choice for moving large files, such as those containing movies and music.12 To help users (p.30) of BitTorrent assemble the files they needed, “torrent trackers” logged IP addresses that were sending and receiving files. For example, if you were looking for certain pieces of a file, torrent trackers would point you to the addresses of users who had the pieces you needed. Representatives of the content industry, looking for violations of their intellectual property, began to run their own trackers to gather the addresses of major unauthorized uploaders and downloaders in order to stop them or even prosecute them. Hydra counteracted this tracking by adding random IP addresses drawn from those previously used for BitTorrent to the collection of addresses found by the torrent tracker. If you had requested pieces of a file, you would be periodically directed to a user who didn’t have what you were looking for. Although a small inefficiency for the BitTorrent system as a whole, it significantly undercut the utility of the addresses that copyright enforcers gathered, which may have belonged to actual participants but which may have been dummy addresses inserted by Hydra. Doubt and uncertainty had been reintroduced to the system, lessening the likelihood that one could sue with assurance. Rather than attempt to destroy the adversary’s logs or to conceal BitTorrent traffic, Hydra provided an “I am Spartacus” defense. Hydra didn’t avert data collection; however, by degrading the reliability of data collection, it called any specific findings into question.

2.8 Deliberately vague language: obfuscating agency

According to Jacquelyn Burkell and Alexandre Fortier, the privacy policies of health information sites use particularly obtuse linguistic constructions when describing their use of tracking, monitoring, and data collection.13 Conditional verbs (e.g., “may”), passive voice, nominalization, temporal adverbs (e.g., “periodically” and “occasionally”), and the use of qualitative adjectives (as in “small piece of data”) are among the linguistic constructions that Burkell and Fortier identify. As subtle as this form of obfuscation may seem, it is recognizably similar in operation to other forms we have already described: in place of a specific, specious denial (e.g., “we do not collect user information”) or an exact admission, vague language produces many confusing gestures of possible activity and attribution. For example, the sentence “Certain information may be passively collected to connect use of this site with information about the use of other sites provided by third parties” puts the particulars of what a site does with certain information inside a cloud of possible interpretations. (p.31) These written practices veer away from obfuscation per se into the more general domain of abstruse language and “weasel words.”14 However, for purposes of illustrating the range of obfuscating approaches, the style of obfuscated language is useful: a document must be there, a straightforward denial isn’t possible, and so the strategy becomes one of rendering who is doing what puzzling and unclear.

2.9 Obfuscation of anonymous text: stopping stylometric analysis

How much in text identifies it as the creation of one author rather than another? Stylometry uses only elements of linguistic style to attribute authorship to anonymous texts. It doesn’t have to account for the possibility that only a certain person would have knowledge of some matter, for posts to an online forum, for other external clues (such as IP addresses), or for timing. It considers length of sentences, choice of words, and syntax, idiosyncrasies in formatting and usage, regionalisms, and recurrent typographical errors. It was a stylometric analysis that helped to settle the debate over the pseudonymous authors of the Federalist Papers (for example, the use of “while” versus “whilst” served to differentiate the styles of Alexander Hamilton and James Madison), and stylometry’s usefulness in legal contexts is now well established.15

Given a small amount of text, stylometry can identify an author. And we mean small—according to Josyula Rao and Pankaj Ratangi, a sample consisting of about 6,500 words is sufficient (when used with a corpus of identified text, such as email messages, posts to a social network, or blog posts) to make possible an 80 percent rate of successful identification.16 In the course of their everyday use of computers, many people produce 6,500 words in a few days.

Even if the goal is not to identify a specific author from a pool of known individuals, stylometry can produce information that is useful for purposes of surveillance. The technology activist Daniel Domscheit-Berg recalls the moment when he realized that if WikiLeaks’ press releases, summaries of leaks, and other public texts were to undergo stylometric analysis it would show that only two people (Domscheit-Berg and Julian Assange) had been responsible for all those texts rather than a large and diverse group of volunteers, as Assange and Domscheit-Berg were trying to suggest.17 Stylometric analysis offers an adversary a more accurate picture of an “anonymous” or (p.32) secretive movement, and of its vulnerabilities, than can be gained by other means. Having narrowed authorship down to a small handful, the adversary is in a better position to target a known set of likely suspects.

Obfuscation makes it practicable to muddle the signal of a public body of text and to interfere with the process of connecting that body of text with a named author. Stylometric obfuscation is distinctive, too, in that its success is more readily tested than with many other forms of obfuscation, whose precise effects may be highly uncertain and/or may be known only to an uncooperative adversary.

Three approaches to beating stylometry offer useful insights into obfuscation. The first two, which are intuitive and straightforward, involve assuming a writing style that differs from one’s usual style; their weaknesses highlight the value of using obfuscation.

Translation attacks take advantage of the weaknesses of machine translation by translating a text into multiple languages and then translating it back into its original language—a game of Telephone that might corrupt an author’s style enough to prevent attribution.18 Of course, this also renders the text less coherent and meaningful, and as translation tools improve it may not do a good enough job of depersonalization.

In imitation attacks, the original author deliberately writes a document in the style of another author. One vulnerability of that approach has been elegantly exposed by research.19 Using the systems you would use to identify texts as belonging to the same author, you can determine the most powerful identifier of authorship between two texts, then eliminate that identifier from the analysis and look for the next-most-powerful identifier, then keep repeating the same process of elimination. If the texts really are by different people, accuracy in distinguishing between them will decline slowly, because beneath the big, obvious differences between one author and another there are many smaller and less reliable differences. If, however, both texts are by the same person, and one of them was written in imitation of another author, accuracy in distinguishing will decline rapidly, because beneath notable idiosyncrasies fundamental similarities are hard to shake.

Obfuscation attacks on stylometric analysis involve writing in such a way that there is no distinctive style. Researchers distinguish between “shallow” and “deep” obfuscation of texts. “Shallow” obfuscation changes only a small number of the most obvious features—for example, preference for “while” or (p.33) for “whilst.” “Deep” obfuscation runs the same system of classifiers used to defeat imitation, but does so for the author’s benefit. Such a method might provide real-time feedback to an author editing a document, identifying the highest-ranked features and suggesting changes that would diminish the accuracy of stylometric analysis—for example, sophisticated paraphrasing. It might turn the banalities of “general usage” into a resource, enabling an author to blend into a vast crowd of similar authors.

Anonymouth—a tool that is under development as of this writing—is a step toward implementing this approach by producing statistically bland prose that can be obfuscated within the corpus of similar writing.20 Think of the car provided to the getaway driver in the 2011 movie Drive: a silver late-model Chevrolet Impala, the most popular car in California, about which the mechanic promises “No one will be looking at you.”21 Ingenious as this may be, we wonder about a future in which political manifestos and critical documents strive for great rhetorical and stylistic banality and we lose the next Thomas Paine’s equivalent to “These are the times that try men’s souls.”

2.10 Code obfuscation: baffling humans but not machines

In the field of computer programming, the term “obfuscated code” has two related but distinct meanings. The first is “obfuscation as a means of protection”—that is, making the code harder for human readers (or the various forms of “disassembly algorithms,” which help explicate code that has been compiled for use) to interpret for purposes of copying, modification, or compromise. (A classic example of such reverse engineering goes as follows: Microsoft sends out a patch to update Windows computers for security purposes; bad actors get the patch and look at the code to figure out what vulnerability the patch is meant to address; they then devise an attack exploiting the vulnerability they have noticed hitting.) The second meaning of “obfuscated code” refers to a form of art: writing code that is fiendishly complex for a human to untangle but which ultimately performs a mundane computational task that is easily processed by a computer.

Simply put, a program that has been obfuscated will have the same functionality it had before, but will be more difficult for a human to analyze. Such a program exhibits two characteristics of obfuscation as a category and a concept. First, it operates under constraint—you obfuscate because people will be able to see your code, and the goals of obfuscation-as-protection are (p.34) to decrease the efficiency of the analysis (“at least doubling the time needed,” as experimental research has found), to reduce the gap between novices and skilled analysts, and to give systems that (for whatever reason) are easier to attack threat profiles closer to those of systems that are more difficult to attack.22 Second, an obfuscated program’s code uses strategies that are familiar from other forms of obfuscation: adding significant-seeming gibberish; having extra variables that must be accounted for; using arbitrary or deliberately confusing names for things within the code; including within the code deliberately confusing directions (essentially, “go to line x and do y”) that lead to dead ends or wild goose chases; and various forms of scrambling. In its protective mode, code obfuscation is a time-buying approach to thwarting analysis—a speed bump. (Recently there have been advances that significantly increase the difficulty of de-obfuscation and the amount of time it requires; we will discuss them below.)

In its artistic, aesthetic form, code obfuscation is in the vanguard of counterintuitive, puzzling methods of accomplishing goals. Nick Montfort has described these practices in considerable detail.23 For example, because of how the programming language C interprets names of variables, a programmer can muddle human analysis but not machine execution by writing code that includes the letters o and O in contexts that trick the eye by resembling zeroes. Some of these forms of obfuscation lie a little outside our working definition of “obfuscation,” but they are useful for illustrating an approach to the fundamental problem of obfuscation: how to transform something that is open to scrutiny into something ambiguous, full of false leads, mistaken identities, and unmet expectations.

Code obfuscation, like stylometry, can be analyzed, tested, and optimized with precision. Its functionality is expanding from the limited scope of buying time and making the task of unraveling code more difficult to something closer to achieving complete opacity. A recent publication by Sanjam Garg and colleagues has moved code obfuscation from a “speed bump” to an “iron wall.” A Multilinear Jigsaw Puzzle can break code apart so that it “fits together” like pieces of a puzzle. Although many arrangements are possible, only one arrangement is correct and represents the actual operation of the code.24 A programmer can create a clean, clear, human-readable program and then run it through an obfuscator to produce something incomprehensible that can withstand scrutiny for a much longer time than before.

(p.35) Code obfuscation—a lively, rich area for the exploration of obfuscation in general—seems to be progressing toward systems that are relatively easy to use and enormously difficult to defeat. This is even applicable to hardware: Jeyavijayan Rajendran and colleagues are utilizing components within circuits to create “logic obfuscation” in order to prevent reverse engineering of the functionality of a chip.25

2.11 Personal disinformation: strategies for individual disappearance

Disappearance specialists have much to teach would-be obfuscators. Many of these specialists are private detectives or “skip tracers”—professionals in the business of finding fugitives and debtors—who reverse engineer their own process to help their clients stay lost. Obviously many of the techniques and methods they employ have nothing to do with obfuscation, but rather are merely evasive or concealing—for instance, creating a corporation that can lease your new apartment and pay your bills so that your name will not be connected with those common and publicly searchable activities. However, in response to the proliferation of social networking and online presence, disappearance specialists advocate a strategy of disinformation, a variety of obfuscation. “Bogus individuals,” to quote the disappearance consultant Frank Ahearn, can be produced in number and detail that will “bury” pre-existing personal information that might crop up in a list of Web search results.26 This entails creating a few dozen fictitious people with the same name and the same basic characteristics, some of them with personal websites, some with accounts on social networks, and all of them intermittently active. For clients fleeing stalkers or abusive spouses, Ahearn recommends simultaneous producing numerous false leads that an investigator would be likely to follow—for example, a credit check for a lease on an apartment in one city (a lease that was never actually signed) and applications for utilities, employment addresses and phone numbers scattered across the country or the world, and a checking account, holding a fixed sum, with a debit card given to someone traveling to pay for expenses incurred in remote locations. Strategies suggested by disappearance specialists are based on known details about the adversary: the goal is not to make someone “vanish completely,” but to put one far enough out of sight for practical purposes and thus to use up the seeker’s budget and resources.

(p.36) 2.12 Apple’s “cloning service” patent: polluting electronic profiling

In 2012, as part of a larger portfolio purchase from Novell, Apple acquired U.S. Patent 8,205,265, “Techniques to Pollute Electronic Profiling.”27 An approach to managing data surveillance without sacrificing services, it parallels several systems of technological obfuscation we have described already. This “cloning service” would automate and augment the process of producing misleading personal information, targeting online data collectors rather than private investigators.

A “cloning service” observes an individual’s activities and assembles a plausible picture of his or her rhythms and interests. At the user’s request, it will spin off a cloned identity that can use the identifiers provided to authenticate (to social networks, if not to more demanding observers) that represents a real person. These identifiers might include small amounts of actual confidential data (a few details of a life, such as hair color or marital status) mixed in with a considerable amount of deliberately inaccurate information. Starting from its initial data set, the cloned identity acquires an email address from which it will send and receive messages, a phone number (there are many online calling services that make phone numbers available for a small fee), and voicemail service. It may have an independent source of funds (perhaps a gift card or a debit card connected with a fixed account that gets refilled from time to time) that enables it to make small transactions. It may even have a mailing address or an Amazon locker—two more signals that suggest personhood. To these signals may be added some interests formally specified by the user and fleshed out with existing data made accessible by the scraping of social-network sites and by similar means. If a user setting up a clone were to select from drop-down menus that the clone is American and is interested in photography and camping, the system would figure out that the clone should be interested in the work of Ansel Adams. It can conduct searches (in the manner of TrackMeNot), follow links, browse pages, and even make purchases and establish accounts with services (e.g., subscribing to a mailing list devoted to deals on wilderness excursions, or following National Geographic’s Twitter account). These interests may draw on the user’s actual interests, as inferred from things such as the user’s browsing history, but may begin to diverge from those interests in a gradual, incremental way. (One could also salt the profile of one’s clone with demographically appropriate activities, automatically chosen, building on the basics of one’s actual data by selecting (p.37) interests and behaviors so typical that they even out the telling idiosyncrasies of selfhood.)

After performing some straightforward analysis, a clone can also take on a person’s rhythms and habits. If you are someone who is generally offline on weekends, evenings, and holidays, your clone will do likewise. It won’t run continuously, and you can call it off if you are about to catch a flight, so an adversary will not be able to infer easily which activities are not yours. The clones will resume when you do. (For an explanation of why we now are talking about multiple clones, see below.) Of course, you can also select classes of activities in which your clones will not engage, lest the actors feigning to be you pirate some media content, begin to search for instructions on how to manufacture bombs, or look at pornography, unless they must do so to maintain plausibility—making all one’s clones clean-living, seriousminded network users interested only in history, charitable giving, and recipes might raise suspicions. (The reason we have switched from talking about a singular clone to speaking about multiple clones is that once one clone is up and running there will be many others. Indeed, imagine a Borgesian joke in which sufficiently sophisticated clones, having learned from your history, demography, and habits, create clones of their own—copies of copies.) It is in your interest to expand this population of possible selves, leading lives that could be yours, day after day. This fulfills the fundamental goal outlined by the patent: your clones don’t dodge or refuse data gathering, but in complying they pollute the data collected and reduce the value of profiles created from those data.

2.13 Vortex: cookie obfuscation as game and marketplace

Vortex—a proof-of-concept game (of sorts) developed by Rachel Law, an artist, designer, and programmer28—serves two functions simultaneously: to educate players about how online filtering systems affect their experience of the Internet and to confuse and misdirect targeted advertising based on browser cookies and other identifying systems. It functions as a game, serving to occupy and delight—an excellent venue for engaging users with a subject as seemingly dry and abstract as cookie-based targeted advertising. It is, in other words, a massively multi-player game of managing and exchanging personal data. The primary activities are “mining” cookies from websites and swapping them with other players. In one state of play, the game looks like a (p.38) few color-coded buttons in the bookmarks bar of your browser that allow you to accumulate and swap between cookies (effectively taking on different identities); in another state of play, it looks like a landscape that represents a site as a quasi-planet that can be mined for cookies. (The landscape representation is loosely inspired by the popular exploration and building game Minecraft.)

Vortex ingeniously provides an entertaining and friendly way to display, manage, and share cookies. As you generate cookies, collect cookies, and swap cookies with other players, you can switch from one cookie to another with a click, thereby effectively disguising yourself and experiencing a different Web, a different set of filters, a different online self. This makes targeted advertising into a kind of choice: you can toggle over to cookies that present you as having a different gender, a different ethnicity, a different profession, and a different set of interests, and you can turn the ads and “personalized” details into mere background noise rather than distracting and manipulative components that peg you as some marketer’s model of your identity. You can experience the Web as many different people, and you can make any record of yourself into a deniable portrait that doesn’t have much to do with you in particular. In a trusted circle of friends, you can share account cookies that will enable you to purchase things that are embargoed in your location—for example, video streams that are available only to viewers in a certain country.

Hopping from self to self, and thereby ruining the process of compiling demographic dossiers, Vortex players would turn online identity into a field of options akin to the inventory screens of an online role-playing game. Instead of hiding, or giving up on the benefits that cookies and personalization can provide, Vortex allows users to deploy a crowd of identities while one’s own identity is offered to a mob of others.

2.14 “Bayesian flooding” and “unselling” the value of online identity

In 2012, Kevin Ludlow, a developer and an entrepreneur, addressed a familiar obfuscation problem: What is the best way to hide data from Facebook?29 The short answer is that there is no good way to remove data, and wholesale withdrawal from social networks isn’t a realistic possibility for many users. Ludlow’s answer is by now a familiar one.

“Rather than trying to hide information from Facebook,” Ludlow wrote, “it may be possible simply to overwhelm it with too much information.” Ludlow’s (p.39) experiment (which he called “Bayesian flooding,” after a form of statistical analysis) entailed entering hundreds of life events into his Facebook Timeline over the course of months—events that added up to a life worthy of a three-volume novel. He got married and divorced, fought cancer (twice), broke numerous bones, fathered children, lived all over the world, explored a dozen religions, and fought for a slew of foreign militaries. Ludlow didn’t expect anyone to fall for these stories; rather, he aimed to produce a less targeted personal experience of Facebook through the inaccurate guesses to which the advertising now responds, and as an act of protest against the manipulation and “coercive psychological tricks” embedded both in the advertising itself and in the site mechanisms that provoke or sway users to enter more information than they may intend to enter. In fact, the sheer implausibility of Ludlow’s Timeline life as a globe-trotting, caddish mystic-mercenary with incredibly bad luck acts as a kind of filter: no human reader, and certainly no friend or acquaintance of Ludlow’s, would assume that all of it was true, but the analysis that drives the advertising has no way of making such distinctions.

Ludlow hypothesizes that, if his approach were to be adopted more widely, it wouldn’t be difficult to identify wild geographic, professional, or demographic outliers—people whose Timelines were much too crowded with incidents—and then wash their results out of a larger analysis. The particular understanding of victory that Ludlow envisions, which we discuss in the typology of goals presented in second part of this book, is a limited one. His Bayesian flooding isn’t meant to counteract and corrupt the vast scope of data collection and analysis; rather, its purpose is to keep data about oneself both within the system and inaccessible. Max Cho describes a less extreme version: “The trick is to populate your Facebook with just enough lies as to destroy the value and compromise Facebook’s ability to sell you”30—that is, to make your online activity harder to commoditize, as an act of conviction and protest.

2.15 FaceCloak: concealing the work of concealment

FaceCloak offers a different approach to limiting Facebook’s access to personal information. When you create a Facebook profile and fill in your personal information, including where you live, where you went to school, your likes and dislikes, and so on, FaceCloak allows you to choose whether to display this information openly or to keep it private.31 If you choose to display the information openly, it is passed to Facebook’s servers. If you choose to keep it (p.40) private, FaceCloak sends it to encrypted storage on a separate server, where it may be decrypted for and displayed only to friends you have authorized when they browse your Facebook page using the FaceCloak plug-in. Facebook never gains access to it.

What is salient about FaceCloak for present purposes is that it obfuscates its method by generating fake information for Facebook’s required profile fields, concealing from Facebook and from unauthorized viewers the fact that the real data are stored elsewhere. As FaceCloak passes your real data to the private server, FaceCloak fabricates for Facebook a plausible non-person of a certain gender, with a name and an age, bearing no relation to the real facts about you. Under the cover of the plausible non-person, you can forge genuine connections with your friends while presenting obfuscated data for others.

2.16 Obfuscated likefarming: concealing indications of manipulation

Likefarming is now a well-understood strategy for generating the illusion of popularity on Facebook: employees, generally in the developing world, will “like” a particular brand or product for a fee (the going rate is a few U.S. dollars for a thousand likes).32 A number of benefits accrue to heavily liked items—among other things, Facebook’s algorithms will circulate pages that show evidence of popularity, thereby giving them additional momentum.

Likefarming is easy to spot, particularly for systems as sophisticated as Facebook’s. It is performed in narrowly focused bursts of activity devoted to liking one thing or one family of things, from accounts that do little else. To appear more natural, they employ an obfuscating strategy of liking a spread of pages—generally pages recently added to the feed of Page Suggestions, which Facebook promotes according to its model of the user’s interests.33 The paid work of systematically liking one page can be hidden within scattered likes, appearing to come from a person with oddly singular yet characterless interests. Likefarming reveals the diversity of motives for obfuscation—not, in this instance, resistance to political domination, but simply provision of a service for a fee.

2.17 URME surveillance: “identity prosthetics” expressing protest

The artist Leo Selvaggio wanted to engage with the video surveillance of public space and the implications of facial-recognition software.34 After considering (p.41) the usual range of responses (wearing a mask, destroying cameras, ironic attention-drawing in the manner of the Surveillance Camera Players), Selvaggio hit on a particularly obfuscating response with a protester’s edge: he produced and distributed masks of his face that were accurate enough so that other people wearing them would be tagged as him by Facebook’s facial-recognition software.

Selvaggio’s description of the project offers a capsule summary of obfuscation: “[R]ather than try to hide or obscure one’s face from the camera, these devices allow you to present a different, alternative identity to the camera, my own.”

2.18 Manufacturing conflicting evidence: confounding investigation

The Art of Political Murder: Who Killed the Bishop?—Francisco Goldman’s account of the investigation into the death of Bishop Juan José Gerardi Conedera—reveals the use of obfuscation to muddy the waters of evidence collection.35 Bishop Gerardi, who played an enormously important part in defending human rights during Guatemala’s civil war of 1960–1996, was murdered in 1998.

As Goldman documented the long and dangerous process of bringing at least a few of those responsible within the Guatemalan military to justice for this murder, he observed that those threatened by the investigation didn’t merely plant evidence to conceal their role. Framing someone else would be an obvious tactic, and the planted evidence would be assumed to be false. Rather, they produced too much conflicting evidence, too many witnesses and testimonials, too many possible stories. The goal was not to construct an airtight lie, but rather to multiply the possible hypotheses so prolifically that observers would despair of ever arriving at the truth. The circumstances of the bishop’s murder produced what Goldman terms an “endlessly exploitable situation,” full of leads that led nowhere and mountains of seized evidence, each factual element calling the others into question. “So much could be made and so much would be made to seem to connect,” Goldman writes, his italics emphasizing the power of the ambiguity.36

The thugs in the Guatemalan military and intelligence services had plenty of ways to manage the situation: access to internal political power, to money, and, of course, to violence and the threat of violence. In view of how opaque (p.42) the situation remains, we do not want to speculate about exact decisions, but the fundamental goal seems reasonably clear. The most immediately significant adversaries—investigators, judges, journalists—could be killed, menaced, bought, or otherwise influenced. The obfuscating evidence and other materials were addressed to the larger community of observers, a proliferation of false leads throwing enough time-wasting doubt over every aspect of the investigation that it could call the ongoing work, and any conclusions, into question. (p.43)

Notes:

(1.) Ling Tseng and I.-Min Tso, “A Risky Defence by a Spider Using Conspicuous Decoys Resembling Itself in Appearance,” Animal Behavior 78, no. 2 (2009): 425-431 (doi:10.1016/j.anbehav.2009.05.017).

(2.) Rip Empson, “Black Car Competitor Accuses Uber of DDoS-Style Attack; Uber Admits Tactics Are “Too Aggressive,” TechCrunch, January 24, 2014 (http://techcrunch.com/2014/01/24/black-car-competitor-accuses-uber-of-shady-conduct-ddos-style-attack-uber-expresses-regret/).

(3.) “Le Gouvernement Veut Rendre les Avertisseurs de Radars Inefficaces,” Le Monde, November 29, 2011 (http://www.lemonde.fr/societe/article/2011/11/29/les-avertis-seurs-de-radars-seront-bientot-inefficaces_1610490_3224.html).

(p.103) (6.) Joab Jackson, “Cards Games: Should Buyers Beware of How Supermarkets Use “Loyalty Cards” to Collect Personal Data?” Baltimore City Paper, October 1, 2003 (http://www.joabj.com/CityPaper/031001ShoppingCards.html).

(7.) Robert Ellis Smith, Privacy Journal, March 1999, p. 5.

(8.) http://epistolary.org/rob/bonuscard/, accessed October 25, 2010.

(9.) “The Ultimate Shopper,” Cockeyed.com, last updated December 11, 2002 (http://www.cockeyed.com/pranks/safeway/ultimate_shopper.html).

(11.) For a somewhat technical but accessible overview of BitTorrent that includes a lucid explanation of trackers, see Mikel Izal, Guillaume Urvoy-Keller, Ernst W. Biersack, Pascal Felber, Anwar Al Hamra, and Luis Garcés-Erice, “Dissecting BitTorrent: Five Months in a Torrent’s Lifetime,” Passive and Active Network Measurement 3015 (2004): 1–11 (doi: 10.1007/978-3-540-24668-8_1).

(12.) Hendrik Schulze and Klaus Mochalski, “Internet Study 2008/2009,” Ipoque (http://www.christopher-parsons.com/Main/wp-content/uploads/2009/04/ipoque-inter-net-study-08-09.pdf).

(13.) Jacquelyn Burkell and Alexandre Fortier, “Privacy Policy Disclosures of Behavioural Tracking on Consumer Health Websites, Proceedings of the American Society for Information Science and Technology 50, no. 1 (May 2014): 1–9 (doi: 10.1002/meet.14505001087_.

(14.) Viola Ganter and Michael Strube, “Finding Hedges by Chasing Weasels: Hedge Detection Using Wikipedia Tags and Shallow Linguistic Features,” in Proceedings of the ACL-IJCNLP Conference Short Papers, 2009 (http://dl.acm.org/citation.cfm?id=1667636).

(15.) David I. Holmes and Richard S. Forsyth, “The Federalist Revisited: New Directions in Authorship Attribution,” Literary and Linguistic Computing 10, no. 2 (1995): 111–127 (doi: 10.1093/llc/10.2.111).

(16.) Josyula R. Rao and Pankaj Rohatgi, “Can Pseudonymity Really Guarantee Privacy?” in Proceedings of the 9th USENIX Security Symposium, 2000 (https://www.usenix.org/legacy/events/sec2000/full_papers/rao/rao_html/index.html).

(p.104) (17.) Daniel Domscheit-Berg, Inside WikiLeaks: My Time With Julian Assange at the World’s Most Dangerous Website (Crown, 2011).

(19.) Moshe Koppel and Jonathan Schler, “Authorship Verification as a One-Class Classification Problem,” in Proceedings of the 21st International Conference on Machine Learning, 2004 (doi: 10.1145/1015330.1015448).

(21.) Drive, directed by Nicolas Winding Refn (Film District, 2011).

(22.) Mariano Ceccato, Massimiliano Di Penta, Jasvir Nagra, Paolo Falcarin, Filippo Ricca, Marco Torchiano, and Paolo Tonella, “The Effectiveness of Source Code Obfuscation: An Experimental Assessment,” in Proceedings of 17th International Conference on Program Comprehension, 2009 (doi: 10.1109/ICPC.2009.5090041).

(23.) See Michael Mateas and Nick Monfort, “A Box, Darkly: Obfuscation, Weird Languages, and Code Aesthetics,” in Proceedings of the 6th Annual Digital Arts and Culture Conference, 2005 (http://elmcip.net/node/3634).

(24.) Sanjam Garg, Craig Gentry, Shai Halevi, Mariana Raykova, Amit Sahai and Brent Waters, “Candidate Indistinguishability Obfuscation and Functional Encryption for all Circuits,” in Proceedings of IEEE 54th Annual Symposium on Foundations of Computer Science, 2013 (doi: 10.1109/FOCS.2013.13).

(25.) Jeyavijayan Rajendran, Ozgur Sinanoglu, Michael Sam, and Ramesh Karri, “Security Analysis of Integrated Circuit Camouflaging,” presented at ACM Conference on Computer and Communications Security, 2013 (doi: 10.1145/2508859.2516656).

(26.) From an interview with Ahearn: Joan Goodchild, “How to Disappear Completely,” CSO, May 3, 2011 (http://www.csoonline.com/article/2128377/identity-theft-preven-tion/how-to-disappear-completely.html).

(27.) Stephen Carter, “United States Patent: 20070094738 A1—Techniques to Pollute Electronic Profiling,” April 26, 2007 (http://www.google.com/patents/US20070094738).

(28.) Rachel Law, “Vortex” (http://www.milkred.net/vortex/). Much of the detail in this section is based on conversation with Law and on her presentation in the Tool Workshop Sessions at the Symposium on Obfuscation held at New York University in 2014.

(p.105) (29.) Kevin Ludlow, “Bayesian Flooding and Facebook Manipulation,” KevinLudlow. com, May 23, 2012 (http://www.kevinludlow.com/blog/1610/Bayesian_Flooding_and_Facebook_Manipulation_FB/).

(30.) Max Cho, “Unsell Yourself—A Protest Model Against Facebook,” Yale Law & Technology, May 10, 2011 (http://www.yalelawtech.org/control-privacy-technology/unsell-yourself-%E2%80%94-a-protest-model-against-facebook/).

(31.) Wanying Luo, Qi Xie, and Urs Hengartner, “FaceCloak: An Architecture for User Privacy on Social Networking Sites,” in Proceedings of the 2009 IEEE International Conference on Privacy, Security, Risk and Trust (https://cs.uwaterloo.ca/~uhengart/publications/passat09.pdf).

(32.) Charles Arthur, “How Low-Paid Workers at ‘Click Farms’ Create Appearance of Online Popularity,” theguardian.com, August 2, 2013 (http://www.theguardian.com/technology/2013/aug/02/click-farms-appearance-online-popularity).

(33.) Jaron Schneider, “Likes or Lies? How Perfectly Honest Business can be Overrun by Facebook Spammers,” TheNextWeb, January 23, 2004 (http://thenextweb.com/facebook/2014/01/23/likes-lies-perfectly-honest-businesses-can-overrun-facebook-spammers/).

(34.) Leo Selvaggio, “URME Surveillance,” 2014 (http://www.urmesurveillance.com).

(35.) Francisco Goldman, The Art of Political Murder: Who Killed the Bishop? (Grove, 2008).

(36.) Ibid., 109.