Category: Data Science

AlgorithmsBooksData Science

Surveillance Capitalism and the working data scientist

Shoshana Zuboff’s 2019 book, The Age of Surveillance Capitalism is a powerful critique of the interplay between capitalism and Big Data. In about 700 pages pages, she makes a convincing argument that the very existence of human autonomy is under threat. In short, Dr. Zuboff is concerned that large companies have perfected a profit loop in which they collect immense amounts of data on consumer behavior, devise models that accurately predict behavior using that data, and then use those models to manipulate consumer behavior to drive profit. It is an intriguing and important read.

It defines several ways of thinking about the data collection and machine learning technologies at the core of this new surveillance capitalism. I think the book falls short in articulating a path forward, especially for people who work in the data industry.

If you’re a data scientist, analyst, or work in any of the myriad jobs behind this phenomenon, you might have concerns about how your 9-to-5 contributes to this mess. If so, you’ve probably asked yourself what to do – how to respond as an ethical, responsible, perhaps moral human being. I’ve put down some thoughts below, chime in if you’d like!

We need to name these things!

My partner always emphasizes that to address an issue, we need to be able to name it as precisely as possible. The devil hides in the details. To those of us with academic training particularly, we have the obligation to identify these names and make them accessible to our people – to make it plain as my grandmothers would say.

In that spirit, there are a few terms that I pulled from The Age of Surveillance Capitalism that helped me view commercial Big Data in a new light.

Rendition

Rendition is the activity that we (or our ETL flows) engage in when we reduce the features and signals about some human activity or human to some set of observable features and use those features to completely describe the person or activity. We render their engagement with our landing page to the act of clicking, the number of milliseconds from rendering of the page to first clicks, the time spent scrolling, etc.

We render emotion using the straw of a signal that we get from our browser API. She talks about rendition of emotions, rendition of personal experience, rendition of social relations, and rendition of the self: all operations that reduce very rich and complex human activities to numbers.

Behavioral Surplus

Behavioral surplus refers to data collected which exceeds the amount required for the application or platform to complete the task at hand. You have an app that provides secure communication. A user happens to talk about a certain brand of sneakers a lot when they’re using your app. Capturing information about their shoe habits doesn’t have much to do with secure instant messaging, but you capture these interactions and share them with a data aggregator who in turn shares them with a shoe retailer. The behavioral surplus is the “sneaker interest”. Your app user who was counting on privacy and security knows that their trust was an illusion when ads for the exact shoes show up everywhere they look online. Maybe that user is undocumented. Maybe ICE shows up at the Foot Locker they shop at.

Instrumentarianism

Instrumentarianism refers to a form of behavioral control, based on Behaviorist models of activity modification, enabled by large cross platform logging of online behavior, combined with predictive machine learning, combined with the capacity to conduct online behavioral experiments. The Instrumentarian regime knows enough about a given population, so that with some certainty C, when an action A is presented to this group, they will perform some behavior B. The regime can thus control this group with a given certainty. Because of the economies of scale, the Instrumentarian need only guarantee a small percentage of response to this stimuli to reap enormous profit.

Sanctuary

The right to sanctuary is the human right to some place (real or virtual) that can be walled off from inspection. Surveillance capitalism, through always on devices, potentially removes all sense of sanctuary. Every conceivable space is monitored through “smart” bathroom scales, “smart” televisions, “smart” smoke alarms, robot vacuum cleaners. In this theory, any device is loaded with surplus sensing that reports on and records personal and social interaction far beyond the purpose of the device. That is, your Echo records you even though you might have just purchased it to play your favorite music. For the surveillance capitalist, the device more than pays for itself by providing precious behavioral insights that might have otherwise been hidden in “sanctuary”.

Holes in Zuboff’s analysis

As I was reading, I noticed three glaring oversights in Zuboff’s analysis.

Instrumentarianism is explicitly violent

Zuboff makes the claim that Instrumentarianism does not require violence for control. But I still leave her book with a concern about the potential for violence that Instrumentarianism and Surveillance Capitalism empowers. To quote from the announcement for the upcoming Data for Black Lives Conference

Our work is grounded in a clear call to action – to Abolish Big Data means to dismantle the structures that concentrate the power of data into the hands of a few. These structures are political, financial, and they are increasingly violent.

increasingly violent. We just have to reflect on the way in whic digital manipulation and capitalist interest have coalesced in state supported violence in support of India’s Citizenship Amendment Act The 2019 El Paso shooting or the complex interplay of anti-Black policing with surveillance technologies and corporate interests in tools like Amazon Rekognition illustrate the complex ways in which violence has and could play out in surveillance capitalism.

Impact upon the marginalized is profound

Zuboff does not discuss the ways in which surveillance capitalism is especially exploitative of marginalized populations. Absent was a discussion of how companies like Palantir and private prisons exploit the incarcerated, including asylum seekers. Absent was the discussion of how Instrumentarianism is particularly pernicious and violent with respect to marginalized populations including the homeless, the undocumented, communities of color, the poor, and LGBTQ communities. There are important voices that are startlingly absent in Zuboff’s analysis.

Data sharing can be empowering

Zuboff doesn’t like the idea of the “hive mind” — mass collection of patterns of activity. When people of a group commit to a collective model, it can be a source of empowerment. Many of the policing abuses were identified because everyday people were able to pool knowledge and data – shared spreadsheets documenting the high incidence of stop and frisk, shared cell phone videos of police shootings if innocent citizens. Class action lawsuits against workplace discrimination and the willingness of people to participate in studies to identify housing discrimination are two activities that come to mind in which a “hive mind” (the group discovery of adverse behavior on the grand scale) is able to benefit the individual and the community. I think that she is arguing for transparency – what are the power relationships that the technology is reinforcing – as opposed to the technology itself.

A plan for action

Articulating the answer to “where do we go from here” is where the book falls short. Zuboff claims (convincingly) that nothing less than human agency is at stake. As far as I could tell, Zuboff’s only suggestion is to trust that the EU’s General Data Protection Regulation (GDPR) would come to our rescue. I think that’s too precious a right to trust just the GDPR.

So what to do?

Here are a few thoughts. Got more? Please chime in!

Unite and act

Ruha Benjamin in a recent talk cited an example of how students in a high school refused have their education mediated by a Facebook instructional platform. They demanded face to face instruction with human teachers and they refused to have all of their learning activity be rendered into the Facebook cloud. They lead a successful campaign for real instruction from real teachers and won. According to Benjamin, the most successful campaigns against surveillance capitalism require collective effort – the technology is too ubiquitous for the single individual to have much impact. A protest staged today at a dozen colleges against the use of facial recognition on campus emphasizes the power of collective action.

So you’re never too old or young or too math or technology challenged to act and make a difference. I think the most powerful “next step” is to think through what to do with your family, friends, neighbors, classmates, community. Start locally.

Educate yourself

There are a host of recent books, articles, posts, and classes that provide a more complete view and history of how to deal with surveillance capitalism. A sampling includes:

Find or create community

There are professional and community organizations that are thinking through ways to rethink data science and other technologies ethically:

Advocate to center ethics in computing education

Computing is becoming more common in primary school, but in depth education on ethics and computing is still rare. It should be a central component of computing instruction. I’ve come across just a handful of classes on ethical design of data platforms. The topics usually appear in advanced classes or as the final lecture in course on machine learning or data science. Given the immense impact that even the simplest social app can have, we should advocate ethics having a central role in computer science and related fields. Talk to your friends in academia, the high schools and grade schools in your neighborhood. Or just…

Teach

A lot of the technology at the core of surveillance capitalism is available to everyday people in spreadsheet packages and cloud environments. The same tools and algorithms at the core of the surveillance regime can be “flipped” to identify and counter manipulative pricing, discriminatory and racist patterns. The groundbreaking work of Rediet Abebe demonstrates the potential of the good that can happen when the tools of data science are used by every day people to improve their lives.

Advocate for transparency and ethical deployment of software systems

Even inside Google, employees were able to advocate for model cards that explain to cloud service users how machine learning models are trained, how they might be biased. Google employees also raised questions about many of the company’s instrumentarianist practices. Certainly, advocating for transparency is one move that can insure that users are provided basic protections. There is still little in the way of openness and ethical training that is provided to users of online experimentation platforms like Optimizely

Build an inclusive workplace

People continue to argue that if we include the perspectives of the marginalized in the development of these platforms, we may be less prone to rush to deploy them in profoundly abusive ways. The presence of marginalized voices in the large surveillance companies remains at unrepresentative levels. Beyond hiring practices, we’ve yet to see wide-scale development of community governance structures implemented. As a data professional advocate for equity and inclusion.

Understand the humanity of your customers

It can be easy sometimes for the data professionals to think of “customers” as a probability distribution, a score, or a click. To use Zuboff’s terminology, our view of our social relationship with the people that use our systems have been subject to a kind of rendition as well. The data that you’re looking at makes it hard to associate a human being with those numbers, much less connect with one. Further, even if you are directly connected with a front end product (for example, making suggestions of videos to watch), you’re more often that not playing a game of aggregates – you’re looking at an abstraction of the actions of millions of people.

Final words

Surveillance Capitalism is a worthwhile read despite it’s flaws. We live in a critical time for data science and ultimately it will be up to all of us to determine what direction it takes.

A/B testingAlgorithmsData Science

Christo Wilson Discusses the Ethics of Online Behavioral Experiments

If your company runs A/B tests involving it’s user community, this talk is a must see. Christo Wilson at Northeastern University discusses an analysis his lab ran on how companies use the Optimizely platform to conduct online experiments. Although these experiments tend to mostly be innocuous, there’s a tremendous need for transparency and mechanisms for accountability. How is your company addressing this?

Data for Breakfast

On May 1 of 2019, Dr. Christo Wilson gave a talk on his investigation into online behavioral experiments. The talk was based on a piper entitled Who’s the Guinea Pig? Investigating Online A/B/n Tests in-the-Wild, which he and his students gave at the 2019 ACM Conference on Fairness, Accountability, and Transparency in Atlanta, Georgia.

Online behavioral experiments (OBEs) are studies (aka A/B/n tests) that people conduct on websites to gain insight into their users’ preferences. Users typically aren’t asked for consent and these studies are typically benign. Typically an OBE will explore questions such as whether changing the background color influences how the user interacts with the site or whether the user is more likely to read an article if the font is slightly larger.

Sometimes, these studies cross ethical boundaries. For example, Facebook conducted an ethically problematic experiment designed to manipulate the emotional state of its users

View original post 948 more words

AIinclusionData Science

Black In AI workshop call for papers

If you are a student, researcher, or professor at a Historically Black College or University and work actively in data science, machine learning, or artificial intelligence, please consider submitting a paper to the 2019 Black in AI workshop. The deadline is now August 7 — I’d encourage submission even (especially!!) if your research and ideas are still coming together. There are also travel grants available and I’ll post that application soon.

The workshop occurs during the 2019 neurlps conference (this is probably the most attending conference on deep learning and other AI architectures). The specific goal of the workshop is to encourage involvement of people from Africa and the African diaspora in the AI field, and to promote research that benefits (and does no harm to) the global Black community.

Paper submission extended deadline: Tue August 7, 2019 11:00 PM UTC

Submit at: https://cmt3.research.microsoft.com/BLACKINAI2019

The site will start accepting submissions on July 7th.

No extensions will be offered for submissions.

We invite submissions for the Third Black in AI Workshop (co-located with NeurIPS). We welcome research work in artificial intelligence, computational neuroscience, and its applications. These include, but are not limited to, deep learning,  knowledge reasoning, machine learning, multi-agent systems, statistical reasoning, theory, computer vision, natural language processing, robotics, as well as applications of AI to other domains such as health and education, and submissions concerning fairness, ethics, and transparency in AI. 

Papers may introduce new theory, methodology, applications or product demonstrations. 

We also welcome position papers that synthesize existing work, identify future directions, or inform on neglected/abandoned areas where AI could be impactful. Examples are work on AI & Arts, AI & Policy, etc.

Submission will fall into one of these 4 tracks:

  1. Machine learning Algorithms
  2. Applications of AI 
  3. Position papers
  4. Product demonstrations

Work may be previously published, completed, or ongoing. The workshop will not publish proceedings. We encourage all Black researchers in areas related to AI to submit their work. They need not to be first author of the work.

Formatting instructions

All submissions must be in PDF format. Submissions are limited to two content pages, including all figures and tables. An additional page containing only references is allowed. Submissions should be in a single column, typeset using 11-point or larger fonts and have at least 1-inch margin all around. Submissions that do not follow these guidelines risk being rejected without consideration of their merits. 

Double-blinded reviews

Submissions will be peer-reviewed by at least 2 reviewers, in addition to an area chair. The reviewing process will be double-blinded at the level of the reviewers. As an author, you are responsible for anonymizing your submission. In particular, you should not include author names, author affiliations, or acknowledgements in your submission and you should avoid providing any other identifying information.

Travel grants

Use this link to apply for travel grants to the conference. They are available for eligible attendees, and should be submitted by  Wed July 31, 2019 11:00 PM UTC at the latest (Note that this is one day after the paper submission deadline).

Content guidelines

Submissions must state the research problem, motivation, and contribution. Submissions must be self-contained and include all figures, tables, and references. 

Here are a set of good sample papers from 2017: sample papers 

Questions? Contact us at bai2019@blackinai.org.

BooksData ScienceHistorically Black Colleges

Black data science book giveaway

The Atlanta University Center Consortium — the umbrella organization of Morehouse, Spelman, Clark Atlanta University, and Morehouse School of Medicine — just launched a Data Science Initiative. To celebrate, I am giving away two books!

Here’s an excerpt from the announcement:

The AUCC Data Science Initiative brings together the collective talents and innovation of computer science professors from Morehouse College and other AUCC campuses into an academic program that will be the first of its kind for our students,” said David A. Thomas, president of Morehouse College. “Our campuses will soon produce hundreds of students annually who will be well-equipped to compete internationally for lucrative jobs in data science. This effort, thanks to UnitedHealth Group’s generous donation, is an example of the excellence that results when we come together as a community to address national issues such as the disparity among minorities working in STEM.

Announcement of the Atlanta University Center data science initiative at http://d4bl.org/conference.html

To commemorate and honor the founding of this initiative, I’ve set up two book giveaways at Amazon. The first book is W. E. B. Du Bois’s Data Portraits: Visualizing Black America. W.E.B. DuBois was a sociologist who taught at the Atlanta University Center. His visualizations of African American life in the early 20th century still set the standard for data visualization and this book is a collection of visualizations that he and his Atlanta University students produced for the 1900 Paris Exposition. If Atlanta University students were doing amazing data science 100 years ago without laptops, we can only guess what the future holds. Click this link to get your book.

The second book is Captivating Technology: Race, Carceral Technoscience, and Liberatory Imagination in Everyday Life by Dr. Ruha Benjamin, a contemporary African American scholar at Princeton whose work addresses “the social dimensions of science, technology, and medicine”. Click this link to get a copy of Captivating Technology.

There is only one copy per book available so the first person to click gets the book.

If you want to know more about the work being done by Black data scientists, you should check out the DATA FOR BLACK LIVES III conference.

I’ll close with one of the sessions from the first Data for Black Lives conference. Where are the Black (data) scientists? Definitely at the Atlanta University Center!

Data ScienceHistorySocial Justice

Remembering Bill Jenkins

As Chelsea Manning is again sent to jail for refusing to abandon basic press freedoms, I am reminded of Bill Jenkins.

Mr Jenkins passed away recently. He was an epidemiologist (and Morehouse graduate ) who bravely exposed the horrific Tuskegee experiments, as Ms Manning exposed egregious human rights violations that occurred during US military operations.

If you are not aware of the Tuskegee experiment, the US Health Service allowed Black men to be untreated for sexually transmitted diseases for three decades. It was a controlled experiment to determine the effects of untreated syphilis. The participants were all poor Black sharecroppers — men recruited through Tuskegee University, believing that they were getting free healthcare in exchange for helping to develop a drug to fight “bad blood”. None of those who had syphilis were given access to penicillin, even after the study supposedly ended. Many perished or suffered irreversible harm.

One outcome was the establishment of informed consent, and other ethical practices we take for granted when we walk into a doctor’s office, or signup for a clinical trial. Jenkins learned of the study, and started asking questions, despite being told to ignore it, or just “look the other way”. In the current climate, Mr Jenkins might have well faced prison. Some principles are worth suffering for, some causes are just that important.

Thank you Bill Jenkins, thank you Chelsea, and thank you to the others doing the right thing.

Black History MonthData ScienceHistory

Black history month book giveaway!

In honor of the U.S. Black History Month commemoration, I am giving away two copies of the book
W. E. B. Du Bois’s Data Portraits: Visualizing Black America.

What do you have to do to be a winner? Be one of the first to create and send in a visualization inspired by the set of infographics on Black America that Dr. W.E.B. DuBois developed for the 1900 Paris Exposition.

Any visualization that you implement that is relevant to the peoples of SubSaharan Africa or the SubSaharan diaspora is relevant too!

Just email me or post as a comment. The first two submissions get the book, and I’ll try to hook up something outstanding works too!