Category: Data Science

Data ScienceHistorySocial Justice

Remembering Bill Jenkins

As Chelsea Manning is again sent to jail for refusing to abandon basic press freedoms, I am reminded of Bill Jenkins.

Mr Jenkins passed away recently. He was an epidemiologist (and Morehouse graduate ) who bravely exposed the horrific Tuskegee experiments, as Ms Manning exposed egregious human rights violations that occurred during US military operations.

If you are not aware of the Tuskegee experiment, the US Health Service allowed Black men to be untreated for sexually transmitted diseases for three decades. It was a controlled experiment to determine the effects of untreated syphilis. The participants were all poor Black sharecroppers — men recruited through Tuskegee University, believing that they were getting free healthcare in exchange for helping to develop a drug to fight “bad blood”. None of those who had syphilis were given access to penicillin, even after the study supposedly ended. Many perished or suffered irreversible harm.

One outcome was the establishment of informed consent, and other ethical practices we take for granted when we walk into a doctor’s office, or signup for a clinical trial. Jenkins learned of the study, and started asking questions, despite being told to ignore it, or just “look the other way”. In the current climate, Mr Jenkins might have well faced prison. Some principles are worth suffering for, some causes are just that important.

Thank you Bill Jenkins, thank you Chelsea, and thank you to the others doing the right thing.

Black History MonthData ScienceHistory

Black history month book giveaway!

In honor of the U.S. Black History Month commemoration, I am giving away two copies of the book
W. E. B. Du Bois’s Data Portraits: Visualizing Black America.

What do you have to do to be a winner? Be one of the first to create and send in a visualization inspired by the set of infographics on Black America that Dr. W.E.B. DuBois developed for the 1900 Paris Exposition.

Any visualization that you implement that is relevant to the peoples of SubSaharan Africa or the SubSaharan diaspora is relevant too!

Just email me or post as a comment. The first two submissions get the book, and I’ll try to hook up something outstanding works too!

BooksData ScienceMigration

The Little Shop of Data Science Stories

I am happy to announce that The Little Shop of Stories bookstore in Decatur, GA is awesome for data science! A few blocks away from us, it is such a regional treasure for children’s books and events. Diane has brought game changing books, authors, and programs to Atlanta and environs.

But last week I was ecstatic when I came across a treasure of data visualization on the shelves.

Who knew data science could bring this much joy?

The book I am referring to is W. E. B. Du Bois’s Data Portraits: Visualizing Black America. But if you live in the Atlanta area, please get it at Little Shop — Amazon can make it without your dollars.

You may be aware of Dr W.E.B. DuBois work in championing and defining civil rights for peoples of the African diaspora during the first half of the twentieth century. You might be aware of his book The Souls of Black Folk , his leadership of the NAACP, and his intellectual nurturing of African independence efforts. But his work at the Atlanta University Center (now Clark Atlanta University) stands the test of time for how to do good data visualization.

Visualizing Black America pulls together the amazing visualizations that he and his AUC students developed for the 1900 Paris Exposition. They are beautiful, innovative, meticulous and tell the story of Black America at the beginning of the 20th century.

that he and his AUC students developed for the 1900 Paris Exposition. They are beautiful, innovative, meticulous and tell the story of Black America at the beginning of the 20th century.

We are so lucky in the Atlanta area to have a bookstore with the vision to stock this treasure. Stop through if you are in the ATL.

AlgorithmsData Science

The science of data science

The Foundations of Data Science Boot Camp given last week (August 27 –  31) at the Simons Institute in Berkeley explored how pure mathematics and theoretical computer science are providing actionable insights that the working data scientist can use — or at least ponder.

I found the talk below by Ravi Kannan useful in pointing out how dimensionality reduction techniques like SVD can be used to set clustering up for success. When dealing with immense data sets, this can be the difference between useful or garbage clusters.

I also thought that David P. Woodruff‘s lecture on a dimensionality reduction technique called sketching was impressive for its clarity. As a data scientist or analysis, you’re often in a dilemma when your Impala cluster runs out of memory for that critical model build — you may just have to sample from that terabyte pile of web pages. It is good to know that you have some math magic behind you when the time comes.

Santosh Vempala thinks the seminar was a better value than Netflix. I’m not sure about that, but those were some good lectures.

Data ScienceHistoryMigrationPoliticsSocial Justice

Back to Mississippi: Black migration in the 21st century

The recent election of Doug Jones to the U.S. senate in Alabama — thanks largely to African American turnout — got me thinking: What if the Black populations of Southern cities were to experience a dramatic increase? How many other elections would be impacted?

Does that seem far-fetched? Over a tenth of the Black population of the U.S. left the South during the first half of the last century.

They moved from the rural South to the North and West, hoping to escape race-based terrorism and find economic opportunity. The featured image, from the U.S. Library of Congress, is an infographic made in 1950 by the Census department about the migration. My grandparents were part of this movement — they left oppression in small town Georgia and Alabama hoping to find a (slightly) better situation in Atlanta.

As the U.S. census figure infographic below indicates, this migration — one wave in 1910 – 1940 and another wave coming 1940 – 1970 — was epic. Isabel Wilkerson’s book The Warmth of Other Suns is a gripping history of this Great Migration.

020_blackpop_northern_cities_horiz-01

The Great Migration, 1910 – 1970 from: US Census Bureau. (2012). Retrieved from https://www.census.gov/dataviz/visualizations/020/

 

A trend towards a reverse migration back to the South has been noted recently. In a 2011 story, the New York Times reported that in 2009, of the 44,000 people who left New York City, over half moved to the South. A more recent report by the Times, provocatively entitled  Racism Is Everywhere, So Why Not Move South? explores some of the rationale behind this movement. The sentiments echo the recent paper Individual Social Capital and Migration by Julie L. Hotchkiss and Anil Rupasingha.  Improved social capital — the sense that you are a somebody in the place that you live, that your life matters (or could matter someplace) is a powerful catalyst for movement.

The LinkedIn Workforce Report for January confirms that Southern cities are gaining workers at the expense of Northern cities, and this Redfin analysis reports that there has been some North to South migration. According to the LinkedIn Workforce Report, southern cities are still among the top ten in terms of job migration (at least amongst LinkedIn members). Thriving African American communities in cities like Atlanta and Jacksonville, lower costs of living, and the rise of these cities as technology centers are powerful draws.

To look at the potential political impact of a new reverse migration, I ran a few simulations. I assumed a similar reverse migration rate of 2% per year over out ten years. In my simulations, I assume that the main states from which African Americans migrate are New York, Illinois, Michigan, New Jersey, Indiana, Pennsylvania, Maryland, Ohio, and California — the main destinations of the Great Migration.  I assumed that the main destinations of the new migrants are among the states that people left during the initial Great Migration: Alabama, Florida, Georgia, Mississippi, and North Carolina. I could have arguably added Tennessee to this mix. I used a Dirichlet distribution to model the allocation of migrants to various destination states.

Let’s first revisit the 2016 election map

newplot (6)

Below are a couple of illustrative outcomes from my simulations. In most of the outcomes, Florida, Georgia, and North Carolina are the states in which the political outcome of the migration are felt most.

newplot (8)

Florida, Georgia, and North Carolina are impacted the most

newplot (7)

There’s still hope for Mississippi

Again, I let 10,000 simulations play out, sampling the allocation of migrants to destination states from a Dirichlet distribution.

To make the point a bit further, below is a bar chart showing the number of outcomes for each state over the 10,000 simulations in which Black voters had a decisive impact upon the presidential election (i.e. allocation of electoral college votes) for that state.

election_outcomes_nicer

The point though is not really predicting the dominance of one political party or the other, it is understanding the implication Black voter empowerment — how Black people are empowered to participate in decisions regarding the health, education, policing, and economic viability of their communities. Further, beyond just Black and White, it speaks to me as an opening to think about participatory multi-racial democracy. After all, there was a flash of time between the Civil War and the enactment of Jim Crow racialist laws  in which Citizens of Color of the South were actively involved in governance.

Although these are speculative simulations — for me they contain the seeds of a certain kind of hope. Perhaps the future is the past — but maybe we can mold the future in ways that are universally empowering.