Category: AI

AIConferencesinclusion

Notes from the Black In AI 2019 Workshop

In early December, I attended the Black In AI workshop (BAI), part of the NeurIPS AI conference held in Vancouver.

Timnet Gebru and Rediet Abebe founded BAI three years ago to address the near complete lack of Black and African voices at NeurIPS and other AI conferences.

Over that period, the organization has had a tremendous impact: participation has grown to several hundred attendees, it has spawned affiliated conferences like Deep Learning Indaba , it was instrumental in bringing the Eighth International Conference on Learning Representations to Ethiopia in 2020, and it has initiated a range of mentoring and training efforts across the African diaspora.

I spent few hours this year participating as an organizer (some coordination of remote presenters and travel grants). The talks were streamed and recorded here.

There is a lot that I learned by participating and it was an honor to work with the brilliant people who made the conference happen — I wanted to share some of what I’ve been able to think through in hopes that there might be some nuggets of value.

The interesting stuff happens at the margins

When I first started in AI, it was an area that existed on the margin of computer science. Neural networks were on the margin of that margin. I think that there is a lot of freedom and creativity that comes when one is open to just think and experiment — there is also the pressure of proving the viability of your position. You can find real innovation being birthed if you look carefully. When you hear talks put all your assumptions into question, then you know that you’ve probably arrived at the right place.

What I found then at Black In AI was a lot of work questioning basic assumptions of a field which has moved from the margin to the spotlight (literally half of the commercial booths at the NeurIPS were hedge funds).

There are three talks (among many ) that stood out for me in this respect.

Abeba Birhane: Rethinking the Ethical Foundations of AI

I had the privilege of hearing Abeba Birhane who was deservedly awarded the Best Paper.

There is a lot of work on bias in machine learning models — for example Assessing Social and Intersectional Biases in Contextualized Word Representations was presented a few days after Birhane’s talk. A lot of the “solutions” in the fairness literature focus on de-biasing of the training and inference process. But Birhane’s talk called into question the point of de biasing algorithms, probing the intent of these algorithms. Is the point to present decision processes that are unfair as fair? Is the point really to reify structural oppression — to put lipstick on a pig (to borrow the title of one paper) ? She is searching for the voices of the marginalized in artificial intelligence and machine learning.

To take a concrete example, many companies are using the text.io app to rewrite job descriptions to have less gender bias. But maybe identifying the bias is really more an indicator of internal structural patterns of oppression? But how do you get companies to address the internal gender issues that give rise to these biased job descriptions to begin with?

abeba

Her talks are recorded. Relational Ethics, starts at 20:30 into the presentation. Her talk at ML for the Developing World: Challenges and Risks starts 38:00 in. There is an accompanying blog post .

Matthew Kinney — Defending Black Twitter from Deepfakes

There was Matthew Kinney’s talk “Creative Red Teaming: Approaches to Addressing Bias and Misuse in Machine Learning” — an approach using deep learning to safeguard internet platforms from misinformation campaigns.

Kinney began looking at the Internet Research Agency‘s disinformation effort when it became apparent that Black Twitter was being targeted as part of voter suppression efforts. Since BAI, we’ve seen similar campaigns launched in support of India’s Citizen Amendment Act and other repressive efforts — these campaigns are likely to be a constant this year, making Kinney’s work all the more critical.

Less you think that the disinformation campaigns are just about the use of video manipulation, Kinney makes the point that misinformation based upon text generators like Open AI’s GPT-2 can be more harmful.

matthew

Sara Menker: Data Science for Agriculture

One of the other impactful talks was by Sarah Menker, CEO of Gro Intelligence — a company that does agricultural analytics. I was interested in how the data science team in particular manages rapid response to develop models in response to rapidly changing weather and farming conditions and also how they deal with a team that is split across Kenya and New York.

Sarah Menker’s talk starts 1:48 into the video.

Prominent themes

There were a number oral presentations at BAI are around speech and language processing — particularly the development of technology to support support Amharic, Tigre, Yoruba, and other African languages. I spoke with the founder of a small startup Latan who is working on Tigre translation. Healthcare and agriculture applications featured prominently.

malaria

Remote Presentations

A number of presenters were not able to make it, mostly due to visa issues (details of this below). The diversity of their talks are indicative of the richness of the research community. Here’s a recording of Simba Nyatsanga’s talk on automatic video captioning

You can access the Black In AI 2019 Youtube channel to view the others.

Visa Privilege

One of the many issues that Black In AI has tackled was transportation exclusion. Many researchers from from Africa, South America, and the Caribbean lack either the institutional or personal resources that would enable a trip to Canada (or other destinations where computing conferences are frequently held). A large part of BAI’s fund raising effort is about putting the resources together to bridge that gap — travel grants for presenters and other attendees also provide airfare and lodging. This makes BAI one of the most economically inclusive workshops.

All that said, an on-going challenge down to the last minute was getting presenters to the conference.

We had nearly 40 presenters denied visas right off. Most of these were reversed once senior IRCC officials reviewed the applications, but for many, it came too late, in some cases the day that the conference was to start. In large part, denials and subsequent reversals seemed to hinge on a political calculus. Senior officials only became involved after pressure from Wired and BBC articles and members of the House of Commons, and various high profile AI researchers.

My analysis is that Canada wants itself perceived as an inclusive country with a progressive visa policy and is planning on building AI as a growth industry. Although these values may not be shared by individual in-consular staff, or maybe even the AI programs used for visa screening. This isn’t much the case in the U.S., where policies are in open opposition to fair visa access to persons from Africa, Islamic countries, and other locations outside of Europe, the U.S, Canada, and Australia.

Despite the reversals, there were other unexpected visa conundrums. Several participants flying through South Africa had to be provided with alternate tickets to deal with not having transit visas for Hong Kong. Several Nigerian presenters were price gouged by Turkish Airlines when trying to get on their flights. That is, they were presented with additional substantial visa fees at the gate. The complaints of stemming from these policies resulted in last week’s suspension of Turkish Airlines in Nigeria. Conference organizers had to scramble to find alternate flights home for those who flew on Turkish Airlines. I give these anecdotes only to highlight the immense privileges that those of us in the U.S., EU, and Canada enjoy in having relatively open and worry free travel.

Planning Distributed conferences

Pulling off the Black in AI workshop itself was the epitome of a distributed team in action. As we began dealing with the problem of managing visa rejections in Brazil and Nigeria, or just managing hotel payments and livestreams highlighted the need for coordination and process. There is a lot of process knowledge that I feel is unique to making such a trans-national, inclusive (language, gender identity, diverging racial categorizations) work. I wondered about the best ways to capture and curate knowledge.

On Having Allies

I was encouraged to see individuals come together in sincere, and supportive ways to bring about a wider view of what global collaboration could be. The coordinated effort by people in Women In AI and LatinX in AI was amazing. The tireless, round the clock efforts by those both famous and invisible, the commitments to encouraging and supporting the emergence of new scholars, developers, artists, thinkers was uplifting in spite of so many other causes for concern. I don’t doubt that there is an AI bubble, or that in a few years the generative networks and transformers will be pedestrian as rice cookers or smoke alarms — less AI than just another kind of device or program. What I think is that getting people together from across the globe, really from across the globe — from across the economic and gender and racial divides — is really how important and unimagined change happens.

AIConferencesinclusion

Black In AI paper submission deadline extended

The Black in AI is a workshop that centers the work of Black AI researchers and practitioners from across the globe. The paper submission deadline for the 2019 workshop has been extended to August 7. This is it’s third year.

I’d encourage submission even (especially!!) if your research and ideas are still coming together. There are a limited number of travel grants available which you can apply for here. You don’t have to have an accepted paper to apply for the travel grant.

AIAlgorithms

Gödel, Incompleteness and Privacy

Avi Wigderson has a nice talk on how Kurt Gödel’s incompleteness theorems bound what can and cannot be computed (or proved) by simple programs.

In this recent post I talked about how Gödel’s theorem was used to show that for certain kinds of learning algorithms, we can’t know definitively whether the algorithm learns the right thing or not. This is kind of equivalent to saying that we can’t definitively know whether there will be a gap in the program’s learning.

The flip side of this, as Wigderson points out, is that it is probably a good thing that there are certain things that are too hard for a program to figure out. This hardness is the key to privacy — the harder it is to decipher an encrypted message, the more you can have confidence in keeping the message content private. This principle is at the core of what allows e-commerce.

Perhaps there is a way to structure our online communications or transactions so that learning our behavior — in pernicious ways — becomes impossibly hard. This might diffuse a lot of the emerging fears surrounding AI.

Wigderson makes his point — what he calls the “Unreasonable usefulness of hard problems” — about 30 minutes into the talk.

Check out the “Unreasonable usefulness of hard problems” 30 minutes in.
AISocial Justice

Which cities use facial recognition?

San Francisco famously banned the use of facial recognition by police and other municipal authorities on May 14th of this year. Citizens in Detroit angered by the use of facial recognition in Project Green Light forced a moratorium on its use. Although Orlando has halted for an immediate deployment, a trial is being conducted involving police officers only. According to the Natalie Bednarz, the Digital Communications Supervisor in the Orlando office of Communications and Neighborhood Relations

if the City of Orlando Police Department decides to ultimately implement official use of the technology, City staff would explore procurement and develop a policy governing the technology

Email communication from the Orlando office of Communications and Neighborhood Relations

This report by Georgetown Law School reports that Chicago uses facial recognition in policing and throughout its mass transit systems.

Beyond surveillance cameras, several cities have been forced by ICE to turn over drivers license photos to ICE’s facial recognition software to identify persons who are not U.S. citizens. Not only is facial recognition software notoriously bad at identifying faces of African Americans, but systems score poorly in identifying people who identify ethnically as Latinx.

The Georgetown Law School in 2016 put together a list of city and state governments across the U.S. that use facial recognition.

Should facial recognition be banned altogether in policing?

AIinclusionData Science

Black In AI workshop call for papers

If you are a student, researcher, or professor at a Historically Black College or University and work actively in data science, machine learning, or artificial intelligence, please consider submitting a paper to the 2019 Black in AI workshop. The deadline is now August 7 — I’d encourage submission even (especially!!) if your research and ideas are still coming together. There are also travel grants available and I’ll post that application soon.

The workshop occurs during the 2019 neurlps conference (this is probably the most attending conference on deep learning and other AI architectures). The specific goal of the workshop is to encourage involvement of people from Africa and the African diaspora in the AI field, and to promote research that benefits (and does no harm to) the global Black community.

Paper submission extended deadline: Tue August 7, 2019 11:00 PM UTC

Submit at: https://cmt3.research.microsoft.com/BLACKINAI2019

The site will start accepting submissions on July 7th.

No extensions will be offered for submissions.

We invite submissions for the Third Black in AI Workshop (co-located with NeurIPS). We welcome research work in artificial intelligence, computational neuroscience, and its applications. These include, but are not limited to, deep learning,  knowledge reasoning, machine learning, multi-agent systems, statistical reasoning, theory, computer vision, natural language processing, robotics, as well as applications of AI to other domains such as health and education, and submissions concerning fairness, ethics, and transparency in AI. 

Papers may introduce new theory, methodology, applications or product demonstrations. 

We also welcome position papers that synthesize existing work, identify future directions, or inform on neglected/abandoned areas where AI could be impactful. Examples are work on AI & Arts, AI & Policy, etc.

Submission will fall into one of these 4 tracks:

  1. Machine learning Algorithms
  2. Applications of AI 
  3. Position papers
  4. Product demonstrations

Work may be previously published, completed, or ongoing. The workshop will not publish proceedings. We encourage all Black researchers in areas related to AI to submit their work. They need not to be first author of the work.

Formatting instructions

All submissions must be in PDF format. Submissions are limited to two content pages, including all figures and tables. An additional page containing only references is allowed. Submissions should be in a single column, typeset using 11-point or larger fonts and have at least 1-inch margin all around. Submissions that do not follow these guidelines risk being rejected without consideration of their merits. 

Double-blinded reviews

Submissions will be peer-reviewed by at least 2 reviewers, in addition to an area chair. The reviewing process will be double-blinded at the level of the reviewers. As an author, you are responsible for anonymizing your submission. In particular, you should not include author names, author affiliations, or acknowledgements in your submission and you should avoid providing any other identifying information.

Travel grants

Use this link to apply for travel grants to the conference. They are available for eligible attendees, and should be submitted by  Wed July 31, 2019 11:00 PM UTC at the latest (Note that this is one day after the paper submission deadline).

Content guidelines

Submissions must state the research problem, motivation, and contribution. Submissions must be self-contained and include all figures, tables, and references. 

Here are a set of good sample papers from 2017: sample papers 

Questions? Contact us at bai2019@blackinai.org.

AIAlgorithmsMachine Learning

Gödel, Incompletness, and AI

Kurt Gödel was one of the great logicians of the 20th century. Although he passed away in 1978, his work is now impacting what we can know about today’s latest A.I. algorithms.

Gödel’s most significant contribution was probably his two Incompleteness Theorems. In essence they state that the standard machinery of mathematical reasoning are incapable of proving all of the true mathematical statements that could be formulated. A mathematician would say that that the consistency (or ability to determine which of any two contradictory statements is true) of standard set theory (a collection of axioms know as Zermelo–Fraenkel set theory) is independent of ZFC. That is, there some true things which you just can’t prove with math.

In a sense, this is like the recent U.S. Supreme Court decision on political gerrymandering. The court ruled “that partisan gerrymandering claims present political questions beyond the reach of the federal courts”. Yeah, the court stuck their heads in the sand, but ZFC just has no way to tell truth from falsity in certain cases. Gödel gives mathematical formal systems a pass.

It now looks like Gödel has rendered his ruling on machine learning.

A lot of the deep learning algorithms that enable Google translate and self driving cars work amazingly well, but there’s not a lot of theory that explains why they work so well — a lot of the advances over the past ten years amount to neural network hacking. Computer scientists are actively looking at ways of figuring out what machines can learn, and whether there are efficient algorithms for doing so. There was a recent ICML workshop devoted to the theory of deep learning and the Simons Institute is running an institute on the theoretical foundations of deep learning this summer.

However, in a recent paper entitled Learnability can be undecidable Shai Ben-David, Amir Yehudayoff, Shay Moran and colleagues showed that there is at least one generalized learning formulation which is undecidable. That is, although the particular algorithm might learn to predict effectively, you can’t prove that it will.

They looked at a particular kind of learning that in which the algorithm tries to learn a function that maximizes the expected value of some metric. The authors chose as a motivating example the task picking the ads to run on a website, given that the audience can be segmented into a finite set
of user types. Using what amounts to server logs, the learning function has to output a scoring function that says which ad to show given some information on the user. The scoring function learned has to maximize the number of ad views by looking at the results of previous views. This kind of problem obviously comes up a lot in the real world — so much so that there is a whole class of algorithms Expectation Maximization that have been developed around this framework.

One of the successes of theoretical machine learning is realizing that you can speak about a learning function in terms of a single number called the VC dimension which is roughly equivalent to the number of classes the items that you wish to classify can be broken into. They also cleverly use the fact that machine learning is equivalent to compression.

Think of it this way. If you magically could store all of the possible entries in the server log, you could just look up what previous users had done and base your decision (which ad to show) based on what the previous user had done. But chances are that since many of the users who are cyclists liked bicycle ads, you don’t need to store all of the responses for users who are cyclist to guess accurately which ad to show someone who is a cyclist. Compression amounts to successively reducing information you store (training data or features) as long as your algorithm performs acceptably.

The authors defined a compression scheme (the equivalent of a learning function) and were then able to link the compression scheme to incompleteness. They were able to show that the scheme works if and only if a particular kind of undecidable hypothesis called the continuum hypothesis is true. Since Gödel proved (well, actually developed the machinery to prove) that we can’t decide whether the continuum hypothesis is true or false, we can’t really say whether things can be learned using this method. That is, we may be able to learn an ad placer in practice, but we can’t use this particular machinery to prove that it will always find the best answer. Machine learning and A.I. are by definition intractable problems, where we mostly rely on simple algorithms to give results that are good enough — but having certainty is always good.

Although the authors caution that it is a restricted case and other formulations might lead to better results, there are some two other significant consequences I can see. First, the compression scheme they develop is precisely the same structure that are used in Generative Adversarial Networks (GANs). The GAN neural network is commonly used to generate fake faces and used in photo apps like Pikazo http://www.pikazoapp.com/. The implication of this research is that we don’t have a good way to prove that a GAN will eventually learn something useful. The second implication is that there may be no provable way from guaranteeing that popular algorithms like Expectation Maximization will avoid optimization traps. The work continues

It may be no coincidence that the Gödel Institute is in the same complex of buildings as the Vienna University AI institute.

Next door to the Gödel Institute is the Vienna AI institute

Avi Wigderson has a nice talk about the connection between Gödel’s theorems and computation. If we can’t event prove that a program will be bug free, then we shouldn’t be too surprised that we can’t prove that a program learns the right thing.

A nice talk by Avi Wigderson. Sometimes hacking is all you got.
AIAlgorithmsAtlanta

The city of Atlanta doesn’t use facial recognition — so why does Delta Airlines?

I recently made an inquiry with the City of Atlanta’s Mayor’s office as to the use of facial recognition software. I received the following reply on the Mayor’s behalf from the Atlanta Police Department

The Atlanta Police Department does not currently use nor the capability to perform facial recognition. As we do not have the capability nor sought the use of, we not have specific legislation design for or around facial recognition technology.

Delta Airlines, a company based in Atlanta, continues to promote the use of facial recognition software, and according to this wired article makes it difficult for citizens to opt out of its use.

There are several concerns with use of facial recognition technology, succinctly laid out by the Electronic Frontier Foundation:

Face recognition is a method of identifying or verifying the identity of an individual using their face. Face recognition systems can be used to identify people in photos, video, or in real-time. Law enforcement may also use mobile devices to identify people during police stops. 

But face recognition data can be prone to error, which can implicate people for crimes they haven’t committed. Facial recognition software is particularly bad at recognizing African Americans and other ethnic minorities, women, and young people, often misidentifying or failing to identify them, disparately impacting certain groups.

Additionally, face recognition has been used to target people engaging in protected speech

Electronic Frontier Foundation at https://www.eff.org/pages/face-recognition

So in other words, the technology has the potential for free assembly and privacy abuses and because the algorithms used are typically less accurate for people of color (POC), the potential abuses are multiplied.

There are on going dialogs (here is the U.S. House discussion on the impact on Civil Liberties) on when/how/if to deploy this technology.

Do me a favor? If you happen to fly Delta, or are a member of their frequent flyer programs, could you kindly ask for non-facial recognition check in? Then asking for more transparency on the use and audit of the software used would be an important step forward.