Here’s analysis I recently did of gender and racial bias in cloud sentiment analysis APIs. The takeaway: test before you leap.

Data for Breakfast

At Automattic, I work in a group that frequently uses natural language processing (NLP) — a kind of artificial intelligence (AI) — that tries to understand text. We have used NLP for suggesting domain names, to tag some support interactions, and to understand the different verticals that our customers build sites for.

In the course of building these tools, we have often encountered and have had to work around gender and racial bias that gets baked into the machine learning models that we use for text analysis. This is an acknowledged problem confronting NLP and the solutions are not simple. Building fair and non-toxic NLP systems requires constant vigilance, and we are continuously auditing new platforms and models to make sure that the users of our systems are not adversely impacted.

In the course of these audits, I’ve found evidence of gender and racial bias in the sentiment analysis

View original post 2,399 more words

Posted by charlescearl

Data scientist at Automattic.com.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.