We Must Fix AI’s Diversity Problem

Leave a comment
Current affairs
Pepper, a white robot developed by Softbank Robotics. Photo by Alex Knight on Unsplash.

Our technology industry has a diversity problem. This in itself is not a new issue. But the subset of our industry working on artificial intelligence (AI) has a particularly acute diversity problem, and it is having a negative impact on the lives of millions of people, all around the world.

Since 2014, Information is Beautiful have maintained a visualization of the published diversity statistics for some of the world’s largest technology companies. Despite the 2017 US population being 51 percent female, at that time Nvidia only employed 17 percent female staff, Intel and Microsoft 26 percent, Dell 28 percent, and Google, Salesforce and YouTube 31 percent. This reporting also didn’t account for those that identify as non-binary or transgender, nor the fact that the diversity gap widens at the most senior levels of companies: a 2018 report found that only 10 percent of tech executives are female.

The diversity problem goes beyond just gender. Racial diversity in technology is poor, and even less is being done about it. Consider “colorless diversity“: a phenomena where the industry is not investing equally in addressing the imbalance of people of color. In 2015, Erica Joy Baker highlighted that “whether by design or by inertia, the favor [of pro-diversity work] seems to land on white women in particular.” In fact, that year, Baker attended a Salesforce leadership conference in San Francisco and was in the audience for the “Building an Inclusive Workplace” panel. During the panel, Salesforce co-founder and Chief Technology Officer Parker Harris stated that:

“Well, right now I’m focused on women, you know, and it’s back to Marc [Benioff]’s focus on priorities. I have employees, that are, you know, other types of diversity coming to me and saying well why aren’t we focused on these other areas as well, and I said yes we should focus on them but, you know, the phrase we use internally is ‘If everything is important, then nothing is important.’

This may be a lesson in prioritization for shipping software and meeting sales targets, but it is a single-minded approach that discriminates against the under-represented groups that need the most help.

Fast forward four years, and a new report has shed light on the current state of diversity in undoubtedly the hottest area of our industry: AI. The paper Discriminating Systems: Gender, Race and Power in AI was published by the AI Now Institute of New York University in April 2019. It highlights the scale – and most shockingly, the global impact – of the diversity problem in the institutions doing work at the cutting edge of this field.

The authors of the report state that “recent studies found only 18% of authors at leading AI conferences are women and more than 80% of AI professors are men”. Similar statistics in AI are observed outside of academia. The paper reports that “women comprise only 15% of AI research staff at Facebook and 10% at Google.” Extending the diversity statistics to include race, the paper also notes that “only 2.5% of Google’s workforce is black, while Facebook and Microsoft are each at 4%”.

We have seen how a lack of diversity can stifle innovation, decrease employee retention and in the worst case, allow cases of racism and harassment to go unpunished. However, the lack of diversity in AI can have a harmful effect on the whole of society.

Diversity issues deployed

Consider the following video from 2009. It shows how the face tracking functionality of a Hewlett Packard webcam seems to be unable to recognize black faces.

Six years later it seemed that the latest Google AI classifiers were still making errors, albeit in an even more racist manner.

So how does this happen? To understand we have to look at the process that is followed in order to create image classification software. Typically, AI engineers will start by finding as many labelled examples of images as they possibly can. By labelled, we mean that a human has looked at a picture of a cat and has marked it with the word “cat”. A large collection of these labelled images is called a corpus.

There are many corpora that are publicly accessible, such as ImageNet, which is maintained by the Vision Lab at Stanford University. If you type the word “dog” into ImageNet, you’ll see labelled images of dogs come back in your search. There are similar corpora of image data available online for AI researchers and engineers to use depending on their desired application, such as MNIST for images of handwritten digits and the more general Open Images dataset.

With access to large amounts of accurately labelled data, AI engineers can create classifiers by training them on the examples to be able to recognize similar images. Sometimes existing pre-trained models are extended to cover new inputs using a process called transfer learning. It follows that with enough example images of dogs, a classifier could be created that can take a previously unseen picture of a dog and label it correctly. Therefore the question is how is it possible for black faces to not be recognized by webcam software, or for pictures of two black people to be classified as gorillas? Are racist labels being assigned to input data, or is there something more subtle at play?

The answer is that the racial bias of these classifiers typically goes unchallenged. This effect compounds in two places. Firstly, the data sets that are being used for training the classifiers are typically not representative of real world diversity, thus encoding bias. Secondly, the predominantly racially homogenous staff do not thoroughly and sensitively test their work on images of people more diverse than themselves.

The AI Now paper highlights that a commonly used public dataset of faces called Labeled Faces in the Wild, maintained by the University of Massachusetts, Amherst, has only 7 percent black faces and 22.5 percent female faces, thus making a classifier trained on these images less able to identify women and people of color. How did this dataset end up being so biased? To find out, we must look at how the images were collected.

Data collection was performed automatically from images featured on news websites in the early 2000s. Thus, the corpus “can be understood as a reflection of early 2000s social hierarchy, as reproduced through visual media”. For example, it contains over 500 pictures of the then US President George W. Bush. Although this dataset seems like an excellent opportunity to create a “real world” facial analysis classifier by sampling seemingly random images, these classifiers end up highlighting the lack of diversity in the media at that time.

This phenomena is not limited to classifiers built from the Labelled Faces in the Wild dataset. A 2018 paper has shown that a number of facial analysis techniques misclassify darker-skinned females up to 34.7 percent of the time compared to lighter-skinned males only being misclassified 0.8 percent. Clearly, something is wrong here. What can we do?

The full picture of bias

In her keynote talk at NIPS 2017, Kate Crawford, the co-director and co-founder of the AI Now Institute, explores how bias in the AI systems that we create today are propagating historical and present discriminatory behavior that we see in society itself. Given that AI systems are becoming prevalent in ways that truly affect the outcome of people’s lives, fixing bias is an important issue for the industry to address. Crawford defines bias as a skew that causes a type of harm. She classifies the effect of bias into two broad areas: harms of allocation and harms of representation.

Harms of allocation are where an AI system allocates or withholds opportunities or resources to or from certain groups. For example, AI may be used to make automated decisions on loans and mortgages. It may automatically screen job applicants for their suitability or their criminal background. It may diagnose illness and thus decide on treatment. It may even inform the police as to which neighborhoods they should be spending their time performing “stop and search” operations.

Development of these systems begin with a labelled data set in order to train a model. We have seen that these data sets can encode existing biases that exist within society. If it just so happens that historically people under thirty or African American women often get turned down for mortgages, then AI trained on this data may unfairly encode this bias and thus cause a harm of allocation to future applicants based on their age, sex or gender. Mathematician and author Cathy O’Neil wrote how personality tests, of which 70% of people in the US have taken when applying for a job, can negatively discriminate. Kyle Behm, a college student, failed a personality test when applying for a job at a Kroger store. He recognized some of the questions in the test from a mental health assessment which he had taken whilst undergoing treatment for bipolar disorder, suggesting that people with mental health issues were being unfairly discriminated against in the hiring process.

Harms of representation are when a system misrepresents a group in a damaging way. The example of Google Photos labelling black faces as gorillas is a harm of representation. Latanya Sweeney, a Harvard Professor, published a paper showing that searching for racially associated names in Google yields personalized ads that could be interpreted as discriminating. For example, searching for popular African American male names such as DeShawn, Darnell and Jermaine generated ads suggesting that the person may have been arrested, thus compounding the bias of black criminality. More recently, full-body scanners operated by TSA are prone to false alarms for common hairstyles for people of color.

A succinct example of harmful gender misrepresentation can be seen through the use of Google Translate above. By typing in the two sentences “he is a nurse” and “she is a doctor” and then translating them into a gender-neutral language such as Turkish and then back again, we see the roles of the genders switch. Given that systems such as Google Translate are trained on large corpora of existing literature, the historical gender bias in the sampled texts for those two roles has been encoded into the system, thus propagating that bias. Similarly, searching for “CEO” or “politician” on Google Images will give you a page of results full of typically white men. Crawford’s talk also notes how Nikon camera software would label Asian faces as “blinking” in photos.

Hence, given these biases of allocation and representation, we need to be extremely careful when using AI systems to classify people, since the consequences in the real world can be catastrophic. A controversial paper described the development of an AI classifier that was able to predict the likelihood of an individual being homosexual based on an image of their face. Given that homosexuality is illegal and punishable by death in a number of countries with sharia-based law, this work drew criticism for being highly unethical. What if this technology were to get into the wrong hands? Arguably, similar technology already has. AI that can recognize the ethnicity of faces has been deployed in China by a company called SenseNets in order to monitor Uighur Muslims as part of their repressive crackdown on this minority group.

Fixing AI bias

So what do we do about bias in AI? Given that we can hypothesize that the impact of the industry’s innovation on the world will only ever increase, we need to make a stand now to prevent existing discrimination in our society spreading virally.

The AI Now Institute’s paper argues that in order to fix it, we need to begin by fixing the diversity issue within the AI industry itself. That begins by recognizing that there is a serious problem, and recognizing that without radical action, it will only get worse. Unlike the lack of diversity in technology generally, where those primarily harmed are current and future employees, the diversity bias encoded into AI systems could have a devastating effect on the entire world.

We need to raise the profile of the diversity problem in technology companies so that non-action becomes unacceptable. This goes far beyond identifying the issue as just a “pipeline problem”, where companies exclaim that due to diverse candidates being a minority on the job market, it is harder to find them. Simply blaming the hiring pool puts the onus on the candidates, rather than on the workplaces themselves. Instead, tech companies need to work hard on transparency. This ranges from publishing compensation levels broken down by race and gender, to publishing harassment and discrimination transparency reports, to recruiting more widely than elite US universities and creating more opportunities for under-represented groups by creating new pathways for contractors, temporary staff and vendors to become full time employees.

We also need to address the way in which AI systems are built so that discrimination and bias are addressed. The AI Now paper suggests tracking and publicizing where AI systems are used, performing rigorous testing, trials, and auditing in sensitive domains, and expanding the field of bias research such that it also encompasses the social issues caused by the deployment of AI. It is also noted that assessments should be carried out on whether certain systems should be designed at all. Who green lighted the development of a homosexuality classifier? What should the ramifications be for doing so?

At a time in history where significant advances in the speed and availability of compute resources have made wide-scale AI available for the masses, our understanding of the impact that it is having on society is lagging sorely behind. We have a long way to go in order to fix the diversity problem in AI and in technology in general. However, through the work published by the AI Now Institute, we can see that if we don’t fix it soon, the systems that we create could cause further harmful divide in an already heavily divided world.

Does the UK Really Think it Can Police the Internet?

Leave a comment
Current affairs
Sajid Javid, UK Home Secretary. Credit: Foreign Office via Flickr/CC BY 2.0.

A school playground in England, 1997. I am 12 years old. I am walking towards the entrance where my mother is waiting to drive me home. I am giddy with excitement during the journey back. Tonight is the night that I’m finally going to get it.

I barely look up as I eat my dinner, shoveling it into my mouth as quickly as I possibly can. As soon as the knife and fork hit the empty plate, I’m rushing upstairs to the computer. I sit in front of the glow of the Windows 95 boot screen until the familiar start-up sound plays. A few clicks later, I’m listening to the screeching and hissing of the modem as I connect to the Internet.

“Welcome to AOL,” Joanna Lumley chirps.

I fire up a program which I spent most of the previous Sunday downloading. It’s a newsreader: an application that connects to Usenet, a distributed discussion system. I press the “Connect” button. I navigate to alt.engr.explosives, where yesterday somebody asked the question as to where they could find The Anarchist’s Cookbook, a fabled, possibly banned, tome of all sorts of forbidden knowledge. Last night’s reply was a pointer to an File Transfer Protocol (FTP) server, of which I copy the IP address, username and password.

I open up another program and paste in the credentials. I hit the button with the lightning bolt on it. It works. I’m in. I navigate through the file system of another computer somewhere in the world. There it is. A text file containing the book that has been elevated to legendary status through conversations at school. I transfer the file to our home computer and disconnect from the FTP server and from Usenet. I look over my shoulder. I look out of the window. I open the file in Notepad and rapidly scroll through the table of contents.

Credit card fraud, picking master locks, making thermite bombs, landmines, mixing napalm, black powder, doing telephone phreaking, constructing pipe grenades. It’s all here. I break out into a cold sweat imagining the knock on the door. I close the text file, delete it and turn off the computer. Can the police recover deleted files from hard drives?

Going live

At 1:40 p.m. on Friday 15th March, a man opens Facebook on his phone, taps the camera icon in the top left corner, and then taps “Go Live”. He shares the link with a number of people. Shortly after the video starts, footage from a head-mounted camera depicts the man getting out of his car, approaching a mosque, loading a shotgun scrawled with the names of figures that were participants in major conflicts with Islamic armies throughout history, and then opening fire indiscriminately at innocent worshippers. Seventeen minutes later the live broadcast stops. Within 24 hours, there are 1.5 million upload attempts of the video on Facebook alone. It is unknown how widely shared the video was on other platforms.

The Christchurch shootings, in which 50 people were killed and another 50 were injured, were the worst that New Zealand has seen in recent times. Most notable was the role that technology – especially social media – played in the viral spread of the recorded footage of the incident. In the weeks following the attack, Facebook, Instagram, and YouTube struggled to detect and remove copies of the video posted on their websites. YouTube’s chief product officer, Neal Mohan, noted that it “was a tragedy that was almost designed for the purpose of going viral.” He admitted that the platforms themselves do not have the necessary technology to curb the spread of harmful material. “This incident has shown that, especially in the case of more viral videos like this one, there’s more work to be done,” he said.

Offensive content has existed since the early days of the Internet. My 12 year old self was able to get his hands on a book of questionable legality. However, being an inquisitive computer geek allowed me to jump through the knowledge hoops of understanding dial-up modems, newsgroups and FTP transfers. The barrier of entry to questionable material back then was high: the Internet of the 1990s was still the realm of the enthusiast, and the means in which to find hidden secrets was obscure and technical. I wouldn’t expect the people that post in my local town’s Facebook group to have been able to stumble across that book during the same era. However, I’m sure that some of those same people were exposed to the Christchurch shooting video via shares from people within their social network and via the algorithm that controls the news feed. I’m also sure that if they had the desire to find that video, it would only be a few simple searches away.

It is easier than ever to create, share, and find content online. Search engines are unbelievably good at finding that article that you couldn’t quite remember the title of. Social networks ingest photos and videos at a dramatic rate and distribute them to family, friends and strangers. We have a fast global Internet infrastructure that allows a video to be live streamed from a mobile phone in New Zealand and watched all around the world instantaneously. For the majority of harmless Internet usage, this is a boon. But with information – both good and bad – so readily available, have we reached a tipping point for more regulation?

Platform versus publisher

Historically, social media platforms have defended the presence of harmful material on their websites by reasoning that it wasn’t their job to police what their users were able to share or find. It could be argued that this attitude is in line with the bohemian and techno-utopian Californian Ideology, with its roots in the writings of Stewart Brand, editor of the Whole Earth Catalog. The publication is famous for the line on the back cover of its final issue that Steve Jobs quoted during his 2005 Stanford commencement address: “Stay hungry. Stay foolish.”  Our social platforms have certainly been hungry for growth, and they’ve most certainly been foolish.

The Californian Ideology is rooted in “cybernetics, free market economics and counter-culture libertarianism” and upholds a belief that the Internet is beyond the control of the state. Twenty-five years ago, limited bandwidth and the above-average technical literacy required to connect and find information meant that one could feasibly see the ideology as an achievable utopia: a space for a certain type of technically literate intellectual, built by the same people that were consuming it. An early filter bubble, if you will. However, we cannot deny that what we consider to be the Internet today is a vastly different place.

In the early days of decentralized content, finding a webpage was more of an action of stumbling rather than calculated steps, and the creation of content required learning HTML and paying for a hosting provider. Today is different. For most, the predominant experience of the Internet is via social platforms that have been built on top of the scaffolding of the Web and are run by companies that profit from the engagement of their users. Given that the demographics and intentions of today’s social platform users are so varied, there are cases being made that content produced and shared should be subject to stricter laws and tighter controls. The platform versus publisher argument has never been more relevant.

But what do we mean by “platform” and “publisher”? As an example of a platform, consider a telecoms company that provides a landline telephone service. The company is responsible for the quality of the platform, but is not liable for any of the content disclosed during phone calls between individuals. They are simply supplying the infrastructure. A hypothetical call between two terrorists to agree on a target for a terror attack would place clear blame for the criminal activity on the participants, but not on the telecoms network.

Now consider a traditional publisher, such as a newspaper. A newspaper is often a for-profit organization where the content of the publication is created and curated by salaried staff. Hence if a newspaper were to publish material that was considered harmful, libelous or slanderous, then there are existing laws in place to hold the newspaper accountable, regardless of the author.

Given that YouTube, Facebook, Reddit and Instagram are the most popular sites in a majority of the world, it follows that the predominant experience of the Internet for the many is now via content shared on platforms that allow anyone, anywhere to easily publish and distribute content. The argument that these platforms should invest more time into regulating their content as if they were a publisher, rather than a platform, grows ever stronger.

The rise of legislation

Flowers at the Christchurch mosque shooting memorial. Credit: Wikimedia Commons.

In the aftermath of the Christchurch shootings, governments began to prepare their responses. Australia almost immediately proposed legislation that would make it “a criminal offense to not remove harmful material expeditiously”. Failure to do so would be punishable by 3 years’ imprisonment or fines of up to 10% of the platform’s annual revenue. If the bill passes and becomes law, then social media platforms would also need to make the Australian Federal Police aware of any stream of “abhorrent violent conduct” that is happening in Australia, or to face a fine of nearly $1 million Australian dollars.

Recently, the UK has gone one step further and has published the Online Harms White Paper. Similar to the Australian bill, the UK government points the finger of blame at the social networks for enabling the sharing of the video of the Christchurch shootings, but it casts a wider – and arguably more ambiguous – net with the categories of content that it wants to hold the tech giants responsible for. To name just a few examples, it wishes for them to be accountable for all illegal and unacceptable content, the mechanisms in which terrorist groups and gangs recruit, and also for harassment and bullying.

It plans to police this behavior by setting up a new regulatory body for the Internet, similar to how Ofcom regulates UK TV, radio and telecommunications services. This independent body would create and enforce a statutory “duty of care” that the social networks will have to show that they are meeting, both by working with law enforcement agencies to prosecute those engaged in illegal activity and also by providing annual transparency reports. If the independent body finds that any companies are in breach of the duty of care, then charges can be levied. These range from issuing fines of up to 10% of annual revenue and even being able to impose liability and therefore criminal charges on senior management.

The unveiling of the document by Home Secretary Sajid Javid was done so amidst strong posturing. “I said in September, I warned the web giants. I told them that keeping our children safe is my number one priority as Home Secretary,” he stated. “I warned you. And you did not do enough. So it’s no longer a matter of choice. It’s time for you to protect the users and give them the protection they deserve, and I will accept nothing else.”

How much the web giants have already listened to Mr. Javid, despite his insistence, is unknown. However, the white paper is causing a stir.

The world reacts

The ambitions of the UK government are bold. They want the UK “to be the safest place in the world to go online, and the best place to start and grow a digital business.” Given the live streaming of the Christchurch massacre, and the backlash against graphic self-harm imagery on Instagram after the death of Molly Russell, one can at least sympathize with their good intention. Writing in The Guardian, John Naughton is supportive of the motion, stating that “since the mid-1990s, internet companies have been absolved from liability – by Section 230 of the 1996 US Telecommunications Act and to some extent by the EU’s e-commerce directive – for the damage that their platforms do.”

However, this positive reaction is in the vocal minority. It has been argued that the white paper highlights how those in positions of political power understand little about the realities of the Internet and technology. A contrasting editorial in The Guardian argues that with the widespread use of the Internet we have created a space that is subject to all of the existing problems that society already has. “No one doubts the harm done by child sexual abuse or terrorist propaganda online, but these things are already illegal. The difficulty there is enforcement, which the white paper does nothing to address.”

Algorithms and artificial intelligence have an extremely long way to go before they can be trusted to automatically censor content online. The traditional “cease and desist” model does not work when content is produced at such a rapid rate. Additionally, without careful development, software can be biased and even racist, so how can we guarantee that automated means will censor fairly, and is it ethical for companies themselves to decide what is and isn’t allowed to be shared?

“Gentleman” by SL.

To police content automatically, we need to know what is and isn’t acceptable. It has been noted that the word ‘harm’ isn’t actually defined in the paper, even though it is in the title. In the absence of an exact definition, the paper instead uses a variety of examples to make its case, such as the spreading of disinformation and in individuals promoting gang culture. However, some of these are entirely subjective: who makes the decision on what is classed as “disinformation” and “gang culture”? Apart from the obvious threat to free speech, it highlights the need for humans to very much be part of the process of classifying content. Is it always possible to correctly automatically classify a UK drill video from a promotion of gang culture? What about the difference between legitimate fake news and satire? Even humans struggle with this problem.

The reality of using humans as screens for harmful content is not only problematic in terms of potential error and the sheer amount of material needing classified. It also exposes humans to real harm. As Casey Newton wrote in The Verge, employees at Cognizant in Phoenix, Arizona, are contracted by Facebook to moderate posts that have been flagged by users as offensive. This subjects employees to videos of terrorist executions, gang murders, bestiality and violence. Exposing workers to extreme content daily is damaging. Staff help to cope by speaking to visiting counselors and have access to a hotline, but employees are reported to have turned to sex, drugs and offensive jokes in order to manage.

So how is the UK government expecting its white paper to be implemented? It highlights that it has already done some work with the social networks to curb online grooming of minors, however this was little more than a two-day hackathon hosted in the US. Forming an independent regulatory body, defining what harmful content is and drafting a duty of care, and staffing the effort with enough people to make a meaningful difference is no small feat. Does the Home Secretary really know the extent of what he is getting into?

Is pivoting platforms the answer?

Mark Zuckerberg F8 2018. Credit: Wikimedia Commons.

As mentioned previously, perhaps it is the case that social networks will always be full of bad actors and harmful content because they are merely a reflection of the society that uses them. Policing that content will always be a losing battle, so in order to win the war, the networks must develop a different strategy. As Nicholas Thompson and Fred Vogelstein write in Wired, “Like the communications innovations before it—the printing press, the telephone, the internet itself—Facebook is a revolutionary tool. But human nature has stayed the same.”

Last month, Mark Zuckerberg wrote that Facebook will be pivoting the service towards a privacy-focused model, based on his hypothesis that the “future of communication will increasingly shift to private, encrypted services where people can be confident what they say to each other stays secure and their messages and content won’t stick around forever.”

As Benedict Evans writes, this pivot is trying to change the platform so that the present issues become irrelevant. He likens this pivot to how Microsoft had to deal with the threat of malware when it opened up Microsoft Office to be a development platform. Macros, which were small user-written programs, could be embedded inside of Office documents and were exploited to create viruses that could automatically replicate themselves to everyone in a user’s email address book. Although the intentions of macros were to save users time on performing repetitive tasks, the 1999 “Melissa” virus managed to infect the Pentagon, and was reported to have caused $1.1 billon of damage worldwide. Evans argues that Facebook is now too open in the same way that Microsoft Office was too open, and the solution is to make it more closed. “Russians can’t go viral in your newsfeed if there is no newsfeed. ‘Researchers’ can’t scrape your data if Facebook doesn’t have your data,” he writes.

It could be argued that the UK’s response, albeit with good intention, is way behind the curve. If Facebook and other social platforms decide to pivot towards private, encrypted messaging, then it is unclear how the proposals will be enforced. In fact, the white paper states that “any requirements to scan or monitor content for tightly defined categories of illegal content will not apply to private channels. We are consulting on definitions of private communications, and what measures should apply to these services.”

Those that strongly oppose the white paper, such as Jim Killock, the executive director of the Open Rights Group, and Matthew Lesh, head of research at the Adam Smith Institute, have spoken out against it. After all, is it really possible to censor websites that are not social networks, such as “public discussion forums, messaging services and search engines” without violating existing rights? Freedom of speech campaigners Article 19 also oppose the motion, saying that “such actions could violate individuals’ rights to freedom of expression and privacy.”

However, with the social networks in the firing line beginning to pivot towards private messaging channels, we may in turn find that the white paper becomes largely irrelevant if it is to eventually become law. The UK Government’s Online Harms White Paper is currently under open consultation. Anyone is able to respond to it online, via email, or in writing until the 1st July.

I suggest that you do so.