Coding Bootcamps: a Glimpse at the Future of Education?

comments 3
Current affairs
Photo by Charles ?? on Unsplash.

Education is big business.

It is projected that over 20 million students will be enrolled in degree-granting institutions in the US in fall 2020. That’s 20 million people willing to invest multiple years of their lives and to incur an average of nearly $30,000 of debt in order to earn a degree, typically to maximize their chances of starting a career in the discipline of their choice. This brings high pressure for the hope of a stable future.

The barrier to entry for our top institutions isn’t just financial. Getting a place is hard work. It is a journey that begins in the early years of a child’s life, requiring persistent effort from them, their teachers and their families. Throughout the high school years they must sustain a high GPA, discover their interests, and apply for the best colleges. Securing a place at a prestigious university is a big deal. The recent US college admissions scandal revealed that the family of a Chinese student paid $6.5 million to help her secure her place at Stanford University.

But let’s step back a second. Why is a place at a prestigious institution worth risking prosecution for a bribe of millions of dollars? Derek Thompson writes for The Atlantic that “Ivy League and equivalent institutions provide more than world-class instruction. They confer a lifetime of assistance from prodigiously connected alumni and a message to all future employees that you’re a rarified talent.” It’s no wonder that many will try to secure a place by any means possible.

However, a 2002 economics paper revealed that the salary increase from going to the most selective schools is “indistinguishable from zero”, although the “payoff to attending an elite college appears to be greater for students from more disadvantaged family backgrounds”. In short, good universities will give your job prospects a lift compared with non-attendance, especially if you are not already in an elite class, but they won’t necessarily make you any better off than if you had attended an institution further down the rankings table.

So let’s step back even further. Colleges, and the education system more broadly, are believed to have originated circa 387 BC. Imagine yourself at Plato’s Academy, set amongst the olive groves of ancient Athens. Here, a selective cohort gravitated around a rare resource: the knowledge that was taught through lectures and discourse. The rarity and exclusivity of knowledge as a resource is a theme that can be observed through history: the Library of Alexandria formed as a central repository of irreplaceable documents in ancient Egypt. Ancient Athenian academic culture gave birth to the formal education system, with the most talented students being taught by the most renowned philosophers: think Socrates, Aristotle and Pythagoras.

In Europe during the High Middle Ages, familiar formal degree titles were developed such as bachelor’s, master’s and doctorates. Oxford University was constructed in the 11th Century, with Cambridge following in the 13th. The tradition of an institution awarding formal degree titles continues today. The most exclusive schools offer the most exclusive degrees which carry the most prestige. Being admitted to an exclusive school is about gaining access to the knowledge and wisdom that they contain. One tries to gain access to the best possible resources for the hope of a better future.

However, we could argue that the greatest repository of human knowledge is now no longer a particular school, library or degree program. It is the Internet. Armed with a computer and an Internet connection, a self-motivated individual could, in theory, teach themselves anything that they want to know. At the time of writing, the English version of Wikipedia alone has over 5 million articles. Compare that to the library of just 320 volumes that were bequeathed by John Harvard for the founding of the eponymous university in 1636.

However, with no formal syllabus or curriculum, the self-learner would find themselves overwhelmed by the amount of information online. Where would they start? What is essential information and what is merely supplementary? What is a trusted source, and what is incorrect? Autodidactism – the act of self-education without teachers or institutions – is a rare skill possessed by the most self-motivated and often the most naturally gifted. We could not expect everyone on the planet to learn unaided.

The process of finding the most valuable information, and ensuring a steady progression through concepts of increasing complexity, is a key that traditional education institutions still hold: decades of teaching and proven feedback loops, informed by the latest research, keeps syllabuses up to date. Teachers can also observe and interact with learners in order to gauge their understanding of the material and help them progress if they are stuck.

Clearly, our schools and colleges offer a learning experience far greater than self-learning via the Internet, regardless of the ease of access to the material. They also reward those that graduate with formal qualifications or accreditations which have become the societal norm for acceptance into graduate schemes and prestigious jobs.

However, is it possible to imagine a future where the Internet could be used as a delivery mechanism of world-class education, regardless of the student’s location or background? Can we lower the barrier of entry to education through technology and educate the world in a democratized manner?

Tutoring turned nonprofit

Salman Khan, founder of Khan Academy. Credit: Darth Viral on Flickr/CC BY 2.0.

In 2003, Salman Khan, a financial analyst, began tutoring his cousin in mathematics over the Internet using Yahoo! Doodle Notepad, a tool for remote participants to draw pictures together. With his friends and relatives also reaching out for his help, he started a YouTube channel. The videos proved extremely popular, enough so for him to quit his job and build Khan Academy, a nonprofit organization that offers free online courses for anyone in the world with an Internet connection.

At present, the site offers a comprehensive curriculum of STEM subjects up to high school level and also features some arts and humanities courses. Google, Comcast and Bank of America are some of the supporters that have donated over $10 million each to the cause, with a total of $53 million being donated in 2017, the latest reporting period. The success stories detail how Khan Academy has allowed high school students to make breakthroughs in understanding mathematics and how adults have used it to refresh their skills after being away from subjects for decades.

Nonprofits such as Khan Academy have harnessed the distribution power of the Internet in order to deliver education to millions around the world. However, being nonprofit organizations, these courses are not pitched as a replacement for traditional education. They are instead a supplement delivered alongside the traditional education system. They do not guarantee any particular outcome or accreditation as a result of completing the courses. However, the website does state that “students who complete 60% of their grade-level math on Khan Academy experience 1.8 times their expected growth” on the NWEA MAP Test, which measures a student’s academic progress at school.

People can take or leave Khan Academy: it has no ulterior motive. The material is there for the learner if they want it, and they do not rely on those that take the courses for their income, so they have no incentive other than their own will to provide good material. Yet, not everyone can have Khan Academy’s runaway success. Salman Khan was in the right place at the right time with what the world wanted. He had first-mover advantage.

For those who didn’t ride Khan Academy’s wave, but still want to make a living offering educational resources, the alternative route is through paid models, where the student is also the customer. But is this model tainted by the need to provide results?

Who even needs a degree?

Given that most students attend university in order to maximize their chances of landing a good career, could it be possible that they could achieve this without going to university at all? Whereas some fields still require lengthy formal education, such as medicine, other fields are beginning to relax the constraints on employment. Fueled by the high demand for talent, technology companies are beginning to look at other ways in which they can fill entry-level positions apart from targeting those that are graduating from computer science programs.

There has been an explosion in the number of so-called “coding bootcamps” over the last decade. The premise is simple: students pay to enroll and are given a crash course in computer programming using the most common languages and tools in the industry. Courses typically last 6 months to a year, and those that perform well have a good chance of landing a well-paid job at a technology company, without having to have spent the time and money required to go through a university degree. The cost of enrolling in a coding bootcamp is still high, however: according to Course Report, bootcamp tuition fees can range from $9,000 to $21,000. This is still not as expensive and time consuming as college, given that many offer their tuition online, but it is still expensive enough for it to be a considered life decision for many to enroll.

One of the highest profile coding bootcamps is Lambda School, a San Francisco based bootcamp offering courses in computer science combined with either full-stack Web, Android or iOS development. Students can also task courses in Data Science or UX Design. These are all highly in-demand skills in the technology industry. Entry level positions in these roles in tech-centric US cities such as San Francisco and New York City can net six-figure starting salaries. This bootcamp model seems to be working: Lambda School announced in January that it raised $30 million in Series B funding from venture capital firms such as Google Ventures, and renowned startup incubator Y Combinator amongst others, giving it a post-money valuation of around $150 million.

In addition to having no physical campus with courses being delivered online, Lambda School is offering an extremely attractive financial arrangement for those that enroll: as an alternative to paying the $20,000 tuition fee up front, students can opt to enter an income share agreement. This allows them to defer payment until they are earning over $50,000 annually for two years. At that point, 17% of their salary is paid back to Lambda School until the debt is paid off. If students are unable to find a job with this level of compensation after five years, they owe nothing. For students from disadvantaged backgrounds, or those wanting to make a complete career switch, this offer is seriously enticing.

Via their Twitter accounts, current and past Lambda School students share their stories of success. There are numerous tweets highlighting graduates that have landed full time programming jobs, with some earning around $100,000 a year. “I make more than both of my parents combined,” writes one anonymous user in a screenshot from their Slack account. “Also my family situation requires me home a lot, which was a huge motivator for the switch to tech anyway.”

According to the self-reported outcomes webpage, graduates have gone on to work for well-known technology companies such as Stripe and Uber. Of their first four cohorts for the Full Stack Web Programming bootcamp, 100%, 88%, 71% and 75% have gone on to find jobs, respectively. At the time of writing, Course Report gives Lambda School an average rating of 4.91 from 65 reviews, and SwitchUp reports a similar average from 143 reviews.

But are students happy?

However, searching online for the experiences of students doesn’t come up all roses. Clusters of reviews on Reddit offer a different picture of the student experience. “I’m going to say this course is not worth it and I don’t recommend it,” says user SpecialistManner. “It’s a scam with a business plan. It’s basically a MOOC [Massively open online course] without the organization, a Slack channel, and 8,000x the brogrammer snark,” reports another user. Various posts highlight concerns, such as unprofessional staff, the disorganization of Slack communications, and inability to cater for the learning styles of individuals.

“They have still neglected to report their hiring stats to CIRR since forever,” writes a participant on Reddit. The CIRR, or the Council on Integrity in Results Reporting by its full name, provides a standardized system for reporting student outcomes for courses. This omission has since been rectified by Lambda School, and the graduate outcomes for January – June 2018 are available. It states that 71 students graduated in the reported time period, with 51.4% graduating on time, and 78.5% graduating within 150% of the length of the program. After 180 days, 85.9% of students had achieved full-time employment, with a median salary of $60,000. 37.8% of students had landed jobs paying $80,000 or more. Lambda School isn’t alone in reporting similar success rates. Hack Reactor Austin reports 94.5% of 50 students in full-time employment earning a median pay of $76,500 and Codesmith NYC reports 91.2% of 31 students employed at a median pay of $112,500.

It is worth noting that most, if not all, schools and colleges have their fair share of bad reviews. According to StudentCrowd, the University of Oxford and the University of Cambridge, currently ranked as world number one and two in the Times Higher Education university rankings, have reviews of 4.4 (20 respondents) and 4.17 (33 respondents) respectively. “As a PhD graduate from Cambridge University, I can say that this is the most racist place on earth. All they want is internal student’s money!” writes user rtert. Despite the prestigiousness of the institution, there will always be people experiencing the full spectrum of human behavior, good and bad.

As Lambda School is the poster child for online coding bootcamps, the burning heat of the media spotlight holds it to a certain amount of scrutiny. However, it is worth remembering that the school is a for-profit private institution that does not need to conform to the same kinds of governmental or state regulation that traditional schools are held to. In fact, in 2014, the Californian Bureau for Private Postsecondary Education sent cease and desist letters to a number of coding bootcamps as they were deemed to be operating without approval in states that have authorization requirements for private education.

Regardless of state authorization requirements, the most worrying aspect of the explosion in online coding bootcamps is that their content is unregulated by an external body, meaning that bad actors or courses of poor quality can sell a dream to students which never comes true. At worst, students could find themselves out of pocket and out of luck as the market begins to rapidly expand and other bootcamps try to get in on the action. “False hope and exaggeration is something these camps thrive on,” writes an anonymous source. “I have become increasingly agitated by articles that completely contradict my own experience and the experiences of my fellow students.” With many people considering to invest serious amounts of money into these programs, how can they guarantee that it’ll be worthwhile?

What does the future of education hold?

We cannot argue that coding bootcamps are offering real opportunities to people. We also cannot argue that websites like Khan Academy and Udemy are having success in designing and delivering online curriculums. The futurist can ponder whether there is a universal education that is delivered to students worldwide via the Internet, rather than via the physical classroom. Could we imagine a future where students can learn at their own pace, in self-selected areas of interest, and have software and AI monitor their progress and suggest additional materials and guidance if they are getting stuck? After all, plenty of technology work is now fully remote.

In his tweet storm, Naval Ravikant, CEO of AngelList paints a compelling picture of technology-based education. “A generation of autodidacts, educated by the Internet and leveraged by technology, will eventually starve the industrial-education system,” he writes. “Eventually, the tide of the Internet and rational, self-interested employers will create and accept efficient credentialing… and wash away our obsolete industrial-education system.” It is a bold view, highlighting the gap between a traditional degree and the skills that an employer wants. Yet, one could argue that the sole purpose of university is not to push all students into employment, it is just one of the outcomes in a journey of self-discovery, the joyous opportunity to explore a subject deeply, and the ability to mix with a vast cohort of like minded individuals.

Perhaps instead we could see a greater diversity in the institutions in which students are able to get their education. Coding bootcamps have some parallels to vocational education whereas a university education is more abstract and academic. Both are valid routes to employment. Students of all ages needing daily support from teachers, either because of their individual needs or their personal preference, may always have a place in traditional school systems. However, one could imagine an alternative self-led education system for the autodidact, taught primarily online.

Such education could reach people who may not have access to quality schools in their local area. Technology could connect with world’s best teachers with the world’s most needing students. Whether this can begin to replace traditional education at the K-12 level remains to be seen, as school is also implicit childcare for busy and working parents. But as Ravikant says, “The best teachers are on the Internet. The best books are on the Internet. The best peers are on the Internet.” We should be working on more ways to connect these teachers, books and peers together for the good of society.

We Must Fix AI’s Diversity Problem

Leave a comment
Current affairs
Pepper, a white robot developed by Softbank Robotics. Photo by Alex Knight on Unsplash.

Our technology industry has a diversity problem. This in itself is not a new issue. But the subset of our industry working on artificial intelligence (AI) has a particularly acute diversity problem, and it is having a negative impact on the lives of millions of people, all around the world.

Since 2014, Information is Beautiful have maintained a visualization of the published diversity statistics for some of the world’s largest technology companies. Despite the 2017 US population being 51 percent female, at that time Nvidia only employed 17 percent female staff, Intel and Microsoft 26 percent, Dell 28 percent, and Google, Salesforce and YouTube 31 percent. This reporting also didn’t account for those that identify as non-binary or transgender, nor the fact that the diversity gap widens at the most senior levels of companies: a 2018 report found that only 10 percent of tech executives are female.

The diversity problem goes beyond just gender. Racial diversity in technology is poor, and even less is being done about it. Consider “colorless diversity“: a phenomena where the industry is not investing equally in addressing the imbalance of people of color. In 2015, Erica Joy Baker highlighted that “whether by design or by inertia, the favor [of pro-diversity work] seems to land on white women in particular.” In fact, that year, Baker attended a Salesforce leadership conference in San Francisco and was in the audience for the “Building an Inclusive Workplace” panel. During the panel, Salesforce co-founder and Chief Technology Officer Parker Harris stated that:

“Well, right now I’m focused on women, you know, and it’s back to Marc [Benioff]’s focus on priorities. I have employees, that are, you know, other types of diversity coming to me and saying well why aren’t we focused on these other areas as well, and I said yes we should focus on them but, you know, the phrase we use internally is ‘If everything is important, then nothing is important.’

This may be a lesson in prioritization for shipping software and meeting sales targets, but it is a single-minded approach that discriminates against the under-represented groups that need the most help.

Fast forward four years, and a new report has shed light on the current state of diversity in undoubtedly the hottest area of our industry: AI. The paper Discriminating Systems: Gender, Race and Power in AI was published by the AI Now Institute of New York University in April 2019. It highlights the scale – and most shockingly, the global impact – of the diversity problem in the institutions doing work at the cutting edge of this field.

The authors of the report state that “recent studies found only 18% of authors at leading AI conferences are women and more than 80% of AI professors are men”. Similar statistics in AI are observed outside of academia. The paper reports that “women comprise only 15% of AI research staff at Facebook and 10% at Google.” Extending the diversity statistics to include race, the paper also notes that “only 2.5% of Google’s workforce is black, while Facebook and Microsoft are each at 4%”.

We have seen how a lack of diversity can stifle innovation, decrease employee retention and in the worst case, allow cases of racism and harassment to go unpunished. However, the lack of diversity in AI can have a harmful effect on the whole of society.

Diversity issues deployed

Consider the following video from 2009. It shows how the face tracking functionality of a Hewlett Packard webcam seems to be unable to recognize black faces.

Six years later it seemed that the latest Google AI classifiers were still making errors, albeit in an even more racist manner.

So how does this happen? To understand we have to look at the process that is followed in order to create image classification software. Typically, AI engineers will start by finding as many labelled examples of images as they possibly can. By labelled, we mean that a human has looked at a picture of a cat and has marked it with the word “cat”. A large collection of these labelled images is called a corpus.

There are many corpora that are publicly accessible, such as ImageNet, which is maintained by the Vision Lab at Stanford University. If you type the word “dog” into ImageNet, you’ll see labelled images of dogs come back in your search. There are similar corpora of image data available online for AI researchers and engineers to use depending on their desired application, such as MNIST for images of handwritten digits and the more general Open Images dataset.

With access to large amounts of accurately labelled data, AI engineers can create classifiers by training them on the examples to be able to recognize similar images. Sometimes existing pre-trained models are extended to cover new inputs using a process called transfer learning. It follows that with enough example images of dogs, a classifier could be created that can take a previously unseen picture of a dog and label it correctly. Therefore the question is how is it possible for black faces to not be recognized by webcam software, or for pictures of two black people to be classified as gorillas? Are racist labels being assigned to input data, or is there something more subtle at play?

The answer is that the racial bias of these classifiers typically goes unchallenged. This effect compounds in two places. Firstly, the data sets that are being used for training the classifiers are typically not representative of real world diversity, thus encoding bias. Secondly, the predominantly racially homogenous staff do not thoroughly and sensitively test their work on images of people more diverse than themselves.

The AI Now paper highlights that a commonly used public dataset of faces called Labeled Faces in the Wild, maintained by the University of Massachusetts, Amherst, has only 7 percent black faces and 22.5 percent female faces, thus making a classifier trained on these images less able to identify women and people of color. How did this dataset end up being so biased? To find out, we must look at how the images were collected.

Data collection was performed automatically from images featured on news websites in the early 2000s. Thus, the corpus “can be understood as a reflection of early 2000s social hierarchy, as reproduced through visual media”. For example, it contains over 500 pictures of the then US President George W. Bush. Although this dataset seems like an excellent opportunity to create a “real world” facial analysis classifier by sampling seemingly random images, these classifiers end up highlighting the lack of diversity in the media at that time.

This phenomena is not limited to classifiers built from the Labelled Faces in the Wild dataset. A 2018 paper has shown that a number of facial analysis techniques misclassify darker-skinned females up to 34.7 percent of the time compared to lighter-skinned males only being misclassified 0.8 percent. Clearly, something is wrong here. What can we do?

The full picture of bias

In her keynote talk at NIPS 2017, Kate Crawford, the co-director and co-founder of the AI Now Institute, explores how bias in the AI systems that we create today are propagating historical and present discriminatory behavior that we see in society itself. Given that AI systems are becoming prevalent in ways that truly affect the outcome of people’s lives, fixing bias is an important issue for the industry to address. Crawford defines bias as a skew that causes a type of harm. She classifies the effect of bias into two broad areas: harms of allocation and harms of representation.

Harms of allocation are where an AI system allocates or withholds opportunities or resources to or from certain groups. For example, AI may be used to make automated decisions on loans and mortgages. It may automatically screen job applicants for their suitability or their criminal background. It may diagnose illness and thus decide on treatment. It may even inform the police as to which neighborhoods they should be spending their time performing “stop and search” operations.

Development of these systems begin with a labelled data set in order to train a model. We have seen that these data sets can encode existing biases that exist within society. If it just so happens that historically people under thirty or African American women often get turned down for mortgages, then AI trained on this data may unfairly encode this bias and thus cause a harm of allocation to future applicants based on their age, sex or gender. Mathematician and author Cathy O’Neil wrote how personality tests, of which 70% of people in the US have taken when applying for a job, can negatively discriminate. Kyle Behm, a college student, failed a personality test when applying for a job at a Kroger store. He recognized some of the questions in the test from a mental health assessment which he had taken whilst undergoing treatment for bipolar disorder, suggesting that people with mental health issues were being unfairly discriminated against in the hiring process.

Harms of representation are when a system misrepresents a group in a damaging way. The example of Google Photos labelling black faces as gorillas is a harm of representation. Latanya Sweeney, a Harvard Professor, published a paper showing that searching for racially associated names in Google yields personalized ads that could be interpreted as discriminating. For example, searching for popular African American male names such as DeShawn, Darnell and Jermaine generated ads suggesting that the person may have been arrested, thus compounding the bias of black criminality. More recently, full-body scanners operated by TSA are prone to false alarms for common hairstyles for people of color.

A succinct example of harmful gender misrepresentation can be seen through the use of Google Translate above. By typing in the two sentences “he is a nurse” and “she is a doctor” and then translating them into a gender-neutral language such as Turkish and then back again, we see the roles of the genders switch. Given that systems such as Google Translate are trained on large corpora of existing literature, the historical gender bias in the sampled texts for those two roles has been encoded into the system, thus propagating that bias. Similarly, searching for “CEO” or “politician” on Google Images will give you a page of results full of typically white men. Crawford’s talk also notes how Nikon camera software would label Asian faces as “blinking” in photos.

Hence, given these biases of allocation and representation, we need to be extremely careful when using AI systems to classify people, since the consequences in the real world can be catastrophic. A controversial paper described the development of an AI classifier that was able to predict the likelihood of an individual being homosexual based on an image of their face. Given that homosexuality is illegal and punishable by death in a number of countries with sharia-based law, this work drew criticism for being highly unethical. What if this technology were to get into the wrong hands? Arguably, similar technology already has. AI that can recognize the ethnicity of faces has been deployed in China by a company called SenseNets in order to monitor Uighur Muslims as part of their repressive crackdown on this minority group.

Fixing AI bias

So what do we do about bias in AI? Given that we can hypothesize that the impact of the industry’s innovation on the world will only ever increase, we need to make a stand now to prevent existing discrimination in our society spreading virally.

The AI Now Institute’s paper argues that in order to fix it, we need to begin by fixing the diversity issue within the AI industry itself. That begins by recognizing that there is a serious problem, and recognizing that without radical action, it will only get worse. Unlike the lack of diversity in technology generally, where those primarily harmed are current and future employees, the diversity bias encoded into AI systems could have a devastating effect on the entire world.

We need to raise the profile of the diversity problem in technology companies so that non-action becomes unacceptable. This goes far beyond identifying the issue as just a “pipeline problem”, where companies exclaim that due to diverse candidates being a minority on the job market, it is harder to find them. Simply blaming the hiring pool puts the onus on the candidates, rather than on the workplaces themselves. Instead, tech companies need to work hard on transparency. This ranges from publishing compensation levels broken down by race and gender, to publishing harassment and discrimination transparency reports, to recruiting more widely than elite US universities and creating more opportunities for under-represented groups by creating new pathways for contractors, temporary staff and vendors to become full time employees.

We also need to address the way in which AI systems are built so that discrimination and bias are addressed. The AI Now paper suggests tracking and publicizing where AI systems are used, performing rigorous testing, trials, and auditing in sensitive domains, and expanding the field of bias research such that it also encompasses the social issues caused by the deployment of AI. It is also noted that assessments should be carried out on whether certain systems should be designed at all. Who green lighted the development of a homosexuality classifier? What should the ramifications be for doing so?

At a time in history where significant advances in the speed and availability of compute resources have made wide-scale AI available for the masses, our understanding of the impact that it is having on society is lagging sorely behind. We have a long way to go in order to fix the diversity problem in AI and in technology in general. However, through the work published by the AI Now Institute, we can see that if we don’t fix it soon, the systems that we create could cause further harmful divide in an already heavily divided world.