The Bias Truth of AI Models

Arrived about 20 minutes early to Thursday’s keynote. The Bellco theater is filling up more slowly than yesterday…maybe folk were out late last night? It’s about 7:59 to the presentation as I type these words, so I guess it’s just gonna be fewer people in the hall, which will make leaving a little easier. I’m keen to hear this presentation, as the technology we use is NOT a neutral force in the world.

Near empty Bellco Theater

Presenter

  • Teddy Benson, Director of Data Integration, Walt Disney World, Parks and Resorts

Started with a quick informal poll of audience members who are studying or working in the field of AI (it was not a huge number of folk). Question: How many think that bias, in general, is bad? A mix of half-asleep hands go in the air, most attendees abstaining.

What is AI?

To Teddy (as a youngster), AI was exemplified by C-3PO and R2-D2 from the Star Wars movie. It’s borrowing characteristics from human intelligence and applying them as algorithms in a computer friendly way. machine learning is a subset of AI, Deep Learning is a subset of machine learning. AI started in 1952 with a study by the Navy to build a neural-net computer made by Frank Rosenblatt. Neural nets are layered which allow for nuanced creative problem-solving. AI has become popular in the last 10 years because of theoretical algorithms made cheaply available by cloud computing such as Azure and AWS.

Today what we use for AI is known as narrow or weak – AI designed to perform a narrow task (e.g. facial recognition, internet searches, driving a car).

In the future we may see the creation of AGI – Artificial General Intelligence. Designed to successfully perform any intellectual task that a human can do.

Passive AI: search auto-fill, delivery of targeted ads, entertainment recommendations, product recommendations, etc.

Active AI: IBM’s Watson info retrieval, knowledge representation, automated reasoning; Salesforce’s Einstein smart CRM; Amazon’s Alexa virtual assistant; Microsoft’s Cortana personal digital assistant; etc.

AI Types

Symbolic AI: rules and knowledge has to be hand-coded and human-readable.

Non-Symbolic AI: lets the pre-generated model to perform calculations on it’s own. Downside: it’s a black box…hard to understand what’s happening inside the AI and/or the model.

Where are we in the evolution?

  • Type 1: Reactive machines (chess playing)
  • Type 2: Limited memory (self driving cars)
  • Type 3: Theory of mind (understanding the world)
  • Type 4: Self-awareness (understanding consciousness)

What is Bias?

Attaching of positive or negative meaning to elements in our environment based on personal or societal influence that shape our thinking.

Conscious bias is to be aware, intentional and responsive.

Unconscious bias, on the other hand, refers to being unaware of bias being introduced into the system.

What is AI Bias?

When a machine is biased, it is unable or less able to adapt to various training models, preferring one route as a primary mechanism.

The makes the developed AI algorithm rigid and inflexible, unable to adjust when a variation is created in the data at hand. It is also unable to pick up on discreet complexities that define a particular data set.

As our use of AI continues to grow the problem of bias in AI will continue to be very pervasive and not yet even fully realized.

Google image search (circa 2012) on “What is a CEO?” shows that only 11% of company CEOs are women, when the actual number was about 27%. A similar search on “What is a telemarketer?” the opposite effect was true (50% versus actual of 64%).

  • Was it the model causing the bias?
  • Was it the data that the model was given?

In 2016, Microsoft released Tay.ai to the world and after 24 hours, was reduced to spouting racist and other awful statements. What caused this chatbot to become a xenophobic Nazi-lover? LEARNED BEHAVIOR. Tay learned its behavior based on the people who were tweeting at it – taking its conversational cues from the WWW. Given that the internet can (and often is) at times a massive verbal garbage firehose of the worst parts of humanity, it is no surprise that Tay began to take on those characteristics.

Word2Vec tries to find associations to words via vectors in interconnected space and ties them together. With language translation, this theory supposes you can take the cascading vectors to translate languages. Google’s translation of articles online fails at this due to vector bias. Vector bias, data influence/inference: nuance in languages are not weighted appropriately in vector space.

Types of Bias

  • Data-driven: missing or one side data set
  • Bias through interaction: learned behavior
  • Emergent bias: providing matching data to requested data
  • Similarity bias: matching like for like
  • Conflicting goal bias: stereotyping results

How to Prevent AI Bias?

  • Know your environment
  • Know your data and where it came from
  • Verify your model logic – code reviews
  • Spot check during training of models