“There comes a time in the life of a subject when someone steps up and writes the book about it. AIQ explores the fascinating history of the ideas that drive this technology of the future and demystifies the core concepts behind it; the result is a positive and entertaining look at the great potential unlocked by marrying human creativity with powerful machines.” Steven D. Levitt, bestselling co-author of Freakonomics
From leading data scientists Nick Polson and James Scott, what everyone needs to know to understand how artificial intelligence is changing the world and how we can use this knowledge to make better decisions in our own lives.
Dozens of times per day, we all interact with intelligent machines that are constantly learning from the wealth of data now available to them. These machines, from smart phones to talking robots to self-driving cars, are remaking the world in the 21st century in the same way that the Industrial Revolution remade the world in the 19th century.
AIQ is based on a simple premise: if you want to understand the modern world, then you have to know a little bit of the mathematical language spoken by intelligent machines. AIQ will teach you that languagebut in an unconventional way, anchored in stories rather than equations.
You will meet a fascinating cast of historical characters who have a lot to teach you about data, probability, and better thinking. Along the way, you'll see how these same ideas are playing out in the modern age of big data and intelligent machinesand how these technologies will soon help you to overcome some of your built-in cognitive weaknesses, giving you a chance to lead a happier, healthier, more fulfilled life.
About the Author
NICK POLSON is Professor of Econometrics and Statistics at the Chicago Booth School of Business. He does research on machine intelligence and deep learning, and is a frequent speaker. Polson lives in Chicago.
JAMES SCOTT is Associate Professor of Statistics at the University of Texas at Austin. He is a statistician, data scientist, and has worked with clients across many industries to help them understand the power of data. Scott lives in Austin with his wife.
Read an Excerpt
On personalization: how a Hungarian émigré used conditional probability to protect airplanes from enemy fire in World War II, and how today's tech firms are using the same math to make personalized suggestions for films, music, news stories — even cancer drugs.
NETFLIX HAS COME so far, so fast, that it's hard to remember that it started out as a "machine learning by mail" company. As recently as 2010, the company's core business involved filling red envelopes with DVDs that would incur "no late fees, ever!" Each envelope would come back a few days after it had been sent out, along with the subscriber's rating of the film on a 1-to-5 scale. As that ratings data accumulated, Netflix's algorithms would look for patterns, and over time, subscribers would get better film recommendations. (This kind of AI is usually called a "recommender system"; we also like the term "suggestion engine.")
Netflix 1.0 was so focused on improving its recommender system that in 2007, to great fanfare among math geeks the world over, it announced a public machine-learning contest with a prize of $1 million. The company put some of its ratings data on a public server, and it challenged all comers to improve upon Netflix's own system, called Cinematch, by at least 10% — that is, by predicting how you'd rate a film with 10% better accuracy than Netflix could. The first team to meet the 10% threshold would win the cash.
Over the ensuing months, thousands of entries flooded in. Some came tantalizingly close to the magic 10% threshold, but nobody beat it. Then in 2009, after two years of refining their algorithm, a team calling themselves BellKor's Pragmatic Chaos finally submitted the million-dollar piece of code, beating Netflix's engine by 10.06%. And it's a good thing they didn't pause to watch an extra episode of The Big Bang Theory before hitting the submit button. BellKor reached the finish line of the two-year race just 19 minutes and 54 seconds ahead of a second team, The Ensemble, who submitted an algorithm also reaching 10.06% improvement — just not quite fast enough.
In retrospect, the Netflix Prize was a perfect symbol of the company's early reliance on a core machine-learning task: algorithmically predicting how a subscriber would rate a film. Then, in March of 2011, three little words changed the future of Netflix forever: House of Cards.
House of Cards was the first "Netflix Original Series," the company's first try at producing TV rather than merely distributing it. The production team behind House of Cards originally went to all the major networks with their idea, and every single one was interested. But they were all cautious — and they all wanted to see a pilot first. The show, after all, is a tale of lies, betrayal, and murder. You can almost imagine the big networks asking themselves, "How can we be sure that anyone will watch something so sinister?" Well, Netflix could. According to the show's producers, Netflix was the only network with the courage to say, "We believe in you. We've run our data, and it tells us that our audience would watch this series. We don't need you to do a pilot. How many episodes do you want to do?"
We've run our data, and we don't need a pilot. Think of the economic implications of that statement for the television industry. In the year before House of Cards premiered, the major TV networks commissioned 113 pilots, at a total cost of nearly $400 million. Of those, only 35 went on the air, and only 13 — one show in nine — made it to season two. Clearly the networks had almost no idea what would succeed.
So what did Netflix know in March of 2011 that the major networks didn't? What made its people so confident in their assessment that they were willing to move beyond recommending personalized TV and start making personalized TV?
The pat answer is that Netflix had data on its subscriber base. But while data was important, this explanation is far too simple. The networks had lots of data, too, in the form of Nielsen ratings and focus groups and countless thousands of surveys — and big budgets for gathering more data, if they believed in its importance.
The data scientists at Netflix, however, had two things that the networks did not, things that were just as important as the data itself: (1) the deep knowledge of probability required to ask the right questions of their data, and (2) the courage to rebuild their entire business around the answers they got. The result was an astonishing transformation for Netflix: from a machine-learning-powered distribution network to a new breed of production company in which data scientists and artists come together to make awesome television. As Ted Sarandos, Netflix's chief content officer, famously put it in an interview with GQ: "The goal is to become HBO faster than HBO can become us."
Today, few organizations use AI for personalization better than Netflix, and the approach it pioneered now dominates the online economy. Your digital trail yields personalized suggestions for music on Spotify, videos on YouTube, products on Amazon, news stories from The New York Times, friends on Facebook, ads on Google, and jobs on LinkedIn. Doctors can even use the same approach to give you personalized suggestions for cancer therapy, based on your genes.
It used to be that the most important algorithm in your digital life was search, which for most of us meant Google. But the key algorithms of the future are about suggestions, not search. Search is narrow and circumscribed; you have to know what to search for, and you're limited by your own knowledge and experience. Suggestions, on the other hand, are rich and open ended; they draw on the accumulated knowledge and experience of billions of other people. Suggestion engines are like "doppelgänger software" that might someday come to know your preferences better than you do, at least consciously. How long will it be, for example, before you can tell Alexa, "I'm feeling adventurous; book me a weeklong holiday," and expect a brilliant result?
There's obviously a lot of sophisticated math behind these suggestion engines. But if you're math-phobic, there's also some very good news. It turns out that there's really only one key concept you need to understand, and it's this: to a learning machine, "personalization" means "conditional probability."
In math, a conditional probability is the chance that one thing happens, given that some other thing has already happened. A great example is a weather forecast. If you were to look outside this morning and see gathering clouds, you might assume that rain is likely and bring an umbrella to work. In AI, we express this judgment as a conditional probability — for example, "the conditional probability of rain this afternoon, given clouds this morning, is 60%." Data scientists write this a bit more compactly: P (rain this afternoon | clouds this morning) = 60%. P means "probability," and that vertical bar means "given" or "conditional upon." The thing on the left of the bar is the event we're interested in. The thing on the right of the bar is our knowledge, also called the "conditioning event": what we believe or assume to be true.
Conditional probability is how AI systems express judgments in a way that reflects their partial knowledge:
You just gave Sherlock a high rating. What's the conditional probability that you will like The Imitation Game or Tinker Tailor Soldier Spy?
Yesterday you listened to Pharrell Williams on Spotify. What's the conditional probability that you'll want to listen to Bruno Mars today?
You just bought organic dog food. What's the conditional probability that you will also buy a GPS-enabled dog collar?
You follow Cristiano Ronaldo (@cristiano) on Instagram. What's the conditional probability that you will respond to a suggestion to follow Lionel Messi (@leomessi) or Gareth Bale (@garethbale11)?
Personalization runs on conditional probabilities, all of which must be estimated from massive data sets in which you are the conditioning event. In this chapter, you'll learn a bit of the magic behind how this works.
Abraham Wald, World War II Hero
The core idea behind personalization is a lot older than Netflix, older even than television itself. In fact, if you want to understand the last decade's revolution in the way that people engage with popular culture, then the best place to start isn't in Silicon Valley, or in the living room of a cord-cutting millennial in Brooklyn or Shoreditch. Rather, it's in 1944, in the skies over occupied Europe, where one man's mastery of conditional probability saved the lives of an untold number of Allied bomber crews in the largest aerial campaign in history: the bombardment of the Third Reich.
During World War II, the size of the air war over Europe was truly staggering. Every morning, vast squadrons of British Lancasters and American B-17s took off from bases in England and made their way to their targets across the Channel. By 1944, the combined Allied air forces were dropping over 35 million pounds of bombs per week. But as the air campaign escalated, so too did the losses. On a single mission in August of 1943, the Allies dispatched 376 bombers from 16 different air bases, in a joint bombing raid on factories in Schweinfurt and Regensburg in Germany. Sixty planes never came back — a daily loss of 16%. The 381st Bomb Group, flying out of RAF Ridgewell, lost 9 of its 20 bombers that day.
World War II airmen were painfully aware that each mission was a roll of the dice. But in the face of these bleak odds, the bomber crews had at least three defenses.
1. Their own tail and turret gunners, to ward off attackers.
2. Their fighter escorts: the Spitfires and P-51 Mustangs sent along to defend the bombers from the Luftwaffe.
3. A Hungarian-American statistician named Abraham Wald.
Abraham Wald never shot down a Messerschmitt or even saw the inside of a combat aircraft. Nonetheless, he made an outsized contribution to the Allied war effort using an equally potent weapon: conditional probability. Specifically, Wald built a recommender system that could make personalized survivability suggestions for different kinds of planes. At its heart, it was just like a modern AI-based recommender system for TV shows. And when you understand how he built it, you'll also understand a lot more about Netflix, Hulu, Spotify, Instagram, Amazon, YouTube, and just about every tech company that's ever made you an automatic suggestion worth following.
Wald's Early Years
Abraham Wald was born in 1902 to a large Orthodox Jewish family in Kolozsvár, Hungary, which became part of Romania and changed its name to Cluj after World War I. His father, who worked at a bakery in town, created a home atmosphere of learning and intellectual curiosity for his six children. The young Wald and his siblings grew up playing the violin, solving math puzzles, and listening to stories at the feet of their grandfather, a famous and beloved rabbi. Wald attended the local university, graduating in 1926. He then went on to the University of Vienna, where he studied mathematics under a distinguished scholar named Karl Menger.
By 1931, when he finished his PhD, Wald had emerged as a rare talent. Menger called his pupil's dissertation a "masterpiece of pure mathematics," describing it as "deep, beautiful, and of fundamental importance." But no university in Austria would hire a Jew, no matter how talented, and no matter how strongly his famous advisor recommended him. So Wald looked for other options. In fact, he told Menger that he was happy to take any job that would let him make ends meet; all he wanted to do was keep proving theorems and attending math seminars.
At first, Wald worked as the private math tutor for a wealthy Austrian banker named Karl Schlesinger, to whom Wald remained forever grateful. Then in 1933 he was hired as a researcher at the Austrian Institute for Business Cycle Research, where yet another famous scholar found himself impressed by Wald: economist Oskar Morgenstern, the coinventor of game theory. Wald worked side by side with Morgenstern for five years, analyzing seasonal variation in economic data. It was there at the institute that Wald first encountered statistics, a subject that would soon come to define his professional life.
But dark clouds were gathering over Austria. As Wald's advisor Menger put it, "Viennese culture resembled a bed of delicate flowers to which its owner refused soil and light, while a fiendish neighbor was waiting for a chance to ruin the entire garden." The spring of 1938 brought disaster: Anschluss. On March 11, Austria's elected leader, Kurt Schuschnigg, was deposed by Hitler and replaced by a Nazi stooge. Within hours, 100,000 troops from the German Wehrmacht marched unopposed across the border. By March 15 they were parading through Vienna. In a bitter omen, Karl Schlesinger, Wald's benefactor from the lean years of 1931–32, took his own life that very day.
Luckily for Wald, his work on economic statistics had earned attention abroad. The previous summer, in 1937, he'd been invited to America by an economics research institute in Colorado Springs. Although pleased by the recognition, Wald had initially been hesitant to leave Vienna. But Anschluss changed his mind, as he witnessed the Jews of Austria falling victim to a terrible orgy of murder and theft and betrayal. Their shops were plundered, their homes vandalized, their roles in public life stripped by the Nuremberg Laws — including Wald's role, at the Institute for Business Cycle Research. Wald was sad to say good-bye to Vienna, his second home, but he could see the winds of madness blowing stronger every day.
So in the summer of 1938, at great peril, he snuck across the border into Romania and traveled onward to America, dodging guards on the lookout for Jews fleeing the country. The decision to leave probably saved his life. Remaining in Europe were Wald's parents, his grandparents, and his five brothers and sisters — and all but one, his brother Hermann, were murdered in the Holocaust. By then Wald was living in America. He was safe and hard at work, married and with two children, and he took solace in the joys of his new life. Yet he would remain so stricken by grief over the fate of his family that he never again played the violin.
Wald in America
Abraham Wald would, however, do more than his fair share to make sure that Hitler faced the music.
The 35-year-old Wald arrived in America in the summer of 1938. Although he missed Vienna, he immediately liked his new home. Colorado Springs echoed the Carpathian foothills of his youth, and his new colleagues received him with warmth and affection. He didn't stay in Colorado for long, though. Oskar Morgenstern, who had fled to America himself and was now in Princeton, was telling his math friends all up and down the East Coast about his old colleague Wald, whom he described as a "gentle man" with "exceptional gifts and great mathematical power." Wald's reputation kept growing, and it soon caught the attention of an eminent statistics professor in New York named Harold Hotelling. In the fall of 1938, Wald accepted an offer to join Hotelling's group at Columbia University. He began as a research associate, but he flourished so rapidly as both a teacher and scholar that he was soon offered a permanent position on the faculty.
By late 1941, Wald had been in New York for three years, and the stakes of what was happening across the water were obvious to all but the willfully blind. For two of those years Britain had been fighting the Nazis alone, fighting, as Churchill put it, "to rescue not only Europe but mankind." Yet for those two long years, America had stood aside. It took the bombing of Pearl Harbor to rouse the American people from their torpor, but roused they were at last. Young men surged forward to enlist. Women joined factories and nursing units. And scientists rushed to their labs and chalkboards, especially the many émigrés who'd fled the Nazis in terror: Albert Einstein, John von Neumann, Edward Teller, Stanislaw Ulam, and hundreds of other brilliant refugees who gave American science a decisive boost during the war.
Abraham Wald, too, was eager to answer the call. He was soon given the chance, when his colleague W. Allen Wallis invited him to join Columbia's Statistical Research Group. The SRG had been started in 1942 by four statisticians who met periodically in a dingy room in Rockefeller Center, in midtown Manhattan, to provide statistical consulting to the military. As academics, they were initially unaccustomed to giving advice under pressure. Sometimes this led to episodes revealing comically poor perspective on the demands of war. In the SRG's early days, one mathematician complained resentfully about being forced by a secretary to save paper by writing his equations on both sides of the page.(Continues…)
Excerpted from "AIQ"
Copyright © 2018 Nicholas Polson and James Scott.
Excerpted by permission of St. Martin's Press.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
Table of Contents
1 The Refugee 13
2 The Candlestick Maker 43
5 The Reverend and the Submarine 77
4 Amazing Grace 109
5 The Genius at the Royal Mint 145
6 The Lady with the Lamp 177
7 The Yankee Clipper 209