Neural Networks and Deep Learning: Crash Course AI #3

Hi, I’m Jabril, and welcome to CrashCourse
AI! In the supervised learning episode, we taught
John Green-bot to learn using a perceptron, a program that imitates one neuron. But our brains make decisions with 100 billion
neurons, which have trillions of connections between them! We can actually do a lot more with AI if we
connect a bunch of perceptrons together, to create what’s called an artificial neural
network. Neural networks are better than other methods
for certain tasks like, image recognition. The secret to their success is their hidden
layers, and they’re mathematically very elegant. Both of these reasons are why neural networks
are one of the most dominant machine learning technologies used today. [INTRO] Not that long ago, a big challenge in AI was
real-world image recognition, like recognizing a dog from a cat, and a car from a plane from
a boat. Even though we do it every day, it’s really
hard for computers. That’s because computers are good at literal
comparisons, like matching 0s and 1s, one at a time. It’s easy for a computer to tell that these
images are the same by matching the pixels. But before AI, a computer couldn’t tell
that these images are of the same dog, and had no hope of telling that all of these different
images are dogs. So, a professor named Fei-Fei Li and a group
of other machine learning and computer vision researchers wanted to help the research community
develop AI that could recognize images. The first step was to create a huge public
dataset of labeled real-world photos. That way, computer scientists around the world
could come up with and test different algorithms. They called this dataset ImageNet. It has 3.2 million labeled images, sorted
into 5,247 nested categories of nouns. Like for example, the “dog” label is nested
under “domestic animal,” which is nested under “animal.” Humans are the best at reliably labeling data. But if one person did all this labeling, taking
10 seconds per label, without any sleep or snack breaks, it would take them over a year! So ImageNet used crowd-sourcing and leveraged
the power of the Internet to cheaply spread the work between thousands of people. Once the data was in place, the researchers
started an annual competition in 2010 to get people to contribute their best solutions
to image recognition. Enter Alex Krizhevsky, who was a graduate
student at the University of Toronto. In 2012, he decided to apply a neural network
to ImageNet, even though similar solutions hadn’t been successful in the past. His neural network, called AlexNet, had a
couple of innovations that set it apart. He used a lot of hidden layers, which we’ll
get to in a minute. He also used faster computation hardware to
handle all the math that neural networks do. AlexNet outperformed the next best approaches
by over 10%. It only got 3 out of every 20 images wrong. In grade terms, it was getting a solid B while
other techniques were scraping by with a low C. Since 2012, neural network solutions have
taken over the annual competition, and the results keep getting better and better. Plus, AlexNet sparked an explosion of research
into neural networks, which we started to apply to lots of things beyond image recognition. To understand how neural networks can be used
for these classification problems, we have to understand their architecture first. All neural networks are made up of an input
layer, an output layer, and any number of hidden layers in between. There are many different arrangements but
we’ll use the classic multi-layer perceptron as an example. The input layer is where the neural network
receives data represented as numbers. Each input neuron represents a single feature,
which is some characteristic of the data. Features are straightforward if you’re talking
about something that’s already a number, like grams of sugar in a donut. But, really, just about anything can be converted
to a number. Sounds can be represented as the amplitudes
of the sound wave. So each feature would have a number that represents
the amplitude at a moment in time. Words in a paragraph can be represented by
how many times each word appears. So each feature would have the frequency of
one word. Or, if we’re trying to label an image of
a dog, each feature would represent information about a pixel. So for a grayscale image, each feature would
have a number representing how bright a pixel is. But for a color image, we can represent each
pixel with three numbers: the amount of red, green, and blue, which can be combined to
make any color on your computer screen. Once the features have data, each one sends
its number to every neuron in the next layer, called the hidden layer. Then, each hidden layer neuron mathematically
combines all the numbers it gets. The goal is to measure whether the input data
has certain components. For an image recognition problem, these components
may be a certain color in the center, a curve near the top, or even whether the image contains
eyes, ears, or fur. Instead of answering yes or no, like the simple
Perceptron from the previous episode, each neuron in the hidden layer does some slightly
more complicated math and outputs a number. And then, each neuron sends its number to
every neuron in the next layer, which could be another hidden layer or the output layer. The output layer is where the final hidden
layer outputs are mathematically combined to answer the problem. So, let’s say we’re just trying to label
an image as a dog. We might have a single output neuron representing
a single answer – that the image is of a dog or not. But if there are many answers, like for example
if we’re labeling a bunch of images, we’ll need a lot of output neurons. Each output neuron will correspond to the
probability for each label — like for example, dog, car, spaghetti, and more. And then we can pick the answer with the highest
probability. The key to neural networks — and really all
of AI — is math. And I get it. A neural network kind of seems like a black
box that does math and spits out an answer. I mean, those middle layers are even called
hidden layers! But we can understand the gist of what’s
happening by working through an example. Oh John Green Bot? Let’s give John Green-bot a program with
a neural network that’s been trained to recognize a dog in a grayscale photo. When we show him this photo first, every feature
will contain a number between 0 and 1 corresponding to the brightness of one pixel. And it’ll pass this information to the hidden
layer. Now, let’s focus on one hidden layer neuron. Since the neural network is already trained,
this neuron has a mathematical formula to look for a particular component in the image,
like a specific curve in the center. The curve at the top of the nose. If this neuron is focused on this specific
shape and spot, it may not really care what’s happening everywhere else. So it would multiply or weigh the pixel values
from most of those features by 0 or close to 0. Because it’s looking for bright pixels here,
it would multiply these pixel values by a positive weight. But this curve is also defined by a darker
part below. So the neuron would multiply these pixel values
by a negative weight. This hidden neuron will add all the weighted
pixel values from the input neurons and squish the result so that it’s between 0 and 1. The final number basically represents the
guess of this neuron thinking that a specific curve, aka a dog nose, appeared in the image. Other hidden neurons are looking for other
components, like for example, a different curve in another part of the image , or a
fuzzy texture. When all of these neurons pass their estimates
onto the next hidden layer, those neurons may be trained to look for more complex components. Like, one hidden neuron may check whether
there’s a shape that might be a dog nose. It probably doesn’t care about data from
previous layers that looked for furry textures, so it weights those by 0 or close to 0. But it may really care about neurons that
looked for the “top of the nose” and “bottom of the nose” and “nostrils”. It weights those by large positive numbers. Again, it would add up all the weighted values
from the previous layer neurons, squish the value to be between 0 and 1, and pass this
to the next layer. That’s the gist of the math, but we’re
simplifying a bit. It’s important to know that neural networks
don’t actually understand ideas like “nose” or “eyelid.” Each neuron is doing a calculation on the
data it’s given and just flagging specific patterns of light and dark. After a few more hidden layers, we reach the
output layer with one neuron! So after one more weighted addition of the
previous layer’s data, which happens in the output neuron, the network should have
a good estimate if this image is a dog. Which means, John Green-bot should have a
decision. John Green-bot: Output neuron value: 0.93. Probability that this is a dog: 93%! Hey John Green Bot nice job! Thinking about how a neural network would
process just one image makes it clearer why AI needs fast computers. Like I mentioned before, each pixel in a color
image will be represented by 3 numbers — how much red, green, and blue it has. So to process a 1000 by 1000 pixel image,
which in comparison is a small 3 by 3 inch photo, a neural network needs to look at 3
million features! AlexNet needed more than 60 million neurons
to achieve this, which is a ton of math and could take a lot of time to compute. Which is something we should keep in mind
when designing neural networks to solve problems. People are really excited about using deeper
neural networks, which are networks with more hidden layers, to do deep learning. Deep networks can combine input data in more
complex ways to look for more complex components, and solve trickier problems. But we can’t make all networks like a billion
layers deep, because more hidden layers means more math which again would mean that we need
faster computers. Plus, as a network get deeper, it gets harder
for us to make sense of why it’s giving the answers it does. Each neuron in the first hidden layer is looking
for some specific component of the input data. But in deeper layers, those components get
more abstract from how humans would describe the same data. Now, this may not seem like a big deal, but
if a neural network was used to deny our loan request for example, we’d want to know why. Which features made the difference? How were they weighed towards the final answer? In many countries, we have the legal right
to understand why these kinds of decisions were made. And neural networks are being used to make
more and more decisions about our lives. Most banks for example use neural networks
to detect and prevent fraud. Many cancer tests, like the Pap test for cervical
cancer, use a neural network to look at an image of cells under a microscope, and decide
whether there’s a risk of cancer. And neural networks are how Alexa understands
what song you’re asking her to play and how Facebook suggests tags for our photos. Understanding how all this happens is really
important to being a human in the world right now, whether or not you want to build your
own neural network. So this was a lot of big-picture stuff, but
the program we gave John Green-bot had already been trained to recognize dogs. The neurons already had algorithms that weighted
inputs. Next time, we’ll talk about the learning
process used by neural networks to get to the right weights for every neuron, and why
they need so much data to work well. Crash Course Ai is produced in association with PBS Digital Studios. If you want to help keep all Crash Course
free for everyone, forever, you can join our community on Patreon. And if you want to learn more about the math
behind neural networks, check out this video from Crash Course Statistics about them.


  1. Nerd Is The New Black

    August 24, 2019 at 6:06 am

    @t I'm looking to make a educational youtube channel like this , if you want to help you can donate here

  2. Daniel Hill

    August 24, 2019 at 6:09 am

    I'll never be able to articulate how wonderful it is to be learning this from a fellow black man.
    This is a blessing.
    Thank you infinitely my brotha.
    Thank you, you inspire me to be greater.


    August 24, 2019 at 6:54 am

    Apart from the science the dog 🐕 is cute.

  4. joshua fiddy allen

    August 24, 2019 at 7:04 am

    very good Gibraltar! my mans

  5. Blacktommer35

    August 24, 2019 at 11:12 am

    At 5:48 there's a typo that says spagehtti instead of spaghetti

  6. Nawab Khan

    August 24, 2019 at 11:30 am

    amzing sir keep it up and we are waiting for your more tutorials…

  7. Dylan Parker

    August 24, 2019 at 12:44 pm

    so, they got everyone to do their work for nothing? how, er… ingenious

  8. Thomas Chow

    August 24, 2019 at 12:51 pm

    why can't the neurons within each hidden layer interact with each other? For example, if a neighbour neuron got a high number, that'd make another neuron (of the same layer) act differently. is that arrangement helpful or just unnecessarily complicating things?

  9. Admiral Firespammer

    August 24, 2019 at 12:52 pm

    3.2M pics? Isn't that the number of pics some boorus have?
    So…. Where are the neural networks to automatically tag big tiddies anime girls? The cyberpunk that humanity deserves

  10. dejohnny2

    August 24, 2019 at 1:48 pm

    Jabrill, you hit a home run with this video. 5 stars dude!

  11. Geoffrey Winn

    August 24, 2019 at 2:57 pm


  12. werothegreat

    August 24, 2019 at 4:56 pm

    You have a very mellow, soothing voice. Just wanted to say that!

  13. Simple Skills

    August 24, 2019 at 6:03 pm

    Crash Course is the place to be. I expecially love this series. This channel has inspired me to create my own channel. It is new and I would love to get some support/guidance on how to improve.

  14. Riccardo Cacchioli

    August 24, 2019 at 6:31 pm

    Is he there because he is black or because he is competent?

  15. W0lfbane Shika

    August 24, 2019 at 6:57 pm

    Jabril: Hey, I'm y'bro!
    Yeah you are! Ahh I'm kidding it just sounds very like it.

  16. Richard Joyce

    August 24, 2019 at 9:06 pm

    And soon the governments will have all your secrets and be able to target children with no famalies so their pedophilic ways can continue, for evedence i site jimmy savile, geoffrey epstine and the fact that both these people were known to be actively molesting children and were protected and enabled by the british and american governments respectively, and they are just the most high profile cases, there are littaraly thousands worldwide being protected right now, this technology IS going to be abused and it IS going to be used to clamp down on your rights and freedoms and enable the governments round the world to silence those of us who speak out about the atrocities and abuses they enable and commit, this technology is allready been put to use in china to target people the government has issues with and the u.s. and the u.k. are not far behind them, welcome to your new orwellian future, you were warned and chose not to act, you only have yourselves to blame to enable violence through inaction is the worst form of violence, the people in power have been conditioned not to care for others from birth, whats your excuse?

  17. Gabe Darrett

    August 24, 2019 at 9:54 pm

    What about quantum computers?

  18. Patrick Vp

    August 25, 2019 at 2:16 am

    Please keep these ML videos coming. This is good, make as many as you possibly can. I cannot express how refreshing this series is, despite being only 3 episodes in. I've been waiting for CrashCourse to make a video series like this for a long time.

  19. Harshal Sojitra

    August 25, 2019 at 11:02 am

    Miss this line "yo guys jabrils here!"

  20. vishal Ghulati

    August 25, 2019 at 12:06 pm

    Hey! Jabril, want to thank you for such an informative and easy to comprehend lecture. But the only thing is that I didn't get that gist of the math imagery. Could you help me out?

  21. Miguel Eduardo Alastre, PMP

    August 25, 2019 at 1:38 pm

    This is awesome. Great big picture about AI
    spaghetti 5:48

  22. Sourav Goyal

    August 25, 2019 at 2:40 pm

    Man your voice is so dull. No excitement for topic and plane.

  23. Neo Mashego

    August 25, 2019 at 5:53 pm

    🥯==🍩? true:false;

  24. Matt Kuhn

    August 25, 2019 at 9:29 pm

    Oh man, I can't wait to see you guys talk about gradient descent! Great job so far!

  25. Josh Jones

    August 26, 2019 at 8:05 pm

    Could you guys do an African American History Crash Course?

  26. Eli Hinze

    August 26, 2019 at 9:11 pm

    I need you to narrate an audiobook. Your voice is so soothing..

  27. Benimation

    August 27, 2019 at 5:57 pm


  28. evanreidel22

    August 28, 2019 at 3:19 am

    I'm surprised you didn't decide to voiceover your Crash Course as well 🙂

  29. Sordatos Cáceres

    August 29, 2019 at 12:09 am

    What was the guy saying "nope, nope" with his head an closing the laptop, looking at ?

  30. Knack

    August 29, 2019 at 2:56 am

    Just remember, the brain perceives things through a series of guesses. so with billions of neurons doing complex statistical analysis, nobody is as bad at math as they think 🙂

  31. Andris Andris

    August 29, 2019 at 12:04 pm


  32. Obi Smith

    August 31, 2019 at 2:52 am


  33. Danilego

    August 31, 2019 at 5:39 pm

    I just love it when John Green Bot completely ignores what Jabril says and does something random lol

  34. Олег Козлов

    September 2, 2019 at 7:12 pm

    World isn't just bagels and donuts. Sometimes it's bees and threes.

  35. That girl from hell that kidnaps bad people.

    September 3, 2019 at 6:53 am

    > Answers simple question correctly
    > Moonwalks away
    John green bot is basically all 1st grade elementary boys ever.

  36. lincoln pepper

    September 3, 2019 at 3:50 pm

    single best explanation of neural networks I've seen.

  37. αทα нєlєทα ρrατα

    September 6, 2019 at 10:01 pm

    fantastic video

  38. amanatee27

    September 7, 2019 at 8:37 pm

    This is a great series, thank you all for taking the time to make it! For future videos, could Jabril's audio be turned up just a bit more? Sometimes, the end of his sentences get quieter and it's harder to catch all the info. Thank you!

  39. bauke

    September 8, 2019 at 5:47 pm

    Great series! Too bad the tape that goes into John Greenbot closes as a cd player tray in stead of a tape recorder 😉

  40. directfunebru

    September 14, 2019 at 9:51 am

    This is a really great series. *takes notes*

  41. juliant eko

    September 16, 2019 at 10:55 am

    Yup like hmmm I hate dog but I love cat

  42. MEDlC

    September 16, 2019 at 9:02 pm

    Now i want to learn to program. Ty for this

  43. Feyisayo Olalere

    September 24, 2019 at 9:56 am

    Could you do a video comparing how the early visual processing /feedforward process works in the brain? thanks for being cool!

  44. CakeZzi

    September 28, 2019 at 12:06 am

    Deeper nueral networks with deeper layers? That's deep

  45. Lia S.

    October 8, 2019 at 9:04 pm

    why red green and blue and not red yellow and blue (or magenta, yellow and cyan blue)? and how do the neurons "distribute tasks"? how do they "decide" which neuron of the hidden layers focuses on what?

Leave a Reply