Lesson 1: Practical Deep Learning for Coders


Rachel and I started fast.ai with the idea
of making neural networks uncool again. It is a grand plan, to be sure, because they
are apparently terribly cool. But really there are some things that we want
to see improve and that’s why we’re doing this course. We’re actually not making any money out of
this course. We’re donating our fees both to the diversity
fellowships that we’re running and also to the Fred Hollows Foundation. I would like to briefly give a quick pitch
to the Fred Hollows Foundation (for those of you who aren’t aware of it), because as
you know, deep-learning is fantastic for computer vision; it’s basically allowing computers
to see for the first time. What you might not realize is that there are
somethig like 3 or 4 million people in the world who can’t see because they have something
called “cataract blindness”. Cataract blindness can be cured for $25 per
eye and actually the group of people who got that price down from thousands of dollars
to $25 was Fred Hollows, who was the Australian of the Year some years ago. He’s passed away now, but his legacy is in
this foundation where if you donate $25, you are giving somebody their sight back. So, as you learn to teach computers how to
see we are also, Rachel and I, are also donating our fees from this to helping humans to see. So we think that’s a nice little touch. So we’re doing this both to help the Fred
Hollows Foundation but more importantly to help something we care a lot about, which
is making deep-learning more accessible. It is currently terribly exclusive. As I’m sure you’ve noticed, resources for
teaching it tend to be quite mathematically intensive — they really seem to be focused
on a certain kind of ivory-tower-type audience, so we’re trying to create training and examples
which are for non-machine-learning and math experts, dealing with small data sets, giving
raw models applications you can develop quickly. Today we’re going to create a real useful
piece of deep-learning code in seven lines of code. We want to get to the point where it is easy
for domain experts to work with deep-learning. There are a lot of domain experts here — whether
you’re working with getting satellites in the air, or whether you’re working with analyzing
the results of chemical studies, or whether you’re analyzing fraud at a bank — all those
people are here in this audience. You are domain experts that we want to enable
to use deep-learning. At this stage, the audience for this course
is coders because that’s as far as we think we can get at this point. We don’t need you to be a math expert, but
we do need you to be coders. I know that all of you have been told of that
prerequisite. We do hope that with your help we can get
to the point where non-coders will also be able to participate in it. The reason why we care about this is that
there are problems like improving agricultural yields in the developing world, or making
medical diagnostics accessible to folks that don’t have them or so forth. These are things that can be solved with deep
learning. But they are not going to be solved by people
who are at these kind of more ivory tower firms on the whole because they are not really
that familiar with these problems. The people who are familiar with these problems
are the people who work with them every day. So for example, I’ve had a lot to do with
these kinds of people at the World Economic Forum, I know people who are trying to help
cure TB and malaria, I know people who are trying to help with agricultural issues in
the developing world and so forth. These are all people who want to be using
deep-learning for things like analyzing crop imagery from satellites, or my most recent
start-up, which was analyzing radiological studies using deep-learning to deal with things
like the fact that in the entire continent of Africa there are only seven pediatric radiologists. So most kids in Africa, in fact in most countries
no kids have access to any radiologists and have no access to any kind of modern image-based
medical diagnostics. So these are the reasons that we’re creating
and running this course. We hope that the kind of feel with this community
is going to be very different than the feel that with deep-learning communities before,
that have been all about “Let’s trim 0.01% off this academic benchmark.” This is going to be all about “Let’s do shit
that matters to people as quickly as possible.” [Time: 5 minute mark] Sometimes to do that we’re going to have to
push the state-of-the-art of the research. And where that happens, we won’t be afraid
to show you the state-of-the-art of the research. The idea is that by the end of Part 1 of this,
you will be able to use all of the current best practices in the most important deep-learning
applications. If you stick around for Part 2, you’ll be
at the cutting edge of research in most of the most important research areas. So, we are not dumbing this down; we’re just
re-focusing it. The reason why we’re excited about this is
that we have now the three pieces of this universal learning machine. We now have the three critical pieces — an
infinitely flexible function, all-purpose parameter fitting, which is fast and scalable. The neural network is the function. We are going to learn exactly how neural networks
work. But the important thing about a neural network
is that they are universal approximation machines. There’s a mathematical proof, the Universal
Approximation Theorem, that we’re going to learn all about which tells us that this kind
of mathematical function is capable of handling any kind of problem we can throw at it. Whether that mathematical function is “How
do I translate English into Hungarian”, or whether that mathematical function is “How
do I recognize pictures of cats”, or whether that mathematical function is “How do I identify
unhealthy crops”. It can handle any of these things. So with that mathematic function, then the
second thing you need is some way to fit the parameters of that function to your particular
need. And there’s a very simple way to do that,
called “gradient descent” and in particular, something called “backwards propogation” or
“back-prop” which we will learn all about in this lesson and the next lesson. The important thing is though that these two
pieces together allow us to start with a function that is in theory capable of doing everything
and turn it into a function that is in practice capable of doing whatever you want to do,
as long as you have data that shows examples of what you want to do. The third piece, which has been missing until
very recently, is being able to do this in a way that actually works with the amount
of data that you have in the time you have available. And this has all changed thanks particularly
to GPUs. So GPUs are Graphics Processing Units, also
called “video cards” (that’s kind of an older term now), also called “graphics cards”. And these are devices inside your computer
which were originally designed to play computer games. So its kind of like when you’re looking at
this alien from the left-hand side and there’s light coming from above, what pixel color
do I need for each place. That’s basically a whole bunch of linear algebra
operations, a whole bunch of matrix products. It turns out that those are the same operations
we need for deep-learning. So because of the massive amount of money
in the gaming industry that were thrown at this problem, we now have incredibly cheap,
incredibly powerful cards for figuring out what aliens look like. And we can now use these, therefore, to figure
out medical diagnostics in Africa. So, it’s a nice, handy little side-effect. GPUs are in all of your computers, but not
all of your computers are suitable for deep-learning. And the reason is that programming a GPU to
do deep-learning really requires a particular kind of GPU, and in practice at the moment,
it really requires a GPU from Nvidia, because Nvidia GPUs support a kind of programming
called CUDA (which we will be learning about). There are other GPUs that do support deep-learning,
but they’re a bit of a pain, they’re not very widely used. And so one of the things that we’re going
to be doing is making sure that all of you guys have access to an Nvidia GPU. The good news is that in the last month (I
think) Amazon has made available good-quality Nvidia GPUs for everybody for the first time. They call them very excitingly their P2 instances. So I’ve spent the last month making sure that
it’s really easy to use these new P2 instances. I’ve given you all access to a script to do
that. Unfortunately, we’re still at the point where
they don’t trust people to use these correctly, so you have to ask permission to use these
P2 instances. [Time: 10 minute mark] The Data Institute folks, for anybody who
does not have an AWS P2 instance or their own GPU server, they are going to collect
all of your AWS IDs, and they have a contact at Amazon who will go through and get them
all approved. They haven’t made any promises, they’ve just
said they will do what they can. They are aware of how urgent that is, so if
you email your AWS ID to Mindy, she will get that organized. And we’ll come back and look at AWS in more
detail very shortly. The other thing that I have done is on the
wiki I have added some information about getting set up, Installation. There is actually quite an interesting option
called OVH. I’m sure by the time that this is a MOOC there
will be a lot more, but this is the only company I’ve come across who will give you a by-the-month
server with decent deep-learning graphics cards on it, and it’s only $200. To give you a sense of how crazily cheap that
is, if you go to their page for GPU servers, you’ll see that this GTX970 is $195 per month
and their next cheapest is $2000 a month. It just so happens that this GTX970 is ridiculously
cheap for how good it is at deep-learning. The reason is that deep-learning uses single-precision
arithmetic — it uses less accurate arithmetic. These higher-end cards are designed for things
like fluid simulations, tracking nuclear bombs and stuff like that, that require double-precision
arithmetic. So it turns out these GTX970s are only good
for two things, games and deep-learning. So the fact that you can get one of these
things which has got two GTX970s in it is a really good deal. So one of the things you might consider doing
in your team is maybe sharing the cost of one of these things. $200 per month is pretty good compared to
worrying about starting and stopping your 90 cent per hour AWS instance, particularly
if AWS takes a while to say yes. How many of you people have used AWS before? Maybe a third or a half. AWS is Amazon Web Services. I’m sure most of you, if not all of you, have
heard of it. It’s basically Amazon making their entire
back-end infrastructure available to everybody else to use. Rather than calling it a server, you get something
they call an instance. You can think of it as basically being the
same thing. It’s a little computer that you get to use. In fact, not necessarily little. Some of their instances cost $14 or $15 an
hour and give you like 8 or 16 graphics cards and dozens of CPU and hundreds of gigabytes
of RAM. The cool thing about AWS is that you can do
a lot of work on their free instance. You can get a free instance called a T2.micro
and you can get things set up and working on a
really small dataset and then you can switch it across if you want to then run it on a
big dataset, switch it across to one of these expensive things and have it run and finish
in an hour or two. So that’s one of the things that I really
like about AWS. Microsoft also has something a lot like AWS
called Azure. Unfortunately, their GPU instances are not
yet publicly available. I’ve reached out to Microsoft to see if we
can get access to those as well, and I’ll let you know if we hear back from them. One of the things that Rachel has done today
is to start jotting down some of the common problems that people have found with their
AWS installs. Getting AWS set up is a bit of a pain, so
we’ve created a script that basically will do everything for you. But the nice thing is that this script is
very easy for you to have a look at and see what’s going on, so over time you can kind
of get a sense of how AWS works. Behind the scenes, AWS is using their command-line
interface, or CLI, which we’ve given you instructions on how to install. [Time: 15 minute mark] As well as using the CLI, you can also go
to console.aws.amazon.com and use this grapical interface. In general, I try to avoid using this graphical
interface because everything takes so much longer and it’s so hard to get things to work
repeatedly. But it can be nice to kind of look around
and see how things are put together. Again, we’re going to come back and see a
lot more about how to use the graphical interface here, as well as how to create and use scripts. So these are some of the pieces that we want
to show you. I wanted to talk a bit more before we go into
more detail about some of the interesting things that we’ve seen happening in deep-learning
recently. And perhaps the thing that I’ve found most
fascinating recently was when one of the leading folks at Google Brain presented this at a
conference at Stanford, which showed the use of deep-learning at Google. And you can see from this is just 2012 to
today, or maybe two months ago, it’s gone from nothing to over 2500 projects. Now the reason I find this interesting is
that this is what is going to start happening to every organization and every industry over
the next few months and few years. So they’ve kind of described how Google is
getting used pretty much everywhere and you can imagine probably if they redid this now,
two months later, it’s probably going to be somewhere up here. So we’ve kind of felt that it would be great
to kind of kick-start lots of other organizations to start going up this ramp. That’s another kind of reason we’re doing
this. I really like looking at applicatons and we
started seeing some examples of some kind of deep-learning amateurs applications — this
is an example of it. What these guys did is (they’re not machine-learning
or deep-learning experts) – they downloaded a copy of Cafe, they ran a pre-existing model. This is what we’re going to learn to do today. We’re going to run a pre-existing model and
use the features from that model to do something interesting. In their case, the thing that they were doing
that was interesting was to take data that they already had, because they are skin lesion
people and analyze skin lesions. These are the different kind of skin lesions
that you can have. They found, for example, that the previous
best for finding this particular kind of skin lesion was 15.6% accuracy. When they did this off-the-shelf Cafe pre-existing
model with a simple linear thing on top, they quadrupled it to 60%. Often when you take a deep-learning model
and use the very simple techniques we’ll learn today, you can get extraordinary optics compared
to non-deep-learning approaches. Another example of that was looking at plant
diseases,there have been at least two groups that have done this in the last few months. Again, very successful results from people
who are not deep-learning or machine-learning experts. Similar results in radio modulation. These folks who are electrical engineering
people found that they could double the effective coverage area of phone networks (this is a
massive result), and again they used very simple approaches. It’s being used in fashion, it’s being used
to diagnose heart disease, and by hedge-fund analysts. There’s a particular post which I found really
inspiring actually in trying to put this together, which is that Keras (which is the main library
we’ll be using), the author of that put together this post showing how to build powerful models
using very little data. I really just want to give a shout-out to
this and say that this work that Francois has been doing has been very important in
a lot of the stuff we’re going to be learning over the next few classes. [Time: 20 minute mark] The basic environment that we’re going to
be working in most of the time is the ipython notebook or the jupyter notebook. Let me just give you a sense of what’s going
on here. When you have a jupyter notebook open, you
will something which … This is a good time to show you about starting
and stopping AWS instances. So I just tried to start going to my notebook
on AWS and it says it can’t be reached. So my guess is if we go back to my console
you can see that I have zero running instances – I have zero servers currently running. So if I click that, I will see all my servers. Normally I would have one P2 server (or instance)
and one T2, because I use the free one for getting everything set up and then use the
paid one once everything’s working. Because I’ve been fiddling around with things
for this class, I just have the P2 at the moment. So, having gone here, one way I could start
this is by selecting Start here, but I don’t much like using this GUI for stuff because
it’s so much easier to do things through the commandline. So one of things that I showed you guys that
you could download today is a bunch of aliases that you can use for starting and stopping
AWS really quick. If you haven’t got them yet, you can find
links to them on Slack, or you can just go to platform.ai/files and there’s a bunch of
different things here. This aws-alias.sh is a file that sets up these
various aliases. The easiest way to grab stuff on your AWS
instance or server is to use wget, so I would right-click on this and choose CopyLinkAddress,
and then go wget and paste in that and that will go ahead and download that file (I already
had one, so it created a copy of it). We can take a look at that file, and you’ll
see that it’s basically a bunch of lines that say “alias something=somethingElse”. And it’s created aws-get-p2, aws-get-t2, aws-start,
aws-ssh, aws-stop. I’m going to show you what these things do
because I find them pretty convenient. First of all, I’ll say “source aws-alias.sh”
and that just runs this file (in bash, that’s how you run a file). That’s now caused all of those names to appear
as aliases to my system. So if I now run aws-get-p2, that’s going to
go ahead and ask Amazon for the ID of my P2 instance. And not only does it print it, but it’s going
to save it into a variable called “instanceId” and all of my other scripts will use $instanceId. So I now want to start that instance, so I
just type aws-start and that’s going to do the equivalent thing of going to the GUI,
right-clicking, choosing Start. The other nice thing it does is it waits until
the instance is running and at the end it asks, or querries for, the IP address and
prints it out. Now the script that I have given you to set
up these instances actually uses an elastic IP that actually keeps the same IP address
every time you run it. So you should find that the IP address stays
the same, which makes it easier, so there is the IP. So I then have something called aws-ssh, and
aws-ssh will go ahead and ssh into that instance (ssh [email protected]$instanceIp). So all it does is basically use the username
“ubuntu” (because that’s the default username for this kind of image on AWS) @$instanceIp
(that’s the IP address we just got). [Time: 25 minute mark] The other thing it does is to use the private
key that was created when this was originally set up. Now in my case, I’ve actually moved that private
key to be my default key so I don’t actually need that -Ip. But you can just type aws-ssh and you’ll see,
bang, here we are. We are now inside that AWS image. One of the handy things about AWS is they
have this thing called AMIs, Amazon Machine Images. An AMI is basically a snapshot of a computer
at a particular point in time. And you can start your own instance using
a copy of that snapshot. So in the script I’ve given you guys I’ve
created and provided an AMI which has all the stuff we want installed. So that’s why it is when you use that script
and log in to it, you can start running things straight-away. So let’s do that right now. I’ve already created a directory for you called
“nbs”, for notebooks. So we can go ahead and type “jupyter notebook”,
and this is how we ask Amazon to set up a jupyter notebook server for us. When it’s done, it says “The Jupyter Notebook
is running at: http://[all ip addresses on your system]:8888”. So what is our IP address? Well, it told us up here when we started it,
52.40.116.111. So I’m going to go to my instance, 52.40.116.111:8888
(it told me that the port is 8888), and press Enter. I’ve set up a password, it is just “dl_course”. We can look later on at how to change that
password but I though it would be handy to have a password there for everybody if you
want to start looking at your own data. Actually, by default it is not going to show
you anything. So now we can just go ahead and say New->Notebook
and choose Python[condaRoot], and this sets up a scientific computing environment for
you, where you can type Python commands and get back responses. The basic idea here is that over there on
Amazon, you have your server, and it’s running a program called jupyter notebook. Jupyter notebook is causing a particular port
(which is 8888) to be opened on that server, where if you access it, it then gives you
access to this jupyter notebook environment. In your team, you guys can all use the same
jupyter notebook if you want to. Or you could run multiple jupyter notebooks
on one machine. It is really pretty flexible. So now that I’ve created one, I can rename
this to say, “jeremy’s nb”. And so then Rachel might come along and be
like I want to run something as well, so she goes New on her computer and it creates a
whole new one over here. And she could say File->Rename and call it
“rachel’s notebook”. If I now go back here, you can see both of
these notebooks are shown to be running. So the server is running multiple kernels,
they’re called. And you can see back here it’s saying, “Creating
new notebook” … “Kernel started …” . So each of those are totally separate. So from one of them, I say “name=rachel” and
in the other one I say “name=jeremy” and over here I say “name”, you’ll see that they are
not in any way talking to each other, they are totally separate. So that’s a super-handy way to do work and
the other nice thing is that you can not just type code, but you can also type Markdown. [Time: 30 minute mark] So I could go, “New Section”, “I want to talk
about something here”. And so that as I do that, it allows me to
mix-and-match information and code. And every piece of code that came out, I can
see where it came from. And, also as you’ll see, it allows us to put
in visualizations and plots and so forth. Some of you may have come across this important
concept called Literate Programming. And Literate Programming is the idea that
as you code, you are documenting what you are doing in a very deep way, not just for
others, but maybe more importantly for yourself. So when you’re doing data science work, work
like a scientist. How many people here are in some form scientists,
or have been scientists? So you guys will know the importance of your
journal notebook. The greatest scientists, there are all sorts
of stories about the kinds of notebooks they kept and how their lab notebooks or their
lab journals worked. This is critical for data scientists too. The idea that as you do experiments, you’re
keeping track — what did I do, what worked, what didn’t work. I can see all the people who put their hands
up as scientists are all nodding right now. So this makes it super-easy to do that. So be helpful to yourself and to your team
by taking advantage of this. Now in order to learn to use this environment,
all you have to do is press H. And when you press H, it brings up all these keyboard shortcuts. After not very long, you will get to know
all of them, because they are all extremely useful. But the main ones I find particularly helpful,
is you hit M to turn into Markdown mode (that’s the mode where you can enter text rather than
code), or Y to switch it back to code again. And you certainly need to know [SHIFT]Enter,
which evaluates the cell and gives you a new cell to enter into, and you also need to know
Escape, which pops you out of entering information and gets you back into this command mode,
and then Enter to get back into enter mode again. And you see as I move around, it changes which
one is highlighted. I’ve started to create some resources on the
wiki to help you with jupyter notebook. It’s still really early, but you guys I’m
sure can help by adding more information here. One of the things I particularly mention is
that there are some good tutorials. I thought I had also mentioned my favorite
book, “Python for Data Analysis” by Wes McKinney. It’s a little old, it also covers Pandas a
lot (which you don’t need), but it’s a good book for getting familiar with this basic
kind of Python scientific programming. The last kind of ingredient I want to introduce
is Kaggle. How many people have been to or have done
anything with Kaggle at any point? Anybody who is in the masters program here
I’m sure will have used Kaggle or will shortly use Kaggle. Mainly because it’s a great place to get all
kinds of interesting data sets. So for example, if you wanted to test your
ability to automate drug discovery, you could go to Kaggle and download the files for the
Merck Molecular Activity Challenge, run some models and test them to see how they compare
to the state-of-the-art by comparing them to the leader board. Kaggle is a place where various organization
run machine-learning competitions, they generally run for about three months. It’s super-cool because they get archived,
potentially forever. You can download the data for them later on
and find out how you would have done in that competition. [Time: 35 minute mark] Generally speaking, if you’re in the top 50%
that means you have an okay-ish model that is somewhat worthwhile. If you’re in the top 20%, it means that you
have a very good model. If you’re in the top 10%, it means you’re
at the expert level for these type of problems. If you’re in the top 10, it literally means
you’re one of the best in the world because every time I’ve seen a Kaggle competition
(I used to be president of Kaggle, so I’m very familiar with it) at least the top 10
generally all beat the previous best in the world and generally are very good machine-learning
experts who are going beyond anything that’s been done before. It seems that the power of competition pushes
people way beyond what the previous academic state-of-the-art was. So Kaggle is a great environment to find interesting
data sets and to benchmark your approaches. So we’re going to be using it for both of
these purposes. Our first challenge will be Dogs vs Cats. Sometimes on Kaggle they run competitions
that are not done for lots of money, sometimes they are done for free or for a bit of fun. In this case, it was actually done for a particular
purpose which was can you create an algorithm that recognizes the difference between dog
photos and cat photos. The reason why was that this particular organization
was using that problem as a CAPTCHA, in other words, to tell the difference between humans
and computers. It turned out that the state-of-the-art machine
classifiers could score 80% accuracy on this task. So really this group wanted to know can you
surpass the state-of-the-art, is this a useful CAPTCHA. And then if you can surpass the state-of-the-art,
can they use this in a dogs vs cats recognizer for their pet finding work. So really the goal here was to beat 80%. Now this is a great example of the kind of
thing which you could use for a thousand million different purposes. For example, the work I did in cancer detection
is this — if you take a CT or an x-ray or an MRI and you say to a deep-learning algorithm,
these people have malignant cancer, these people don’t, then it’s the same as cats vs
dogs. If this is a healthy, high crop-yield area
from satellite photos, this area is not, then it’s the same as cats vs dogs. If you say, this is one kind of skin lesion
and this is another kind of skin lesion; if you say that this is an abstract art painting
and this is not; this is an extremely valuable painting and this is not; this is a well-taken
photo and this is not. They’re all image analysis problems that are
generally classification problems, and these are all examples of things that people have
done with this kind of technology. So cats vs dogs, it turns out, is a very powerful
format and so if we can learn to solve this well, we can solve all of these kinds of classification
problems. Not just binary, not just this group or that
group, but also things like that skin lesion example, these are 10 different types of skin
lesions, which type is it. Or the crop disease example, which of these
13 crop diseases are we looking at here. An example of an actual thing that I saw was
cucumber analysis. A Japanese cucumber farmer used this approach
to deep-learning, he automated all the logistics and had a system that would put different
grades of cucumbers into different bins automatically and make the cucumber workflow much more efficient. So, if that was your idea for a start-up,
it’s already been done, sorry … but there’s many more. There are all of our basic pieces. To get started, here we are with this AWS
server, with this pretty empty looking set of notebooks here, so we want to go ahead
and start getting some work done. To do that, we need to download the basic
files that we need. [Time: 40 minute mark] So I’ve sent you all of this information already,
all of the information you need is on our platform.ai website. All of the notebooks are in files/nb. I’m going to show you a cool little trick,
I’m going to press [CNTL]C twice, that shuts down the notebook, the notebook’s not running. Don’t worry, it saves itself automatically
on a regular basis, or you could just hit S to save it right now. After shutting down the notebook, as you’ll
see, the Python notebook files are still there. And you can see actualy that behind the scenes,
they’re just big bunches of JSON text, so you can stick them in Github
and they’ll work pretty well. What I generally like to do is run something
called tmux. How many of you here have used tmux or screen
before? Those of you who haven’t, you’re going to
love this trick. Tmux and screen are programs that let you
run programs on your server, close your terminal, come back later and your program will still
be running in the exact same way. I don’t remember if tmux is already installed;
it is. To use it, you just go tmux, and it looks
like nothing happened, except a little green bar here at the bottom. But if I now hit tmux’s magic command, which
is [CNTL]B, and press [CNTL]B? (control-B-?), you can see there are lots
of keystrokes that tmux has ready for me to use. And so one of the ones I like is [CNTL]B”
(control-B-doubleQuote), which creates a second window underneath this one, or [CNTL]B% (control-B-percent),
which creates a second window next to this one. I seem to like to set up a little tmux session
and get it all set up the way I want. So I’m not going to go into detail about how
to do everything I show you, what I really want to do in the class to make the most of
the time, is say here’s something that exists, here’s something I recommend you using, here’s
what it’s called, and during the week, you can play with it. You can ask questions, you can use it in your
team, and so forth. So here it is, it’s called tmux, this is what
it does, and I’ll show you something cool. If I now go [CNTL]B and then d for detach,
and close out of this altogether, it’s all gone. So if I now go back into my server, it’s all
gone. So if I now go back into my server … I wasn’t
able to return to my session properly because currently $instance Ip is not defined. Rather than every time I start, sourcing my
aws-alias.sh file, what I should do is go “vim .bashrc” (.bashrc is a file that is run
every time you run bash), and if I edit my .bashrc file and at the end I type “source aws-alias.sh”, you can see
now all the aliaseses are there. So before I ssh to $instanceId, I have to
find out my correct IP address. So I can say aws-get-p2 to get my instance
ID. I’m not sure I have something here to just
get the IP address. As you can see, I’m kind of playing with this
as I go. So I’m going to go ahead and show you how
to do this. Right now, the IP address only gets printed
out when I start an instance. [TimeP: 45 minute mark] In this case, I’ve already got an instance
running. I’m going to edit this script and I’ll change
it later on. But basically I’m going to create a new alias
called aws-ip, and I just going to keep the bit that says instanceIp=somethingSomethingSomething. I then source aws-alias.sh, and I’ve now got
a new alias called aws-ip, and now I can go ssh [email protected]$instanceIp. Having said all that, because my IP address
is going to be the same every time and I couldn’t really be bothered waiting for all that, I’m
actually going to manually going to put my IP address in here, so that the next time
I run this I can just press upArrow and rerun the command. I’m kind of showing you lots of ways of doing
things so that you can kind of decide what your own workflow is like, or come up with
better ones. So here’s a cool thing, I am back in my box
here, and then if I say “tmux attach”, I am exactly back to where I came from. Whatever I had running, whatever state it
was, it is still sitting there. The particularly cool thing is that any notebooks,
the kernels I had running, they are all still sitting there. This is particularly helpful if you are running
an OVH server, or one of your own servers. With AWS, it is a little less helpful because
you really need to shut it down to avoid paying the money. But if you’ve got something you can keep running. For all the USF students, you all have or
will have access to the GPU server we have here at the University, particularly helpful
for you guys. So I actually tend to use this little bottom
right hand window to permanently have jupyter notebook running and I tend to use this left
hand window to do other things. In particular I am going to go and grab my
notebook. The easiest way to grab things is to use wget,
and if I go “wget http://www.platform.ai/files/nbs/lesson1.ipynb”, I now have a notebook, lesson1 notebook. And so if I go back to my jupyter notebooks,
it is here and if I click on it and here is our notebook. If you’re using a T2 instance (the free one)
generally speaking particularly the first time you run something, it could take quite
a long time to open. You should find the second time is fast, by
the way. So here is our notebook. Hopefully quite a few of you have already
gotten to the point today that you can see this. Those of you that haven’t will get plenty
of help during the week. This particular notebook uses two external
scripts to help. Those scripts are called utils and vgg16. The last thing to do before our break is to
grab those (wget), just toss those all in the notebook directory so they’re all in the
same place. Then unzip them. Then the only other thing that you need is
the data. The data sits in the platform.ai/data directory. The data is all the dogs and cats. Now I’ve the Kaggle data and made some changes
to it, which I’m going to be showing you. So rather than downloading it from Kaggle,
I suggest you grab it from platform.ai, and I’ve sent you this information today as well. So I’m going to cd into the data directory
and wget dogscats.zip as well. So that’s going to run for
a few minutes. [Time: 50 minute mark] The previous section for some of you was a
bit of a fire hose of information, here’s bash, here’s AWS, here’s Kaggle, here’s GPUs. And for some of you
it was probably really boring, most practicing data scientists probably are using all of
these things already. If you’re at one extreme (holy shit that was
a fire hose of information), don’t worry, we have all week to get through it. We’ll have the video tomorrow. And by the time that you’re here again next
week, I want to make sure that everybody who has the time and interest to work hard on
it has got through all the material. If you haven’t, like it’s early in the weekend
and you’re not going to get there, please let Rachel and I know. We will work with you in person to get you
there. Everyone who puts the time in, I’m determined
to make sure can get through the material. If you don’t really have the background and
you don’t really have the time, that’s fine. Maybe you won’t get through all the material. But I really am determnied that everybody
who’s prepared and able to put in the time can get through everything. So between the community resources, and the
video, and Rachel and I, we will help everybody. To those of you who are practicing data scientists
and you are familiar with all of these pieces, I apologize that it will be a bit slow for
you and hopefully as we move along there will be more and more new stuff. I’m kind of hoping that for those of you that
have some level of expertise, we will continually give you ways that you can go further. So for example, at the moment, I’m thinking,
can you help us with these scripts, to make them better, to make them simpler, to make
them more powerful, to create Azure versions of them. All this stuff that we’re doing to try and
make deep-learning as accessible as possible, can you help contribute to that, can you contribute
to the wiki. So for those of you that already have a high
level of expertise, I’m looking to make sure there’s always ways to push yourself. So if you’re ever feeling a bit bored, let
me know and I’ll try to give you something to do that you don’t know how to do, and then
you won’t be bored anymore. So at this point, I downloaded dogscats.zip
and I unzipped it (unzip -q dogcats.zip). If you are wondering about the “-q”, it is
just because otherwise unzip prints out every filename as it goes, so “q” is for quiet. So just about the most important thing for
doing this kind of image classification is how the data directories are structured. In particular, you’ll notice that we have
a training set, and a test set. That’s because when we downloaded the data
originally from Kaggle, it had a train.zip and a test.zip. Keras, which is the library we’re going to
use, expects that each class of objects that you’re going to recognize is in a different
directory. So the one main thing I did after I downloaded
it from Kaggle is that I created two directories, one called cats and one called dogs, put all
the cats in the cats directory and all the dogs in the dogs directory. When I downloaded them from Kaggle, they were
all in one directory and they were called cat.1.jpg and dog.1.jpg. (ls train/dogs/dog.1*)
There are 11500 dogs in there and 11500 cats. So now if I “ls -l train/dogs/ | wc -l; ls
-l train/cats/ | wc -l” there are 11,500 dogs in there and 11,500 cats in there, so that’s
the number of dogs and cats we have in our training set. [Time: 55 minute mark] So for those of you that haven’t done much
data science before, there’s this really key concept that there’s a training set and a
test set. Kaggle, being a competition, makes this really
obvious, the files in the training set tell you what they are; here is a dog, it’s called
dog.something. But if I look in the test set, they don’t
say anything, they are just numbers. Why is that? That’s because your job in the Kaggle competition
is to say for example, file 43.jpg – is it a dog or is it a cat? So there are 12500 images in the test directory
for you to score, for you to classify. Even if you’re not doing a Kaggle competition,
you should always do this yourself. Ideally, you should get one of your colleagues
to do it without you being involved. Split the data into a test set and a training
set and to not let you look at the test set until you’ve promised you’re finished. Kaggle kind of enforces this. They let you submit to the leader board and
find out how you’re going but the final score is given based on a totally separate set of
data that is not scored. So this is like for me, before I started entering
Kaggle competitions, I thought my data science process was reasonably rigorous, but once
I really started doing competitions I realized that that level of enforcing the test/training
data set made me a much better data scientist, you know you can’t cheat. I do suggest that you do this in your own
projects as well. Now because we also want tune our algorithm,
in terms of different architectures and different parameters and so forth, I’d like to talk
about, it’s also a good idea to split your training set further, into a training set
and a validation set. You’ll see a lot more about how this works. But you’ll see in this case, I’ve created
another directory, called “valid”, which has dogs and cats subdirectories as well. Let’s check that they’re exactly the same. Here you can see that there are 1000 cats
and 1000 dogs. So when I originally downloaded from Kaggle,
there were 12500 cats and dogs in the training set. That’s why in my training set there are 11500
because I moved 1000 of each of them to the validation set. So that’s the basic data structure we have. Other than splitting things into test, training,
and validation sets (that’s the most important advice I have for data scientists), the second
most important piece of advice I have for data scientists is to do nearly all of your
work on a sample. A sample is a very small amount of data that
you can run so quickly that everything you try, you get a nearly immediate answer to. This allows you to very quickly try things,
change things, and you get a basic process running. So I always create a sample with 100 or so
items to just get started with. So you’ll see I have a directory called sample,
and in that I have a whole separate train and valid. I did not move things there, I copied them
there. The purpose of this sample directory is to
just let me do things really quickly. So, you’ll see inside sample/train, again
I have cats and dogs directories, but this time there ae 8 files in each directory. I probably should have put more in there. I think more like 100 may have been good,
but I think at the time I was using a really low power computer to do my testing, just
enough to check that my script’s working. Now that everything’s downloaded, you can
see that I have my jupyter notebook has already loaded. I’ll get rid of the zip files and the notebooks
that I was just playing with, and we’re ready to get started doing some deep learning. [Time: 1 hour mark] The goal for you guys during this week will
be to replicate everything that I’ve done, initially just by making sure that this notebook
works for you, but then to replicate it with another data set. One of the things we’ll do tomorrow is post
some ideas of other interesting Kaggle data sets you can try, and maybe other people can
also post other interesting data sets they found elsewhere. The idea will be to make sure that during
the week, you can run your own classification process on some data set other than dogs and
cats. But first of all, make sure you can run this. So as you can see in this notebook, I’ve used
Markdown cells. How many people have used Mardown before? So, most of you. For those of you that don’t know, Markdown
is what we use both in the notebook, as well as on the wiki. It’s basically a way of really quickly creating
formatted text. There’s not enough of you that aren’t familiar
with it that I’m going to go over it in detail. If you’re not familiar with it, please google
Markdown, and you can experiment with it either on the wiki or in your notebook. As you can see though, I’ve basically created
cells with headings and some text. During the week, you can read through things
in detail. As we mentioned, we’re going to try to enter
the dogs and cats competition. So 25,000 labeled dog and cat photos half
of each, 12,500 in the test set, and the goal is to beat 80%. As we go along, we are going to be learning
about quite a few libraries. Not too many, but enough that for those of
you that haven’t used Python for data science before, it’s going to seem like quite a bit. By the end of the seven weeks, hopefully you’ll
be pretty familiar with all of them. One of the really important three is matplotlib. Matplotlib does all of our plotting and visualization. And on the wiki, we have a section called
Python Libraries, and as you can see we have our top three listed up here. At the moment, there are just links to where
they come from. I’m hoping that you guys will help us to turn
this into a really rich source of information, about places that you’ve found lots of helpful
stuff, answers to questions. But for now, if you’re not familiar with one
of these things, type the word followed by “tutorial” into google and you’ll find lots
of resources. All of these things are widely used, Keras
a little bit less so because it’s just a deep-learning library and therefore relatively new. Numpy and matplotlib and scikit-learn and
scipy — there’s lots of books about them, there’s lots of tutorials about them. Matplotlib creates plots and one of the things
we need to do is to tell jupyter notebook what to do with these plots. Should it pop up a new window for them, should
it save them? So this “%matplotlib inline” says please show
our plots in the actual jupyter notebook. That’s pretty much the first line in every
jupyter notebook right here. And here’s the thing I told you about, which
is sometimes I want to run stuff on a sample, sometimes I want to run it on everything. So I make it really easy for myself by having
a single thing called “path” which I can switch between the sample and the everything. So for now, lets just do things on the sample,
do all of your work on the sample until everything is working. As you can see, each time I’ve done something,
I’ve pressed [SHIFT][ENTER] and it says a particular number after the In (In [2]:),
so this is the second input cell that I’ve run. Like every programming language, a large amount
of the power of Python comes from the libraries that you use. To use a library in Python, you have to do
two things: you have to install it and then you have to import it. In Python, I strongly recommend that you use
a particular Python distribution called Anaconda. And if you’re using the scripts and the AMIs
we provided, you’re already using Anaconda. [Time: 1.05 hour mark] You can check which Python you’re using by
typing “which python” and it will show you. You’ll see that I’m not only using Anaconda,
but I’m using an Anaconda that was installed into my home directory. So no screwing around with sudo or any of
that business. If you use our AMI scripts, this is all being
done for you. With Anaconda, installing anything is as simple
as typing “conda install” and the name of the package. And on Anaconda, everything’s been precompiled,
so you don’t have to wait for it to compile, you don’t have to worry about dependencies,
you don’t have to worry about anything, it just works. That is why we very highly recommend using
Anaconda. It works on Mac, it works on Windows, and
it works on Linux. Lots of Windows users use it, very few Linux
users use it, very few Mac users use it. I think that’s a mistake because lots of Linux
and Mac users also have trouble with compiling dependencies and all that stuff. I suggest that everybody use it. From time-to-time, you’ll come across something
that does not have a conda installer available, in case you’ll have to use pip instead. In our case, I think just Theano and Keras
are in that situation, but neither of those need compiling anything at all, so they’re
very, very easy to install. So, once you’ve installed it by typing conda
install whatever (and most things are already installed for you with our AMI), you then
have to tell Python that I want to use it in this particular session, which you do by
typing import and the thing you want to look at. So I’m not going to go through all these libraries
right now (I’ll go through them as we use them), but one of the big three is here, which
is numpy. Numpy is the thing which our wiki page describes
that provides all of our linear algebra. How many people here have some familiarity
at all with linear algebra? Nearly all of you, good. So, if you’re somebody who didn’t put up their
hand, I would suggest looking at the resources that Tara added. Go back to the homepage and go to Linear Algebra
for Deep Learning. Generally speaking for any math stuff, my
suggestion is to go to the Kahn Academy site. The Kahn Academy has really great videos for
introducing these kind of simple topics. We just need to know these three things, mainly
just these first two things (matrix product, matrix inverse) for this course. Numpy is the thing that gives you these linear
algebra operations in Python, and as you’ll see it makes them extremely easy to use. Pretty much everybody renames numpy to np,
thats what “import numpy as np” does. You’ll find in nearly everybody’s script on
the Internet, it will be np.something. In general, we try to stick with the same
kind of approaches that everybody else uses, so that nothing will be too unfamiliar. Okay, so we’ve imported the libraries that
we need. We also try to provide additional utilities
and scripts for things we think out to exist but don’t exist, to make things easier. There’s very few of them. Nearly all of them are in a little script
called utils. There’s a cool little trick, if you are using
an external script that you’ve created and you’re changing it quite a bit. For example, now that you’ve got utils, feel
free to add and change and do what you like to it. If you import it like this, “import utils;
reload(utils); from utils import plots”, you can go back and rerun that cell later after
you’ve changed utils.py, and all of your changes will be there, available for you to use. For now, we’re just going to use one thing
from our utils library, plots. So, our first step will be to use a pre-trained
model. What do we mean by a pre-trained model? What we mean is that somebody has already
come along and downloaded millions of images off the Internet and built a deep-learning
model that has learned to recognize the contents of those images. Nearly always when people create these pre-trained
models they use a particular dataset, called ImageNet. One of the key reasons that they tend to use
ImageNet is because ImageNet has the most respected annual computer vision competition. Nowadays people that win the ImageNet
challenge tend to be companies like Google and Microsoft. A couple of years ago, it tended to be people
who immediately got hired by Google and Microsoft. [Time: 1.10 hour mark] ImageNet itself is fun to explore. If you go to ImageNet and go to Explore (image-net.org/explore),
you can check it out. Basically, there are 32,000 categories. So, for example, you could go to ImagNet and
look at plant->crop->field-crop->field-corn->dent-corn. So here we have a number of pictures of dent
corn; there are 397 of them. The folks that create these pre-trained networks
basically download a large subset of ImageNet, the competition has 1000 of these 32,000 categories
that people compete on. So nearly always people just build models
for these 1000. I would be remiss if I did not mention the
shortcomings of the ImageNet dataset. Can anybody tell me something that they notice
in common about what these photos look like or how they are structured? They’re just one thing. Like if you look at an arbitrary photo from
my photo album, you’ll see there’s a person here and a bridge there, and something else
here. ImageNet is carefully curated, for flint corn
there are 312 really good pictures of flint corn, whatever that is. This is an easier problem than may problems
that you will be facing. For example, I was talking to Robin from Planet
Labs at the break about the work that they’re doing with satellite imagery. Their satellite imagery is going to have a
lot more than just a piece of corn. Planet Labs photos
are pretty big, a couple million pixels, you’re going to have 500 sq km. So there’s going to be tennis courts and swimming
pools, and people sunbathing and all kinds of stuff. So when Robin takes this stuff to Planet Labs,
he’s not going to be able to use a pre-trained network directly. But we’re going to show you how you can use
some of the structure of the pre-trained network even if you are not looking at photos that
are this clear. Having said that, if you remember the slide
I showed you earlier of the plant disease project, each of those plant disease pictures
were very clearly just pictures of one thing. Be aware that when you’re using a pre-trained
network, you are inheriting the shortcomings and biases of the data is was trained from,
and therefore you should always look at the data it was trained from. Being aware of that, I would say for us this
is going to be a very suitable dataset and when we look at the dataset, you’ll see why
I say that. So each year, most of the winners of the Imagenet
competition make their source code and their weights available. So when I say their source code and their
weights, the source code is the thing that defines … remember when I told you there
were three bits that give us modern deep learning … infinitely flexible function, way to train
parameters, fast and scalable. The particular functional form is what is
the neural net architecture. So that’s the source code. So generally you download the source code
from the folks that built it all. So the second thing is the parameters that
were learned. Generally an ImageNet winner has trained the
model for days or weeks, nowadays often on many GPUs, to find the particular set of parameters,
set of weights, that make it really good at recognizing ImageNet pictures. So you generally have to get the code and
the weights. And once you have those two things, you can
replicate that particular ImageNet winner’s results. [Time: 1.15 hour mark] One of the winners of 2014 was the Visual
Geometry Group, an Oxford University group, with a model called VGG. You’ll hear about it lots. Generally speaking, every year’s ImageNet’s
winners, the particular model they used are so well-used in the community that people
call them by name. Like the 2012 winner was AlexNet, the 2014
winner was VGG. The 2015 was conception, the 2016 was ResNeXt,
so they all have names. VGG is a couple of years old, so it’s not
quite the best today, but it’s special because it’s the last of the really powerful simple
architectures. We will get to the more complex architectures. Depending on how we go, it might be in this
set of classes. If not, it will be in next year’s classes. Hopefully this year’s set of classes. VGG’s simpler approach is not much less accurate
and for teaching purposes, we’re going to be looking at something that is pretty state-of-the-art
and is really easy to understand, so that’s one of the reasons we’re using VGG. Another reason we’re using VGG is it’s excellent
for the kinds of problems we were just talking about that Robin with his satellite imagery
has, which it’s a great network for changing so that it works for your problem, even if
your problem’s a little different. So there’s a number of reasons that VGG is
a really great thing for us to be using. My strong preference is to start out by showing
you how to do things that you can use tomorrow, rather than starting with 1+1 and showing
you how to do things that are useful in six years time after you’ve got your PhD. So, I’m going to start out by showing you
7 lines of code that do everything you need. And to get to the punchline, the state-of-the-art
for dogs vs cats in academia is 80% accuracy, this gives you 90% accuracy, and you don’t
need to do anything else. For you, after this class to see if you can
get everything working, basically your job will be can you run these 7 lines of code. And if you can, you can re-run it on your
own dataset as long as you structure the directories the way that I just showed you. So what I’m going to do is I’m going to go
through these 7 lines of code (or something very similar to them) line by line and show
you pictures of what we’re doing along the way. I wanted to start by showing you these 7 lines
of code because we’re going to be looking at all kinds of things along the way in order
to really understand what’s going on, and at some point you might start thinking “Gosh,
there’s a lot to do to do deep-learning.” But there’s not. There’s a lot to do to really explain, and
talk about and think about deep-learning, but for you to actually do image classification,
you just need these 7 lines of code. So what does it mean to train a model that’s
already trained? Yes, you’re getting a little bit ahead of
us, but it’s great to answer these questions many times. In this case, the VGG model has been trained
to recognize photos of the 1000 types that are in the ImageNet competition. There’s a number of reasons why that does
not give us dogs vs cats. Reason #1, is that if we go into the animals
secrtion of ImageNet – dogs ->hunting dogs, sporting dogs, pointers,
Vizsla. They have 2334 pictures of Vizsla. You could go back and run it and find all
the German Short Pointers and Vizsla, but that’s something you have to do. So that’s one shortcoming of the VGG approach
compared to what we actually do. [Time: 1.20 hour mark] The second shortcoming is that sometimes it’s
going to get it wrong, and it might get it wrong for very good reasons. For example, this one might come back with
snow. But it’s not going to come back with just
snow, it’s going to come back with a probability for every one of the 1000 categories. It could be a probability of 0.0003 that it’s
a mushroom, and 0.0002 that it’s an airport, and 0.4 that it’s snow, and 0.3 that it’s
a German Shephard. We want to kind of take advantage of all that
information as well. So what this actually does, it does something
called fine-tuning, something we’re going to learn a lot about. Fine-tuning takes that pre-trained image model
and it says, use everything you know about the 1000 categories to figure out which one
of the cats. That’s a great question and we’re going to
go back and talk about that a second time. So this code can work for any image recognition
task with any number of categories, regardless of whether it’s in Imagenet or not. Really the only kind of image processing/recognition
that they’re not going to do, is something where you’re going to recognize lots of objects. This is specifically for recognizing a class. Let’s see how it works. When something’s running, it has a little
star. You will probably get this warning, cuDNN
is more recent than the one Theano officially supports. So this is a good time to talk about some
of the layers that we have going on. In this example, we’re using our vgg16 plus. If it’s sitting on top of Keras, which is
the main deep-learning library we’re using. Keras is sitting on top of Theano (which we’ll
be talking about quite a bit, but less than Keras). Theano is the thing that takes Python code
and turns it into compiled GPU code. Theano is sitting on top of a number of things,
broadly speaking Nvidia’s CUDA programming environment. Part of CUDA is the CUDA deep-learning neural-network
libray (cuDNN). For most important things in deep-learning,
Theano is simply calling a function inside cuDNN. So one of the things that we’ve set up for
you in the scripts is to get all of this stuff stuck together. Keras is all written in pure Python and what
it does is it takes your deep-learning architectures and code and turns it into Theano code (in
our case). It can also turn it into TensorFlow code. TensorFlow and Theano are very similar. They’re both libraries that sit on top of
CUDA and provide a type of a Python to GPU mapping and lots of libraries on top of that. TensorFlow comes out of Google and it is particularly
good at things that Google really cares about, in particular running things on lots and lots
of GPUs. One of the things you will hear a lot is that
you can’t do anything with deep-learning unless you have shitloads of data and shitloads of
GPUs. That is totally, totally wrong, as you’ll
see throughout this course. It is true that if you want to win ImageNet
next year, you’ll need lots and lots of GPUs because you’ll be competing for that last
0.1% against Google, against Microsoft. [Time: 1.25 hour mark] However, if you’re trying to recognize 10
different skin lesions (like the folks I just showed you were), they were the first people
to try to do that with deep-learning and they quadrupled the previous state-of-the-art using
1 GPU and a very small amount of data that they had hand-collected. So the reason you see a lot of stuff about
a lot of GPUs and a lot of a data is because it’s part of the thing to try to make neural
networks cool, rather than uncool. Trying to make it exclusive rather than inclusive. It’s like unless you’re us, you’re not in
the club. And I really don’t want you to go into that
thing. You will find again and again that it’s not
true. As I’ve just shown you, with 7 lines of code,
you can turn the state-of-the-art 20% error rate into a 3% error rate, and it takes about
5 minutes to run on a single GPU which costs 90 cents per hour. So I am not going to be talking much about
TensorFlow in this course because it’s still very early, it’s still very new. It does some cool things, but not the kind
of cool things that uncool things have access to. Theano, on the other hand, has been around
quite a lot longer. It’s much easier to use. It does not do multi-GPUs well, but it does
everything else well. If you build something in Keras and you get
to a point where everything is great and have a 400% improvement in the state-of-the-art,
I want the extra 5% that comes from running this on 8 GPUs, it’s a simple configuration
change to change the back-end to TensorFlow. Specifically, I want to show you that configuration
change. For those of you that haven’t use batch before,
when you see tilde “~”, that just means your home directory. In your home directory, there is a .keras
folder and in there is a keras.json file, this is the configuration file, and you’ll
see here “backend” : “Theano”. If you change it to “backend” : “TensorFlow”
then TensorFlow will use all of your GPUs. If you do this, I also recommend changing
the “th” (in “image_dim_ordering” : “th”) to “tf”. We may talk about that in the next course;
it’s a pretty minor detail. The other configuration file to be aware of
is the .theanorc. You’ll find a lot of Unix-y things, somethingrc
is how they name their configuration files, or .somethingrc — here it is .theanorc. I want to point out that there’s a really
line here which is “device=” this is either “gpu” or “cpu”. If you’re using a T2 instance, you’ll find
that the AMI we created has changed the “gpu” to “cpu” and that’s because the T2 instance
does not support gpu. So if you want to switch from gpu to cpu,
just change the “g” to a “c”, or “c” to a “g”. Those are the two configuration pieces that
you may need to know about. For this class, you won’t really need to know
about those because everything’s been set up for you, but I like to show you what’s
going on behind the scenes. This warning that cuNN is too recent. If you see any problems, try updating Theano
or downgrading cuDNN to version 5. I haven’t found any problems, so you can ignore
that warning. It just means that we’re using a more up-to-date
version of cuDNN than the authors have tested. So we create our VGG object [vgg=vgg16()]. In doing so, there’s a whole bunch of stuff
going on behind the scenes, we’re going to look at all of it. By the end of the next lesson, you will understand
every line of code in our vgg script. For now, I would just point out that you can
look at it, and inside it, you’ll see there is 100 lines of code – so it’s not very big
at all. And we’re going to understand all of it by
the end of next class. [Time: 1.30 hour mark] For now, there’s a pre-trained network called
vgg16(), we now have a vgg object which gives us access to that pre-trained network. With deep-learning, we don’t look at images
one at a time, we also don’t look at them the whole dataset at a time. We look at them a few at a time. That number, the few that we look at, we call
either a batch or a mini-batch. A mini-batch is simply grabbing (in this case images) a
few images at a time. And the size of that is the size of the mini-batch,
and computing on all of those at once. Why don’t we do one at a time? The reason that we don’t do one at a time
is because a GPU needs to do lots of things at once to be useful. It loves running on like thousands and thousands
of things at the same time because it can do all of them at the same time. So a single image is not enough to keep a
GPU busy, and it’s slow. Why not do all of it, the whole dataset at
once. First of all, your GPU only has a certain
amount of memory. Generally somewhere between about 2G and 12G
and your dataset is unlikely to fit in that amount of memory. And secondly, there’s no need to do the whole
lot, anything we want to do we can do with small amounts at a time. So in this case, I’m just going to show you
how we can look at the results of this vgg model, and we’re just going to do 4 at a time. So there’s a get_batches command, which basically
says, in our VGG model, let’s look inside the path and grab 4 at a time. So were in the samples, and there’s 16 images,
so let’s grab one batch. We’re going to grab 4 images and 4 labels. Here are the 4 images, and here are the 4
labels. You can see it’s labeled [0,1] if it’s a dog
and it will be [1,0] if it’s a cat. Now that we’ve done that (so that’s basically
what our data looks like), we can call vgg.predict, passing in the images. So that’s going to ignore the labels of what
it actually is, it’s going to use this pre-trained model and it’s going to tell us what it thinks
the four things are. In this case we run it and it thinks they
are a Rotweiler, an Egyptian cat, a toy terrier and a Rotweiler. So you see it’s clearly made a mistake here. It’s very rare that it makes a mistake, it
must have been confused by all the stuff going on in the background. So it’s also shown you for the toy terier
that it’s only 24% sure that it’s a toy terrier. So you can see that it knows it’s not sure. Whereas for the Rotweiler, it’s very sure
it’s a Rotweiler. How come it’s not so sure that it’s an Egyptian
cat? That’s because there’s a lot of cats that
look a bit like an Egyptian cat, it doesn’t quite know which one it is. We could have a look at all those details
to see which other ones it thought it could be. We’ll be looking at that in the next lesson. So the final thing I’m going to do is to take
these probabilities and turn them into a dogs vs cats model. I’m going to do it quickly now, and then I’m
going to revisit it in the start of the next class. [Time: 1.35 hour mark] So to take that 1000 probabilities (we’re
just showing one proability from each but there’s actually 1000 probabilities), and
turn it into a dog vs cat prediction, we basically do exactly what we did before, do vgg.get_batches
then we call finetune, vgg.finetune(batches). What finetune’s going to do is build a new
model and and it’s going to replace the 1000 categories with the 2 classes that it’s found. How does it know what the 2 classes are? That’s because
we have directories called cats and dogs. So the finetune command has now created a
model that checks for cats and dogs. Just creating the model is not enough, we
have to actually run it, train it. So if we then go vgg.fit(batches, val_batches,
nb_epoch=1), it will then use that gradient descent method, that back-propagation, that
I talked about earlier and it will attempt to make that model and get better at determining
cats vs dogs. Now obviously doing it on just 6 data items
is fast, but not very accurate. I can run it a few times and you can see that
the accuracy is getting higher and higher each time, but the validating accuracy is
not getting much higher, and that’s because I’m running it on the sample. So if I ran it on the full dataset, it would
take about 5 minutes to run, and you can try it when you get home. Give it a go and see what accuracy you get. If you want to get the accuracy higher, just
rerun this cell a bunch of times. That’s the end of today’s class. This first class is kind of like the opening
of a novel, when you introduce all the characters and their back-stories. A little bit less deep-learning goes on in
the first class, a little bit more getting set up. Your first week, for many of you, is likely
to be the most frustrating and challenging week because many of you will find that you
have some kind of configuration problem, or you don’t understand how some piece of stuff
fits together. Don’t worry. By the end of the 7 weeks, that stuff’s going
to be straightforward and all of the interesting bit will be in the deep-learning. The more time that you can put in to this
week, making sure that you get all the infrastructure stuff working and comfortable about what it
is. Take a look at the things I’ve introduced
today, look at the video, google all the stuff you’re not already familiar with. Understand how it works. Anything you’re unclear about, ask your colleagues
on the Slack Channel or on the Forums. Teaching is the best way to learn, so go to
the wiki and try to explain the things you’ve learned. Make sure that you can run the code that we’ve
seen today, up to here. For those of you that are pretty familiar
with this already, make sure that you can run this code on a different dataset. And we’ll talk about some different datasets
that you can use tomorrow. Any of you that want to go further, please
let Rachel or I know, we have lot’s of ideas for ways that you can extend this a long way.

50 Comments

  1. Evan Zamir

    December 20, 2016 at 11:01 pm

    Looking forward to this!

  2. Jeanfranco David Farfan Escobedo

    December 22, 2016 at 1:27 am

    can you add subtitles please?

  3. Furkan Durmuş

    December 22, 2016 at 3:42 pm

    Jeremy you are the king 🙂 can you really add subtitle in english please

  4. Daniel Nyeste

    December 22, 2016 at 6:46 pm

    I'm just curious, why did you used the Hungarian language as an example at the beginning, why not any other language? 🙂

  5. prattarazzi

    December 22, 2016 at 7:22 pm

    Hey Jeremy. Thanks again. The only 1 little thing that didn't work for me was [ import utils; reload(utils) from utils import plots ] in the lesson notebook. This results in an error from utils.py (line 15) : ImportError: No module named cv2. The upshot is that the plots of images don't work. Everything else worked fine, though. Thank you. P.S. I'm using AWS P2.xlarge and all the steps, scripts & setup you described in this video: https://www.youtube.com/watch?v=8rjRfW4JM2I ** UPDATE ** I fixed with conda install -c menpo opencv3=3.1.0 and pip install keras –upgrade

  6. Alister

    December 23, 2016 at 5:41 am

    I finished Stanford's cs231n like long time ago it was cool but they way you explain is awesome besides
    200$ is lots of money for me 🙁

  7. Stephen D

    December 24, 2016 at 12:26 pm

    cool,but have to say I need subtitile in English :)

  8. Jared Wilber

    December 27, 2016 at 8:46 pm

    Calling aws-start works fine. However, calling any of the following results in the following error: Permission denied (publickey).

    <aws-ssh>
    <ssh [email protected]<pub-ip>
    <ssh -i path/to/file.pem [email protected]<pub-ip>
    <ssh [email protected]<pub-ip>

    Any ideas?

  9. Ange Monzo

    December 29, 2016 at 5:23 am

    The "AWS Install – video walkthrough" ==>

    This video has been removed by the user.

    Sorry about that.

  10. Ram

    January 3, 2017 at 3:35 pm

    do we need to buy the membership in order to complete this course?

  11. james wolf

    January 18, 2017 at 2:28 am

    @Jeremy Howard can you give some advice on how to get jupyter notebook to start when the p2 instance starts? I am want to be able to mess with it without having to ssh into the instance just to start jupyter notebook. Initially I thought I could just add the jupyter notebook command to by adding "sudo jupyter notebook" to my rc.local file but that doesnt work. Anyone have some tips?

  12. Matthew Kleinsmith

    January 18, 2017 at 11:47 am

    0:00 – Fast AI & the course
    5:29 – Why Deep Learning is exciting
    10:51 – Deep Learning setup
    16:02 – Deep Learning trends and applications
    20:06 – Starting your AWS instance
    27:07 – Introduction to Jupyter Notebooks
    33:43 – Introduction to Kaggle
    41:14 – Introduction to tmux
    52:57 – Kaggle Dogs vs. Cats data & general data structuring tips
    1:01:01 – Introduction to Markdown
    1:02:02 – Introduction to some scientific Python libraries
    1:09:23 – Pre-trained models & ImageNet
    1:15:15 – VGG model
    1:17:08 – Implementing VGG
    1:22:14 – Python stack being used
    1:23:48 – Theano vs. TensorFlow
    1:27:02 – Keras and Theano settings
    1:30:20 – Batches
    1:34:38 – Finetuning ImageNet VGG16 for Dogs vs. Cats

    http://wiki.fast.ai/index.php/Lesson_1_Timeline

  13. GCM

    January 24, 2017 at 8:47 pm

    FYI, the setup_t2.sh script creates a t2.xlarge instance (which is billable outside the free plan) and not a t2.micro

  14. a a

    January 26, 2017 at 11:59 pm

    How did you get a custom visual style for the Jupyter Notebook? It looks different by default.

  15. yeo yong Lau

    February 7, 2017 at 9:12 am

    jeremy, i am new to aws. if i stop the p2 instance, all the files will remain in the instance or it will be remove?

  16. generalqwer

    February 8, 2017 at 1:18 am

    "cucumber work flow" lol

  17. pepe21101

    February 16, 2017 at 4:07 pm

    Why does his jupyter notebook look so fancy?

  18. Boyan Angelov

    March 6, 2017 at 2:36 pm

    For those who are interested: You can get the same theme for your Jupyter Notebook here: https://github.com/dunovank/jupyter-themes

  19. Wilson Mar

    March 8, 2017 at 8:32 pm

    [22:12] http://www.platform.ai/files/ is returning this:
    This is not the place to download notebooks or spreadsheets from any more. Please use git, as mentioned in How To Get Started.If you're looking for the Dogs v Cats download, you can find it here.
    And when I do "wget https://platform.ai/files/aws-alias.sh" I get:
    –2017-03-08 15:33:51— https://platform.ai/files/aws-alias.sh
    Resolving platform.ai... 67.205.12.187
    Connecting to platform.ai|67.205.12.187|:443… connected.
    ERROR: cannot verify platform.ai's certificate, issued by ‘CN=sni.dreamhost.com,O=DreamHost,ST=California,C=US’:
    Self-signed certificate encountered.
    ERROR: certificate common name ‘sni.dreamhost.com’ doesn't match requested host name ‘platform.ai’.
    To connect to platform.ai insecurely, use `–no-check-certificate'.

  20. Tema Z

    March 17, 2017 at 11:32 pm

    If u r interesting in cloud gpu instance, just try one from hetzner.com. For 99 euro/m you will get the GTX 1080.

  21. Bozo Jimmy

    March 29, 2017 at 5:36 pm

    Im stuck at bash setup_p2.sh getting error "An error occurred (OptInRequired) when calling the CreateVpc operation: You are not subscribed to this service. Please go to http://aws.amazon.com to subscribe" i think since im not based out of US im seeing this….How can i resolve this ?

  22. Nitin gupta

    April 1, 2017 at 4:53 am

    Hi, Can I learn using my local machine instead of using amazon cloud?

  23. Tianhao Wu

    April 25, 2017 at 7:56 pm

    Thank you Fast.AI for the awesome introduction to deep learning and guide through the tedious setup!

    There were some really frustrating parts setting up AWS and I had to swap from a Windows computer to a Linux chromebook since Cygwin wasn't working right with / vs. and had issues with path.

    I'm amazed at how fast tech has advanced in the last few years, its crazy to me that renting a 90 cents per hour box on Amazon allows you to label any image in the world with <10 lines of code.

    Really appreciate your teaching methodology of focusing on showing the results first before diving into the theory and numbers to justify your outputs.

  24. Николай

    April 30, 2017 at 8:51 pm

    How should I know if my Python skills are sufficient to get into this course? 'One year experience' is not really good enough since I could be coding for 1 hour a day or 15 hours a day for that one year. Is there any excersize that can check this?

  25. youtube account

    May 9, 2017 at 5:59 pm

    thank you for uploading this

  26. Nick R. Feller

    May 15, 2017 at 3:45 pm

    How do I download the dataset, the command wget http://www.platform.ai/data/dogscats.zip is giving me:
    -2017-05-15 15:44:47 http://www.platform.ai/data/dogscats.zip
    Resolving www.platform.ai (www.platform.ai)… 172.217.7.243
    Connecting to www.platform.ai (www.platform.ai)|172.217.7.243|:80… connected.
    HTTP request sent, awaiting response… 302 Found
    Location: https://www.platform.ai/data/dogscats.zip [following]
    -2017-05-15 15:44:47 https://www.platform.ai/data/dogscats.zip
    Connecting to www.platform.ai (www.platform.ai)|172.217.7.243|:443… connected.
    Unable to establish SSL connection.

  27. Shane Keller

    May 19, 2017 at 8:06 am

    Useful tip in the video at 44:00 – If you want to avoid rerunning alias.sh, add the aliases to your ~/.bash_profile (create ~/.bash_profile if you don't have one). On a Mac, you can do this in either your .bashrc or your .bash_profile.

  28. Tony Nicholas

    May 19, 2017 at 9:03 pm

    Thank yor this course Jeremy. I will have to ask everyone on my udacity DL Foundation nano degree to get onto fast.ai. You made all the years of hardwork worth it. Thanks alot for the clear explanations and the pace of teaching. However, i am not able to find the notebooks. I am not able to access the platform.ai

  29. Rahul Choudhry

    June 2, 2017 at 7:02 pm

    Amazing videos…Thanks a lot Jeremy and Rachel..jumping into code right away is helpful..i keep Bengio's book on the side as reference while going through the experiements..On a secondary note, any suggestions on GPU card. I am looking to buy one for my desktop and was interested in GeForce GTX1080 https://www.amazon.com/dp/B01GAI64GO/?tag=pcpapi-20

  30. S R

    June 14, 2017 at 10:37 am

    @ 00:06:00 Should it not be S(x)=1/(1+e^(-x)) ? Is it not a sigmoid function ?

  31. Sami Imas

    June 29, 2017 at 8:46 am

    Hi, Can I get access to https://www.platform.ai/files/#

  32. Todd Hoff

    July 11, 2017 at 6:41 pm

    Could you please use a white background with black text. The text is nearly unreadable.

  33. Cullen Thomas

    July 14, 2017 at 5:51 pm

    I'm hooked! Let's dive in!

  34. Safak Ozkan

    July 31, 2017 at 6:37 am

    To get the 'dogscats.zip' file type
    wget http://files.fast.ai/data/dogscats.zip

  35. Motiur Rahman

    August 11, 2017 at 6:29 am

    Is this an example of transfer learning?

  36. Robot Corp.

    August 13, 2017 at 5:35 pm

    I'm sorry, I came to this video to learn about image recognition and only last 20minutes were useful where you actually went through the code. Please get better.

  37. John Harper

    August 30, 2017 at 2:48 pm

    Great stuff although that's a lot for a complete beginner.

  38. Dimitri CHARLES

    September 14, 2017 at 6:40 pm

    How to switch between sessions in T-mux? I cannot find the command? Is it C-b O ?

  39. Rami H

    September 16, 2017 at 4:45 pm

    How about using google compute engine, they have GPU available, and they give approval in 1 minute?

  40. College Guide

    September 18, 2017 at 10:57 am

    0:00 – Fast AI & the course
    5:29 – Why Deep Learning is exciting
    10:51 – Deep Learning setup
    16:02 – Deep Learning trends and applications
    20:06 – Starting your AWS instance
    27:07 – Introduction to Jupyter Notebooks
    33:43 – Introduction to Kaggle
    41:14 – Introduction to tmux
    52:57 – Kaggle Dogs vs. Cats data & general data structuring tips
    1:01:01 – Introduction to Markdown
    1:02:02 – Introduction to some scientific Python libraries
    1:09:23 – Pre-trained models & ImageNet
    1:15:15 – VGG model
    1:17:08 – Implementing VGG
    1:22:14 – Python stack being used
    1:23:48 – Theano vs. TensorFlow
    1:27:02 – Keras and Theano settings
    1:30:20 – Batches
    1:34:38 – Finetuning ImageNet VGG16 for Dogs vs. Cats

    http://wiki.fast.ai/index.php/Lesson_1_Timeline

  41. ddarhe

    September 29, 2017 at 2:54 pm

    any people doing the course locally? i dont really see much to the setup besides installing the stack i guess, right?

  42. Deepak Murthy

    October 12, 2017 at 8:39 am

    Week 1 assignments can be found at – http://forums.fast.ai/t/lesson-1-discussion/96/3

  43. 孙一航

    October 19, 2017 at 2:15 am

    I found this video is extremely useful to me. As a beginner, I am confused that how did you upload the dataset file to the AWS server? Did you just upload those files by 'SCP' command? If anyone know the answer, please let me know. Thanks in advance!

  44. Bishshoy Das

    November 28, 2017 at 2:53 pm

    The lesson1, vgg16 and utils files are available at:

    wget https://raw.githubusercontent.com/fastai/courses/master/deeplearning1/nbs/vgg16.py
    wget https://raw.githubusercontent.com/fastai/courses/master/deeplearning1/nbs/utils.py
    wget https://raw.githubusercontent.com/fastai/courses/master/deeplearning1/nbs/lesson1.ipynb

  45. Nauroze Hoath

    December 2, 2017 at 9:33 pm

    "Diagnose heart disease by hedge fund analysts", interesting.

  46. Raghavan Kalyanasundaram

    January 9, 2018 at 12:48 pm

    At 1:20:10 to 1:20:22, What is the 1st shortcoming that he's mentioning? I understand that the second one is that the model may get it wrong at times due to the image being unclear/ it calculates the probability of a 1000 different categories? Can someone please clarify?

  47. Donovan Keating

    January 13, 2018 at 3:52 am

    Can we do this course on Google Cloud Platform too?
    Will it matter much in terms of the course material?

  48. doncumentarian

    January 15, 2018 at 8:06 pm

    Run `ls -l train/dogs | head -5` to understand why `wc` reports 11,501 files; the 1st line is the ls header giving the total # of blocks used. Since you're only counting files, you can use `ls -1 …` instead to get an accurate count. (And that's "dash-one" preferred over "dash-el" for those readers with fonts that render "el"s and "one"s near identically.)

  49. Vương Ân

    January 22, 2018 at 6:47 am

    Make AI great again !!

  50. ThankYouESM

    December 22, 2018 at 9:49 am

    Would love to see the most basic neural algorithm written as to produce deepfake images in purely Python… meaning no third-party imports.

Leave a Reply