VIDEO: How Deep Learning Works

Jul 12, 2019

Learn More in the Directory!

Article ImageMerlinOne CLO David Tenenbaum explains deep learning as analogous to teaching a child to learn with labeled images in this clip from his presentation at Digital Experience 2019.

See more videos from Digital Experience Conference in the Digital Experience Conference Video Portal.


David Tenenbaum: We're gonna define intelligence as the ability to get better at one or more tasks by learning from experience, and deep learning is almost identical to teaching your favorite three-year-old niece or nephew how to name things by showing them flashcards. And so you hold up a card, and I'm gonna cover up the name, and I say, "Sally, what is this?" And Sally gets all excited and says, "car." And I say, "No, sorry Sally, that's an airplane." Then I'll pick up the next card and she'll go, "okay, that's a car," and I'll go, "great, you got it." And we'll go through the slide deck three, four times and by the end of that, two things have happened. Sally is scoring in the high 90s on these, and she's bored, so we have to go and move on to something else.

The really important thing to grasp is, we have no idea how Sally learns to do this. There's a lot of work being done on brain science and we are close to clueless still about that. Now, we're gonna start mapping this to being able to do the same kind of thing digitally, and so the process is pretty much the same. You've got an input and you've got an input mechanism, in this case your eye, and some signal goes into this black box and there's some sort of network in here. And on the output, if everything works right, you recognize what it is and you get the title "car." We're gonna try and shine a little light on what's in the black box, both for humans and for machines.

So this is a single human neuron, just like the one that's in Sally's brain, and these arms at the left, they're receiving input from other neurons that are further back here. And the gap between one neuron and the other neuron is called the synapse, you've probably heard all these terms. And what happens is, this particular nucleus is tuned to care about some input signal and if it gets that signal, it goes, "Aha, I'm excited," and it sends the signal out and that, these tentacles sort of signal the next set of neurons. One single neuron can't really do anything. You need a network of neurons, and each one of you has roughly 100 billion neurons in your brain, and each one of those is getting input from roughly 7,000 other neurons. So this is what you call a really, really deep network.

Again, we really don't know how it works, but in general terms, you get an input here and some of these neurons are preconditioned, they're trained to fire when they see certain things, and if you're lucky you get "car" at the end. So this sort of a schematic of what a biological network looks like. A digital network looks like this, and each one of these things we call a node, and it's really just a chunk of software. It's nothing physical at all. And you've got inputs and outputs and the stuff in the middle we call "hidden layers," and basically that's what lives in the black box.

For a deeper network, you would have more of those layers. And almost just like with human neurons, every column of neurons is connected to the next column over, and this node talks to this one and this one and this one and this one and this one, and so on, all the way through until you get to the outputs. 

Here's the really important thing: Some signal comes into this node, and it can really only do a very small number of things. It can pass it along, and it can amplify it, in this case by three, or by two, or maybe it decides it wants to shrink it down by minus one. And those numbers are called weights, and when you train a network, all you end up with is a whole bunch of these numbers, which are the weights. And they get learned and modified as you do the training and when you're done training, this is it, and you can email it to your friend and they can plug it into their network and it will work the same. So all that's happening, these nodes aren't moving around or anything, all that's happening are these little numbers, the weights, are getting adjusted.

Starting on a brand new network, we have no idea, and it has no knowledge of anything. So we take all those weights and we initialize into a really tiny random number near zero. So it's basically dumb, and for this illustration we're gonna see if we can identify one of these five things. So we've got those outputs labeled, and we connect a red light to each one, and if it figures out that it's one of those things the red light should light up.

We're gonna put in a picture of an airplane and we're gonna take part of it and put it into the top node and another part into the next node, and do that all the way down the line, and we're gonna start sending that data across. And it travels to the first hidden layer, and that's gonna apply these nonsense default weights and it goes to the next hidden layer which applies its nonsense default weights, and it goes to the output layer, and wait for it, the model says, "dog." Which obviously is wrong, but we kind of expected that because this thing doesn't know anything yet.

Just like we did with Sally, because we know the answer--because the whole idea of a training set is you know the answer--we can say, "Hey, let's do a correction," and we can say, "Let's take two away from the dog activation and add two to the correct answer, airplane, and let's push that back from right to left across this network," and all the little weights in all those nodes are gonna get slightly adjusted.

That's what's called back propagation, and this may seem obvious but it was a huge breakthrough and it's what makes everything work. And it goes all the way through the network, and that's how you train the network, except you do it a few million times. Now if you recall, little Sally got stuff pretty much nailed in four times through the deck.

Neural networks are a lot dumber than that and literally need a million give-or-take samples of whatever it is you want it to learn about. But eventually, you end up with a trained network, and we call that a model because it kind of understands a little bit of how our world works.

Related Articles

Infoloom Inc. CLO Michael Biezulnski makes the case for migrating your data repositories to graphs (not graphics) in this clip from his presentation at Digital Experience 2019.
Real Story Group Founder Tony Byrne identifies the four platforms that underpin omnichannel architectures in the enterprise in this clip from his presentation at Digital Experience 2019.
Real Story Group Founder Tony Byrne explores journey orchestration and decisioning for omnichannel content platforms in this clip from his presentation at Digital Experience 2019.