Deep Learning — An ELI5 Intro to Neural Networks

Published in

Becoming Human: Artificial Intelligence Magazine

6 min readAug 16, 2018

In this blog post, you’ll learn about neural networks. There’s a lot of hype around artificial intelligence so I thought I’d write this post for individuals who are out of this domain or for those interested in knowing about deep learning.

When I first started, I pictured this when people said neural networks:

This analogy is accurate. A single neuron will take the output of several neurons in the form of nervous impulses and will decide to fire if it meets a certain threshold.

I personally like to think of it as a system of connected water pipes with knobs. The knobs determine how much water will come out of the joined ends. If all the knobs are at the perfect setting, the optimal water will flow out of the system.

Now let’s work with an example. Here’s a basic graph that models the acceptance at an university:

We use the diagonal line to determine if we got accepted to the university or not. Imagine you got a grade of 4 and you had written 10 tests [coordinates (10,4)]. According to this graph, we’d still be accepted to the university since we’re on the green side. Awesome right?

In reality, that would be a low score and would probably result in being declined from the university. This tells us that our line isn’t placed at the best spot or angle. Is a single line even a good option? Maybe, but we can do better here.

What other shapes can we draw to perfectly separate our data?

How about this?

This wouldn’t really be accurate. Plus a circle isn’t a function (2 values of y map to a single x). How do we even optimize its size?

Let’s try having two lines intersecting and maybe we can use some math to make them positioned in the best way (don’t worry about the math yet).

Trending AI Articles:

1. Machines Demonstrate Self-Awareness
2. Visual Music & Machine Learning Workshop for Kids
3. Part-of-Speech tagging tutorial with the Keras Deep Learning library
4. AI & NLP Workshop

Now our data looks more separated. As a student, I can now check if I will be accepted. I can ask myself 3 questions when plotting myself onto this map.

Is the point above the horizontal line?
Is the point on the right side of the vertical line?
Is the answer to question 1 AND 2 a YES?

Here’s how it would visually look like:

Now, if we look at each question independently, we can model it with a neural network like this:

In the image above, you’ll see the AND operation at the end. An AND operation is something we use in the logic world to calculate if two values are both YES. It’ll output a YES only if all values coming in as an input are both a YES. If one or both values are a NO, the AND operation will output a NO.

A simplified depiction of the image above is:

Notice that we represented the AND operation as a graph.

Now we’ve successfully represented our first neural network. There are 3 layers in our network:

The input layer — our inputs TEST and GRADE.
The hidden layer — our plotting on the graph.
The output layer — where the AND operation does its job and outputs a YES or NO.

Here’s a simple depiction of our network:

This is the simplest version of a neural network. Each node is typically called a neuron or a perceptron. In deep learning, we can have several hidden layers.

Here’s a slightly more complex network with 3 inputs (features) and 4 neurons in the hidden layer.

Unfortunately, the data in the real world is multidimensional. In our example, we were working with 2 dimensions, the GRADE and TEST. What if we had a 100-dimension input and 4 hidden layers with 100 neurons in each layer? You can start to imagine that our network would start to look very complex.

Isn’t it crazy how with a large dataset, a lot of computing power, and a large neural network, we can teach computers to drive cars, understand language and play games?

Now things start to get a little more complex:

How do you determine which input is more important than the others?

How will the neural network pick up on that?

In my next post, I’ll breakdown how to make these neural networks learn. These include concepts such as weights, gradient descent and backpropagation. There are also various families of neural networks such as the Convolutional Neural Network (CNN) or Recurrent Neural Networks (RNN). They can be used for computer vision and natural language processing, respectively. They can also be combined to do very crazy stuff! I will also be covering them in future posts :)

If you’re curious, this is what part of a CNN and RNN architecture may look like:

Convolutional Neural Network (CNN) architecture for Image Recognition

Recurrent Neural Network (RNN) Structure used for Sequences

Use-Cases of Architectures & Areas to Hype you up:

CNN: Detecting tumors in medical images with higher accuracy than a trained professional.

RNN: Making a bot write new chapters of Game of Thrones.

Generative Adversarial Network (GANs): Generating new Pokemon using an existing dataset of Pokemon.

Deep Reinforcement Learning: Teaching a robot to do surgery or learning to drive from scratch.

Autoencoders: Compression, De-noising, Dimensionality Reduction

Thanks for reading!

References:

All credits to picture of graphs and neural networks go to Udacity (https://www.udacity.com/)
CNN Tesla architecture: http://cs231n.github.io/convolutional-networks/
RNN architecture: http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/