Intro to Deep Learning, a member of the machine learning family

Joel Hanson
Analytics Vidhya
Published in
7 min readApr 1, 2020

--

Photo by Andy Kelly on Unsplash

This article is the very beginning of my blog series explaining the deep learning textbook by Ian Goodfellow and Yoshua Bengio and Aaron Courville.

Disclaimer: If I am wrong please correct me, I will try my best to explain things in a much simpler and easier way.

You remember things well enough when you take good notes.

Yes, writing down something seems to help us understand it better. On the other hand, not writing down things simply asks forgetting. The only way you don’t have to write down things is to write them down so you remember them well enough not to write them down 😃.

I bought the deep learning book about six months ago before I wrote this article. I bought this book to learn what machine learning is, and the basics of mathematics need for learning deep learning in general. But the book kept unread for about five months, and then I discovered that I needed to take action to read this book, or else the book might remain untouched forever, so it came to the solution of reading each chapter and writing down what I read.

Overview

The Deep Learning textbook is written by 3 of the most renowned engineers in this field of machine learning. The book was published in 2015. The Deep Learning textbook is a guide intended to help students and professionals in general and in particular deep learning in the area of machine learning. The online version of the book is now complete and is free online.

The book is split into 3 parts

  1. Part I: Applied Math and Machine Learning Basics
  2. Part II: Modern Practical Deep Networks
  3. Part III: Deep Learning Research

Part I is the introduction to the basics of mathematical concepts needed to understand the basics of deep learning. This includes the Linear algebra, probability and information theory, Numerical computation and Machine learning basics.

The II part is about the deep learning modern practices, which include Deep feedforward networks, Regularization for deep learning, Optimization for training deep models, The convolutional network, Sequence Modeling, Practical Methodology, and the applications. This section focuses on the approaches essential to the technologies used in the industry.

Part III points to the advanced approaches in the field and the ones which are currently pursued by the research community. This part is the most applicable for a researcher — any individual who needs to apprehend the scope of insights delivered into the place of deep-learning and cross the field toward genuine artificial intelligence.

Introduction

The Introduction part of the book shows us how the name artificial intelligence and the machine learning definition came into existence. This is about the solutions to intuitive problems like the ones which we feel automatic, for example, face recognition and understanding the spoken words.

Deep learning is somewhat like the hierarchical layers of concepts that are used to solve complicated concepts that are built on simpler concepts. As it gets hierarchical it gets deep hence we call this approach deep learning.

In the past there were a lot of AI projects that hardcoded the knowledge base about the world in a formal language, the computer can only make decisions based on logical inferences. All the project which used this type of method didn’t succeed. One of the most famous projects which used this Knowledgebase approach to artificial intelligence is the Cyc.

The machine learning algorithms depend on the representation of data which are given. For example, consider the case of small machine learning algorithms like logistic regression, we all know that the logistic regression can determine the bank loan frauds which are based on the data of a person including his personal and transaction history.

Each data point that is given to the algorithm is known as the feature. This feature determines the result given by the machine learning algorithm.

This features or the representation has an important role in determining the performance of the machine learning algorithm. If we give a wrong representation of data or the data given does not imply the real-world scenario then the performance of the algorithm will drastically decline. For example, consider taking photos of a flower in a high-resolution camera and train your model to classify which flower it is based on given data will work in the initial phase but when you come to a real-world scenario of having a camera with 144p standard will drastically reduce your prediction results.

The mapping from representation to output can also happen vise-versa this approach is called the representation learning.

Consider the example of classifying cars, the manually efforts to classify this will take a long time. If we have a system that learns from the representation like the car is have a particular shape and they are having four wheels something like that then we can discover a good set of features for a complex task in hours.

Representation learning is like the autoencoder which will encode as much information need and then decodes it back to its real state.

Image source
Image source

Even though the representation learning can show much progress in getting simple features from the given data but it fails when we need to get the high-level abstract features from raw data, this is the case where deep learning gets into play.

Deep learning can build complex concepts from simpler concepts. consider an image of a car the deep learning system can get its simpler concepts like the corners and the contours which in terms defined in terms of edges.

Who should read this book?

Deep learning has already been more prominent in the field of computer vision, speech and audio processing, natural language processing, etc… This deep learning textbook is indented for students who are learning about machine learning and artificial intelligence.

The second category of people is software engineers who have no background in machine learning or statistics who want to learn more about deep learning.

A Venn diagram showing how deep learning is a kind of representation learning, which is in turn a kind of machine learning.
A Venn diagram showing how deep learning is a kind of representation learning, which is, in turn, a kind of machine learning, which is used for many but not all approaches to AI. Each section of the Venn diagram includes an example of an AI technology. Source

Historical Trends in Deep Learning

The history of deep learning was long and rich. The deep learning name had many predecessors based on different viewpoints and increased its popularity.

When the data and machine resources grew, the deep learning model became more useful. The deep learning algorithms were able to solve some of the most complicated tasks which we thought is impossible.

The deep learning name was recently being introduced over the years because the name deep learning was known in different names. For example, deep learning was known as cybernetics in the year 1940s–1960s, Connectionism in the years 1980s-1990s and currently deep learning after its frequent occurrences.

The datasets sizes had increased drastically over the years, due to the increase in digitalization of society grew more and more activities started to take place on computers. Every day to day data are being recorded, this data can be feed into the machine learning applications.

The age of big data has made machine learning much easier as it can generalize well to the new data after observing a small amount of data.

The increase in model size increase over the years it wasn’t possible for us to run large models due to the limits of the computational resources. But nowadays it’s possible to run such large models If we are running a model which is having only a single neuron which is of not particularly useful.

We need to have multiple neurons to have much more use rather than a single neuron as the number of connections increase the model size gets increased.

And also the accuracy of the models got increased over the years once we were able to only classify 2 types of categories but now we can classify more than 1000 classes. The images which we had fed into the initial model were too small but now we can feed in images that are large in size and resolutions.

The increase in complexity has been pushed to its logical conclusions that learn from memory cells write arbitrary content to memory cells. Another enormous accomplishment in the area of profound deep learning is its application to the field of reinforcement learning. The reinforcement model can make learn from its task of trial and error without any guidance from a human operator.

Summary

In summary, The only realistic approach to building AI systems that run on complicated real-world environments is machine learning.

Deep learning is a subset of machine learning which achieves great power and flexibility by representing it as a nested hierarchy of concepts, with each concept being described simpler concepts.

The field of deep learning is growing rapidly due to the growth in the computational resources and the growth of larger datasets. There are newer techniques introduced to the field of deep learning every day. The years ahead are full of challenges and opportunities to further develop in-depth research and move it to new boundaries.

Reference: https://www.deeplearningbook.org/

Book link: https://amzn.to/3cXR09b

Please provide your feedback and suggestions. Follow me to get the latest updates.

Linkedin: linkedin.com/in/joel-hanson/

Portfolio: joel-hanson.github.io/

Github: github.com/Joel-hanson

Twitter: twitter.com/Joelhanson25

--

--