Fundamentals of AI: Computer Vision and Natural Language Processing

Published in

Becoming Human: Artificial Intelligence Magazine

6 min readAug 12, 2021

COMPUTER VISION & NATURAL LANGUAGE PROCESSING

We’ve talked a lot about the general meaning of AI and the impact that it has had on our world. In previous articles we have talked about the types of AI algorithms and when and why those are used. If you missed those you can read them here.

Fundamentals of AI : AI for the Layman

You’re scrolling past your Facebook feed. You come across an advertisement “Our AI-powered solutions will change your…

moosa-ali.medium.com

For any of you who have previously researched about AI must have come across the following buzzwords; Computer Vision and NLP.

These are 2 of the most talked-about — and undoubtedly important — branches of AI and these are what we’ll talk about in this article.

By the end of this article you would have answered the following questions:

What’s Natural Language Processing?
What is Computer Vision?
What is their importance?
What is their impact on the modern world?

Without further ado, let’s get started.

NLP — Natural Language Processing

As the heading already gave it away, NLP stands for ‘Natural Language Processing’. The most basic idea of NLP is all in its name but if we were to define it we would say it is:

‘the way machines interpret natural human language’

Trending AI Articles:

1. Why Corporate AI projects fail?
2. How AI Will Power the Next Wave of Healthcare Innovation?
3. Machine Learning by Using Regression Model
4. Top Data Science Platforms in 2021 Other than Kaggle

We know all machines see are Zeros and Ones. The concept of words, sentences, languages are not understood by machines so there are certain processing techniques which we need to use and this is where NLP jumps in to save the day. NLP is a very diverse field which starts from very simple steps but eventually leads up to very complex memory-based models.

Some simple Natural Language preprocessing techniques include:

Tokenization — breaking down sentences into words.
Stemming — cutting the suffixes of words to extract the core context from words e.g. studies → studi.
Lemmatization — this is very similar to stemming but instead of simply cutting of the suffix, this reduces the word to its root e.g. studies →study.
Bag of Words — an approach to represent all the important words in a corpus in the form of vectors.
TF-IDF — similar approach as bag of words but with the added advantage of given higher weightage to important words and vice versa.

These are just a few of the simplest techniques, most of these are used in conjunction with one another and the end result is a numeric vector which actually makes sense to your computer.

Natural language processing covers vast processing techniques.

**If you have ever removed useless punctuation from certain text or converted the entire text to lower case then congratulations! you have applied NLP in your work.

But all these techniques seem quite useless! What good can we make of them? The answer is “Very Little”. Almost all of the techniques mentioned above are mostly used for data preparation. This prepared data is then passed onto larger, more complicated models which use them to generate useful outputs.

CV-Computer Vision

What’ is CV ?

As NLP dealt with textual data, CV is the processing of image data by your computer and performing multiple useful tasks on those images.

We may see images as colorful paintings but a computer views them as pixels and channels. A color image has 3 channels (Red, Green, Blue). Manipulation of these pixel values is the basis of Computer Vision.

Some basic Computer Vision techniques include;

Edge detection — detecting edges of the objects present in an image.
Color segmentation — grouping similar pixels together to create a mask for the image.
Noise filtering — removing unwanted entities from an image to make it more clearer.
Adding filters — this includes adding blurs, changing colors , cropping the image etc.

Image Segmentation using K-means clustering algorithm | Python

In a previous article, we saw how to implement K-means algorithm from scratch in python. We delved deep into the…

medium.com

This field is a little more intuitive than Natural Language Processing because it is easier to visualize the process but just like NLP, the above mentioned techniques are fun to play around with but there is very little you can do with these alone. As a computer vision engineer real world usefulness is achieved when we introduce complex Deep Neural Nets to the procedure.

Real world application of both NLP and CV are discussed in the next section.

Real World Application

There are many companies which are dedicated to developing products which utilize these techniques.

When you open you personal assistant by saying “Hey Siri” or “Okay Google” they respond to these commands because there is a trained NLP model in your smartphone which interprets these words and decides what action to take. The same model is used when you ask your personal assistant to do something for you e.g. Set a reminder.

Object detection / Source : Facebook research

Most of us are familiar with snapchat and are or have been crazy about the real-time filters that they apply on your face. Placing a virtual crown on your head or making you eyes pop out. This is possible because of Augmented Reality — a field of Computer Vision. The computer vision model detects the shape of your face and keeps tracking it to change the position of the virtual object. Similar models are used when Facebook AI detects a person in your picture and suggests you to tag them. Facebooks Detectron2 model is considered state-of-the art in object detection in images.

Conclusion

It is undeniable that the impact of artificial intelligence on everyday life has been huge. We utilize this technology in our every day applications and don’t even realize it. AI has impacted our life more than we realize.

After coming so far ahead in research and development, the world of AI continues to evolve with every passing day as newer models are released and new research papers released.

If you enjoy data science and Machine Learning, you can view my other works.