Computer Vision using OpenCV in Python

Published in

Becoming Human: Artificial Intelligence Magazine

7 min readApr 5, 2021

Computer Vision is a field of study that focuses on creating digital systems that computers can process, analyze and gain high-level understanding from digital images or videos by converting them into arrays of matrixes. In this advanced age of technology, computer vision has made self-driving cars, fatal illness detection, and facial recognition possible. In this blog, I wanted to show all the basics of computer vision by demonstrating OpenCV key features.

OpenCV (Open Source Computer Vision)

In order to download OpenCV, we need to download the module by using

pip install opencv-python

Load and Display an image

OpenCV makes it really easy to load images using the function, cv2.imread(‘file path’, ‘color’). In this function, we can adjust the color in either grayscale or RGB scale. We can customize the ‘color’ of the image by using cv2.IMREAD_COLOR or -1 for color images and cv2.IMREAD_GRAYSCALE or 0 for black/white images. In order to display the image, use cv2.imshow(‘name’, ‘image’) , and it will open a new window that displays the image. The code cv2.waitkey(0) will close the new window with any key and follow by cv2.destroyAllWindows() to destroy all windows (git bash).

import cv2
img = cv2.imread('img_folder/hello.png', 0)
cv2.imshow('image',img)
cv2.waitkey(0) 
cv2.destroyAllWindows()

Flipping, Resizing, and Saving the Image

This functioncv2.resize(‘image’, (size, size)) changes the size of the image by manually inputting the height and weight. Another way to change the size is by shrinking or expanding the size of the image by using cv2.resize(‘image’, (0,0), fx = 0.5, fy =0.5). This will shrink the size of the image by half and you can input different fx and fy to expand or shrink the image. When the image comes inverted and you want to mirror it, you can use the flip method to make it your liking. The code cv2.flip(“image”, -1): -1 will mirror it and 0 will flip it vertically. In order to save the image, you use cv2.imwrite(‘image’, img), this will save your edited image to the current directory.

img = cv2.resize(img, (256,256)) 
img = cv2.resize(img, (0,0) fx = 2, fy = 2) 
img = cv2.flip(img, -1) 
cv2.imwrite('new_image.jpg', img)

Accessing your Webcam

With OpenCV, they made it easy to access your webcam and capture images. The function cv2.VideoCapure(0) will access your primary webcam and enter a different number to access different video-recording devices. The function, the frame is the image, and ret tells you if the webcam is functional. When accessing your webcam, you assign keys on your keyboard to perform different actions in the program. For example in the code below, it will take a screenshot when you press the space key and end the program if you press the ESC key. In order to access the keys, we use the function key = cv2.waitKey(1) and we are able to access the different keys on the keyboard by taking the module of that function, then it will give us the ASCII key. You can set the statement as if key%256 == 27: this means that by pressing the ESC key, this if statement will activate.

import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while True:
   ret, frame = cap.read() 
   cv2.imshow('frame', frame)
   key = cv2.waitKey(1)
   if key%256 == 27: 
      print('Esc key hit, closing webcam')
      break 
   elif key%256 == 32: ## space key
      cv2.imwrite('mywebcam.jpg', img)
    ## it will take an screen shoot when space bar is hit cap.release() ## it releases the VideoCapture 
cv2.destoryAllWindows() ## ends the function

Trending AI Articles:

1. Why Corporate AI projects fail?
2. How AI Will Power the Next Wave of Healthcare Innovation?
3. Machine Learning by Using Regression Model
4. Top Data Science Platforms in 2021 Other than Kaggle

Drawing Lines, Circles, and Text

When accessing your webcam, image, or video, you might want to display text, lines, or circles in certain positions. When detecting faces, objects, or anything else, you need to have a frame around the subject, for in order labels or identify that you will need to draw or write on or around it. First, you need to figure out the height and width of the frame. cam.get(3) will give the width and cam.get(4)will give you the height of the frame(image). The function cv2.line(‘img’, (0,0), (width,height), (255,0,0),10) will draw a line from the left top corner to the bottom right corner. The function needs the image, starting position (0,0), ending position (width, height), color (255,0,0), and line thickness (10). The rectangle function has the same syntax except for cv2.rectangle(). In order to draw a circle, you need to use cv2.cirlce(img, (300,300), 60, (0,255,0),5). The function needs the image, center position (300,300), radius (60), color (0,255,0), and line thickness (5). In order to write text on the webcam, you need to use cv2.putText(img, ‘This is Crazy’, (10,height -10), font, 4, (0,0,0),5, cv2.LINE_AA). This function needs an image, text (‘This is crazy’), center position (10, height -10), font, the scale of the font (4), color (0,0,0), line thickness (5), linetype (cv2.LINE_AA). The “linetype” will make the text look better and it is highly recommended by the documentation of OpenCV.

import cv2
cam = cv2.VideoCapture(0)
if not cam.isOpened():
    raise IOError("Cannot open webcam")
while True:
    ret, frame = cam.read()
    width = int(cam.get(3))
    height = int(cam.get(4))
    img = cv2.flip(frame, 1)
    img = cv2.line(img, (0,0), 
                        (width, height), (255,0,0), 10)
    img = cv2.line(img, (0, height),
                          (width, 0), (0,255,0), 5)
    img = cv2.rectangle(img, (100,100), 
                              (200,200), (0,0,255), 5)
    img = cv2.circle(img, (300,300), 
                            60, (0,255,0), -1) ##-1 fills the circle
    font = cv2.FONT_HERSHEY_PLAIN
    img = cv2.putText(img, 'This is Crazy', 
                     (10,height-10), font, 4, 
                     (0,0,0), 5, cv2.LINE_AA)   
    cv2.imshow('webcam', img)
    k = cv2.waitKey(1)
    if k%256 == 27:
        print("Esc key hit, closing the app")
        break
    elif k%256 == 32:
        cv2.imwrite('img.jpg', img)
        print("picture taken")
cam.release()
cam.destroyAllWindows

The image below is the result of the code above.

Detecting Faces using Haar Cascade

The next step would be detecting faces in a frame or image. OpenCV has a built-in machine learning object detection program that will identify objects called Haar Cascade classifiers. For detecting faces, it has trained from a large number of color images of faces and negative(black-white) images of non-faces to train it. Below is a link to the XML file of face-detection Haar Cascade that you will need to have in order to detect faces.

opencv/opencv

Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub.

github.com

The function cv2.CascadeClassifier(cv2.data.haarcascade + 'filepath') reads in the XML file for face detection from a file path. After that, you do the steps taught above to open the webcam and then you use face_cascade.detectMultiScale(img, 1.3,5) , which needs an image, min size (1.3), and max size (5). Then, it will find the face and then you use the rectangle function to draw around the face.

import cv2
import numpy as np
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
cam = cv2.VideoCapture(0)
while True:
    ret, frame = cam.read()
    img = cv2.flip(frame, 1)
    face = face_cascade.detectMultiScale(img,1.3, 5)
    for (x, y, w, h) in face:
        cv2.rectangle(img, (x,y), (x+w,y+h), (0,128,0),2)
    cv2.imshow('image', img)       
    k = cv2.waitKey(1)
    if k%256 == 27:
        break
cam.release()
cv2.DestroyAllWindows()

After running the program, it will detect the face and draw a rectangle around the face like the picture below.

In this blog, we learned how to load, save and display an image. Also, we are able to manipulate the image by using flipping and resizing. We can also access the webcam and use the keys to add features like screenshots and we are able to draw lines, circles, and text on the webcam as well. After combining all the skills that we learned, we are able to detect faces using the haar XML file.