Face detection has become a fundamental aspect of various AI applications, from security systems to personal devices. With the ESP32-CAM, a low-cost microcontroller with camera capabilities, you can create your own face detection system. This guide will show you how to perform face detection using ESP32-CAM and Python on the Thony Python IDE. Whether you're a hobbyist or a tech enthusiast, this tutorial will help you create a functional project that detects faces in real-time.
Prerequisites:
ESP32-Cam module
FTDI programmer
Arduino IDE (installed)
Thony Python IDE (installed)
Micro-USB cable
Jumper wires
A local Wi-Fi network
Step 1: Setup ESP32-CAM with Thony IDE
1.1 Install Thony Python IDE
- Download Thony: Visit [Thony.org](https://thonny.org) and download the IDE for your operating system.
- Install Python (If not already installed): Thony IDE will install Python automatically, but if you want a separate installation, go to [Python.org](https://python.org).
1.2 Connect ESP32-CAM to Your System
Connect the ESP32-Cam to the FTDI programmer
Connect the U0T and U0R pins of the ESP32-Cam to the RX and TX pins of the FTDI programmer.
Connect the GND and 5V pins of the ESP32-Cam to the respective FTDI pins.
Make sure the IO0 pin is connected to GND for flashing the ESP32-Cam.
Install the ESP32 board package in Arduino IDE
Open Arduino IDE and go to File > Preferences. In the "Additional Board Manager URLs" field, paste the following link:
Go to Tools > Board > Board Manager and search for ESP32. Install the ESP32 board package.
Select the ESP32-Cam board in Arduino IDE
Go to Tools > Board and choose AI-Thinker ESP32-Cam.
Set the upload speed to 115200 and the correct port for your FTDI programmer.
Upload the Webserver Example Code for Face Detection
Open File > Examples > ESP32 > Camera > CameraWebServer.
In the code, ensure you add your Wi-Fi SSID and password to connect the ESP32-Cam to your network.
Upload the code to the ESP32-Cam by pressing Upload in the Arduino IDE. Once uploaded, remove the GND connection from IO0 and reset the module.
Step 2: Get the ESP32-Cam’s IP Address
Open Serial Monitor
Go to Tools > Serial Monitor in Arduino IDE. Set the baud rate to 115200.
Once the ESP32-Cam boots, you should see an IP address displayed in the Serial Monitor. Copy this IP address, as it will be used in the next step.
Step 3: Integrate Python for Face Detection
Install the OpenCV, Numpy library in Thony
Open Thony Python IDE and go to Tools > Manage Packages.
Search for opencv-python, Numpy and install it. This library will handle face detection.
Install requests library
In the same way, search for and install the requests library. This is required to interact with the ESP32-Cam’s webserver.
3. Write Python Script for Face Detection
Create a New Python Script
In Thony, create a new file and name it something like face_detection.py.
Write the Code
Use the following code to capture the video stream from the ESP32-Cam and detect faces.
Run the Python Script
- Make sure the ESP32-CAM webserver is running. Replace `'http://your-esp32-cam-ip-address/stream'` with the actual IP address of your ESP32-CAM.
- Run the Python script in Thony IDE. A window will pop up displaying the video stream from the ESP32-CAM with detected faces highlighted.
Step 4: Code Explanation
Let's break down this code for face and eye detection using the ESP32-CAM stream into simple sections for a beginner:
1. Importing Required Libraries
import cv2
import urllib.request
import numpy as np
- cv2: This is the OpenCV library, used for image and video processing.
- urllib.request: This is used to fetch data from URLs (in this case, we’ll fetch images from the ESP32-CAM).
- numpy (`np`): This is used for handling arrays and matrices. We need it to convert the images we get from the URL into a format OpenCV can process.
2. Loading Pre-Trained Models (Haar Cascades) for Face and Eye Detection
f_cas = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_eye.xml')
- Cascade Classifier: OpenCV uses pre-trained models (Haar cascades) to detect objects like faces and eyes.
- `haarcascade_frontalface_default.xml` is used for detecting faces.
- `haarcascade_eye.xml` is used for detecting eyes.
The `CascadeClassifier` function loads these XML files, which contain the trained models.
3. Defining the ESP32-CAM URL
url = 'http://192.168.1.104/capture'
- This defines the URL from where the ESP32-CAM streams its video or captures frames. You should replace `'http://192.168.1.104/capture'` with the actual IP address of your ESP32-CAM. Make sure the ESP32-CAM is connected to the same network as your computer.
4. Creating a Display Window
cv2.namedWindow("Live Transmission", cv2.WINDOW_AUTOSIZE)
- This creates a window named "Live Transmission" to display the camera feed. `cv2.WINDOW_AUTOSIZE` means the window will automatically adjust its size based on the image size.
5. Main Loop to Continuously Capture and Process Frames
while True:
img_resp = urllib.request.urlopen(url)
imgnp = np.array(bytearray(img_resp.read()), dtype=np.uint8)
img = cv2.imdecode(imgnp, -1)
- `while True:`: This loop continuously fetches frames from the ESP32-CAM.
- `urllib.request.urlopen(url)`: This retrieves the image from the ESP32-CAM via the URL.
- `np.array(bytearray(img_resp.read()), dtype=np.uint8)`: Converts the image from bytes into a NumPy array so it can be handled by OpenCV.
- `cv2.imdecode(imgnp, -1)`: Decodes the NumPy array into an image that OpenCV can work with.
6. Converting the Image to Grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
- `cv2.cvtColor` converts the color image (BGR format) into grayscale, which is easier and faster for the detection algorithms (face and eye detection) to process.
7. Detecting Faces in the Image
face = f_cas.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)
- `f_cas.detectMultiScale`: This function detects faces in the grayscale image.
- `gray`: The grayscale image where faces are to be detected.
- `scaleFactor=1.1`: This parameter specifies how much the image size is reduced at each image scale (controls accuracy).
- `minNeighbors=5`: Defines the minimum number of neighboring rectangles that need to be detected for an object (face) to be considered valid.
8. Drawing Rectangles Around Detected Faces
for x, y, w, h in face:
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 3)
- `for x, y, w, h in face:`: This loop runs through all the detected faces, where:
- `x` and `y` are the coordinates of the upper-left corner of the face.
- `w` is the width and `h` is the height of the face.
- `cv2.rectangle`: Draws a red rectangle (BGR color `(0, 0, 255)`) around the detected face in the original image (`img`).
9. Detecting and Highlighting Eyes Within the Detected Face
roi_gray = gray[y:y+h, x:x+w]
roi_color = img[y:y+h, x:x+w]
eyes = eye_cascade.detectMultiScale(roi_gray)
for (ex, ey, ew, eh) in eyes:
cv2.rectangle(roi_color, (ex, ey), (ex + ew, ey + eh), (0, 255, 0), 2)
- `roi_gray` and `roi_color`: These define the "Region of Interest" (ROI) where eyes are expected to be found, which is the region inside the detected face.
- `eye_cascade.detectMultiScale(roi_gray)`: Detects eyes within the face region in the grayscale image.
- `cv2.rectangle`: Draws a green rectangle (BGR color `(0, 255, 0)`) around each detected eye.
10. Displaying the Result
cv2.imshow("live transmission", img)
- `cv2.imshow`: This function displays the current frame with rectangles around detected faces and eyes in the "Live Transmission" window.
11. Exiting the Program
key = cv2.waitKey(5)
if key == ord('q'):
break
- `cv2.waitKey(5)`: Waits for 5 milliseconds for a key press.
- `if key == ord('q'):`: If the 'q' key is pressed, the program breaks out of the loop and stops the live video feed.
12. Cleanup
cv2.destroyAllWindows()
- `cv2.destroyAllWindows`: Closes the window displaying the video when the loop ends (after pressing 'q').
Summary:
- Import libraries: OpenCV for image processing, `urllib` for getting images from the ESP32-CAM, and NumPy for array handling.
- Haar Cascades: Pre-trained models to detect faces and eyes.
- ESP32-CAM URL: Defines the web address from which the camera feed is fetched.
- Face & Eye Detection: OpenCV processes each frame, converting it to grayscale for more efficient detection, and uses `CascadeClassifier` to draw rectangles around faces and eyes.
- Live Video Stream: Displays the video feed in real time, with face and eye detection applied, until the user presses 'q' to quit.
Conclusion:
Congratulations! You’ve successfully set up face detection using the ESP32-CAM and Python on the Thony IDE. This project can be extended for various applications such as smart home security, automated attendance systems, or even facial recognition.If you enjoyed this tutorial, be sure to visit our Skill-Hub for the Arduino Master Class, where you can take your tech skills to the next level!
Comentarios