Difference between revisions of "Gesture-Controlled Tello Drone- Rapolas Kairys"
From RoboWiki
(→Gesture Controlled Tello Drone Project) |
(→Gesture Controlled Tello Drone Project) |
||
(One intermediate revision by the same user not shown) | |||
Line 14: | Line 14: | ||
* Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface. | * Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface. | ||
* Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none). | * Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none). | ||
+ | |||
=== Controlls === | === Controlls === | ||
Line 29: | Line 30: | ||
=== Steps Taken === | === Steps Taken === | ||
− | # Environment Setup Installed necessary Python libraries: mediapipe, opencv-python, djitellopy. Created a virtual environment to keep dependencies organized. | + | # Environment Setup: |
− | + | #* Installed necessary Python libraries: mediapipe, opencv-python, djitellopy. | |
− | # Implemented Gesture Detection | + | #* Created a virtual environment to keep dependencies organized. |
− | + | # Implemented Gesture Detection: | |
− | Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks. | + | #* Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks. |
− | Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder). | + | #* Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder). |
− | Drone Controller | + | # Drone Controller: |
+ | #* Developed a separate background thread for drone commands. | ||
+ | #* Stores only one command at a time and executes it before accepting the next. | ||
+ | # Main Application Loop: | ||
+ | #* Captures frames from the laptop camera in a separate thread to avoid UI lag. | ||
+ | #* Performed gesture recognition in the main loop, and if needed, updates the drone command. | ||
+ | #* Displays HUD info: current command, active command, battery status. | ||
+ | # Testing and fine Tuning: | ||
+ | #* Adjusted arm-raise thresholds for better recognition accuracy. | ||
− | |||
− | |||
− | |||
− | + | === Algorithm overview === | |
− | + | # Threaded Camera Capture | |
− | + | #* A dedicated camera thread continuously reads frames from the laptop camera (cv2.VideoCapture). | |
− | + | #* The latest frame is stored in a shared variable accessible to the main loop. | |
+ | # Pose Detection with Mediapipe | ||
+ | #* Each new frame is passed to the Mediapipe Pose module. | ||
+ | #* Landmark positions for shoulders and elbows are extracted. | ||
+ | # Gesture Classification | ||
+ | #* Compare elbow y coordinate to shoulder y coordinate: | ||
+ | #:* If both elbows are above their shoulders, the gesture is “UP.” | ||
+ | #:* If only the left elbow is above, that’s “LEFT.” | ||
+ | #:* If only the right elbow is above, that’s “RIGHT.” | ||
+ | #:* Otherwise, “HOVER.” | ||
+ | # Flight Toggle Logic | ||
+ | #* A timer tracks continuous “UP” gesture. If it persists for 4 seconds, we toggle drone flight state (takeoff or land). | ||
+ | # Command Execution | ||
+ | #* A drone controller in a background thread executes any command (TOGGLE_FLIGHT, LEFT, RIGHT, HOVER) without blocking the main loop. | ||
+ | #* The main loop continues updating the camera feed avoiding freezes. | ||
+ | # HUD & Feedback | ||
+ | #* The system displays the command currently recognized, the active command (the one being executed by the drone) and battery level. | ||
− | + | === Links === | |
− | + | * [https://github.com/RafalasTypeBeat/Gesture-controlled-drone Source code] | |
+ | * [https://www.youtube.com/watch?v=hZfRaGYyeyY Video example] |
Latest revision as of 18:00, 2 February 2025
Contents
Gesture Controlled Tello Drone Project
Goal of the Project
The main objective of this project is to control a Tello drone using gesture recognition from a laptop camera. Users can raise their arms in specific poses to make the drone move left, right, up or toggle flight (take off/land) by holding an “UP” pose for four seconds. This project aims to:
- Provide a way to interact with and control a small drone using hand gestures.
- Explore computer vision and machine learning for gesture recognition.
- Demonstrate real-time controls using multithreading to have a smooth video feed.
Description
The system uses:
- MediaPipe Pose Estimation to detect body landmarks (shoulders, elbows).
- OpenCV for video processing.
- DJITelloPy library to send commands to the Tello drone over Wi-Fi, handling takeoff, landing, and movement.
- Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface.
- Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none).
Controlls
When the user performs a gesture in front of the laptop camera, the system detects it and translates it into a drone command:
- Holding both arms raised for 4 seconds toggles flight (either takeoff or land).
- Once in the air, raising both arms makes the drone go up.
- Raising only the left or right arm makes the drone move left or right.
- Doing neither results in a hover command.
- The drone moves 30cm for all commands, which can be changed in drone_controller.py.
- The command has to be held for 1.5 seconds to take effect. This helps prevent accidental commands and ensure safety.
- While the program is running and a drone is connected pressing "t" makes the drone take off/land. Pressing "q" shuts down the program.
Steps Taken
- Environment Setup:
- Installed necessary Python libraries: mediapipe, opencv-python, djitellopy.
- Created a virtual environment to keep dependencies organized.
- Implemented Gesture Detection:
- Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks.
- Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder).
- Drone Controller:
- Developed a separate background thread for drone commands.
- Stores only one command at a time and executes it before accepting the next.
- Main Application Loop:
- Captures frames from the laptop camera in a separate thread to avoid UI lag.
- Performed gesture recognition in the main loop, and if needed, updates the drone command.
- Displays HUD info: current command, active command, battery status.
- Testing and fine Tuning:
- Adjusted arm-raise thresholds for better recognition accuracy.
Algorithm overview
- Threaded Camera Capture
- A dedicated camera thread continuously reads frames from the laptop camera (cv2.VideoCapture).
- The latest frame is stored in a shared variable accessible to the main loop.
- Pose Detection with Mediapipe
- Each new frame is passed to the Mediapipe Pose module.
- Landmark positions for shoulders and elbows are extracted.
- Gesture Classification
- Compare elbow y coordinate to shoulder y coordinate:
- If both elbows are above their shoulders, the gesture is “UP.”
- If only the left elbow is above, that’s “LEFT.”
- If only the right elbow is above, that’s “RIGHT.”
- Otherwise, “HOVER.”
- Flight Toggle Logic
- A timer tracks continuous “UP” gesture. If it persists for 4 seconds, we toggle drone flight state (takeoff or land).
- Command Execution
- A drone controller in a background thread executes any command (TOGGLE_FLIGHT, LEFT, RIGHT, HOVER) without blocking the main loop.
- The main loop continues updating the camera feed avoiding freezes.
- HUD & Feedback
- The system displays the command currently recognized, the active command (the one being executed by the drone) and battery level.