Difference between revisions of "Gesture-Controlled Tello Drone- Rapolas Kairys"
From RoboWiki
(→Abstract) |
(→Gesture Controlled Tello Drone Project) |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == Gesture | + | == Gesture Controlled Tello Drone Project == |
− | === | + | === Goal of the Project=== |
− | + | The main objective of this project is to control a Tello drone using gesture recognition from a laptop camera. Users can raise their arms in specific poses to make the drone move left, right, up or toggle flight (take off/land) by holding an “UP” pose for four seconds. This project aims to: | |
− | * | + | * Provide a way to interact with and control a small drone using hand gestures. |
− | * | + | * Explore computer vision and machine learning for gesture recognition. |
− | * | + | * Demonstrate real-time controls using multithreading to have a smooth video feed. |
− | + | ||
− | * | + | === Description === |
− | * | + | The system uses: |
− | * Command | + | * [https://github.com/google-ai-edge/mediapipe MediaPipe] Pose Estimation to detect body landmarks (shoulders, elbows). |
+ | * [https://opencv.org/ OpenCV] for video processing. | ||
+ | * [https://github.com/damiafuentes/DJITelloPy DJITelloPy] library to send commands to the Tello drone over Wi-Fi, handling takeoff, landing, and movement. | ||
+ | * Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface. | ||
+ | * Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none). | ||
+ | |||
+ | |||
+ | === Controlls === | ||
+ | When the user performs a gesture in front of the laptop camera, the system detects it and translates it into a drone command: | ||
+ | |||
+ | * Holding both arms raised for 4 seconds toggles flight (either takeoff or land). | ||
+ | * Once in the air, raising both arms makes the drone go up. | ||
+ | * Raising only the left or right arm makes the drone move left or right. | ||
+ | * Doing neither results in a hover command. | ||
+ | * The drone moves 30cm for all commands, which can be changed in ''drone_controller.py''. | ||
+ | * The command has to be held for 1.5 seconds to take effect. This helps prevent accidental commands and ensure safety. | ||
+ | * While the program is running and a drone is connected pressing "t" makes the drone take off/land. Pressing "q" shuts down the program. | ||
+ | |||
+ | |||
+ | === Steps Taken === | ||
+ | |||
+ | # Environment Setup: | ||
+ | #* Installed necessary Python libraries: mediapipe, opencv-python, djitellopy. | ||
+ | #* Created a virtual environment to keep dependencies organized. | ||
+ | # Implemented Gesture Detection: | ||
+ | #* Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks. | ||
+ | #* Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder). | ||
+ | # Drone Controller: | ||
+ | #* Developed a separate background thread for drone commands. | ||
+ | #* Stores only one command at a time and executes it before accepting the next. | ||
+ | # Main Application Loop: | ||
+ | #* Captures frames from the laptop camera in a separate thread to avoid UI lag. | ||
+ | #* Performed gesture recognition in the main loop, and if needed, updates the drone command. | ||
+ | #* Displays HUD info: current command, active command, battery status. | ||
+ | # Testing and fine Tuning: | ||
+ | #* Adjusted arm-raise thresholds for better recognition accuracy. | ||
+ | |||
+ | |||
+ | === Algorithm overview === | ||
+ | # Threaded Camera Capture | ||
+ | #* A dedicated camera thread continuously reads frames from the laptop camera (cv2.VideoCapture). | ||
+ | #* The latest frame is stored in a shared variable accessible to the main loop. | ||
+ | # Pose Detection with Mediapipe | ||
+ | #* Each new frame is passed to the Mediapipe Pose module. | ||
+ | #* Landmark positions for shoulders and elbows are extracted. | ||
+ | # Gesture Classification | ||
+ | #* Compare elbow y coordinate to shoulder y coordinate: | ||
+ | #:* If both elbows are above their shoulders, the gesture is “UP.” | ||
+ | #:* If only the left elbow is above, that’s “LEFT.” | ||
+ | #:* If only the right elbow is above, that’s “RIGHT.” | ||
+ | #:* Otherwise, “HOVER.” | ||
+ | # Flight Toggle Logic | ||
+ | #* A timer tracks continuous “UP” gesture. If it persists for 4 seconds, we toggle drone flight state (takeoff or land). | ||
+ | # Command Execution | ||
+ | #* A drone controller in a background thread executes any command (TOGGLE_FLIGHT, LEFT, RIGHT, HOVER) without blocking the main loop. | ||
+ | #* The main loop continues updating the camera feed avoiding freezes. | ||
+ | # HUD & Feedback | ||
+ | #* The system displays the command currently recognized, the active command (the one being executed by the drone) and battery level. | ||
+ | |||
+ | === Links === | ||
+ | * [https://github.com/RafalasTypeBeat/Gesture-controlled-drone Source code] | ||
+ | * [https://www.youtube.com/watch?v=hZfRaGYyeyY Video example] |
Latest revision as of 18:00, 2 February 2025
Contents
Gesture Controlled Tello Drone Project
Goal of the Project
The main objective of this project is to control a Tello drone using gesture recognition from a laptop camera. Users can raise their arms in specific poses to make the drone move left, right, up or toggle flight (take off/land) by holding an “UP” pose for four seconds. This project aims to:
- Provide a way to interact with and control a small drone using hand gestures.
- Explore computer vision and machine learning for gesture recognition.
- Demonstrate real-time controls using multithreading to have a smooth video feed.
Description
The system uses:
- MediaPipe Pose Estimation to detect body landmarks (shoulders, elbows).
- OpenCV for video processing.
- DJITelloPy library to send commands to the Tello drone over Wi-Fi, handling takeoff, landing, and movement.
- Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface.
- Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none).
Controlls
When the user performs a gesture in front of the laptop camera, the system detects it and translates it into a drone command:
- Holding both arms raised for 4 seconds toggles flight (either takeoff or land).
- Once in the air, raising both arms makes the drone go up.
- Raising only the left or right arm makes the drone move left or right.
- Doing neither results in a hover command.
- The drone moves 30cm for all commands, which can be changed in drone_controller.py.
- The command has to be held for 1.5 seconds to take effect. This helps prevent accidental commands and ensure safety.
- While the program is running and a drone is connected pressing "t" makes the drone take off/land. Pressing "q" shuts down the program.
Steps Taken
- Environment Setup:
- Installed necessary Python libraries: mediapipe, opencv-python, djitellopy.
- Created a virtual environment to keep dependencies organized.
- Implemented Gesture Detection:
- Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks.
- Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder).
- Drone Controller:
- Developed a separate background thread for drone commands.
- Stores only one command at a time and executes it before accepting the next.
- Main Application Loop:
- Captures frames from the laptop camera in a separate thread to avoid UI lag.
- Performed gesture recognition in the main loop, and if needed, updates the drone command.
- Displays HUD info: current command, active command, battery status.
- Testing and fine Tuning:
- Adjusted arm-raise thresholds for better recognition accuracy.
Algorithm overview
- Threaded Camera Capture
- A dedicated camera thread continuously reads frames from the laptop camera (cv2.VideoCapture).
- The latest frame is stored in a shared variable accessible to the main loop.
- Pose Detection with Mediapipe
- Each new frame is passed to the Mediapipe Pose module.
- Landmark positions for shoulders and elbows are extracted.
- Gesture Classification
- Compare elbow y coordinate to shoulder y coordinate:
- If both elbows are above their shoulders, the gesture is “UP.”
- If only the left elbow is above, that’s “LEFT.”
- If only the right elbow is above, that’s “RIGHT.”
- Otherwise, “HOVER.”
- Flight Toggle Logic
- A timer tracks continuous “UP” gesture. If it persists for 4 seconds, we toggle drone flight state (takeoff or land).
- Command Execution
- A drone controller in a background thread executes any command (TOGGLE_FLIGHT, LEFT, RIGHT, HOVER) without blocking the main loop.
- The main loop continues updating the camera feed avoiding freezes.
- HUD & Feedback
- The system displays the command currently recognized, the active command (the one being executed by the drone) and battery level.