Difference between revisions of "Gesture-Controlled Tello Drone- Rapolas Kairys"

From RoboWiki
Jump to: navigation, search
(Abstract)
(Gesture Controlled Tello Drone Project)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Gesture-Controlled Tello Drone Project ==
+
== Gesture Controlled Tello Drone Project ==
=== Abstract ===
+
=== Goal of the Project===
This project implements a real-time gesture control system for DJI Tello drones using:
+
The main objective of this project is to control a Tello drone using gesture recognition from a laptop camera. Users can raise their arms in specific poses to make the drone move left, right, up or toggle flight (take off/land) by holding an “UP” pose for four seconds. This project aims to:
* [https://github.com/google-ai-edge/mediapipe MediaPipe] for pose estimation
+
* Provide a way to interact with and control a small drone using hand gestures.
* [https://opencv.org/ OpenCV] for video processing
+
* Explore computer vision and machine learning for gesture recognition.
* [https://github.com/damiafuentes/DJITelloPy DJITelloPy] library for drone communication
+
* Demonstrate real-time controls using multithreading to have a smooth video feed.
  
Key features:
+
 
* Hand sign recognition for directional control
+
=== Description ===
* First-person view (FPV) video recording
+
The system uses:
* Command queue system for safe operation
+
* [https://github.com/google-ai-edge/mediapipe MediaPipe] Pose Estimation to detect body landmarks (shoulders, elbows).
 +
* [https://opencv.org/ OpenCV] for video processing.
 +
* [https://github.com/damiafuentes/DJITelloPy DJITelloPy] library to send commands to the Tello drone over Wi-Fi, handling takeoff, landing, and movement.
 +
* Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface.
 +
* Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none).
 +
 
 +
 
 +
=== Controlls ===
 +
When the user performs a gesture in front of the laptop camera, the system detects it and translates it into a drone command:
 +
 
 +
* Holding both arms raised for 4 seconds toggles flight (either takeoff or land).
 +
* Once in the air, raising both arms makes the drone go up.
 +
* Raising only the left or right arm makes the drone move left or right.
 +
* Doing neither results in a hover command.
 +
* The drone moves 30cm for all commands, which can be changed in ''drone_controller.py''.
 +
* The command has to be held for 1.5 seconds to take effect. This helps prevent accidental commands and ensure safety.
 +
* While the program is running and a drone is connected pressing "t" makes the drone take off/land. Pressing "q" shuts down the program.
 +
 
 +
 
 +
=== Steps Taken ===
 +
 
 +
# Environment Setup:
 +
#* Installed necessary Python libraries: mediapipe, opencv-python, djitellopy.
 +
#* Created a virtual environment to keep dependencies organized.
 +
# Implemented Gesture Detection:
 +
#* Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks.
 +
#* Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder).
 +
# Drone Controller:
 +
#* Developed a separate background thread for drone commands.
 +
#* Stores only one command at a time and executes it before accepting the next.
 +
# Main Application Loop:
 +
#* Captures frames from the laptop camera in a separate thread to avoid UI lag.
 +
#* Performed gesture recognition in the main loop, and if needed, updates the drone command.
 +
#* Displays HUD info: current command, active command, battery status.
 +
# Testing and fine Tuning:
 +
#* Adjusted arm-raise thresholds for better recognition accuracy.
 +
 
 +
 
 +
=== Algorithm overview ===
 +
# Threaded Camera Capture
 +
#* A dedicated camera thread continuously reads frames from the laptop camera (cv2.VideoCapture).
 +
#* The latest frame is stored in a shared variable accessible to the main loop.
 +
# Pose Detection with Mediapipe
 +
#* Each new frame is passed to the Mediapipe Pose module.
 +
#* Landmark positions for shoulders and elbows are extracted.
 +
# Gesture Classification
 +
#* Compare elbow y coordinate to shoulder y coordinate:
 +
#:* If both elbows are above their shoulders, the gesture is “UP.”
 +
#:* If only the left elbow is above, that’s “LEFT.”
 +
#:* If only the right elbow is above, that’s “RIGHT.”
 +
#:* Otherwise, “HOVER.”
 +
# Flight Toggle Logic
 +
#* A timer tracks continuous “UP” gesture. If it persists for 4 seconds, we toggle drone flight state (takeoff or land).
 +
# Command Execution
 +
#* A drone controller in a background thread executes any command (TOGGLE_FLIGHT, LEFT, RIGHT, HOVER) without blocking the main loop.
 +
#* The main loop continues updating the camera feed avoiding freezes.
 +
# HUD & Feedback
 +
#* The system displays the command currently recognized, the active command (the one being executed by the drone) and battery level.
 +
 
 +
=== Links ===
 +
* [https://github.com/RafalasTypeBeat/Gesture-controlled-drone Source code]
 +
* [https://www.youtube.com/watch?v=hZfRaGYyeyY Video example]

Latest revision as of 18:00, 2 February 2025

Gesture Controlled Tello Drone Project

Goal of the Project

The main objective of this project is to control a Tello drone using gesture recognition from a laptop camera. Users can raise their arms in specific poses to make the drone move left, right, up or toggle flight (take off/land) by holding an “UP” pose for four seconds. This project aims to:

  • Provide a way to interact with and control a small drone using hand gestures.
  • Explore computer vision and machine learning for gesture recognition.
  • Demonstrate real-time controls using multithreading to have a smooth video feed.


Description

The system uses:

  • MediaPipe Pose Estimation to detect body landmarks (shoulders, elbows).
  • OpenCV for video processing.
  • DJITelloPy library to send commands to the Tello drone over Wi-Fi, handling takeoff, landing, and movement.
  • Multithreading to ensure that drone commands (which can block) do not freeze the camera feed or the user interface.
  • Custom gesture logic to determine gestures (LEFT arm up, RIGHT arm up, both arms up, or none).


Controlls

When the user performs a gesture in front of the laptop camera, the system detects it and translates it into a drone command:

  • Holding both arms raised for 4 seconds toggles flight (either takeoff or land).
  • Once in the air, raising both arms makes the drone go up.
  • Raising only the left or right arm makes the drone move left or right.
  • Doing neither results in a hover command.
  • The drone moves 30cm for all commands, which can be changed in drone_controller.py.
  • The command has to be held for 1.5 seconds to take effect. This helps prevent accidental commands and ensure safety.
  • While the program is running and a drone is connected pressing "t" makes the drone take off/land. Pressing "q" shuts down the program.


Steps Taken

  1. Environment Setup:
    • Installed necessary Python libraries: mediapipe, opencv-python, djitellopy.
    • Created a virtual environment to keep dependencies organized.
  2. Implemented Gesture Detection:
    • Used Mediapipe’s Pose Estimation to identify shoulder and elbow landmarks.
    • Determined a gesture by comparing y-coordinates of elbows vs. shoulders (e.g., elbow above the shoulder).
  3. Drone Controller:
    • Developed a separate background thread for drone commands.
    • Stores only one command at a time and executes it before accepting the next.
  4. Main Application Loop:
    • Captures frames from the laptop camera in a separate thread to avoid UI lag.
    • Performed gesture recognition in the main loop, and if needed, updates the drone command.
    • Displays HUD info: current command, active command, battery status.
  5. Testing and fine Tuning:
    • Adjusted arm-raise thresholds for better recognition accuracy.


Algorithm overview

  1. Threaded Camera Capture
    • A dedicated camera thread continuously reads frames from the laptop camera (cv2.VideoCapture).
    • The latest frame is stored in a shared variable accessible to the main loop.
  2. Pose Detection with Mediapipe
    • Each new frame is passed to the Mediapipe Pose module.
    • Landmark positions for shoulders and elbows are extracted.
  3. Gesture Classification
    • Compare elbow y coordinate to shoulder y coordinate:
    • If both elbows are above their shoulders, the gesture is “UP.”
    • If only the left elbow is above, that’s “LEFT.”
    • If only the right elbow is above, that’s “RIGHT.”
    • Otherwise, “HOVER.”
  4. Flight Toggle Logic
    • A timer tracks continuous “UP” gesture. If it persists for 4 seconds, we toggle drone flight state (takeoff or land).
  5. Command Execution
    • A drone controller in a background thread executes any command (TOGGLE_FLIGHT, LEFT, RIGHT, HOVER) without blocking the main loop.
    • The main loop continues updating the camera feed avoiding freezes.
  6. HUD & Feedback
    • The system displays the command currently recognized, the active command (the one being executed by the drone) and battery level.

Links