DOI : 10.17577/IJERTCONV14IS050024- Open Access

- Authors : Ajeet Singh, Sumya Panwar, Sunita Bhatt, Shiv Kumar, Shayan Haider
- Paper ID : IJERTCONV14IS050024
- Volume & Issue : Volume 14, Issue 05, IIRA 5.0 (2026)
- Published (First Online) : 24-05-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Gesture Control Drone
Ajeet Singh
Computer Science and Engineering Dept.
Moradabad Institute of Technology Moradabad, India ajeetsingh252@gmail.com
Sumya Panwar
Computer Science and Engineering Dept.
Moradabad Institute of Technology Moradabad, India saumya99panwar@gmail.com
Sunita Bhatt
Computer Science and Engineering Dept.
Moradabad Institute of Technology Moradabad, India bhattsunita751@gmail.com
Shiv Kumar
Computer Science and Engineering Dept.
Moradabad Institute of Technology Moradabad, India sg2676745@gmail.com
Shayan Haider
Computer Science and Engineering Dept.
Moradabad Institute Of Technology Moradabad, India hshayan283@gmail.com
Abstract It is a new method of drone control that improves the way humans interact with machines: a natural method of flying in the air. This research introduces the process of developing a gesture-controlled drone, while computer vision and machine learning are used to interpret the hand movement via a webcam on a laptop. The identified gestures are translated into drone commands through a GUI built using Tkinter, and these commands are sent to the Raspberry Pi for use in communicating with the flight control system. Integration with GPS enables precise location tracking, while VNC Viewer and PuTTY provide remote monitoring and access of the drone.
It's a full-fledged system free from the need for traditional controllers to facilitate hands-free operation for applications in search and rescue, surveillance, and industrial inspections. This paper discusses the architecture of the system, implementation challenges, and future work toward improvements, including enhancement of accuracy, reduction of latency, and obstacle avoidance.
Keywords Gesture recognition, unmanned aerial vehicles (UAVs), drone control, computer vision, deep learning
-
INTRODUCTION
The evolution of human-computer interaction has led to innovative control mechanisms, and gesture interfaces are among the most promising new alternatives to conventional input methods. From gesture recognition, effortless human communication-man-machine interaction, becomes increasingly free from physical controllers and much more accessible. Therefore, this study embarks on developing a gesture-controlled drone, where hand gestures are detected through a webcam, machine learning techniques applied to process the data and transmit it to the drone's flight control system for real-time navigation.
The system basically integrates Raspberry Pi, VNC Viewer, flight control system with GPS and the Tkinter Library to support gesture detection via a graphical interface. The computer vision and machine-learning features enable it to detect the gestures and transform these into drone movements accurately. Thus, this system does not need a remote
controller making it very intuitive as a user-friendly drone operation tool.
Applications of such a system include search and rescue operations, military surveillance, disaster management, and inspection of the remote areas. While most conventional remote controller-operated drones are joystick based and need prior training, the hand gestures will make the use of drones much more natural and accessible allowing even a casual user to operate a drone efficiently.
This document addresses attributes of the whole system architecture, hardware components, software framework, and also experiment results of gesture-controlled drones. It also discusses the obstacles one would meet in gesture recognition processes and real-time processing for environmental adaptability. Further, with this research, we aspire to join the field of human-machine interaction and drone technology by showcasing one of the most innovative, efficient and user- friendly control mechanisms.
-
PROBLEM STATEMENT
The rapid evolution of Unmanned Aerial Vehicles (UAV) has necessitated their use in numerous disciplines like surveillance, disaster management, military, agriculture, and industrial inspections. Still, drone controlling methods found in traditional ethics, including remote controllers and mobile applications, are experiencing some setbacks. Such controllers demand pre-training, precise handling, and, in cases with limited access of external physical input devices to the user, possibly function unpractically. External agent interference, latency in signals, and hardware breakdown now become limitations to the efficiency of typical control strategies. In a bid to combat these problems, the research presents the gesture-controlled drone that would utilize natural hand gestures without requiring a physical controller.
The system developed here integrates the latest computer vision and machine learning advancements to recognize and interpret hand gestures through a webcam and convert the
same into commands to navigate a drone. Key elements of the system are the Raspberry Pi for processing, flight controller system, GPS integration, and gesture recognition library Tkinter. Although there are numerous benefits to gesture control, there are a few technical issues left to overcome.
One of the biggest downsides to gesture recognition systems is that they are extremely sensitive to environmental factors such as lighting changes, background noise, or even where the hand of the person is, and all of these factors can drastically influence accuracy. Also, the system must be able to process in real time with low latency to enable smooth navigation of the drone. The second problem is the smooth integration of the gesture recognition software with the drone flight control suite for smooth command execution. This research suggests the design, deployment, and evaluation of a gesture-based drone system that provides a more natural, effective, and accessible way of controlling UAVs. The research contributes to the advancement of human-machine interaction technologies that facilitate the usability and accessibility of drones in mission-critical applications by addressing the limitations of traditional drone controllers.
-
LITERATURE REVIEW
The application of gesture-controlled UAVs has attracted major attention because it holds the potential to revolutionize human-drone interaction. Many researchers have looked into the viability of using computer vision-based gesture recognition systems for drone control by leveraging advances in machine learning, artificial intelligence, and embedded systems to enhance responsiveness and accuracy [1], [2]. Most research into gesture-controlled drone navigation has aimed at utilizing hand gestures as an intuitive input technique. Research has shown using Convolutional Neural Networks (CNNs) and Deep Learning (DL) models to perform hand gesture classification from RGB camera or depth sensor captures [3], [4]. Some of the prevalent methods use MediaPipe, OpenCV, or Tkinter libraries for in-real-time gestures tracking and extracting features to decode user commands without errors [5].
Applying machine learning algorithms to the recognition of hand gestures has much enhanced the precision of recognition [6]. Support Vector Machines (SVM), CNNs, and Recurrent Neural Networks (RNNs) are among the many algorithms that have been used for model training to make real-time detections [7]. All these methods have facilitated gesture- controlled systems to meet high accuracy within controlled environments. Adapting the models for other conditions like variable lighting, skin colour differences, orientations of hands, and occlusions leading to misclassifications still presents difficulties [8].
A number of research woks have investigated the use of embedded computing boards such as Raspberry Pi, NVIDIA Jetson Nano, and Arduino in processing gesture input [2]. Applications of flight control boards (e.g., Pixhawk or ArduPilot) enable system integration of gesture recognition
modules with drone movement implementation without any communication breakdown [3]. The application of Wi-Fi and Bluetooth-based communication protocols has also been researched to reduce response time in sending control instructions [6].
Despite the progress, the use of gesture-controlled drones has a number of challenges. Gesture recognition is prone to adverse lighting, shadows, and background noise, impacting detection accuracy [1]. Response time between gesture and action in real-time is critical, calling for optimized algorithms and hardware acceleration [4]. Misrecognition of gestures or system failure can cause operational risks, calling for fail-safe mechanisms [7]. Current research has established a solid groundwork for gesture-controlled drones, with feasibility proven through computer vision and AI-driven gesture recognition [5]. Challenges of real-time responsiveness, robustness in dynamic scenes, and smooth integration of drones are open areas for enhancement. This research seeks to expand on current research by developing an optimized, real-time gesture control system for UAVs, resolving major limitations observed in prior studies [2].
-
RELATED WORK
Different studies have investigated the application of gesture recognition in UAV control, confirming the viability of computer vision and deep learning methods for hand gesture interpretation. Past work has mostly been centered on gesture classification from Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and hybrid models, with great accuracy within laboratory settings. In addition, some studies have explored the employment of wearable sensors, including inertial measurement units (IMUs) and gyroscopes, to sense hand motions for drone control. Although these methods have achieved encouraging results, they tend to need particular hardware, calibration, or standardized illumination conditions, restricting their usability in dynamic real-world environments.
Even with advances in machine learning models, numerous gesture-controlled drone systems continue to be challenged in terms of real-time processing, environmental adaptability, and integrating flight control mechanisms without interruptions. A few works have utilized predefined gesture libraries that limit flexibility and mandate users to acquire particular hand movements. Others used external hardware sensors, complicating the system and making it expensive. Additionally, gesture misclassification caused by changing hand orientations, background distractions, and motion blur is a significant challenge to guaranteeing stable control.
Contrarily, this research seeks to address these limitations by developing a real-time gesture recognition system based on a webcam and machine learning processes only. By removing the requirement for other hardware, the proposed system improves accessibility and convenience. In addition, the use of Raspberry Pi onboard processing, GPS for navigation, and VNC Viewer for distant monitoring guarantees continuous operation and functionality. This research adds to existing
knowledge by being an optimized, affordable, and scalable gesture control solution for drone operation, making it applicable across various uses, including search and rescue, factory inspections, and hands-free directions.
-
METHODOLOGY
Designing a gesture-controlled drone needs to be done using a systematic process that combines computer vision, machine learning, and embedded systems for real-time performance and precision [1], [2]. This section details the methodology followed to design and develop a system that can detect hand gestures and convert them into UAV control instructions. The system has three major components: gesture recognition, processing and communication, and execution of drone control. The gesture recognition subsystem records and understands the gestures of the user, while the processing subsystem (Raspberry Pi) sends commands to the flight system of the drone for implementation [3]. This approach provides a smooth and natural interaction between the user and the drone without the use of conventional remote controllers.
-
System Architecture
The system proposed here is intended to provide gesture- based control of drones by employing computer vision, machine learning, and embedded systems [4]. The three principal components of the architecture include the gesture recognition module, the processing and communication module, and the drone control system. The gesture recognition module receives real-time hand gestures via a laptop webcam, which are machine learning-based gesture recognition processed [5]. The processed gestures are transformed into pre-specified drone control commands, which are sent to the flight control system of the drone through a Raspberry Pi-based processing unit. This design removes the necessity of classical controllers and has an easy- to-use interface for operating UAVs [6].
Figure 1: System Architecture of Gesture-Controlled Drone
-
Gesture Recognition Process
Computer vision and deep learning are used by the system to detect hand gestures and map them to drone commands [7]. Gesture recognition utilizes the Tkinter library, which relies on pre-trained machine learning models to identify hand movements. The webcam records real-time hand movements, which are preprocessed using image segmentation and feature extraction methods [8]. Hand position, finger direction, and movement path are extracted as significant features to achieve correct classification. The pre-processed data is passed on to the gesture recognition model, which assigns gestures to individual drone control operations, like take-off, landing, left/right translation, and altitude change [9].
Figure 2: Flowchart of Gesture Recognition Process
-
Hardware Components
The hardware configuration is made up of several parts that enable gesture recognition, data processing, and drone control. The parts coordinate to provide real-time execution of user commands [10]. The most important hardware components utilized in this system are:
-
Pixhawk PX4 Autopilot Flight Controller Manages stabilization, navigation, and the execution of movement commands[11].
-
Raspberry Pi 4 Serves as the central processing unit, which is in charge of interpreting gestures and sending commands to the flight controller [12].
-
KK Flight Controller Enables drone stabilization through dynamically modifying flight motion according to live settings .
-
GPS Module Offers positional precision and navigation assistance in outdoor use.
-
Motors and Fans Drive aerial motion by dynamically modifying thrust levels for accurate maneuverability [14].
-
Batteries Used as the main source of power to provide long-lasting flight times and non-stop performance [1].
-
Laptop Webcam Is used to detect real-time hand gestures for recognition and processing [3].
These hardware items altogether facilitate effortless combination of UAV navigation with gesture recognition, where there is no requirement for the use of controllers.
-
-
Software Framework
The software framework combines computer vision algorithms, machine learning algorithms, and communication protocols to allow for gesture control of UAVs [6]. The control script is based on Python and records webcam input, processes image frames, and implements gesture recognition algorithms through Tkinter [8]. The recognized gestures are mapped to pe-defined sets of commands, which are sent to the Raspberry Pi through a socket-based communication protocol. The Raspberry Pi converts these commands into signals compatible with the drone's flight control system to achieve precise and real-time movement execution [10].
-
-
IMPLEMENTATION & TESTING
The successful deployment of the gesture-controlled drone necessitated the integration of hardware and software components, along with extensive testing to guarantee reliability and real-time response. This section details the system setup, configuration procedure, and testing paradigms utilized to assess the performance of the system.
-
System Setup & Configuration
The hardware parts were installed and set up to provide seamless communication between the gesture recognition system and the flight controller of the drone. The Pixhawk PX4 Autopilot Flight Controller was interfaced with the Raspberry Pi 4, which executed gesture commands and sent them to the drone. The KK Flight Controller was implemented to provide stabilization support, while the GPS module provided accurate location tracking.
For gesture recognition, a laptop webcam was used to record real-time hand movements. The webcam stream was connected to the Raspberry Pi, which executed gesture recognition models with the Tkinter library. A socket-based communication protocol was employed to transmit processed commands to the drone. The drone was powered by rechargeable lithium-ion batteries, which offered long flight times.
For ease of monitoring and debugging, VNC Viewer and PuTTY were used to enable remote access to the Raspberry Pi and allow for effective troubleshooting during flight tests.
Figure 3: Assembled gesture-controlled drone used for system implementation
(Image shows GPS module, propellers, and wiring setup as described above.)
Figure 4: Alternate view of the hardware configuration.
-
Testing Methodology
The system was tested under different conditions to gauge its performance in detecting gestures, performing flight commands, and reacting to environmental changes. The most important testing parameters were:
Accuracy for Gesture Recognition: The model was tested with a variety of hand gestures to validate classification precision, recall, and F1-score.
Measuring Latency: The delay from gesture detection until drone movement performance was examined for real-time assessment.
Environmental Adaptability: Robustness to varying light intensities, environmental backgrounds, and user variations in the model accuracy was examined through testing under differing conditions.
Flight Control Stability: The drone was subjected to take-off, directional movement, hovering, and landing tests to confirm stable and accurate control.
Through these tests, the effectiveness of the system in real- world use was assessed, providing reliable and intuitive gesture-based drone control.
-
-
RESULTS & ANALYSIS
The performance of the gesture-controlled UAV was evaluated on the basis of gesture recognition accuracy, response time, environmental adaptability, and flight control stability. Results confirm the usability of computer vision- based gesture recognition for UAV control [3].
-
Gesture Recognition Performance
The system attained an average gesture recognition accuracy of 93.4%, with negligible false detections. The precision and recall scores of diverse hand gestures demonstrated high reliability, affirming the efficacy of the Tkinter library for real-time classification [5]. Slight discrepancies were, however, noted during complex hand movements, where the recognition accuracy fell to 88% during low-light conditions [13].
-
Latency & Response Time
The system showed a mean response time of 250 milliseconds from gesture input to execution of drone movement [6]. The delay was within acceptable levels for real-time operation, allowing for smooth control without perceived lag. The latency was marginally higher (280-300 ms) when operating in low light conditions but remained consistent in regular environments [9].
-
Environmental Adaptability
Experiments with different illuminations and backgrounds revealed that the system had an accuracy level over 90% under bright conditions [7]. Performance dropped to 85% under very high brightness or low illumination, which may necessitate improvement in image preprocessing methods for stability [13].
-
Flight Control Stability
The drone successfully executed primary flight maneuvers, such as take-off, directional flights, hovering, and landing, at 98% accuracy in controlled environments [4]. Minor
instability was noted during outdoor tests under high wind velocities, impacting fine movement adjustments [1].
-
Summary of Findings
The outcomes substantiate the practicality and real-time applicability of gesture-control UAVs [1]. Though the system displayed great precision and stability, small shortfalls in low light environment operation and outdoor stability can be optimized for in the next development phase [14].
-
-
CONCLUSION & FUTURE WORK
Conclusion
The study was able to prove the viability of gesture- controlled drones through computer vision and machine learning. The system, which utilizes a webcam for real-time gesture detection, Raspberry Pi for processing, and a flight control system for UAV movement, does away with the need for conventional controllers, providing an intuitive and easy- to-use alternative. The results indicated high accuracy of gesture recognition (93.4%), low latency (~250 ms response time), and robust flight performance in controlled conditions. Minor limitations were, however, noted in low-light conditions and outdoor stability, suggesting areas of further optimization.
The results confirm the efficacy of gesture-based drones for search and rescue missions, surveillance missions, industrial inspection, and control of UAVs without using one's hands. Through making interaction with drones more intuitive using hand gestures, the research promotes progress in human- computer interaction technology for autonomous flying robots.
Future Work
Even though the system performed high responsiveness and accuracy, there are a few issues remaining that could be resolved in further studies:
-
Enhancing Gesture Recognition in Low-Light Conditions Augmenting image preprocessing methods or using infrared-based sensors would enhance recognition in low-light situations.
-
Outdoor Flight Stability Incorporating high-end stabilization algorithms and adaptive wind compensation methods would increase drone stability in outdoor environments.
-
Gesture Customization for Users Adding personalized gesture training would enable users to establish custom commands for increased flexibility.
-
Multi-Gesture Command Sequences Extending the system to detect sequences of gestures rather than individual gestures would allow for more sophisticated drone movements.
-
Hardware Optimization Investigating edge computing hardware (e.g., NVIDIA Jetson Nano)
may lower latency and enhance real-time processing effectiveness.
By solving these enhancements, the suggested system can be developed into a stronger, flexible, and more general-purpose UAV control system, improving autonomous navigation and interaction features.
-
-
REFERENCES
-
Shakhatreh, H., Sawalmeh, A. H., Al-Fuqaha, A., Dou, Z., Almaita, E., Khalil, I., … & Guizani,M. (2019). Unmanned aerial vehicles (UAVs): A survey on civil applications and key research challenges. IEEE Access, 7, 48572-48634.
-
Han, H., & Yoon, S. W. (2019). Gyroscope-based continuous human hand gesture recognition for multi-modal wearable input device for human machine interaction. Sensors, 19(11), 2562.
-
Hu, B., & Wang, J. (2020). Deep learning based hand gesture recognition and UAV flight controls. International Journal of Automation and Computing, 17(1), 17-29.
-
Begum, T., Haque, I., & Keselj, V. (2020, November). Deep learning models for gesture-controlled drone operation. In 2020 16th International Conference on Network and Service Management (CNSM) (pp. 1-7). IEEE.
-
Dixit, K. R., Verma, T., Subramanya, U. S., & Umadevi,
V. (2018, May). Hand Gesture Based Quadcopter Control Using Image Processing And Adaptive Machine Learning. In 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) (pp. 1214-1218). IEEE.
-
Chandarana, M., Meszaros, E. L., Trujillo, A., & Allen,
B. D. (2017, June). Analysis of a gesture-based interface for UAV flight path generation. In 2017 International Conference on Unmanned Aircraft Systems (ICUAS) (pp. 36- 45). IEEE.
-
Zhou, J., Zhu, H., Kim, M., & Cummings, M. L. (2019). The impact of different levels of autonomy and training on operators drone control strategies. ACM Transactions on Human-Robot Interaction (THRI), 8(4), 1-15.
-
Islam, T., Islam, M. S., & Shajid-Ul-Mahmud, M. (2017, December). Comparison of complementary and Kalman filter based data fusion for attitude heading reference system. In AIP Conference Proceedings (Vol. 1919, No. 1). AIP Publishing.
-
Avola, D., Bernardi, M., Cinque, L., Foresti, G. L., & Massaroni, C. (2018). Exploiting recurrent neural networks and leap motion controller for the recognition of sign language and semaphoric hand gestures. IEEE Transactions on Multimedia, 21(1), 234-245.
Implementation of a User-Friendly Drone Control Interface Using Hand Gestures and Vibrotactile Feedback. J. Inst. Control Robot. Syst, 28, 349-352.
-
Liu, C., & Szirányi, T. (2021). Real-time human detection and gesture recognition for on-board UAV rescue. Sensors, 21(6), 2180.
-
Shin, S. Y., Kang, Y. W., & Kim, Y. G. (2019, January). Hand gesture-based wearable human-drone interface for intuitive movement control. In 2019 IEEE International Conference on Consumer Electronics (ICCE) (pp. 1-6). IEEE.
-
Zhao, H., Ma, Y., Wang, S., Watson, A., & Zhou, G. (2018). MobiGesture: Mobility-aware hand gesture recognition for healthcare. Smart Health, 9, 129-143.
-
Hakim, N. L., Shih, T. K., Kasthuri Arachchi, S. P., Aditya, W., Chen, Y. C., & Lin, C. Y. (2019). Dynamic hand gesture recognition using 3DCNN and LSTM with FSM context-aware model. Sensors, 19(24), 5429.
