

- Open Access
- [post-views]
- Authors : K Shivaraj, Pawan Kumar R, Nandini R, M Vaishnu Kumar, Vijaya Kumar A
- Paper ID : IJERTV14IS040089
- Volume & Issue : Volume 14, Issue 04 (April 2025)
- Published (First Online): 16-04-2025
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Math with Gestures using AI
K Shivaraj (Senior Lecturer), Dept of CSE, Sandur Polytechnic, Yeshwantnagar,
Pawan Kumar R (Lecturer), Dept of CSE, Sandur Polytechnic, Yeshwantnagar,
Nandini R (Student)
Dept of CSE, Sandur Polytechnic, Yeshwantnagar,
M Vaishnu Kumar (Student)
Dept of CSE, Sandur Polytechnic, Yeshwantnagar,
Vijaya Kumar A (Student)
Dept of CSE, Sandur Polytechnic, Yeshwantnagar,
Abstract: Solving complex mathematical expressions can be difficult and time-consuming. We identified this challenge and proposed a system that allows users to perform mathematical problems using hand gestures, making the learning experience more interactive. Our solution uses a combination of OpenCV and MediaPipe for real-time gesture recognition and Googles Gemini API for solving math problems based on the input gestures. The application is built using Streamlit, providing a simple and user- friendly interface. This project aims to enhance the way users interact with technology in learning environments by bridging the gap between physical gestures and digital problem-solving through AI.
Keywords MediaPipe, OpenCV, Google Gemini API, Gesture Recognition, AI-based Math Solver, Streamlit.
-
INTRODUCTION
Solving math problems on digital devices can often be a slow and frustrating process, especially when it involves typing complex symbols or equations. This can become a barrier for students, young learners, and those with physical limitations. With advancements in artificial intelligence and computer vision, there is now a way to make math interaction more natural and engaging. This project, Math with Gestures Using AI, aims to create an intuitive system where users can perform mathematical operations using simple hand gestures. By using MediaPipe for gesture recognition and integrating OpenCV for real-time video input, the system captures and interprets user gestures. These gestures are then processed through Googles Gemini API to solve the corresponding math problem. This system offers several benefits. It makes learning math easier by letting students use hand gestures instead of typing. It also improves accessibility for users who have difficulty writing equations. The interactive nature keeps learners more engaged, and overall, it introduces a new and
innovative way to learn math using AI. This approach is helpful for those with physical challenges or learning difficulties. It gives everyone a chance to learn math in a way that's easier and more accessible. The system encourages active participation. Since users can interact with the screen using their hands, it feels more engaging than just typing or reading. This solution bridges the gap between physical interaction and digital learning, offering a modern, gesture- based approach to solving math problems in real time.
-
LITERATURE SURVEY
AI-Powered Gesture-Controlled Interactive System for Real-Time Mathematical Problem Solving and Visualization (Dr. Saravanan G et al., 2024). In this paper, we are working on an AI based gesture- controlled system which responses to the hand motion of users and solves computer generated mathematical problems without any need of mouse, keyboard or even getting off from their seat. Dubbed Math Vision, the system uses contemporary machine learning techniques like OpenCV for detecting gestures and Google Gemini LLM technology for interpreting math equationsatop a relatively straightforward interface. A webcam or other camera captures hand movements on gestural drawings equations that are then converted into a usable format which can be interactively scrolled, deleted and confirmed by making specific gestures with the hands. A trained model to detect finger gestures when writing math expressions, with custom mathematical layers. Math Vision serves as an educational software product, providing educational interactivity to a general process of problem-solving where students receive problems and can either solve mathematical exercises by hand or type their solutions.
Gestures, systemic functional linguistics and mathematics education (Farsani et al., 2022).
Unexpected occurrences can sometimes emerge during the research process. In our study, we encountered such a case when the same gesture was observed across three separate mathematics lessons, each focused on different topics and conducted in different parts of the world. This coincidence led us to reflect on whether the gesture served a similar role in each context, suggesting that it might be part of a broader global classroom culture in mathematics. While gestures have been extensively studied in mathematics education (Sfard, 2009), most of the existing literature centres on how gestures function within specific classroom interactions (Simpson & Cole, 2015), rather than how they might operate across diverse cultural settings. One possible reason for this gap in research is the absence of analytical frameworks capable of identifying gesture functions both within immediate communicative contexts and within broader sociocultural frameworks. In this article, we draw on Hallidays Systemic Functional Linguistics (SFL) to explore and compare the roles that this gesture played across different classroom settings. Real-Time Math Solver Using Hand Gestures with Al (Radford, 2003).
Handwritten gesture recognition plays a vital role in AI and ML, enabling machines to interpret and process human gestures, specifically those made by hand, including characters, symbols, and shapes. Recent innovations in AI and ML have significantly improved the ability to accurately identify complex handwritten gestures, making this technology valuable for various applications, including educational tools, digital note-taking, and mathematical problem- solving. Handwritten gesture recognition is a specialized area within handwriting character recognition, valuable for science and education as these symbols often appear in mathematical formulas and equations. Researchers have been working on methods to automatically recognize mathematical symbols for over fifty years. The Roles of Mathematical Metaphors and Gestures in the Understanding of Abstract Mathematical Concepts (Khatin-Zadeh et al., 2023). Students often struggle to understand new mathematical concepts when they are introduced solely through abstract symbols. This is because such symbols typically lack a direct connection to tangible or easily observable objects. However, when the same ideas are conveyed through a graphor through gestures that visually represent the graph they tend to be more accessible and easier to comprehend. Visual tools can also support the process of working through complex mathematical problems. Converting mathematical ideas or problems into graphical representations is a widely used strategy in problem-solving. This can be
considered a form of mathematical metaphor, where the abstract problem is interpreted through a visual lens. Moreover, because these graphical forms are inherently visual, they can be naturally illustrated using gestures.
-
AI AND INNOVATION
Artificial Intelligence plays a key role in making this project innovative and impactful. By using AI models, we are able to recognize hand gestures in real-time and translate them into mathematical expressions. This removes the need for traditional input methods like keyboards or touchscreens. The system uses MediaPipe to detect hand landmarks and Googles Gemini API to understand and solve the math problems drawn by the user. This combination of gesture recgnition and AI problem-solving introduces a new, creative way of learning. It allows users to interact with technology in a more natural way and makes math more approachable. The innovation lies in how we bring together computer vision, AI, and user interaction to create a smart and accessible learning tool.
-
METHODOLOGY
The following steps describe the methodology used in developing the gesture-based math solving system:
Dataset Setup: Collected hand gesture datasets representing numbers and mathematical symbols. Annotated and labelled the gestures based on their intended math operations (e.g., 19, +, , ×, ÷). Public datasets and custom hand gestures were combined to ensure accuracy and diversity.
Preprocessing: Preprocessing was performed on each frame of the video input. This included resizing images, normalizing pixel values, and applying necessary filters to enhance gesture visibility and accuracy. Background noise was reduced to make gesture recognition more precise.
Gesture Detection Using MediaPipe: Implemented MediaPipe Hands to detect and track hand landmarks in real time. The framework identifies 21 hand key points which are used to analyses the gesture and classify it as a number or symbol.
Gesture Classification: The gesture landmarks were mapped to a trained gesture classification model. Based on the hands position and movement, the system recognizes the intended math input. Gestures are recorded in sequence to form a full mathematical expression.
Math Solving Using Gemini API: Once the gesture-based expression is formed, it is sent to Googles Gemini API, which processes the input and returns the solved result. This step allows AI to handle even complex calculations seamlessly.
Real-Time Integration with Streamlit: The entire processfrom video input to result displaywas built into a Streamlit app. Users can view their gestures being recognized live, the math expression formed, and the solution displayed instantly.
Testing and Evaluation: The system was tested with different lighting conditions, hand sizes, and backgrounds. Accuracy of gesture recognition and correct solving
Error Handling and Feedback: Implemented a feedback mechanism to notify users when gestures are unclear or unrecognized. Suggestions or re- attempt prompts are given to enhance user interaction and reduce error rates.
-
INSIGHT INTO AI AND GESTURE RECOGNITION
The project leverages MediaPipe and Googles Gemini API, integrating gesture recognition with advanced artificial intelligence to interpret and solve mathematical expressions in real-time. MediaPipe is a framework developed by Google that detects and tracks hand landmarks using deep learning models. It identifies 21 key hand points from live video frames, which serve as the core input for gesture analysis.
MediaPipe Hands uses a pipeline of neural networks trained to recognize hand shapes and orientations. These landmarks help classify specific gestures such as numbers and operators. By observing hand shapes and finger positions, the system can reliably identify intended math symbols (e.g., 2, +, =). The Gemini API, a powerful language model, is used to process the interpreted expression and provide a solution. Once a full expression is formed using gestures, the API solves the problem, allowing for flexible and accurate mathematical output. The gesture recognition system uses a combination of Convolutional Neural Networks (CNNs) and landmark mapping for accuracy. It handles variations in hand size, lighting, and orientation using real-time adjustments and filtering. To improve performance, we applied data normalization, dynamic gesture tracking, and threshold-based recognition. In real-time use, the system processes video frames continuously, detects hand gestures using MediaPipe, converts them into math expressions, and then calls the Gemini API for solving. This process creates an intuitive and fast way to interact with math using only hand movementsno need for typing or writing.
The innovation lies in combining gesture detection with AI-powered problem-solving, making math learning more accessible and interactive, especially for students and users with physical or learning challenges.
-
WORKFLOW OF AI-BASED MATH LEARNING THROUGH HAND GESTURES
This diagram represents the core workflow of the Math with Gestures using AI system. It showcases the step-by-step process through which mathematical expressions are understood and solved using hand gestures and AI integration.
-
RESULTS
The final product is that when you open Math with Gestures using AI, you will see a simple and user- friendly interface that helps you to get started. It has two buttons to turn on camera or off, a switch to start gesture recognition.
Fig 1. Home Page
In Math with Gestures using AI, users can draw math problems on the screen using a single finger as a virtual pen. The system will track the fingers movement in real time, allowing for smooth and natural writing.
Fig 2. Drawing Math with Finger
In Math with Gestures using AI, you can erase your input and start fresh by pointing your thumb to the left. This makes it easy to clear mistakes and enter a new math problem effortlessly.
Fig 3. Clear Math Input Gesture
By showing five fingers to the camera, the written problem to Google Gemini API for processing. The AI then analyses it and returns the solution instantly.
Fig 4. Send Input for AI Processing
Once the problem is sent to the AI, it quickly processes the input and displays the solution on the screen. This makes solving math problems effortless and interactive.
Fig 5. AI-Generated Math Solution Display
-
CONCLUSION
To sum up, the Math Vision project effectively combines AI-powered equation solving with gesture recognition, offering a fresh and engaging method of teaching mathematics. The technology improves learning by allowing users to sketch equations in Realtime and get immediate solutions using computer vision and AI algorithms. The integration of digital computing and physical gestures creates a link between conventional problem-solving techniques and contemporary technology. This method not only increases interest but also facilitate greater understanding of mathematical ideas. Math Vision can be made even more useful for professionals, students, and teachers in the future by expanding its equation-solving capabilities and improving gesture detection.
-
RERFERNCES
-
Dr. Saravanan G, Senthil Kumar M, Abiselvam B, Sneha S, Revathi S, & Vinoth V. (2024). AI-Powered Gesture-Controlled Interactive System for Real-Time Mathematical Problem Solving and Visualization. International Research Journal on Advanced Engineering Hub (IRJAEH), 2(12), 27222728. https://doi.org/10.47392/IRJAEH.2024.0376
-
Farsani, D., Lange, T., & Meaney, T. (2022). Gestures, systemic functional linguistics and mathematics education. Mind, Culture, and Activity, 29(1), 7595. https://doi.org/10.1080/10749039.2022.2060260
-
Khatin-Zadeh, O., Eskandari, Z., & Farsani, D. (2023). The Roles of Mathematical Metaphors and Gestures in the Understanding of Abstract Mathematical Concepts. Journal of Humanistic Mathematics, 13(1), 3653. https://doi.org/10.5642/jhummath.BZXW2115
-
Radford, L. (2003). Gestures, Speech, and the Sprouting of Signs: A Semiotic-Cultural Approach to Students Types of Generalization. Mathematical Thinking and Learning, 5(1), 37
70. https://doi.org/10.1207/S15327833MTL0501_02