Global Knowledge Platform
Serving Researchers Since 2012

Bot Detection Without CAPTCHA

DOI : https://doi.org/10.5281/zenodo.19402027
Download Full-Text PDF Cite this Publication

Text Only Version

 

Bot Detection Without CAPTCHA

Shravani Mayekar

Dept. of Artificial Intelligence & Data Science, A. C. Patil College of Engineering, Navi-Mumbai, India

Krutadnya Mhatre

Dept. of Artificial Intelligence & Data Science, A. C. Patil College of Engineering, Navi-Mumbai, India

Reva Pakhale

Dept. of Artificial Intelligence & Data Science, A. C. Patil College of Engineering, Navi-Mumbai, India

Saniya Wagh

Dept. of Artificial Intelligence & Data Science, A. C. Patil College of Engineering Navi-Mumbai, India

Jayaprabha Terdale

Dept. of Artificial Intelligence & Data Science, A. C. Patil College of Engineering Navi-Mumbai, India

Abstract – This project proposes a CAPTCHA-free, passive bot detection system that runs in the background without disrupting users. It analyzes behavioral and environmental datasuch as mouse movements, keystrokes, navigation patterns, and device fingerprintsusing machine learning to distinguish humans from bots. The system adapts to evolving attacks, improves security, and enhances accessibility and user experience by eliminating CAPTCHAs. The system passively monitors user behavior to detect malicious bots without interrupting genuine users. It uses machine learning models trained on interaction patterns to ensure high detection accuracy. This approach improves security while maintaining accessibility and a smooth user experience.

Keywords – Passive Bot Detection, CAPTCHA, Mouse Movement Analysis, Keystroke Dynamics, Device Fingerprinting, HumanBot Classification.

  1. INTRODUCTION

    The rapid growth of online platforms has led to a significant increase in automated bots that perform malicious activities such as data scraping, fraud, spamming, and account abuse.[2] These bots negatively affect system performance, compromise data integrity, and reduce user trust. As bots become more sophisticated and capable of mimicking human behavior, detecting them has become increasingly challenging for modern web applications.[4]

    Traditional bot prevention techniques, particularly CAPTCHA-based mechanisms, introduce usability and accessibility issues while offering diminishing effectiveness.[1], [3] CAPTCHAs disrupt the user experience, create barriers for users with disabilities, and can now be bypassed by advanced AI-driven bots.[5] As a result, reliance on explicit user challenges is no longer sufficient to ensure robust protection against automated attacks.[6], [7]. CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) systems have long been used as a primary defense mechanism against automated bots and malicious activities on the web [9]. Over time, various CAPTCHA schemes have been proposed and analyzed, highlighting both their effectiveness and limitations in real- world applications [8].

    With the rapid advancement of machine learning and deep learning techniques, traditional CAPTCHA systems have become increasingly vulnerable to automated attacks. Several studies demonstrate that text-based and image-based CAPTCHAs can be broken using sophisticated deep learning models, raising concerns about their long-term reliability [13], [15], [16]. Additionally, new CAPTCHA designs such as motion-based and emerging image-based schemes have been explored; however, these too exhibit vulnerabilities that can be exploited by attackers [14], [19].

    Human interaction behavior has also been studied as an alternative approach for bot detection. In particular, keystroke dynamics and behavioral biometrics have gained attention as potential methods for distinguishing humans from bots [11], [12]. Recent advances in deep learning have further enhanced the effectiveness of keystroke-based biometric systems, enabling more accurate and continuous authentication mechanisms [10], [17], [18]. Moreover, large-scale evaluations of CAPTCHA usability reveal that while CAPTCHAs aim to differentiate humans from machines, they often introduce usability challenges for legitimate users [20]. These limitations have motivated researchers to explore alternative and more user-friendly authentication mechanisms that balance security and usability.

    To overcome these challenges, a CAPTCHA-free, passive bot detection system is proposed that analyzes environmental and behavioral data such as mouse movements, keystroke dynamics, browsing patterns, and device fingerprints. Machine learning models process these signals to accurately distinguish humans from bots while operating unobtrusively in the background. This approach enhances security, improves accessibility, and delivers a frictionless user experience suitable for modern web applications.

  2. RELATED WORK
    1. Literature Survey Summary

      Recent research has explored several approaches to improve bot detection and strengthen CAPTCHA-based

      security systems. The study No Bot Expects the DeepCAPTCHA! Introducing Immutable Adversarial Examples, With Applications to CAPTCHA Generation (2022) proposes a CAPTCHA generation method using Immutable Adversarial Noise (IAN). This technique creates adversarial examples that can confuse deep learning models while remaining recognizable to humans. The approach improves resistance against machine learningbased CAPTCHA solvers and enables the generation of many unique challenges. However, it still remains vulnerable to relay attacks where human solvers complete CAPTCHAs, and generating adversarial samples at scale requires significant computational resources.[1]

      Another work, Creating a Bot-tleneck for Malicious AI: Psychological Methods for Bot Detection (2024) introduces a bot detection mechanism based on psychological screening questions. The system leverages human cognitive abilities such as contextual understanding, commonsense reasoning, and associative thinking to differentiate humans from automated systems. While the approach is scalable and more robust than traditional CAPTCHAs, it may produce false positives, particularly for inattentive users or non- native speakers, and may become less effective as AI language models improve.[2]

      The research An Empirical Evaluation and New Design for CAPTCHA (2023) evaluates existing CAPTCHA schemes, including text-based, image-based, and reCAPTCHA systems, and proposes a new CAPTCHA design that improves usability while maintaining resistance to automated attacks. The new design allows faster and easier interaction for human users while enhancing security. However, the approach still relies on explicit CAPTCHA challenges, which introduce friction in user experience and remain vulnerable to relay attacks.[3]

      Similarly, Implementation of CAPTCHA Mechanisms using Deep Learning to Prevent Automated Bot Attacks (2024) explores the use of deep learning techniques for CAPTCHA systems. The study employs Convolutional Neural Networks (CNNs) for image-based CAPTCHA recognition and Recurrent Neural Networks (RNNs) for sequence-based CAPTCHA handling. This approach improves security and adaptability, making CAPTCHA systems harder for automated bots to bypass. However, integrating deep learning models increases implementation complexity and requires continuous updates to address evolving AI-based attack methods.[4]

      Similarly, BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot Detection (2023) focuses on enhancing bot detection by modeling natural typing behavior. Using a Kernel Density Estimator (KDE), the researchers developed a system that creates models of how users type across various keys. The primarybenefit of this approach is the implementation of passive CAPTCHA, which allows for bot detection based on natural interactions without forcing users to solve frustrating manual challenges. However, the paper identifies a significant gap: because the models are

      often universal or user-dependent, they occasionally produce unnatural samples because they fail to account for the complex dependencies between specific keys and individual user quirks.[5]

      BeCAPTCHA-Mouse: Synthetic Mouse Trajectories and Improved Bot Detection (2021) explores the generation of realistic mouse movements to train better detection systems. The methodology involves two distinct paths: a Function- based Method using mathematical profiles (like linear and exponential trajectories) and a GAN-based Method that uses Generative Adversarial Networks to synthesize human-like movements from noise. While the system achieved a high accuracy of 98.7% and significantly outperformed prior methods by incorporating neuromotor features, it remains limited to desktop mouse dynamics. Furthermore, the GAN- generated data still contains detectable patterns, meaning the synthetic movements are not yet perfectly indistinguishable from real human behavior.[6]

      BeCAPTCHA: Behavioral Bot Detection using Touchscreen and Mobile Sensors (2021) shifts the focus to mobile platforms, this research utilizes Support Vector Machines (SVM) to classify users based on behavioral patterns captured by touchscreen and accelerometer data. By combining these sensor inputs, the system achieves high accuracy without requiring cognitive tasks or distorted images. The main drawback noted is its limited scope; the study only evaluates swipe (drag-and-drop) gestures, leaving a wide range of other mobile interactions unexplored. Additionally, there is a risk of overfitting, where classifiers trained on one specific synthetic generation method may struggle to detect bots created using different, unseen techniques.[7]

      Overall, existing solutions focus primarily on improving CAPTCHA robustness through adversarial examples, cognitive challenges, or deep learningbased mechanisms. While these approaches strengthen security, they still depend on interactive CAPTCHA challenges, which interrupt user workflows and may affect accessibility. Furthermore, issues such as relay attacks, computational overhead, and evolving AI capabilities remain unresolved. Therefore, there is a growing need for a CAPTCHA-free passive bot detection system that can analyze behavioral and environmental signalssuch as mouse movements, keystroke dynamics, navigation patterns, and device fingerprintsto accurately distinguish between human users and automated bots without disrupting the user experience.

    2. Research Gap

    CAPTCHAs are widely deployed but increasingly ineffective: AI models can solve text, image, and audio CAPTCHAs with high accuracy. Psychological tests (free- response, reasoning tasks) are effective but labor-intensive and slow. Deep learning CAPTCHAs improve robustness but still impose usability barriers for humans. Need for non- intrusive CAPTCHA-free solutions that detect bots in the background.

  3. PROPOSED SYSTEM

    The proposed system introduces a CAPTCHA-free, passive bot detection framework designed to provide strong security while preserving a smooth and frictionless user experience. Unlike traditional CAPTCHA-based methods that interrupt users, this system operates silently in the background by continuously observing user behavior, environmental signals, and cognitive cues to determine whether interactions originate from humans or automated bots.

    The framework captures frontend behavioral data such as keystroke dynamics, mouse movement patterns, and scrolling behavior, which naturally differ between humans and bots. Human interactions are typically irregular, inconsistent, and context-driven, whereas bots exhibit uniform, mechanical patterns. These behavioral signals form a rich source of information for identifying authentic users without requiring explicit challenges.

    In addition to behavioral data, the system analyzes environmental and device fingerprints, including browser configurations, network characteristics, and IP-level attributes, to detect anomalies commonly associated with automated traffic. Cognitive and semantic cuessuch as contextual consistency of user inputsare also evaluated to identify illogical or synthetic responses. Together, these layers enhance detection accuracy while reducing false positives.

    All collected features are processed using machine learning classifiers, with an ensemble approach combining multiple models to improve robustness and adaptability. The system generates a real-time risk score that can be used to allow access, flag suspicious activity, or trigger secondary verification. By leveraging behavioral analytics, device intelligence, and adaptive machine learning, the proposed solution delivers effective bot detection without compromising usability, accessibility, or user satisfaction.

    1. Algorithm:

      Step 1 Collect Data:

      Track user actions in the background without interrupting them:

      Keystroke speed and patterns, Mouse movements, Scrolling behavior.

      Gather device and network information:

      Browser type, screen size, timezone, IP address, location, network type.

      Step 2 Extract Features

      Turn raw data into measurable features, e.g.: Average typing speed, Mouse movement patterns.

      Step 3 Train Model

      Use labeled examples of human and bot sessions.

      Train a machine learning model to detect patterns that distinguish humans from bots.

      Step 4 Decision Making

      If score indicates human allow access.

      If score indicates bot block or limit actions.

  4. SYSTEM ARCHITECTURE

    Fig 1-Architecture

    Our proposed ensemble for the CAPTCHA-free passive bot detection system includes the following models as mentioned in the architecture:

      1. Random Forest Environmental & Structured Data Analysis: Random Forest is an ensemble of decision trees that analyzes structured data such as IP addresses, device fingerprints, and browser settings. Each decision tree makes an independent prediction (bot or human), and the forest aggregates these results through majority voting. In the context of bot detection: Repeated appearances of the same IP across multiple sessions may indicate automated activity, Identical browser settings across many sessions can suggest the presence of bots, Device fingerprint anomalies (screen size, plugins, fonts, timezone mismatches) can flag suspicious users.
      2. LSTM (Long Short-Term Memory) Behavioral Time-Series Analysis: LSTM is a type of Recurrent Neural Network (RNN) capable of learning from sequential and time- based patterns, making it ideal for analyzing keystroke dynamics. In bot detection: Human typing behavior exhibits small, irregular pauses between keystrokes (e.g., he[50ms] ll[70ms] o), Bots tend to type with unnatural precision and little-to-no delay, LSTM learns these timing patterns and detects unnatural sequences indicative of automation.
      3. CNN (Convolutional Neural Network) Behavioral Gesture Analysis: CNNs are deep learning models designed to identify shapes and patterns. They are effective for processing mouse movement paths and gesture data. In bot detection: Human mouse movements are irregular, curved, and include hesitations, Bot movements are often straight, precise, and

        uniform, CNNs classify movement patterns to differentiate human-like behavior from automated scripts.

      4. N-gram: An N-gram model is a probabilistic Natural Language Processing (NLP) approach that analyzes text based on the likelihood of word sequences. It evaluates how frequently certain combinations of words (bigrams, trigrams, etc. occur in natural language. In bot detection: Human- generated text tends to follow common and statistically probable word patterns (e.g., The ice sculpture will melt in the sun, where word transitions like will melt and in the sun are highly probable). Bots, on the other hand, may generate unusual or low-probability word sequences (e.g., Snow hot day bright energy foot surface), where adjacent words rarely co-occur in real language usage.

        Integration of Ensemble Predictions

        The outputs from these individual models are combined using a decision-level fusion strategy, resulting in a final bot detection score. This score determines whether the session is classified as human or bot. The ensemble approach provides enhanced accuracy by leveraging complementary strengths: environmental analysis, behavioral pattern recognition, gesture classification, and semantic understanding. This multi-model design ensures that our bot detection system is robust, adaptive, and capable of detecting sophisticated automated threats without disrupting the user experience.

        Fig 2- System Workflow Step-by-Step Flow:

        User Frontend Data Collector- The user interacts with the interface (e.g., moving mouse, clicking buttons, typing). These actions generate raw behavioral and device signals

        Frontend Data Collector Bot Detection Engine- The frontend component sends a feature vector to the detection system.

        Bot Detection Engine- Stores raw features temporarily for analysis or model retraining. The ML Model processes these features and computes a bot probability score.

        Prediction Decision- Based on a predefined threshold, the engine classifies the user as Human or Bot.

        Bot Detection Engine Frontend Data Collector- The decision (human or bot) is sent back to the frontend, which can then take action like: Allowing access (for human users), Showing an access denied or retry message (for bots).

  5. RESULTS AND DISCUSSION

The proposed CAPTCHA-free passive bot detection system was evaluated to analyze its effectiveness in distinguishing between human users and automated bots. The system utilizes frontend environmental data and a backend machine learning model for classification.

Fig 3.1: Data collection interface capturing user interaction parameters.

Fig 3.2: Data collection interface capturing user interaction parameters.

The system collects passive signals such as mouse movement, typing patterns, and device-related attributes without requiring explicit user input.

Fig 4: Backend classification result showing human/bot prediction.

The machine learning model processes the captured data and classifies the user as either a human or a bot based on learned behavioral patterns.

A. Model Evaluation.

Fig 5: Confusion matrix representing classification results of the proposed system.

Fig 5: Performance evaluation of the proposed CAPTCHA-

free bot detection system.

The evaluation results indicate that the proposed system achieves an accuracy of 75%, with a high precision of 90% and a recall of 52%. The high precision suggests that the model is effective in correctly identifying legitimate users, minimizing false positives. However, the relatively lower recall indicates that a significant number of bots are not detected, leading to false negatives. This behavior is further supported by the confusion matrix, where 221 bot instances were misclassified as human users. Despite this limitation, the system demonstrates strong potential as a user-friendly alternative to traditional CAPTCHA systems by reducing user friction while maintaining reasonable detection performance. Future improvements can focus on enhancing recall by incorporating more diverse behavioral features and improving model generalization.

  1. Comparative Analysis with Existing CAPTCHA Systems

The proposed CAPTCHA-free passive bot detection system is compared with widely used CAPTCHA mechanisms, including text-based, image-based, and Google reCAPTCHA systems. The comparison is based on performance metrics, usability, and user experience, supported by findings from existing literature.

Table 1: Performance Comparison of Proposed System with Existing CAPTCHA Methods

Parameter Text-Based CAPTCHA Image-Based CAPTCHA reCAPTCHA Proposed System
Accuracy 5084% 7181% 7185% 75%
Response Time 915 sec 1532 sec 35 sec ~2 sec
User Effort High High Medium None

As shown in Table 1, the proposed system achieves moderate to high accuracy compared to traditional methods.

Table 2: Usability and User Experience Comparison between Traditional CAPTCHA and Proposed System

Feature Traditional CAPTCHA Proposed System
User Interaction Required Not Required
Task Completion Time Higher Lower
User Frustration High Minimal

Table 2 highlights the improved user experience of the proposed approach.

Traditional CAPTCHA systems prioritize security at the cost of usability, whereas the proposed system prioritizes user experience while maintaining acceptable security performance.

CONCLUSION

This paper presented a CAPTCHA-free passive bot detection system designed to enhance both security and user experience in web-based applications. Unlike traditional CAPTCHA mechanisms such as text-based, image-based, and Google reCAPTCHA, the proposed approach eliminates the need for explicit user interaction by leveraging behavioral and environmental data for classification.

The experimental results demonstrate that the system achieves an accuracy of 75%, with a high precision of 90%, indicating its effectiveness in correctly identifying legitimate users while minimizing false positives. Although the recall of 52% suggests that some bot instances remain undetected, the system maintains a practical balance between detection capability and usability. A comparative analysis with existing CAPTCHA systems highlights the key advantage of the proposed approach in significantly improving user experience by reducing interaction time and eliminating interruptions. Traditional CAPTCHA systems, while effective in bot detection, introduce delays, accessibility issues, and user frustration, which can negatively impact task completion rates.

Overall, the proposed system offers a promising alternative to conventional CAPTCHA-based authentication by prioritizing usability without completely compromising security. This makes it particularly suitable for modern applications where seamless user interaction is critical.

Future work will focus on improving recall and overall detection performance by incorporating more advanced behavioral features, expanding training datasets, and exploring adaptive machine learning techniques to better detect sophisticated bots.

ACKNOWLEDGMENT

I would like to express my sincere gratitude to my guide, Prof. (Mrs.) Jaya Terdale, for her valuable guidance, constant support, and insightful feedback throughout this project. I also thank the Head of the Department, Prof. S. P. Bansu, and the faculty of the AI-DS department at A.C. Patil College of Engineering for providing the necessary resources and a supportive learning environment. I am grateful to my friends and peers for their encouragement and cooperation, and to my family for their continuous motivation and unwavering support. This project has been a valuable learnin experience, and I am thankful to everyone who contributed to its successful completion.

REFERENCES

  1. Alejandro Acien, Aythami Morales , Julian Fierrez, Ruben Vera- Rodriguez. BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot Detec tion., arXiv preprint arXiv:2207.13394, 2023.
  2. Alejandro Acien, Aythami Morales, Julian Fierrez, Ruben Vera- Rodriguez, Oscar Delgado-Mohatar. BeCAPTCHA-Mouse: Synthetic Mouse Trajectories and Improved Bot Detection., arXiv preprint arXiv:2005.00890, 2021.
  3. Daniel DeAlcala1, Aythami Morales1, Ruben Tolosana1, Alejandro Acien1, Julian Fierrez1, Santiago Hernandez1, Miguel A. Ferrer2, Moises Diaz2 BeCAPTCHA: Be havioral Bot Detection Using Touchscreen and Mobile Sensors Benchmark on Hu MIdb.,

    Proceedings of the International Conference on Biometrics (ICB), Springer LNCS, 2020.

  4. Prof. (Dr.) Sachin R. Sakhare, Mr. Vivek D. Patil Implementation of Captcha Mech anisms using Deep Learning to Prevent Automated Bot Attacks., Research Journal of Computer Systems and Engineering (RJCSE), 4(2), 115, 2023.
  5. M. Osadchy, J. Hernandez-Castro, S. Gibson, O. Dunkelman, and D. P´erez-Cabo, No bot expects the DeepCAPTCHA! Introducing immutable adversarial examples, with applications to CAPTCHA generation., IEEE Transactions on Information Forensics and Security, 12(11), 26402653, Nov. 2017. doi: 10.1109/TIFS.2017.2718479.
  6. C. Rodriguez and D. M. Oppenheimer, Creating a bot-tleneck for malicious AI: Psy chological methods for bot detection., Behavior Research Methods, 56, 62586275, Apr. 2024. doi: 10.3758/s13428-

    024-02357-9. 20

  7. A. Searles, Y. Nakatsuka, E. Ozturk, A. Paverd, G. Tsudik, and A. Enkoji, An Empir ical Study & Evaluation of Modern CAPTCHAs., arXiv preprint, arXiv:2307.12108, Jul. 2023
  8. M. Moradi and M. R. Keyvanpour, CAPTCHA and its alternatives: A review, Security and Communication Networks, vol. 8, no. 13, pp. 21552156, 2015, doi: 10.1002/sec.1157.
  9. L. von Ahn, M. Blum, N. J. Hopper, and J. Langford, The CAPTCHA, 2000. [Online].
  10. A. Acien, A. Morales, John V. Monaco, R. Vera-Rodr´guez, and Julian Fierrez. TypeNet: Deep learning keystroke bio metrics. IEEE Transactions on Biometrics, Behavior, and Identity Science, 4(1):57 70, 2022.
  11. Yousof Al-Hammadi and Uwe Aickelin. Detecting bots based on keylogging activities. In 2008 Third International Conference on Availability, Reliability and Security, pages 896902, 2008.
  12. EmtethalKAlamri, AbdullahMAlnajim, andSulimanAAl suhibany. Investigation of using captcha keystroke dynamics to enhance the prevention of phishing attacks. Future Inter net, 14(3):82, 2022.
  13. F. H. Alqahtani and F. A. Alsulaiman. Is image-based CAPTCHA secure against attacks based on machine learning? An experimental study. Computers & Security, 88:101635, 2020.
  14. S. Gao, M. Mohamed, N. Saxena, and C. Zhang. Emerging-Image Motion CAPTCHAs: Vulnerabilities of Existing Designs, and Counter measures. IEEE Transactions on Dependable and Secure Computing, 16(6):10401053, 2019.
  15. D. Aguilar, D. Riofrío, D. Benítez, N. Pérez and R. F. Moyano, “Text- based CAPTCHA Vulnerability Assessment using a Deep Learning based Solver,” 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), Cuenca, Ecuador, 2021, pp. 1-6, doi: 10.1109/ETCM53643.2021.9590750.
  16. M. Tang, H. Gao, Y. Zhang, Y. Liu, P. Zhang and P. Wang, “Research on Deep Learning Techniques in Breaking Text-Based Captchas and Designing Image-Based Captcha,” in IEEE Transactions on Information Forensics and Security, vol. 13, no. 10, pp. 2522-2537, Oct. 2018, doi: 10.1109/TIFS.2018.2821096.
  17. J. Fierrez, A. Pozo, M. Martinez-Diaz, J. Galbally, A. Morales, Benchmarking Touchscreen Biometrics for Mobile Authentication, IEEE Transactions on Information Forensics and Security, 13, 11, 2720-2733, 2018.
  18. A. Acien, A. Morales, R. Vera-Rodriguez, J. Fierrez, Multilock: Mobile active authentication based on multiple biometric and behavioral patterns, in: Proc. ACM Intl. Conf. on Multimedia, Workshop on Multimodal Un derstanding and Learning for Embodied Applications (MULEA), 2019,

    pp. 5359.

  19. R. Datta, J. Li, and J. Z. Wang, IMAGINATION: A robust image based CAPTCHA generation system, in Proc. 13th ACM Int. Conf. Multimedia, Singapore, 2005, pp. 331334.
  20. E. Bursztein, S. Bethard, C. Fabry, J. C. Mitchell, and D. Jurafsky, How good are humans at solving CAPTCHAs? A large scale evaluation, in Proc. IEEE Symp. Secur. Privacy., Washington, DC, USA, May 2010, pp. 399413.