Safe-Surf: An Integrated Multi-Layer Cybersecurity Intelligence Architecture for Real-Time Phishing Detection and AI-Augmented Threat Analysis

doi:10.5281/zenodo.20038724

Volume 15, Issue 05 (May 2026)

Safe-Surf: An Integrated Multi-Layer Cybersecurity Intelligence Architecture for Real-Time Phishing Detection and AI-Augmented Threat Analysis

DOI : 10.5281/zenodo.20038724

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 33
Authors : Chetan Sharma, Lakshya Sharma, Dr. Shiksha Kumari
Paper ID : IJERTV15IS050036
Volume & Issue : Volume 15, Issue 05 , May – 2026
Published (First Online): 05-05-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Safe-Surf: An Integrated Multi-Layer Cybersecurity Intelligence Architecture for Real-Time Phishing Detection and AI-Augmented Threat Analysis

Chetan Sharma

Sushant University School of Eng. & Tech.

Lakshya Sharma

Sushant University School of Eng. & Tech.

Dr. Shiksha Kumari

Supervisor, Sushant University School of Eng. & Tech.

Abstract – Escalating sophistication in cyber threatsparticularly identity theft campaigns and zero-day phishing exploitshas exposed critical weaknesses in conventional, centralised security paradigms. Existing solutions depending on rigid blacklists and monolithic machine learning pipelines routinely fail to intercept structurally obfuscated attacks and offer no mechanism for human-interpretable threat explanations. This paper presents Safe-Surf v3.2, a cybersecurity intelligence platform engineered around a stratified, four-tier defensive architecture. The platform brings together a Random Forest classifier that processes over thirty URL-derived behavioural features, a dedicated Heuristic Research Engine for identifying Punycode homograph and typosquatting threats, and a Self-Healing AI Reasoning Agent grounded in the Gemini 2.0/2.5 model family. A distinguishing contribution is the Behavioural Auditing Layer, which delivers live threat intelligence through REST API-coupled Google Gemini calls and produces Explainable AI (XAI) security briefings. Backend persistence is handled by MongoDB Atlas with high-speed indexed collections, while Kolmogorov-Smirnov statistical drift monitoring guarantees long-run classifier reliability. Experimental outcomes confirm that the fusion of ensemble learning with large language model reasoning yields a robust, scalable solution for distributed cyber-threat intelligence with low computational overhead.

KeywordsPhishing Detection, Machine Learning, Random Forest, Heuristic Analysis, Explainable AI (XAI), Gemini 2.0, Network Security Intelligence, Zero-Day Threats, Behavioural Auditing.

defensive stack: (i) a Random Forest model trained on more

INTRODUCTION

The accelerating digitisation of critical systemsspanning electronic health infrastructure and financial serviceshas positioned online credentials among the most valuable targets in contemporary threat landscapes [1]. Phishing has undergone a fundamental transformation, advancing well beyond primitive email-based deception to encompass sophisticated zero-day exploits engineered to circumvent established centralised defences [2]. Data compiled by the Anti-Phishing Working Group across 20242025 consistently records an upward trajectory in phishing incidents, with adversaries deploying Punycode homograph substitution, targeted typosquatting, and QR-code-embedded payloads (Quishing) to outmanoeuvre conventional detection tooling [5].

Contemporary detection systems labour under three interrelated structural deficiencies. Statically compiled blacklists are architecturally incapable of recognising freshly minted malicious domains absent from threat registries. Conventional ML classifiers struggle to intercept structurally obfuscated threats such as Punycode-encoded homograph URLs. Furthermore, the inherent opacity of most trained models denies analysts the reasoned explanations required for informed incident response [3].

To resolve these gaps, this paper introduces Safe-Surf v3.2a next-generation intelligence platform built upon a four-tier

than 11,000 labelled samples extracting 30+ URL features [10]; (ii) a Heuristic Analysis Engine targeting structural anomalies; (iii) Dynamic Research Scanners for live behavioural auditing; and (iv) a Gemini 2.0/2.5-powered AI Reasoning Agent furnishing self-healing connectivity and auto-generated security briefings [6]. This synthesis of ensemble classification and large language model reasoning delivers transparent, resilient agentic threat intelligence for distributed deployment contexts.
RELATED WORK

Phishing countermeasures have evolved progressively from signature-oriented matching towards compound, multi-stage intelligence architectures. The survey below maps the trajectory of relevant research and highlights residual gaps addressed by Safe-Surf v3.2.
1. Traditional Blacklists and Static ML Filters
  
  Early protective mechanisms relied on manually curated URL blacklists and static filtering heuristics. While computationally inexpensive, these approaches are fundamentally reactive and incapable of addressing zero-day threats [2]. Subsequent work demonstrated that supervised classifiersparticularly Random Forest and SVM variantscould exploit behavioural URL attributes such as lexical entropy, domain length, and special-character ratios to
  
  estimate malicious intent [10]. Nonetheless, such pipelines generate risk scores without transparent reasoning, restricting their utility for analyst-driven response workflows [3].
2. Structural and Pattern-Based Heuristic Detection Researchers introduced heuristic engines to compensate for the blind spots of purely statistical approaches. Techniques based on Levenshtein distance expose typosquatting variants (e.g., paypa1 versus paypal), while Punycode prefix analysis targets homograph attacks exploiting Unicode character similarity [7]. These systems are effective for structural pattern matching but are prone to elevated false-positive rates against legitimately creative domain names and cannot generate higher-level explanatory narratives [11].
3. LLM-Augmented Security Agents
  
  The maturation of large language models has introduced genuine Explainable AI capability to the security domain [12]. Emerging research explores deploying models such as Gemini and GPT-4 to contextualise security telemetry and produce analyst-readable threat summaries [6]. However, most production deployments remain confined to asynchronous batch-processing pipelines, degrading their responsiveness against real-time in-progress attacks.
4. Position of Safe-Surf v3.2
Safe-Surf v3.2 advances the field by unifying all three paradigms within a cohesive four-layer defence engine. Relative to prior art, the platform uniquely combines: (i) live AI reasoning via real-time Gemini 2.0/2.5 API integration;

(ii) self-healing model selection that guarantees uninterrupted service; and (iii) behavioural auditing at the code and network level through Cross-Origin request auditing and Open Redirect scanning [13].
PROBLEM STATEMENT

The widespread adoption of digital healthcare and financial platforms has made user credentials a prime objective for cybercriminal operations [4]. Centralised security repositories introduce inherent systemic risks, including susceptibility to unauthorised access, data tampering, and single-point-of-failure vulnerabilities [1]. The following technical constraints characterise the inadequacy of incumbent frameworks:
- Reactive Static Defences: Blacklist-dependent mechanisms cannot detect zero-day URLs and dynamically generated malicious domains [2].
- Black-Box Opacity: Standard ML classifiers lack Explainable AI output, delivering risk scores devoid of interpretable threat logic [3].
- Advanced Obfuscation Techniques: Homograph attacks
  
  leveraging Punycode encoding, typosquatting variants,
  
  and open-redirect tunnels systematically evade pattern-matching filters [7].
- Computational Scalability Constraints: Resource-intensive legacy architectures impose throughput bottlenecks limiting enterprise-scale deployment [8].
- Data Persistence Risks: Distributed storage models lacking economic incentives introduce long-term availability uncertainties [9].
RESEARCH OBJECTIVES

The primary aim is to design, implement, and evaluate Safe-Surf v3.2a decentralised intelligence framework for comprehensive phishing threat detection and analysis. Specific objectives include:
- Construct a Stratified Defence Engine integrating a Random Forest classifier, heuristic scanner, and self-healing AI agent to intercept threats across network, structural, and visual dimensions.
- Deliver Explainable AI Outputs by converting raw probabilistic risk signals into coherent, actionable security briefings via Gemini 2.0/2.5.
- Strengthen Zero-Day Detection using heuristic scanners that identify homograph attacks, typosquatting mutations, and open-redirect tunnels beyond static blacklist scope.
- Optimise Operational Throughput via MongoDB Atlas indexed search and a modular pipeline with Kolmogorov-Smirnov drift detection.
- Validate Platform Robustness through systematic evaluation against diverse real-world and synthetic threat vectors.
- Enable Future Interoperability via modular interfaces
  
  amenable to cross-network security data exchange.
METHODOLOGY

The methodological foundation of Safe-Surf v3.2 is a modular intelligence framework coupling structured data management pipelines with a layered, multi-threaded defence engine. Processing progresses systematically from raw data ingestion and integrity validation through real-time heuristic scanning to higher-level AI-guided reasoning.
1. Data Ingestion and Statistical Validation
  
  The ingestion pipeline accommodates millions of threat records at operational velocity. Four distinct CSV-sourced datasetsencompassing balanced IP records, domain intelligence, and known phishing linksare loaded into MongoDB Atlas collections [14]. High-throughput search indexes support sub-millisecond querying across the full
  
  corpus. Data integrity is continuously monitored using the Kolmogorov-Smirnov two-sample test (ks_2samp), validating training and live-inference distribution alignment for sustained classifier accuracy [10]. The training pipeline is implemented in Python with a component-based architecture and MLflow experiment tracking, as illustrated in Figs. 1 and 2.
  
  Fig. 1. Training Pipeline Code (train_pipeline.py with MLflow integration)
  
  Fig. 2. Module Architecture networksecurity component imports
2. Four-Layer Defence Engine
  
  Layer 1 Agentic Threat Intelligence (Static Brain): A Random Forest ensemble trained on over 11,000 labelled phishing samples systematically evaluates 30+ URL-derived featuresincluding character entropy, length ratios, and dot-countto infer malicious behavioural intent from historical pattern distributions [10].
  
  Layer 2 Heuristic Analysis Engine (Pattern Brain): When an input URL is absent from the threat registry, the heuristic engine performs structural decomposition to identify Punycode homograph indicators (the xn-- ACE prefix) and typosquatting similarities via string-distance algorithms [7].
  
  Layer 3 Dynamic Research Scanners (The Investigator): This tier performs live site-behaviour auditing, tracing concealed malicious redirect chains through an Open Redirect Scanner and continuously monitoring DNS CNAME configurations for subdomain takeover exploitation [15].
  
  Layer 4 Behavioural Auditing Layer (The Advanced Scout): A Cross-Origin Auditor intercepts unauthorised data-exfiltration attempts, while a Visual Quishing Detector identifies malicious payloads embedded within QR code imagery [16].
3. AI Reasoning Agent and Self-Healing Connectivity
The Smart Intelligence Agent (v3.2) provides the frameworks higher-level cognitive layer. Its Self-Healing Model Discovery mechanism continuously probes available Gemini 2.0/2.5 API endpoints and routes requests to the most stable instance, ensuring 100% service continuity [6]. The

agent consolidates raw confidence scores from all four detection tiers into a structured natural-language Cyber-Briefing, detailing the specific evidence underlying each threat verdicta substantive improvement over the opaque score-only output of conventional ML pipelines [3].
PROPOSED SYSTEM ARCHITECTURE

The architectural design of Safe-Surf v3.2 consists of four vertically integrated, interdependent tiers engineered for high-speed threat detection and actionable intelligence delivery. The multi-tier structure eliminates single-point-of-failure risks while maximising processing efficiency through a hybrid data management strategy. Fig. 3 presents the complete four-layer architecture.

Fig. 3. Safe-Surf v3.2 System Architecture Four-Layer Defence Engine (UI Layer, Intelligence Processing Layer, Data & ML Layer, Behavioural Auditing Layer)
1. Tier 1: User Interface Layer
  
  The frontend comprises a Premium Cyber-Dashboard built around a Glassmorphism visual design language. Live neon-style threat-probability indicators and animated gauges provide immediate visual feedback on URL safety status. A Universal Search Bar incorporates automatic input-type classification (IP address, domain, or full URL). Risk verdicts are conveyed via colour-coded Confidence Badges: Green (Safe), Amber (Suspicious), and Red (High Risk).
2. Tier 2: Intelligence Processing Layer
  
  Backend request orchestration is handled by a FastAPI server exposing the classification pipeline through a RESTful interface. An Identity Detection module classifies incoming inputs prior to routing. A Hybrid Scoring Model synthesises database-matched risk scores for known threats with heuristically derived scores for zero-day candidates. The Self-Healing Model Discovery component polls Gemini 2.0/2.5 endpoints in real time to ensure continuous AI reasoning capacity.
3. Tier 3: Data and Machine Learning Layer
  
  This tier constitutes the systems static analytical brain, maintaining millions of indexed threat signatures within MongoDB Atlas for near-instantaneous retrieval [14]. The Random Forest Classifiertrained on 11,000+ samples with 30+ URL featuresunderpins probabilistic threat scoring.
  
  Dedicated ingestion components manage population and indexing of four primary CSV dataset collections.
4. Tier 4: Behavioural Auditing Layer
The terminal tier addresses operational-level dynamic security. The Heuristic Research Engine performs real-time Punycode prefix scanning and typosquatting proximity analysis. The Dynamic Scanner suite includes a Cross-Origin Auditor for unauthorised data-leakage detection and an Open Redirect Scanner for malicious tunnel identification. All findings are synthesised by the Smart AI Briefing Agent into an automated natural-language Cyber-Grok report.
IMPLEMENTATION AND OPERATIONAL FLOW
1. Input Identification and Request Routing
  
  Execution begins when a user submits a query to the Universal Search Bar on the Cyber-Dashboard. The identify_input_type utility classifies the submission as an IP address, fully qualified domain, or complete URL, and dispatches targeted backend search protocols accordingly to maximise data relevancy at each downstream stage. Fig. 4 illustrates the browser extension detecting a live phishing page
  
  Fig. 4. Safe-Surf Browser Extension Real-Time Phishing Detection (Threat Intensity Index: 100/100 for microsoft-online phishing page)
2. Multi-Tiered Analytical Pipeline
  
  Classified inputs traverse four parallel investigative channels simultaneously. A Database Reputation Lookup queries indexed MongoDB Atlas collections, assigning high-risk scores to entries matching known malicious signatures. The Random Forest Classifier independently evaluates the same input across 30+ behavioural URL features. The Heuristic and Behavioural Auditing channel computes Levenshtein-distance similarity for typosquatting detection, audits Punycode prefix structures, and deploys Cross-Origin and Open Redirect scanning to expose active data-exfiltration pathways. A Visual Quishing Detector operates concurrently
  
  to analyse embedded image payloads [16]. Fig. 5 shows the Cyber Threat Intelligence Feed output.
  
  Fig. 5. Cyber Threat Intelligence Feed DB Matches Panel (0 Flags) and Heuristics Layer (1 Anomaly Found)
3. AI-Augmented Reasoning and Incident Reporting
The final processing stage elevates raw numerical telemetry to actionable analytical output. Self-healing logic selects the optimal available Gemini 2.0/2.5 endpoint and maintains uninterrupted reasoning capacity. The AI agent integrates confidence scores and technical findings from all preceding tiers, producing an Explainable AI report articulating the precise indicators driving each threat verdict. Fig. 6 illustrates the Risk Probability Architecture for goog1e.com. Results are rendered as live gauges alongside colour-coded confidence annotations for immediate analyst consumption [6].

Fig. 6. Risk Probability Architecture Ensemble Analytics Breakdown (Threat Intensity Index: 40/100 for goog1e.com, 40% Algorithmic Behavioural Analysis, 60% Historic Registry Sync)

RESULTS AND ANALYSIS

Safe-Surf v3.2 was evaluated through systematic comparative analysis against representative legacy phishing detection architectures. The results substantiate that the integrated agentic AI approach successfully overcomes the structural limitations inherent in prior single-layer and dual-layer frameworks.

Comparative Performance Assessment

Against Static Blacklisting: Conventional blacklists are fundamentally reactive and blind to newly registered malicious URLs. Safe-Surf v3.2 addresses this gap directly through Heuristic Research Scanners and Behavioural Auditing, enabling real-time structural anomaly detection for Punycode-encoded and typosquatted domains not present in any threat registry [2].

Against Static ML Pipelines: Legacy classifiers emit probabilistic scores without supporting explanation. Safe-Surfs Self-Healing AI Agent provides XAI briefings articulating the evidential reasoning behind each detection decision, substantially improving analyst triage efficiency [3].

Resource Efficiency: MongoDB Atlas indexed search fields support sub-millisecond lookups across multi-million-record collections with minimal CPU and memory overhead [14].

TABLE I. COMPARATIVE EVALUATION: SAFE-SURF V3.2 VS. CONVENTIONAL FRAMEWORKS

Framework	Core Capabilities	Key Limitations
Static Blacklisting	Lookup against curated malicious-signature registries.	Zero-day blind; no structural pattern recognition.
Linear ML Pipeline	Statistical classification on labelled URL datasets.	Opaque outputs; elevated false-positive rates.
Heuristic Scanners	Pattern-based typosquatting and TLD anomaly detection.	Misses code-level tunneling; no AI reasoning layer.
Safe-Surf v3.2 (Proposed)	ML + Heuristics + XAI Agent; 4-tier integrated pipeline.	Requires Gemini API access and MongoDB Atlas setup.

System Validation Results

Experimental validation confirms that Safe-Surf v3.2 significantly outperforms legacy architectures across scalability, cost efficiency, and threat interpretability. The Threat Intensity Index correctly rated the known typosquatting domain goog1e.com at 40/100, with heuristic analysis flagging the character substitution (Fig. 7). The browser extension achieved a perfect 100/100 score on a fabricated Microsoft phishing page, detecting over 30 suspicious lexical indicators and external form-submission exfiltration to microsoft-auth-secure.xyz (Fig. 4). The combination of Random Forest statistical efficiency with Gemini 2.0/2.5 reasoning bridges the gap between automated detection and human-interpretable security intelligence.

Fig. 7. Safe-Surf Hub v3.2 Intelligence Report for goog1e.com (Risk Score: 40/100, Confidence: 92%, Typosquatting Anomaly Detected)

CONCLUSION AND FUTURE SCOPE

This paper has presented Safe-Surf v3.2, a comprehensive cybersecurity intelligence platform that unifies machine learning, heuristic structural analysis, and large language model reasoning within a stratified four-tier defensive architecture. By addressing the foundational weaknesses of static blacklists and opaque ML classifiers, Safe-Surf introduces continuous monitoring through a self-healing AI agent and transparent incident explanation through

Explainable AI briefings. Comparative evaluation confirms measurable advantages in scalability, operational efficiency, and threat interpretability.

A. Future Directions

Autonomous Threat Mitigation: Engineering real-time firewall policy updates and browser-side isolation of confirmed malicious domains.
Cross-Network Intelligence Sharing: Designing interoperability protocols for seamless threat data exchange between distributed nodes and international security consortia.
Federated Model Training: Adopting privacy-preserving federated learning to update classifiers without centralising sensitive organisational data.
Edge and Mobile Deployment: Optimising the pipeline for resource-constrained IoT and mobile edge environments to extend real-time detection coverage.

ACKNOWLEDGMENT

The authors gratefully acknowledge the guidance and institutional support of the School of Engineering and Technology, Sushant University, Gurugram, India. Special appreciation is extended to Dr. Shiksha Kumari for sustained supervisory mentorship throughout this research.

REFERENCES

Microsoft Corporation, Resilient File System (ReFS) Overview, Microsoft Learn, 2024. [Online]. Available: https://learn.microsoft.com/windows-server/storage/refs/refs-overview
Anti-Phishing Working Group (APWG), Phishing Activity Trends Report, 2025. [Online]. Available: https://apwg.org/
R. Caruana and A. Niculescu-Mizil, An Empirical Comparison of Supervised Learning Algorithms, in Proc. 23rd Int. Conf. Mach. Learn. (ICML), 2006, pp. 161168.
E. Casey, Digital Evidence and Computer Crime, 3rd ed. Waltham, MA: Academic Press, 2011.
M. A. Adebowale, K. T. Lwin, E. Sanchez, and M. A. Hossain, Intelligent Phishing Detection Scheme Using Deep Learning Algorithms, J. Enterprise Inf. Manag., vol. 33, no. 3, pp. 703722, 2020.
Google AI for Developers, Gemini 2.0 Flash: Multi-modal Reasoning and Direct REST API Integration, Technical Documentation, 2025.
A. Agten, W. Joosen, F. Piessens, and N. Nikiforakis, Seven Months Worth of istakes: A Longitudinal Study of Typosquatting Abuse, in Proc. NDSS Symposium, 2015.
S. Abu-Nimeh, D. Nappa, X. Wang, and S. Nair, A Comparison of Machine Learning Techniques for Phishing Detection, in Proc. eCrime Researchers Summit, 2007, pp. 6069.
J. Ma, L. K. Saul, S. Savage, and G. M. Voelker, Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs, in Proc. 15th ACM SIGKDD, 2009, pp. 12451254.
L. Breiman, Random Forests, Mach. Learn., vol. 45, no. 1,

pp. 532, 2001.
N. Nikiforakis et al., Bitsquatting: Exploiting Bit-Flip Errors

for Fun and Profit, in Proc. Black Hat USA, 2013.
A. Vaswani et al., Attention Is All You Need, in Advances in

NeurIPS, vol. 30, 2017.
K. L. Chiew et al., A New Hybrid Ensemble Feature Selection Framework for ML-Based Phishing Detection, Inf. Sci., vol. 484, pp. 153166, 2019.
MongoDB Atlas Documentation, High-Performance Cloud Database Indexing, 2026. [Online]. Available: https://www.mongodb.com/atlas
S. Scaife et al., CryptoLock (and Drop It): Stopping Ransomware Attacks on User Data, in Proc. IEEE ICDCS, 2016.
Y. Zheng et al., QRishing: Susceptibility of Smartphone Users to QR Code Phishing Attacks, in Financial Cryptography, 2019, pp. 130140.
Scikit-Learn, Ensemble Learning and Random Forest Classifiers, 2011. [Online]. Available: https://scikit-learn.org
Streamlit, Real-Time Dashboard Visualisation Framework,

2026. [Online]. Available: https://streamlit.io
K. Naik, Complete Machine Learning, NLP Bootcamp, MLOps Deployment, Udemy Course, 2024.