DOI : https://doi.org/10.5281/zenodo.20048546
- Open Access

- Authors : Mambakam Vemu, Dr. B. Raja, Dr. S. Geetha, Dr. V. B. Ganapathy
- Paper ID : IJERTV15IS050020
- Volume & Issue : Volume 15, Issue 05 , May – 2026
- Published (First Online): 06-05-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
XFS Forensic Scanner: A Tool to Detect, Carve, Recover and Remove Hidden Malicious Data in XFS File System
Mambakam Vemu
MTech Scholar, Dept. of Cyber Security Dr. M.G.R. Educational and Research Institute Chennai, India
Dr. S. Geetha
Dean & HOD, Dept. of CSE, Dr. M.G.R. Educational and Research Institute Chennai, India
Dr. B. Raja
Professor, Dept. of CSE, Dr. M.G.R. Educational and Research Institute Chennai, India
Dr. V. B. Ganapathy
Professor, Dept. of CSE, Dr. M.G.R. Educational and Research Institute Chennai, India
Abstract – The XFS file system is widely used in enterprise Linux environments because of its high performance and scalability, yet it remains highly vulnerable to anti-forensic techniques where attackers hide malicious payloads in superblock slack space, inode slack, free lists, free inodes and nanosecond timestamps [1]. This creates a serious gap for digital forensic investigators who rely on general-purpose tools like The Sleuth Kit that offer only basic metadata support for XFS and lack automated detection, carving, recovery or removal capabilities.
This paper aims to answer the research question: How can a modular Python-based forensic tool using pytsk3, combined with string matching, file signatures, entropy analysis and user-approved removal, provide reliable, automated and evidence-preserving detection, carving, recovery and removal of hidden malicious data in XFS file systems?
The proposed XFS Forensic Scanner integrates eight modules (M1M8) that operate together in a single class. It uses pytsk3 for low-level XFS parsing, 25+ malicious file signatures (including ELF, PE, ZIP and ransomware strings like WannaCry and BlackBasta), entropy calculation (>6.5 threshold) and PhotoRec integration for deleted file recovery. The framework also includes smart carving (M6), HTML dashboard with graphs (M7) and optional safe removal with full CSV and text logging (M8). All processing is local, command-line based and runs on standard hardware with low resource usage.
Evaluating the tool on crafted XFS images with hidden data in all five locations shows 95% detection accuracy, 100% payload carving success, 92% deleted file recovery and end-to-end response under 2 minutes. The interactive HTML dashboard provides clear visualizations and court-ready evidence. The contributions are: (1) a complete XFS-specific forensic scanner with eight integrated modules, (2) practical combination of detection + carving + recovery + removal that fills the existing research gap, and (3) an easy-to-use open-source tool that makes advanced XFS forensics accessible to students and investigators. This work extends digital forensics capabilities to XFS environments and provides a ready-to-deploy solution for real-world cyber investigations [1], [2].
Keywords – XFS File System, Digital Forensics, Hidden Data Detection, File Carving, Ransomware Signatures, Entropy Analysis, Forensic Dashboard, Data Removal, pytsk3
-
‌INTRODUCTION
The increasing use of Linux servers in enterprises has made the XFS file system one of the most common high-performance file systems today. However, attackers are now actively exploiting XFS structures to hide malicious data such as ransomware payloads, malware executables and stolen information in slack spaces and metadata areas that normal tools cannot easily examine [1]. This situation creates a critical gap in digital forensics because investigators waste valuable time doing manual checks on large disk images.
Endpoint forensic tools have evolved from basic hex viewers to advanced frameworks like The Sleuth Kit and Autopsy. These tools provide good support for ext4 but give only limited XFS metadata extraction and completely miss automated slack-space scanning, entropy-based detection, smart carving and safe removal. The key challenge is the lack of an integrated, automated and user-friendly tool specifically designed for XFS anti-forensic analysis.
I developed XFS Forensic Scanner to solve this exact problem. The tool automatically scans five main hiding locations (M1 to M5), detects suspicious content using signatures and entropy, carves full payloads (M6), recovers deleted files using PhotoRec (M7), generates an HTML dashboard with graphs (M8) and allows optional removal of hidden data with proper logging. This work directly addresses the research gap by providing a complete, practical and evidence-preserving solution for XFS forensics that runs locally on standard hardware.
Data hiding in file systems: Current state, novel methods, and a standardized
corpus
Forensic Science Internation al: Digital Investigatio
n, 2025
Lacks XFS-specific
slack/metadata tools for entropy detection
Systematic Review: Anti-Forensic Computer
Techniques
Applied Sciences, 2024
Skips XFS tool evaluations and removal benchmarks
Ext4 and XFS File System Forensic Framework
Based on TSK
Electronics
, 2021
No slack or hidden data analysis and removal features
Anti-Forensic Capacity and Detection Rating of Hidden Data in the Ext4
Filesystem
Advances in Digital Forensics
XIV, 2018
No XFS parallels or removal tactics
Anti-forensics in ext4: On secrecy and usability of timestamp-based
data hiding
Digital Investigatio n, 2018
No detection algorithms or removal strategies for XFS
fishy – A Framework for Implementing Filesystem-Based Data Hiding
Techniques
Digital Forensics and Cyber Crime, ICDF2C
2018
No detection/removal or XFS modules
Time is on my side: Steganography in filesystem
metadata
Digital Investigatio n, 2016
No integrated detection tools or removal for hidden payloads
Data investigation based on XFS file system
metadata
Multimedia Tools and Application
s, 2016
No slack/hidden data
focus or dynamic scanning/remediation
-
‌LITERATURE REVIEW
-
Foundations of Digital Forensics in File Systems and Anti-Forensic Techniques
Digital forensics in file systems follows the standard process of identification, preservation, analysis and presentation of digital evidence [3]. In XFS, important structures include the superblock, allocation groups (AGs), inodes, free lists and timestamp metadata. Toolan and Humphries (2025) studied data hiding in XFS slack spaces and measured capacity, detection difficulty and stability but focused only on attacker methods without providing any defender tool [1]. Kim et al. (2021) developed a TSK-based framework for XFS metadata recovery but did not address slack-space hiding or removal [5].
-
AI and Machine Learning Integrations in File System Forensics
Machine learning techniques such as entropy analysis and signature-based detection have shown high accuracy in identifying hidden or anomalous data [6]. Göbel and Baier (2018) demonstrated timestamp steganography in file systems and created the fishy framework for testing hiding methods, but it remained attacker-centric [7]. Hybrid approaches combining string matching, file signatures and entropy calculation have achieved detetion rates above 90% in similar file-system studies [8].
-
Existing Forensic Tools and Frameworks
Popular tools like The Sleuth Kit, Autopsy and xfs_db provide basic XFS parsing but lack automated multi-technique scanning, carving and dashboard features [9]. Recent surveys on anti-forensics highlight the absence of practical XFS-specific tools that combine detection, recovery and removal in one package [10].
-
Critical Analysis and Identified Gaps
Despite the useful contributions above, several gaps remain. Most research is attacker-focused and does not provide ready-to-use investigator tools. There is no integrated XFS scanner that supports all five hiding techniques plus carving, recovery and removal. Existing tools miss ransomware signature support, entropy-based timestamp analysis and easy HTML reporting. The present study addresses these insufficiencies by building a complete modular tool using pytsk3 that fills the XFS forensics gap.
Paper Title
Source (Journal/C onference)
Research Gap
Data hiding in the XFS file system
Forensic Science Internation al: Digital Investigatio
n, 2025
Lacks automated defender tools for scanning or payload
removal
Data hiding in symbolic link
slack space
Forensic Science Internation al: Digital Investigatio
n, 2025
No automated symlink slack scanners or removal
Table 1: Literature Review Summary
-
-
PROPOSED METHODOLOGY
The research design follows a practical development approach with iterative testing on real XFS images. The tool is built as a single Python class XFSForensicScanner that coordinates eight modules.
-
System Architecture
The architecture uses pytsk3 for low-level image access and runs as a command-line tool: python3 xfs_scanner_full2.py .
Fig. 1: System Architecture
-
Detection Modules (M1M5) M1 scans superblock slack, M2 scans inode slack, M3 scans free list areas, M4 scans free inodes and M5 analyzes nanosecond timestamps using entropy. Each module reads data blocks, applies string matching, 25 malicious signatures and entropy threshold (>6.5).
-
Carving and Recovery (M6, M8) M6 performs smart carving by searching file signatures and saving full payloads. M6 uses PhotoRec for deleted file recovery with fallback string search.
-
Reporting and Removal (M7) The tool generates CSV evidence, text report and HTML dashboard with two graphs (detections by technique and offset distribution). Removal is user-approved only and logged for evidence.
Fig. 2: Module Diagram
-
-
UML Activity Diagram
The UML Activity Diagram illustrates the complete end-to-end workflow of the XFS Forensic Scanner in a clear, sequential manner. The process starts with the initial activity Receive Input Parameters where the user provides the XFS disk image path and output directory. The flow then moves to the decision node Initialize Image Successfully? using pytsk3 to load the image and read the superblock for block size, AG count, and inode details.
If initialization fails, the process ends with an error log. If successful, the activity proceeds to parallel scanning of the five core detection modules (M1 to M5): Superblock Slack Scanner (M1), Inode Slack Scanner (M2), Free List Area Scanner (M3), Free Inodes Scanner (M4), and Nanosecond Timestamp Analyzer (M5). Each module performs string matching, malicious file signature detection (25+ signatures including ELF, PE, ZIP, and ransomware strings), and entropy analysis (threshold > 6.5).
After all detection modules complete, the flow joins and moves to Smart File Carving (M6) where hidden payloads are automatically extracted and saved as separate files in the carved_files folder. Next, the activity continues to Deleted File Recovery (M8) using PhotoRec integration along with a fallback string-based recovery method.
The workflow then reaches Generate Visualizations and HTML Dashboard (M7) where two graphs (detections by technique and offset distribution) are created using matplotlib and combined into an interactive HTML dashboard containing links to all reports and recovered files.
A decision node follows: User Approves Removal? If the user selects Y, the flow proceeds to Remove Hidden Data (optional safe zeroing of detected regions with full logging). Finally, the process ends with Save CSV Evidence, Text Report and Dashboard and displays a completion message. The entire activity is designed with swim lanes for clarity between detection, carving/recovery, reporting, and removal phases, making it easy for investigators to understand the tools logic at a glance.
Fig. 3: UML Activity Diagram of XFS Forensic Scanner
This diagram ensures the tool follows a logical, repeatable forensic process while maintaining user control and evidence integrity at every step.
-
Implementation Details
The tool is implemented in Python 3 with modular functions for each activity. All scans are logged in real-time, and removal actions are recorded in both CSV and text reports for court admissibility. The design ensures low resource usage and fast execution even on large XFS images.
-
-
SYSTEM REQUIREMENTS
The XFS Forensic Scanner is designed to run efficiently on commodity hardware and open-source Linux environments, making it accessible for students and forensic investigators. The tool requires a standard Linux setup with native XFS support and Python dependencies to ensure seamless low-level disk image parsing using pytsk3. All components were selected to keep the tool lightweight, portable, and easy to install on any Kali Linux machine.
-
Software Requirements
The development and testing were carried out on Kali Linux, which provides excellent support for XFS file system operations and forensic tools. The following software components are essential:
-
Operating System: Kali Linux (recommended) or any Ubuntu 20.04+ distribution for native XFS support.
-
Programming Language: Python 3.8 or higher.
-
Core Libraries: pytsk3 (for XFS parsing via The Sleuth Kit), numpy (for entropy calculation), matplotlib (for graph generation), csv, re, datetime, os, sys, subprocess and math.
-
Additional Tools: xfsprogs (for XFS utilities), PhotoRec (for deleted file recovery), dd and xfs_mkfs (for creating test images), hexdump or xxd (for manual validation).
-
Development Environment: Visual Studio Code or any text editor, with Git for version control.
All libraries can be installed with a single command: sudo apt install python3-pip libtsk-dev followed by pip3 install pytsk3 numpy matplotlib. No internet connection is needed after installation.
-
-
Hardware Requirements
The tool is optimized for low resource usage and runs smoothly on standard student laptops. Minimum hardware specifications are:
-
Display: 1920Ă—1080 resolution (optional for viewing HTML dashboard).
With these specifications, the tool completes full scans in under 2 minutes on a typical 10 GB test image while keeping CPU usage below 30%.
-
-
-
RESULTS
The XFS Forensic Scanner was rigorously tested on multiple crafted XFS disk images containing hidden malicious data in all five hiding locations plus ransomware strings. All experiments were perfomed on Kali Linux 2025.2 virtual machine.
-
Step-by-Step Execution Commands in Kali Linux
-
Open terminal and navigate to the project folder: cd
~/xfs_project
-
Create a test XFS image (if needed): dd if=/dev/zero of=test_xfs.img bs=1M count=1024 mkfs.xfs -f test_xfs.img
-
Hide sample data (for testing): echo “HIDDENSUPERBLOCK” | sudo dd of=test_xfs.img bs=1 seek=272 conv=notrunc
-
Install dependencies (one time): sudo apt update && sudo apt install python3-pip libtsk-dev photorec pip3 install pytsk3 numpy matplotlib
-
Run the XFS Forensic Scanner: python3 xfs_scanner_full2.py test_xfs.img output_folder
-
After execution, open the generated dashboard: firefox output_folder/xfs_forensic_dashboard.html
The tool automatically performs M1M8 scans, carves files, recovers deleted data, generates graphs, and asks for removal confirmation.
Likewise, perform testing of different test images with hidden data for each and every module.
-
-
Detection Performance
The tool successfully detected hidden data in all test cases with 95% overall accuracy.
Table 2: Detection Results Summary
Module
Detection Rate
Files Carved
Recovery Rate
Execution Time
M1M5
(Core Scans)
95%
12
85 seconds
M6 Smart Carving
100%
12
35 seconds
M7
PhotoRec Recovery
92%
70 seconds
Overall (M1M8)
95%
12
92%
120 seconds
-
Processor: Multi-core CPU (Intel i5 or equivalent / AMD Ryzen 5, 4 cores recommended).
-
Memory: Minimum 8 GB RAM (16 GB recommended for large 100 GB+ images).
-
Storage: 256 GB SSD with at least 50 GB free space for storing disk images, carved files and reports.
Fig. 4: Detections by Technique Graph
Fig. 5: Offset Distribution Graph
-
-
Sample Output and Dashboard
After running the command, the terminal shows real-time alerts like: Detected in superblock AG 0 Technique: M1: Superblock Slack
Fig. 6: Result1
Fig. 7: Result2
Fig. 8: Result3
Fig. 9: Result4
Fig. 10: Result5
Fig. 11: Result6
Fig. 12: Result7
Fig. 13: Result8
All evidence is saved in CSV and text reports. The HTML dashboard contains clickable links to carved files, recovered deleted files, and graphs.
Fig. 14: HTML Dashboard Screenshot1
Fig. 15: HTML Dashboard Screenshot2
The results clearly demonstrate that the XFS Forensic Scanner is fast, accurate and practical for real-world XFS forensic investigations on Kali Linux or any Linux Flavours.
-
-
CONCLUSIONS
The XFS Forensic Scanner developed provides a complete and practical solution to the long-standing problem of hidden malicious data in the XFS file system. XFS is widely used in enterprise Linux environments for its high performance and scalability, yet it has remained vulnerable to anti-forensic techniques where attackers hide payloads in superblock slack space, inode slack, free lists, free inodes and nanosecond timestamps. Existing tools such as The Sleuth Kit and Autopsy offer only basic metadata support and lack automated multi-technique detection, smart carving, deleted file recovery and safe removal capabilities.
This work successfully implemented eight integrated modules (M1 to M8) using pytsk3 for low-level XFS parsing, 25+ malicious file signatures, entropy analysis and PhotoRec integration. The tool automatically detects hidden data, carves full payloads, recovers deleted files, generates an interactive HTML dashboard with graphs, and allows optional user-approved removal with complete logging for evidence preservation. Experimental evaluation on multiple crafted XFS disk images demonstrated 95% detection accuracy, 100% carving success rate, 92% deleted file recovery and end-to-end execution time under 2 minutes on standard hardware. The UML Activity Diagram clearly illustrates the logical and repeatable workflow, while the system requirements confirm that the tool runs efficiently on any Kali Linux machine with minimal resources.
The results validate that the XFS Forensic Scanner effectively bridges the research gaps identified in the literature survey and provides investigators with a fast, accurate and court-ready forensic tool. By combining detection, carving, recovery, visualization and removal in a single modular Python package, the tool makes advanced XFS forensics accessible to students, researchers and practicing digital forensic examiners. This work contributes a ready-to-deploy open-source solution that significantly reduces manual effort and improves the quality of investigations on XFS-based systems.
-
FUTURE SCOPE
Although the XFS Forensic Scanner has shown excellent performance, several enhancements can be carried out in future work to make the tool even more powerful and user-friendly. The following areas are planned for extension:
-
Developing a graphical user interface (GUI) using Tkinter or PyQt to make the tool easier for non-technical investigators and students.
-
Extending support to other popular Linux file systems such as ext4 and btrfs for broader forensic applicability.
-
Integrating advanced machine learning models for automatic anomaly scoring and further reduction of false positives.
-
Adding support for real-time scanning of live mounted XFS partitions and remote image analysis.
-
Implementing cloud-based collaborative reporting while maintaining full data privacy through local processing.
-
Conducting large-scale real-world testing on actual seized XFS disks from cybercrime cases and performing formal validation with forensic experts.
-
Exploring integration with other forensic frameworks like Autopsy to create a complete XFS analysis plugin.
These future improvements will make the tool more robust and widely usable in both academic and professional digital forensics environments.
REFERENCES
-
F. Toolan and G. Humphries, Data hiding in the XFS file system, Forensic Science International: Digital Investigation, vol. 51, pp. 301799, 2025.
-
F. Toolan and G. Humphries, Data hiding in symbolic link slack space, Forensic Science International: Digital Investigation, vol. 51, pp. 301800, 2025.
-
A. Schwietert and J. Hilgert, Data hiding in file systems: Current state, novel methods, and a standardized corpus, Forensic Science International: Digital Investigation, vol. 51,
pp. 301801, 2025.
-
P. Picazo-Sánchez, G. A. Pérez and P. Merino, Systematic Review: Anti-Forensic Computer Techniques, Applied Sciences, vol. 14, no. 5, pp. 2345-2367, March 2024.
-
H. Kim, S. Kim, Y. Shin, W. Jo, S. Lee and T. Shon, Ext4 and XFS File System Forensic Framework Based on TSK, Electronics, vol. 10, no. 12, pp. 1456, June 2021.
-
T. Göbel and H. Baier, Anti-Forensic Capacity and Detection Ratng of Hidden Data in the Ext4 Filesystem, in Advances in Digital Forensics XIV: 14th IFIP WG International Conference, New Delhi, India, January 3-5, 2018. Springer, pp. 271-285.
-
T. Göbel and H. Baier, Anti-forensics in ext4: On secrecy and usability of timestamp-based data hiding, Digital Investigation, vol. 24, pp. 1-12, March 2018.
-
T. Göbel and H. Baier, fishy – A Framework for Implementing Filesystem-Based Data Hiding Techniques, in Digital Forensics and Cyber Crime: 10th International EAI Conference, ICDF2C 2018, New Orleans, LA, USA, September 10-12, 2018. Springer, pp. 45-62.
-
S. Neuner, S. Schrittwieser, D. Grubwinkler, C. Platzer, M. Taschwer and A. Rauber, Time is on my side: Steganography in filesystem metadata, Digital Investigation, vol. 18, pp. S1-S12, August 2016.
-
Y. Kim, D. Kim, K. Seol and D. Won, Data investigation based on XFS file system metadata, Multimedia Tools and Applications, vol. 75, no. 22, pp. 14871-14889, November
2016.
-
B. Carrier, File System Forensic Analysis, Addison-Wesley Professional, 2005.
-
S. Garfinkel, Digital forensics research: The next 10 years, Digital Investigation, vol. 7, pp. S64-S73, August 2010.
-
E. Casey, Digital Evidence and Computer Crime: Forensic Science, Computers, and the Internet, 3rd ed., Academic Press, 2011.
-
M. Pollitt, A history of digital forensics, in Advances in Digital Forensics VI, Springer, 2010, pp. 3-15.
-
The Sleuth Kit, pytsk3 Documentation, 2024. [Online]. Available: https://github.com/py4n6/pytsk (accessed May 2026).
