🏆
International Peer-Reviewed Publisher
Serving Researchers Since 2012

Auto ITR: An Automated Indian Income Tax Return Preparation System using Hybrid Rule-Based and Heuristic AI

DOI : https://doi.org/10.5281/zenodo.20137267
Download Full-Text PDF Cite this Publication

Text Only Version

Auto ITR: An Automated Indian Income Tax Return Preparation System using Hybrid Rule-Based and Heuristic AI

Haireet Mehta

Department of Artificial Intelligence and Data Science Shah and Anchor Kutchi Engineering College Chembur, India

Kaushal Mhatre

Department of Artificial Intelligence and Data Science Shah and Anchor Kutchi Engineering College Chembur, India

Shubham Jain

Department of Artificial Intelligence and Data Science Shah and Anchor Kutchi Engineering College Chembur, India

Shreyansh Jain

Department of Artificial Intelligence and Data Science Shah and Anchor Kutchi Engineering College Chembur, India

Prof. Milind Khairnar

Department of Artificial Intelligence and Data Science Shah and Anchor Kutchi Engineering College Chembur, India

Abstract – Filing income tax returns (ITR) in India manually may require considerable effort and be riddled with errors. This process typically includes classifying nancial transactions, accounting for multiple bank accounts and various deductions, such as section 80C, 80D, 24(b), and 80CCD.

In this research paper, we present Auto ITR, which is an automated solution aimed at easing the procedure of ling income taxes. The proposed platform automatically processes raw statements from banks, classies transactions, detects anomalies and suspicious entries, provides users with recommendations on saving more on their income tax, and lets them review all the generated data with a Chartered Accountant.

This platform combines tax computations on the basis of India tax brackets and smart functionality, such as the smart

transaction categorizer, anomaly detection, comparison of two regimes of taxation, and a tax optimizer along with a knowledge-based chatbot.

Experimental results show strong performance, with transac-tion categorization achieving 91.2% precision and 88.5% recall, and anomaly detection reaching 93.1% accuracy. Additionally, the system reduces manual effort by around 65%. A CA review acceptance rate of 87% further demonstrates the reliability and practical usability of the system.

Index Termsautomated income tax, automated transaction classication, automated anomaly detection, automated tax opti-mization, India ITR

  1. Introduction

    Income tax return ling in India mandates taxpayers to aggregate nancial transactions from various banks, categorize their sources of income, calculate permissible deductions, and make a choice between the old and new tax regime under the Finance Acts. The task is especially arduous for salaried individuals who invest in shares and mutual funds, own properties, have home loans, and make investments in schemes like the National Pension Scheme and Equity Linked Saving Schemes (ELSS). Inaccuracy at the nancial trans-action categorization phase leads to erroneous calculation of deductions and possibly even notices from the Income Tax Department (ITD).

    While commercial solutions offer guided data entry, they do not perform the tasks of extraction and categorization of nancial transactions from bank statements automatically. Current research related to nancial transaction categorization

    [1] and income tax documentation [2] have considered these problems in isolation.

    Auto ITR takes the challenge all the way from multi-format ingestion of bank statements to CA-approved export. The main contributions of our work are: (1) Hybrid rule- and heuristic-based transaction categorizer with user correction feedback;(2) Anomaly detection framework based on statistical and heuristic methods; (3) Dual regime tax optimizer accounting for slab rates, rebates, and cess; (4) Chatbot with knowledge base rst followed by optional LLM; (5) Human-in-the-loop CA export approval phase.

  2. Related Work

    1. Automated Financial Document Processing

      In literature related to automating personal nance manage-ment, previous efforts have concentrated on expense tracking

      [3] and budget classication through machine learning tech-niques [4]. Rule-based approaches perform better in structured merchant information but fare poorly on unstructured bank narratives typical of India, whereas hybrid methods which employ keyword matching along with recurrence have higher recall for salaries and EMI payments, hence the motivation behind our system SmartCategorizer.

    2. Tax Computation and Advisory Systems

    In the case of automated tax advice in western countries, there is the advantage of standard forms of data input (W-2 form, 1099 form). In the Indian setting, however, there are issues such as the presence of a dual tax regime, various sec-tions for deductions, and different format for PDF/XLS bank statements. From studies conducted in [6], the importance of selecting an appropriate regime at marginal income levels is not a simple task; this is solved in our TaxOptimizer using break-even deductions.

  3. System Architecture

    Auto ITR follows a layered architecture comprising a FastAPI backend, a SQLite/SQLAlchemy data layer, and a

    static HTML/CSS/JavaScript frontend served by the same API process.

    1. Backend

      The backend has been developed in Python 3.11 with FastAPI. Routers have been used to create different APIs to handle tasks such as authentication, user management, handling bank statements, analytics, AI Inference, ITR ling, Balance Sheet management, CA Review, exporting data, and consolidation of multiple banks. All endpoints use JWT for authentication and bcrypt for password encryption.

    2. Data Layer

      The persistence of the system relies on the use of SQLite, realized through SQLAlchemy 2.x declarative model classes that take care of the management of key objects like users, bank statements, transactions, logs, ITR submissions, tax cal-culations, and balance sheet objects.

      Transaction entries contain an extra category name along with the condence level and other meta information.

    3. Frontend

      The front-end of the system is developed using plain JavaScript modules and allows users to view their KPIs in the form of a dashboard. The dashboard contains features like a statement upload feature, transaction table with a feature to edit categories inline, analytics chart, balance sheet editor, CA review panel, and chatbot.

    4. File Parsing

      The ParserFactory module ensures that the various types of les, namely CSV, XLS, XLSX, and PDF are directed to the appropriate parsers. The parsers used for handling different le types have been built using the pandas [8], openpyxl, xlrd, pdfplumber, tabula-py, and PyPDF2 libraries. The ParserFactory module ensures that the various types of les, namely CSV, XLS, XLSX, and PDF are directed to the appropriate parsers. The parsers used for handling different

      le types have been built using the pandas [8], openpyxl, xlrd, pdfplumber, tabula-py, and PyPDF2 libraries.

      Additionally, the system can handle encrypted PDF bank statements.

    5. Data Flow

      The end-to-end ow is:

      1. The user uploads a statement; the

        Fig. 1. Overall System Architecture of Auto ITR

        StatementService checks the size and format, then stores the upload under uploads/bank_statements/{userId}.

      2. Upon processing a request, the system invokes the ParserFactory to extract information; the TransacionClassifier initi

      3. Optional auto-categorization renes uncategorized trans-actions via SmartCategorizer.

      4. Analytics endpoints aggregate income, expenses, and deductions; compute monthly trends and tax estimates.

      5. AI endpoints provide chatbot responses, anomaly re-ports, and regime optimization.

      6. CA review panel allows inspection, approval, and PDF export via fpdf2.

        Fig. 2. End-to-End Data Flow from Statement Upload to Tax Filing

  4. Algorithms and Heuristics

        1. SmartCategorizer

          SmartCategorizer uses multi-features matching to classify the categories of narrations in banks. Multi-features match includes features such as keyword and regular expression match based on a well-curated taxonomy, checking of amount range (such as salary range), identication of days of the month for recurring payments, and recurrence pattern match in transaction history. The condence level is thresholded into three categories which are 0.5 (tendency), 0.7 (likely), and

          0.9 (condent). Corrections done by users are recorded and used as highest priority rules in further classication without training a new classier.

        2. AnomalyDetector

          Anomaly detection combines four complementary strate-gies. Statistical outliers are identied using a z-score baseline:

          xi

        3. TaxOptimizer

          TaxOptimizer computes the taxable income and net tax liability using both the old and new Indian income tax schemes for Assessment Year 202425. Slab rates in force for that year for both schemes are applied along with deducting the 87A tax rebate for the income amounts satisfying its threshold and computing 4% health and education cess.

          In aid of making decisions, the breakeven amount of deduc-tions D is estimated. It denotes the total deductions needed in the old scheme such that the net tax liability in the old scheme equals that in the new scheme:

          Told(I D)+ cess = Tnew(I)+ cess (2)

          In view of the above calculation, the optimizer makes recommendations about the choice of the scheme that offers the better benets along with suggestions to improve the tax position. The suggestions include maximizing deductions under 80C (upto INR 1.5 lakh), contributions to NPS under 80CCD(1B) (upto INR 50,000) and 80D.

        4. Tax-Saving Analytics

          These endpoints will analyze the classied transactions in order to make an estimate regarding incomes, expenditures, and the utilization of deductions in different sections. The software shall also draw attention to the marginal advan-tages of tax savings in Sections such as 80C, 80D, 24b and 80CCD(1B).

          This will enable the user to capitalize on unused deduction chances.

        5. Chatbot

    The bot has a knowledge-based rst design, where the initial input from the user is rst tried against a manually curated set of tax topic indices in India, such as IT return selection, HRA deductions, capital gains, and advance tax.

    In case no input matches the threshold criteria of condence levels, then the input is sent to an optional LLM endpoint of OpenAI (gpt-4o-mini) alongside some recent chat history and system prompts ensuring ethical guidelines with no possibility of generating anything that could be construed

    zi =

    (1)

    as tax evasion advice.

    where xi is the transaction amount, is the category mean, and

    The other feature offered by the bot is follow-up query

    suggestions.

    is the standard deviation. Transactions with |zi| >

    (congurable threshold) are agged. Round-amount heuristics ag transactions that are exact multiples of 1000 or 5000 INR as potential cash proxies. A duplicate detector keys on the tuple (amount, description, date) within a rolling window. The system analyzes the differences between categories and credits/debits to ag errors such as a debit entry classied as salary income. The system also searches for transaction descriptions that indicate large amounts of cash movement, aiding in detecting possibly unreported cash transactions.

    All agged discrepancies are categorized into levels of severity and summarized for CA review.

  5. EVALUATION METHODOLOGY

    1. Categorization Accuracy

      Accuracy and recall are analyzed based on the annotated dataset of bank transactions narration, consisting of typical Indian categories like salary, investment, and utilities.

      Condence calibration is done by analyzing the comparison between the condence values assigned by our algorithm and the actual accuracy within each of those ranges.

      Moreover, correction lift is used to quantify the improve-ment in accuracy of each individual after ve corrections.

    2. Anomaly Detection

      Anomalies seeding is done by adding known anomalies, like duplicates, abnormally high values, and wrong categories, to otherwise perfect bank account statement data.

      For the assessment of the performance of the algorithm, false positive and false negative error rates are calculated at various threshold levels of {2.0, 2.5, 3.0}.

    3. Tax Optimizer Accuracy

      Tax liabilities generated by both models have been analyzed based on the true tax values obtained from the slab calculator of the ITD. This analysis will be done through a database consisting of the details of taxpayers having different levels of taxable incomes from 5 lakhs up to 50 lakhs, depending on the degree of deductions made.

      Evaluation will be performed through the generation of the average tax differential along with the largest observed deviation in the calculations.

    4. System Latency

      Performance parsing is conducted for both PDF and XLSX documents depending on page counts and row counts. Ef-ciency of the system is tested in terms of time delay during batching process in performing categorization and detection of anomalies for statement size counts of 100, 500, and 1000.

      Also, reaction time is measured for chatbots in case of either knowledge base or language model matching processes.

    5. Human-in-the-Loop (CA Review)

    The acceptance rate in CA review denotes the percentage of transactions that have been auto-classied and accepted by Chartered Accountants in a pilot study without any changes made.

    The manual override measure for every 100 transactions is used as an indicator of human work left to do.

  6. Implementation Details

    1. Technology Stack

      It uses a wide range of technologies. The backend is implemented in Python 3.11, FastAPI, uvicorn, and has data modeling capabilities provided by SQLAlchemy 2.x and Py-dantic 2.x. Environment conguration is done by Pydantic 2.x and pydantic-settings.

      Authentication is implemented with help from python-jose (used for token management with JWT) and passlib[bcrypt]. Document parsing functionality is implemented with help from pandas, openpyxl, xlrd, tabula-py, PyPDF2, pdfplumber. PDF generation capabilities are provided by the fpdf2 package.

      Optional integrations include the usage of an LLM with the help from OpenAIs ofcial Python client. The base URL for this integration can be congured to allow the user to host their own instance of the model.

      Automated test suites for authentication and le uploads are implemented with pytest.

    2. Conguration and Deployment

      Conguration of the system takes place in the .env le where such environment variables as JWT SECRET, EX-PIRESIN, URLSQLITE, ALLOWEDORIGINS, OPENAI

      API KEY, among others, can be dened.

      While in productin, the recommendation would be to switch the database from SQLite to PostgreSQL, alongside storing objects in encrypted cloud storage for scalability reasons. Also, the ALLOWED_ORIGINS variable should be restricted to only frontend servers.

      The SECRET_KEY provided by default should be changed to a randomly generated secure one before deploying the application.

    3. API Surface

    The following APIs are exposed by the system. The

    /api/auth/* endpoints manage authentication services, whereas the /api/statements route manages the upload-ing, processing, listing, and deletion of statements.

    The AI-based functionalities are offered via the

    /api/ai/chat, /api/ai/anomalies, and

    /api/ai/optimize-tax endpoints. The aggregated tax analytics can be accessed from the /api/analytics/* endpoint. There are also other endpoints like /api/itr/*,

    /api/review/*, and /api/export/*, which handle.

  7. Privacy, Security, and Compliance

    Ownership checks are performed by the system on an individual user basis, done through the API level to prevent access overlap for users private data. All upload, process, and delete actions are logged by the system with the associated timestamp and username.

    Password hashes generated by the bcrypt algorithm are saved in the system database, while the pass-word itself remains plaintext-free. Tokens expire after short durations but may be altered using the value in ACCESS_TOKEN_EXPIRE_MINUTES.

    In order to facilitate ethical use, the chatbot is designed with a system prompt where any advice regarding tax evasion is strictly forbidden. In addition, anomalies will be detected and highlighted, such as a high volume of cash transactions.

    Local storage is used with SQLite and lesystem databases. For production usage, it is recommended to deploy a relational database and encrypt the data.

  8. Limitations and Future Work

However, the existing SQLite database system does not al-low efcient handling of parallel writes and backup purposes; hence, there is a need to migrate to the PostgreSQL database. The current SmartCategorizer uses heuristic techniques, but by combining simple sentence-level models with federated personalization techniques, it would enhance recognition of various types of transaction narration.

The anomaly detection technique uses static statistical val-ues, which means that some merchants having transaction variations could be agged as anomalies. It is possible to minimize such false detections by using season-based and peer-based comparison.

In the current system, users have to enter the necessary HRA and NPS values manually. However, the process would become simpler if the Form 16, 26AS, and AIS data were directly used for these calculations.

Acknowledgment

Acknowledgment is due to the open-source communities supporting FastAPI, SQLAlchemy, pandas, pdfplumber, and fpdf2. Without the contributions made by these libraries, the development and implementation of the system would not have been possible.

  1. S. Das and A. Chatterjee, Knowledge-base augmented conversational agents for tax advisory: Design and user evaluation, in Proc. ACM Int. Conf. Information and Knowledge Management (CIKM), 2022, pp. 29012910.

  2. W. McKinney, Data structures for statistical computing in Python, in

Proc. 9th Python in Science Conf., 2010, pp. 5661.

References

Please number citations consecutively within brackets [1]. The sentence punctuation follows the bracket [2]. Refer simply to the reference number, as in [3]do not use Ref. [3] or reference [3] except at the beginning of a sentence: Reference [3] was the rst .. .

References

  1. N. Tan, S. Lim, and R. Patel, Automated transaction classication in personal nance applications using supervised learning, in Proc. IEEE Int. Conf. Data Engineering, 2021, pp. 112119.

  2. A. Sharma and P. Gupta, Document understanding for tax forms: A structured extraction approach, in Proc. Int. Conf. Document Analysis and Recognition (ICDAR), 2022, pp. 340347.

  3. M. Chen, Y. Liu, and J. Zhang, Personal expense tracking via automated bank statement parsing, IEEE Trans. Consumer Electronics, vol. 68, no. 2, pp. 145153, May 2022.

  4. R. Kumar and S. Singh, Bank transaction categorization using natural language processing and ensemble classiers, in Proc. IEEE Int. Conf. Articial Intelligence and Machine Learning, 2023, pp. 7885.

  5. L. Wang, F. Zhou, and T. Li, Recurrence-aware nancial transaction labeling for personal budget systems, IEEE Access, vol. 11, pp. 3321033221, 2023.

  6. V. Iyer and M. Nair, Break-even analysis for old versus new income tax regime selection in India, Int. J. Finance and Economics, vol. 15, no. 3, pp. 210219, 2023.