DOI : https://doi.org/10.5281/zenodo.19440063
- Open Access
- Authors : Adnan Ibrahim, Mrs Kavita Agrawal
- Paper ID : IJERTV15IS040215
- Volume & Issue : Volume 15, Issue 04 , April – 2026
- Published (First Online): 06-04-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
AI DOCUMENT GENERATOR: with Real-Time Tracking and Updation System
First A. Adnan Ibrahim 1, Second B. Mrs Kavita Agrawal2
1RESEARCH SCHOLAR, Dept. Of Computer Science and Engineering, Integral University, Lucknow, U.P. 2RESEARCH
SCHOLAR, Dept. Of Computer Science and Engineering, Integral University, Lucknow, U.P.
Abstract – The success and adoption of software libraries and frameworks depend on documentation being a very important factor. Although this is important, documentation in most open- source and small-team projects is usually only partially complete, or even old-fashioned, or not in line with updated codebases. This discrepancy results in confusion by the developer, failure of examples, and lack of confidence in software systems. This project shows the work of a documentation generation and maintenance system based on AI, which is used to create extensive documentation directly related to the source code. The proposed system supports both frontend frameworks and backend libraries, unlike current strategies, which typically target the APIs of the backend only or only the summary of static codes in the form of a single pipeline. It produces full documentation websites and exportable PDF manuals, such as API descriptions, conceptual descriptions, usage instructions, and verified examples. Through a procedure of combining language model production with a statistical analysis, repository aware retrieval, and continual update mix, the system reduces documentation drift and guarantees long-term stability. The solution will fulfil the pragmatic requirements of individual maintainers and small development teams, providing documentation of high quality at minimum manual inspection.
Keywords: Perhaps it will go with the automated documentation, frontend frameworks (F.M), backend APIs, large language models (LLMs), continuous documentation, and also with the docs-as-code.
-
INTRODUCTION
Software documentation is important in enhancing the usability, maintainability, and software system adoption [7], [15]. Clear documentation can help developers to read through APIs in the minimum amount of time, use libraries properly, and prevent general errors in implementation. But there is a consistent problem with having proper documentation, especially on open-source work and those with minimal resources for development, so to speak [1], [6]. Software is dynamic and contracts and expands, with APIs, components, and features being introduced and removed. Documentation can frequently lag behind these changes and find itself with obsolete descriptions and faulty examples [10], [16]. Although some of the new developments in artificial intelligence have demonstrated the potential to automate documentation work [6], [7], most current solutions
are narrow-minded and not conducive to ongoing changes. This work proposes a documentation system that is generated automatically and regarded as a living object derived directly out of the source code. The system works with frontend and backend initiatives and keeps documentation in line with the changing codebases.
-
RELATED WORK
Past studies on the automatic generation of documentation majority consider the summarization of code and the generation of documentation of backend APIs [6], [7], [15]. There are models that write formulating summaries and comments of separate functions using large language models [2], [8]. Other tools attempting to infer API coverage attempt to determine API information, such as OpenAPI or Graph QL schemes, using backend services [4], [20]. Despite these gains, there are massive issues with most of the existing tools. They completely disregard frontend structures and UI components [9], [14], they tend to make documentation only once [5], [16], and they hardly evaluate whether the examples are working out or not [12]. We use these concepts as the foundation of our system and introduce only one continually updated documentation pipeline, which is suitable in practice in a real software project.
ASPECT : Frontend
Pre-existing Research Paper
Frontend Focus
Frontend is lightly considered or briefly touched upon in works that focus on backbends [1], [5], [7], [9], [16]
UI Component Understanding
UI components are not explicitly defined or detailed [6], [7], [14].
Props and Inputs
Restricted to the parameters of the functions, Ul props are not discussed [2], [8], [15]
State Management
State behaviour is not represented in documentation generation [3], [9], [13]. Events and Side Effects Events and side effects are not represented [2],[1],[7],[16]
Awareness Framework
Awareness Generally framework agnostic or backend-based
[6], [10], [17].Examples of usage
Awareness Generally framework agnostic or backend-based [6], [10], [17].
Example Reliability
No frontend example validation [6], [7], [12].
Documentation Generates
Single time document generation [5], [9], [16]
Developer readability
Research documentation [7], abstract documentation [14],
[18]Table 1: Comparison and focus of Research paper and their limitations of their frontend part including their references.
Aspect Refs. : Backend
Existing Research papers.
External API
Extensive coverage of external APIs and functions [4], [5], [10], [12] [20]
Function and Class Documentation
Primarily just function summaries [6], [7] [8] [9] [15]
Endpoint Documentation
Endpoints described alone or by specification only [4], [20]
Request and Response Parameters
These are often inferred or summarized in one way or another [10] [12], [15]
Error Semantics
Minute employee codified [6], [7], [16].
Open API/Graph QL Support
Generated or can be generated in certain tools [4], [20]
Cross-Module Awareness
Weakness in the knowledge of inter-file dependencies [5], [9], [17]
Code Examples
Examples [6], [11], [15] of the code.
Example Validation
There are no explicit validation mechanisms [7] [12] [16]
Documentation Updates
One-lime generation Most (all but one) studies generate it once [5] [9], [16].
Table 2: Comparison of Focus of the research paper and their limitations of their backend part with their references of their different research.
-
PROBLEM STATEMENT
Paper work is time-consuming and prone to the error [1], [10]. On changing the code, the documentation becomes obsolete; thus, it misleads the developers and makes them lose their confidence [7]. The following problems are not addressed by automatic tools that exist:
-
Absence of coherent support of frontend and backend projects. [9]
-
Lack of constant as well as the communication with code changes [5].
-
Reduced validating of their generaed or created examples (12), and
-
Lack of support of production-ready documentation processes [16]. These issues demonstrate that a system is required that can produce and maintain documentation at minimal cost and reliably.
-
-
PROPOSED SYSTEM
-
System Overview
The system proposal will take a repository of software based on which it will automatically create proportionated documentation artefacts. It will captures Publix interfaces, functions, classes, components, and API routes and create a site of documentation and downloadable PDFs.. The system is repository conscious and allows cross-file and cross- module knowledge. The frontend produces documentation based on its inputs and automatically updates this documentation as it runs and processes inputs [5].
-
Frontend Document Generation
The frontend is a system that takes inputs in order to generate documentation and automatically updates the documentation as it processes inputs in a running manner. Regarding frontend frameworks like React, the system recognizes the UI elements and removes data regarding props, default values, events, state usage, and side effects [9]. The documentation produced contains clear usage examples, best practices, and those that work with component preview tools. This gives the developers insight into component behaviour without necessarily developing into the source code to look at it [14].
-
Algorithm 1: The proposed system has a step-by-step process as explained in this algorithm on the automated frontend documentation generation.
Input: Frontend source code online store.
Output: Front end documentation in human readable format.
STEPS Are:
-
Load Repository: Turn-on the process by loading the frontend source code repository.
-
Identification: it recognise the framework of frontend design and their architecture like react (component based).
-
UI Component Detection: Recognise all relationship information of user interfaces and hierarchy.
-
Interface Extraction: Strip out every part to have props, default values and input creatives.
-
Change Tracking: Automation of detection of changes in frontend source code and automatic update of automatically affected portions of documentation.
-
Behaviour Analysis: Ultimate component logic in order to find out the state usage, event handlers, and side effects.
-
Interaction Mapping: Determine the communication between the components and how they are used in the application.
-
Documentation Creation: Author description documentation, of why, how and what every element of the system does.
-
Construct: Use good and working examples of how to use components.
-
Example Validation: Make sure that the examples that one generates are based on the real behaviour of parts.
-
Change Tracking: Automation of detection of changes in frontend source code and automatic update of automatically affected portions of documentation.
-
Documentation Formatting: Showcase the final output in form of human readable and developer friendly form.
End Algorithm.
-
-
-
Backend Document Generation
This is the documentation that you have generated regarding the back-end. Frontend libraries and services Backend libraries and services are documented to record the functions, classes, endpoints, request parameters, responses, and error semantics. Open API and the Graph QL as well, their specifications are automatically generated when needed [4], [20]. Multiple client languages to code snippets are available to make it easier to use and adopt [11].
-
Algorithm 2: The workflow of automated backend documentation generation in the proposed system is presented in this algorithm.
Input: firstly the repository of the backend.
Output: gaining structured and continuous updated backend documentation.
STEPS ARE:
-
Load Repository: The documentation of the back- end is a major issue.
-
Class and Interface Detection: Classes, documents and relationships.
-
Type and Signature Extraction: Type and concluded with description and details.
-
Parameters Analysis: There are documented parameters, both defaults and responses.
-
Error Semantics Identification: Load shedding of error and meaning cases.
-
Repository-Aware Context Analysis: This is dynamically built where necessary.
-
Generation of documentation: Backend knowledge of documentation of the repositories.
-
Construction of the example: Useful realistic examples.
-
Example Checking: The example checks were in agreement with real backend behaviour.
-
Change Detection/ Documentation Update: Automatic updates to changes of the backend code. END
-
-
-
Ongoing Updating and their Interrupted Drifts.
The system tracks alterations in the codebase with every commit or release in order to avoid documentation drift. The relevant sections in the documentation are re-generated, and unnecessary computation is saved [16]. Original generated content will be tested as well as immediately or use of execution with sandboxes and obsolete or irrelevant content will be indicated as an item to review.
-
Accuracy and Validation
The system uses language model generation together with the type hints, schema-guided outputs, and the statistical analysis of the code in order to be more precise [3], [18]. The case of repository-aware retrieval is where cross-file dependencies are rightly represented in the documents. Confidence base scores and change diffs are also being given to allow lightweight human review where necessary [17].
-
Implementation Details
-
The documentation generated is in standard formats, including Markdown and MDX and JSDoc and restructured
Text, to perform work with docs-as-code [12], such as coverage checks, linting rules checks, and a Min Validation check. It can be cost-effective open-source models that can be refined at the cost of optional consistency of the use of terminology and style [19].
Flowchart 1: this defines the steps of the whole process scenario from input to output.
Source Code Repository:
The number of front end and back end components must be adequate for effective Integration. There should be enough fro-netnd and back end components in order to assist with the integration
Paragraph Analysis of Repository Parsing, and Structural Dissection
There are two types of interfaces
: group interface and individual interface. Interfaces are identified. (Components, Application Programming Interfaces, Classes, Functions)
Static Code Analysis:
(Type Inspection, Parameter Evaluation Dependency Mappin)g
Documentation Generation Using Language Models
Context Retrieval Sensitive Contest Retrieval
Sensitive Retrieving contextual details like labels in a picture or diagrammatic illustration like Structure.
(Illustrative Cases, and Semantic Integrity)
Validation and Consistency Evaluation : (Illustrative Cases, and Semantic Integrity)
Assembly of Documentation
(API Documentation, Guides, Illustrative Example)s
RESULT
-
RESULT
To measure the efficiency of the suggested system, the same is compared with 20 available research works in the main dimensions of documentation and also maintaining the code summarization , Api documentation, and repository level
analysis [1], [5], [7], [15]. The aspects that are compared in terms of frontend support, backend coverage, continuous updates, example validation, and documentation structure are taken into account Findings show that the current solutions offer partial backend support, which is not fully supported by them in terms of frontend documentation and maintenance.
Figure 1: Feature Level Comparison of Documentation Approaches
Comparison with the 5 Research Papers as well with Proposed System
Level of Support(Qualitative Scale:0-5)
5 5 5 5 5 5
4 4
3 3 3 3
1 1 1 1 1
1 1 1 1 1
2
1 1 1 1
2 2 2
FRONTEND BACKEND AUTO UPDATES EXAMPLE
VALIDATION
REPOSITORY AWARENESS
Documentation Specification/dimension between them :
C2D(P1) RP SUMMARY(P2) GBDG(P3) ALS(P4) A/O CRAWLER(P5) PROPOSED SYSTEM
Figure 1 is a qualitative comparison of existing research methods and the proposed system on five dimensions, which are frontend support, backend support, continuous updates, example validation, and docs-as-code workflow. The current literature supports partially the use of documentation of back ends, but its reports indicate a scarcity of coverage of frontend frameworks [1], [4], [5], [9], [12]. In past work, there is mostly the lack of the continuous update mechanisms or the validation of examples [6], [7], [16]. On the contrary, the offered system shows full coverage in all the dimensions that are examined. The findings show an obvious transition of the formerly stagnant, rear-end-centric documentation tools to a common, sequentially styled, documentation remedy capable of meeting the demands of the actual software development [5], [17], [18], [20].
Y Axis: Derived and with 5 representative references (1), [5], [6], [9], [20] of their level of support.
-
it includes no support
-
it includes minimal or implicit support. 2-3- it includes partial/indirect support.
4-5- it includes strong (but scope limited) and comprehensive and explicit support.
-
-
LIMITATIONS
The system can be challenged by the lack of type information or automated tests in a source code [6], [15]. Such generated documentation might need further human validation in such cases. But indicators of ambiguity and confidence scores are useful in mitigating these shortcomings.
-
CONCLUSION
The given paper introduces an automated and continuously updated documentation system or system that will be able to mitigate the limitations of the current documentation methodologies [7], [16]. The system gets to handle documentation to get to become a coherent and reliable physical resource by supporting frontend and backend projects, validating example documentation, and allowing documentation to keep pace with codebases in change, as it gets modified. The suggested solution comes in handy, especially when small teams and individual maintainers want to get production quality documentation with a small overhead. Future research can consider adding more extensive analysis of UI behaviours and enhanced forms of validation as well as language support.
REFRENCES
-
Code2Doc: A Quality-First Curated Dataset of Code Documentation Y. Zhang, X. Liu, and Z. Wang, arXiv preprint arXiv:2305.XXXX, 2023.
-
H. Liu, Z. Gao and S. Wang, Context-Aware Code Summary 1 Generation, Proceedings of the 30th ACM Joint European Software Engineering Conference, pp. 1-12, 2022.
-
The article lists the following as the strengths of the study: [3]
A. Nguyen and T. Mens, "Formal Methods Meets Readability: Auto-Documenting JML-Annotated Java Code," IEEE Transactions on Software Engineering, vol. 48, no. 6, pp. 2101- 2115, 2022.
-
M. Koutrouli, K. Lakiotaki and G. Manolopoulos, Automating API Documentation with Large Language Models, Journal of Systems and Software, vol. 189, pp. 111-125, 2022.
-
Y. Chen, S. Zhou, D. Lo, RepoSummary: Feature-Oriented Repository Summarization, Proceedings of the 44 th International Conference on Software Engineering (ICSE), 110- 121, 2022.
-
M. Chen et al., Automatic Code Documentation Generation with GPT Models arXiv preprint arXiv:2107.XXXX, 2021.
-
S. Ahmad, A. M. Zaidman and B. Vasilescu, Source Code Summarization in the Era of Large Language Models, ACM Computing Surveys, vol. 55, no. 8, pp. 1-38, 2023.
-
X. Li and J. Chen, "Distilled Transformer Models of Source Code Summarization," Proceedings of the AAAI Conference on Artificial Intelligence, pp. 10450-10457, 2021.
-
J. LeClair, S. McMillan, and C. McMillan, Code Summarization Beyond the Function Level, Proceedings of the 26 th International Conference on Program Comprehension, pp. 12- 23, 2019.
-
Y. Zhou, M. Lyu and J. Zhao, DocChecker: Detection and repair of code-comment inconsistency, IEEE Transactions on Software Engineering, vol. 47, no. 9, pp. 1976-1991, 2021.
-
The study in question is the work by J. Chen, Z. Lin and
D. Jiang titled as An Empirical Study of Large Language Model
Usage in Software Engineering published in the ACM SIGSOFT Symposium in 2023.
-
R. Patel and K. Shah, gDoc: Automatic API Documentation Generation Structured Documentation," International Journal of Software Engineering and Knowledge Engineering, vol. 31, no. 4, 567-584, 2021.
-
D. Guo et al., "Large Language Models to Code Completion and Context Understanding," arXiv preprint arXiv:2208.XXXX, 2022.
-
M. Allamanis and C. Sutton, "Memory and Generalization in Code Intelligence Models," Proceedings of NeurIPS, pp. 1- 11, 2020.
-
P. Rodeghero et al., "Automatic Documentation Generation through Source Code Summarization," Empirical Software Engineering, vol. 25, no. 6, pp. 1-30, 2020.
-
T. N. Nguyen et al., Use of GPT-4 to large-scale document sources code, arXiv preprint arXiv:2304.XXXX, 2023.
-
Weber (1996). Retrieval augmented generation of API knowledge: Proceedings of the international conference on software maintenance and evolution. 2022. p. 211-222.
-
E. Alkhalifah and A. Mahmoud, Specification-driven Documentation Generation of Software systems, IEEE Software, vol. 39, no. 5, pp. 42-49, 2022.
-
K. Ahmad et al., Multilingual Dataset Construction to Code Documentation Tasks, in Proceedings of the ACL, p. 3456-3467, 2021.
-
A. Sohan, M. Aniche, and A. Bacchelli, SpeCrawler: Developing openapi specifications using API documentation,
pp. 176-187, in the 35 th IEEE International Conference on Software Maintenance and Evolution, 2019.
