Policy Impact Analyzer using Parliament Debates & News

doi:10.5281/zenodo.19627644

Volume 15, Issue 04 (April 2026)

Policy Impact Analyzer using Parliament Debates & News

DOI : 10.5281/zenodo.19627644

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 30
Authors : Guruvelli Ramya Sree, Amrutaluri Dhaneshwar, Shaik Affiq Ahmed, Mr. Bhujanga Reddy
Paper ID : IJERTV15IS040516
Volume & Issue : Volume 15, Issue 04 , April – 2026
Published (First Online): 17-04-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Policy Impact Analyzer using Parliament Debates & News

Guruvelli Ramya Sree

Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana

Amrutaluri Dhaneshwar

Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana

Shaik Affiq Ahmed

Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana

Mr. Bhujanga Reddy

Assistant Professor, Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana

Abstract – Policy Analyzer is a fully integrated web-based tool that allows the systematic analysis, comparison, and evaluation of policies formulated by the Indian government. The software has adopted a new algorithm of impact scoring that consists of six different parameters to score and prioritize policies. These parameters include policy coverage, reach to beneficiaries, budget allocated, scale of implementation, sector relevance, and simplification. The application uses the combination of React in its front-end and Node.js in its back-end and includes 101+ policies within the sectors. Policy analyzer has several important features, including intelligent policy search using debounce optimization technique, comparative policy analysis to find synergies among different policies, automatic identification of beneficiaries, and analysis of policy distribution among sectors.

Index Terms – Policy Analytics, Impact Scoring Algorithm, Policy Comparison System, Full-Stack Web Application, Government Policy Intelligence, Natural Language Processing, Beneficiary Analysis, Sector-wise Policy Distribution

INTRODUCTION

Policy Analyzer is a web-based platform designed to change how people access, analyze, compare, and understand government policies in India. The project tackles a key issue in governance: the scattered and hard-to-reach policy information across many government databases, ministries, and official sources. India has rolled out over 100 key government policies in areas like healthcare, education, agriculture, employment, rural development, financial services, and social security. These policies represent huge investments of public funds, often reaching thousands of crores of rupees, and they significantly affect the lives of hundreds of millions of citizens.

Despite their importance, these policies are spread out across various government websites, ministry portals, parliamentary databases, and official documents. A researcher trying to understand a specific policy must sift through many different sources. A citizen looking for relevant government schemes has no single platform to use. A government official who needs to coordinate policies across ministries does not have tools to find connections and dependencies. A policy

analyst who wants to compare policies across sectors lacks a standard way to do so. Policy Analyzer was built to address these issues by bringing together policy information, offering useful analysis tools, and making policy information available to various groups.

The idea for Policy Analyzer came from recognizing that policy analysis is essential for governance and public welfare, yet it remains mostly manual and fragmented. Government policies are complex documents that include various interconnected elements: objectives, target beneficiaries, implementation strategies, budget allocations, amendments, parliamentary discussions, challenges, and future prospects. Understanding a single policy fully often involves reading lengthy documents and piecing together information from different sources. Comparing two policies to find their relationships, similarities, or conflicts demands time- consuming manual analysis that can lead to subjective interpretations. To determine which policies might benefit a specific citizen based on their demographic profile, one must know all available policies and their target audiencesa knowledge base typically held by government officials or dedicated researchers. Evaluating the impact of various policies across sectors demands a standardized evaluation framework that is not currently available in a unified manner. Policy Analyzer tackles these challenges with an integrated platform that combines modern technology, smart algorithms, and user-friendly design to make policy information accessible and enable systematic analysis.

Policy Analyzer consists of several interconnected

components that function as a unified system. The platform offers a robust policy search engine that lets users find relevant policies by inputting policy names, keywords, sectors, or beneficiary types. Instead of just matching keywords, the search system uses debouncing optimization to limit unnecessary API calls while providing real-time suggestions that help users discover policies they may not have known about.

Behind the search feature is a database containing 101 government policies, each with detailed information such as

title, content, summary, ministry, category, budget allocation, target audience, objectives, benefits, implementation strategy, parliamentary discussions, debates, amendments, challenges, and future prospects. This extensive detail allows for a level of analysis and comparison that far exceeds traditional policy databases.

The platform’s main innovation is an intelligent impact scoring system that assesses each policy across six weighted dimensions: policy coverage (measuring documentation and scope), beneficiary reach (estimating the targeted population segments), budget allocation (reflecting financial resources invested), implementation scale (based on documented discussions and benefits), sector importance (based on explicit highlights), and complexity reduction (reflecting how well- documented challenges are). These six dimensions are normalized against all policies in the database and combined with tailored weights to create a final impact score ranging from 0 to 100. This scoring allows for fair comparisons of policies across different sectors, objectives, implementation methods, and operational scales.

In addition to analyzing individual policies, Policy Analyzer offers a sophisticated policy comparison feature that stands out in governance technology. Users can choose any two policies and receive a thorough comparative analysis that highlights similarities, differences, synergies, and complementarities. The comparison framework looks at several aspects: Do the policies target the same sectors? Are they led by the same ministry or different ones? Do they serve overlapping beneficiary populations? Were they introduced around the same time or one after the other? Are there clear connections indicating that one policy developed from another? Could beneficiaries of one policy access resources from the other? Do both policies face similar implementation challenges that coordinated efforts might resolve? The system automatically finds logical relationships between policies and provides actionable insights for policy coordination. For instance, it might show that a skill development policy and a microfinance policy target overlapping populations and could be combined into a single program that helps skill-trained individuals obtain business loans and start businesses. This kind of insight, which usually takes months of manual research, is generated by Policy Analyzer in seconds.
LITERATURE REVIEW

The recent advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) have greatlychanged how people perceive and analyze public opinion and policy data in various ways. Many studies have examined the role of machine learning and deep learning techniques in extracting sentiment, identifying topics, and tracking trends from large amounts of text data, like news articles. These studies provide a solid foundation for developing systems that aim to analyze the impact of policy. The research on AI-driven public sentiment and trend prediction for 2025 combines CNN, BERT, and Graph Neural Networks (GNN) to analyze social

media discussions and forecast topic trends. The model achieves high accuracy in sentiment detection and can track changes in discussions over time. However, the system relies solely on social media data and overlooks formal sources, such as parliamentary debates and organized news content, which are essential for a complete policy evaluation.

The study on AI-based sentiment analysis for government policies (2025) uses BERT to categorize public opinion about policies. It shows how AI can detect people’s feelings toward various policy areas at a subtle level. Although the paper focuses on Twitter data, it overlooks other important sources like parliamentary debates and news media. It also lacks visualization techniques and time-series analysis to help understand the results.

Another key work cited is the hybrid deep learning approach that combines LSTM with machine learning techniques for analyzing Indian government policies (2024). This study demonstrates that LSTM architectures can model contextual dependencies in text more accurately than traditional methods. However, it still falls short because it does not use the latest transformer-based models, and it barely addresses scalability and the integration of multiple sources.

Finally, a systematic review in policy sentiment analysis (2025) evaluated different machine learning methods, including SVM and Logistic Regression. The paper identifies SVM as a strong baseline model and emphasizes the importance of preprocessing methods. However, it lacks advanced NLP models, interactive dashboards, and predictive features needed for real-life applications. Most earlier methods focus on single data sources, and there is no unified system that integrates parliamentary debates, news articles, and public opinion. Additionally, they offer limited visualization and real-time analysis options. The proposed Policy Impact Analyzer addresses these gaps by integrating multiple data sources, modern NLP techniques, and interactive dashboards to provide a complete and scalable tool for evaluating policy impact.
EXISTING SYSTEM

Currently, most policy analysis and public opinion monitoring systems operate separately. Each works with individual data sources instead of combining them into one analytical platform. One key source of official information is parliamentary websites, which publish transcripts of debates, policy documents, and legislative discussions. However, these documents are often lengthy, disorganized, and hard to interpret. They also lack features like summarization, sentiment analysis, or comparisons with public opinion. This forces users to read and analyze large amounts of text themselves to understand the policy implications.

News portals and media platforms do cover government policies, but each has its own viewpoint. This may lead to biased or inconsistent information. Right now, there is no way to unite insights from various news sources into one clear perspective. Additionally, these platforms do not provide tools for monitoring sentiment changes, identifying key discussion points, or tracking how public perception shifts over time. As

a result, users often struggle to get a fair and complete picture of policy impact.

Another category of existing systems is social media analytics tools. These tools typically focus on tracking hashtags, engagement metrics, and basic sentiment classification. While they capture public reactions in real time, they only operate at a surface level. Moreover, they do not connect discussions to official parliamentary debates or policy documents. They also fail to provide meaningful summaries or insights based on specific topics. Therefore, these tools do not fully reveal how policies are perceived across different platforms.

Furthermore, academic and research-based instruments offer advanced capabilities for analyzing text using NLP methods. However, these tools can be complicated, tedious, and challenging to use without technical knowledge. They are not designed for the general public, such as policymakers, journalists, or citizens, and lack user-friendly dashboards and real-time data processing features. Overall, current systems suffer from scattered data sources and extensive manual analysis. They also lack integration of multiple sources, minimal use of advanced AI techniques, and user-friendly visualizations. These issues highlight the need for a simple, all-in-one solution for thorough policy impact analysis.
PROPOSED SYSTEM
1. System Overview
  
  Policy Analyzer is built on a three-tier web architecture that keeps things clean and organized: a frontend that users interact with, a backend that handles the logic, and a database that stores everything. This separation makes the system easier to scale and maintain over time, since each layer can be worked on independently without breaking the others.
  
  The frontend and backend talk to each other exclusively through RESTful APIs which means the door is already open for building a mobile app or connecting with other systems down the road. Data updates happen in real time, so users are always looking at the most current policy information.
  
  The system is also designed with the future in mind. Features like policy search, comparison, scoring, and report generation are logically separated, so they could eventually be spun off as independent microservices without a major overhaul. Security is taken seriously throughout authentication uses JWT tokens, passwords are hashed with bcrypt, and CORS restrictions are in place to keep things locked down.
2. Frontend Components and User Interface
  
  The frontend is built with React.js, which makes it easy to build interactive, component-driven interfaces. In total, there are seven major pages, each designed around a specific workflow.
  
  The Dashboard is where users land first it gives a quick overview of the system, including how many policies are in the database, what sectors are covered, and whats been updated recently. The Policy Search page is where most users will spend their time, featuring a smart search bar with real- time suggestions, animated visual effects, and results that show policy titles, ministries, categories, issue dates, and summaries at a glance.
  
  The Impact Scorecard page breaks policy rankings down into four tabs: Impact Ranking (policies sorted by score), Sector-wise Distribution (policies grouped by sector with average scores), Beneficiary Analysis (which demographic groups are covered and by how many policies), and Visual Analytics (trend charts, investment breakdowns, and heatmaps). The Policy Comparison page lets users pick any two policies and get a side-by-side breakdown similarities, differences, timeline relationships, and recommendations for how the two could work better together.
  
  Theres also a detailed policy view that opens in a modal, showing everything from objectives and benefits to parliamentary debates, amendments, challenges, and future outlook. The Report Sharing page lets users generate and download professional PDF reports for any policy. And finally, the Login and Signup pages handle account creation and secure authentication using JWT tokens.
  
  The interface is fully responsive it works just as well on a phone as it does on a widescreen monitor. Visually, it leans into a dark theme with gradient accents in purple, cyan, pink, andblue, giving it a modern feel. Animations and transitions are handled by Framer Motion, which adds a layer of polish to every interaction.
  
  On the performance side, search inputs use debouncing
  - the system waits 400 milliseconds after you stop typing before firing an API call, which cuts down on unnecessary server requests. The interface also uses progressive disclosure: search results show the essentials upfront, and users can click View Full Details when they want the full picture. Error handling is built in throughout, with clear, readable messages whenever something goes wrong.
3. Backend Infrastructure and API Design
  
  The backend runs on Node.js with Express.js, keeping things lightweight and efficient. Its organized into logical layers: configuration (database settings), models (data structures), routes (API endpoints), middleware (authentication and other shared logic), and scrapers (for collecting policy data).
  
  The API follows REST conventions GET for fetching data, POST for creating it, and so on. All policy-related endpoints live under /api/, with /api/auth/signup and
  
  /api/auth/login handling user accounts, and /api/policies
  
  handling everything policy-related. Searching is as simple as passing a query parameter for example,
  
  /api/policies?q=education and getting back a JSON array of matching results.
  
  Error handling is consistent and meaningful: 200 for success, 201 when something new is created, 400 for bad input, 401 for authentication issues, 404 when something cant be found, and 500 for server-side problems. The backend also logs requests and errors, which makes debugging and monitoring much easier in production.
  
  CORS is configured to allow the frontend even when hosted on a separate domain to communicate with the backend without issues. Request parsing, error handling, and authentication checks are all handled through Express middleware. For database operations, connection pooling keeps things efficient under concurrent load, and queries are optimized to stay fast even as usage grows.
4. Database Design and Schema
  
  Policy Analyzer uses MongoDB as its database, and its a natural fit. Policy documents vary a lot in how much information they contain some are extensively documented, others are sparse and MongoDBs flexible schema handles that variability without any awkward workarounds.
  
  There are two main collections. The Users collection stores account information: a MongoDB-generated ID, the users name, a unique email address, a bcrypt-hashed password, and a creation timestamp. Plaintext passwords are never stored.
  
  The Policies collection is more involved. Each document includes the policy ID, title, full content, executive summary, issue date, ministry, category, and status (Active, Proposed, or Archived). Beyond that, it stores arrays for benefits, objectives, debates, amendments, key highlights, and implementation challenges, along with fields for target audience, budget allocation, implementation strategy, parliamentary discussions, future prospects, and source documentation.
  
  To keep search fast, MongoDB indexes are created on the fields users are most likely to search title, content, category, and ministry. The $regex operator enables case- insensitive keyword searches across these fields. The design also accounts for future growth: sharding (splitting data across servers by policy ID or sector) and vertical scaling are both supported. Connection pooling handles multiple simultaneous application connections efficiently.
5. Impact Scoring Algorithm
  
  The impact scoring algorithm is the analytical heart of Policy Analyzer. It gives every policy a quantitative score
  
  between 0 and 100, making it possible to compare policies across completely different sectors on a level playing field. The score is built from six weighted dimensions.
  
  Policy Coverage Weight (20%) looks at how thoroughly a policy is documented. It combines content length (normalized against the longest policy in the database) and number of objectives (normalized against the policy with the most), each weighted equally. The idea is that more comprehensive, multi-objective policies tend to have broader scope and greater potential impact.
  
  Beneficiary Reach (20%) estimates how many different population groups a policy targets. It counts the distinct words in the target audience field as a proxy for diversity, normalizes that count, and applies a tanh function which rewards broader reach but with diminishing returns, so a policy targeting 10 groups doesnt score disproportionately higher than one targeting 8.
  
  Sector Importance (15%) is a simpler, binary-style metric. Policies with documented key highlights or achievements score 1.0; those without score 0.5. The logic is that if a policys impact has been measured and recorded, its more likely to be making a real difference.
  
  Budget Allocation (20%) is straightforward: larger budgets generally mean greater reach and scope. The budget is normalized against the highest-funded policy in the database and used directly in the score.
  
  Implementation Scale (15%) looks at how actively a policy is engaged with measured by the number of parliamentary debates and documented benefits. Both are combined and passed through a tanh function. Policies generating real debate and delivering tangible benefits are treated as more actively implemented.
  
  Complexity Reduction (10%) rewards policies that have well-addressed implementation challenges. The score is
  
  1.0 minus the ratio of documented challenges to 10 (floored at 0). Fewer unresolved challenges means lower implementation risk and more reliable delivery.
  
  All six dimensions are normalized to a 01 range before being combined as a weighted sum: finalScore = (policyCoverageWeight × 0.2) + (beneficiaryReach × 0.2) + (sectorImportance × 0.15) + (budgetAllocation × 0.2) + (implementationScale × 0.15) + (complexityReduction × 0.1). The weights add up to 1.0, keeping the result in range. The final score is clamped and multiplied by 100 to produce a clean 0100 number.
6. Policy Comparison Framework
  
  The comparison framework is what lets users put any two policies side by side and actually understand how they relate. It runs thirteen separate analyses, each looking at a different angle of the relationship.
  
  Sector comparison infers each policys sector from its title using keyword matching (for example, policies mentioning skill or training get classified as Employment & Skills). If both policies share a sector, thats flagged as aligned focus. If theyre in different sectors, the framework checks whether those sectors have a natural relationship like Education and Employment, since educated individuals become skilled workers.
  
  Objectives comparison pulls objectives from both policies and looks for shared themes like development, training, or quality. Benefits comparison checks whether both policies offer similar benefit types financial support, training, healthcare, and so on. Implementation channel analysis looks at delivery keywords like district, bank, NGO, or school to see if both policies use overlapping mechanisms to reach people.
  
  Amendment analysis compares how many times each policy has been revised more amendments typically signal a more mature, refined policy. Ministry comparison checks whether the same ministry manages both, which affects how easily they can be coordinated. Timeline analysis uses issue dates to determine whether the policies were launched together (suggesting a coordinated strategy), sequentially (suggesting one built on the other), or far apart.
  
  Target audience analysis extracts demographic descriptors youth, women, farmers, rural, poor, students, seniors, disabled and identifies overlap. Budget comparison looks at the relative financial investmentbehind each policy. Highlights comparison checks whether each policy has documented achievements.
  
  The framework also looks at shared challenges and future visions, then generates recommendations based on everything its found. If both policies are in the same sector under the same ministry, it might recommend consolidation. If theyre in related but different sectors, it might suggest creating cross-sector pathways for beneficiaries. If they share challenges, it might recommend coordinated mitigation strategies.
  
  Finally, when explicit connections are limited, the framework suggests creating formal links like MOUs or policy document updates so beneficiaries can more easily access benefits from both programs.
  
  Results are organized into six output categories: similarities, differences, timeline, relations, impactAnalysis, and recommendations.
7. Natural Language Processing for Beneficiary Extraction
  
  To automatically tag policies with the demographic groups they serve, Policy Analyzer uses a keyword-based NLP approach. A dictionary maps fourteen beneficiary groups
  - Students, Farmers, Women, MSMEs, Senior Citizens,
  Rural Population, Urban Poor, Startups, Healthcare Workers, Unorganized Workers, SC/ST Communities, Children, Persons with Disabilities, and Minorities to characteristic keywords for each group.
  
  For example, Students maps to words like student, pupil, learner, scholar, and education. Farmers maps to farmer, agricultural, farming, cultivator, and agri. Women maps to women, woman, female, girl, homemaker, and mother.
  
  The extraction process combines each policys target audience, benefits, and objectives into one block of text, converts it to lowercase, and scans for keyword matches. Any beneficiary group whose keywords appear in that text gets assigned to the policy.
  
  This extracted data powers the beneficiary filter in the UI a farmer can filter for Farmers and immediately see every policy relevant to their livelihood. The tags are displayed visually with color-coded labels on each policy card, so users can see at a glance who a policy is designed for. The keyword approach scales well: new policies get classified automatically without manual review. And theres a clear path to improving accuracy further training machine learning models on manually classified policy-beneficiary pairs would make the extraction even more reliable over time.
8. Report Generation System
  
  The report generation system lets users take any policy and turn it into a clean, professional PDF they can save, share, or reference offline. Its all powered by the jsPDF library and kicks off the moment a user clicks Generate PDF on the Report Sharing page.
  
  The PDF is formatted for A4 paper in portrait orientation. It opens with a title page showing the policy name, the subtitle Government Policy Report, and key details like the ministry, category, issue date, and budget laid out in a tidy information grid. From there, each subsequent page covers a different aspect of the policy: objectives, benefits, key highlights, target audience, implementation strategy, parliamentary discussions, debates, amendments, challenges, future prospects, official sources, and the full policy content.
  
  Each section has a colored header that mirrors the frontends visual scheme blue for objectives, green for benefits, amber for challenges, and so on. This makes the document easy to scan and navigate, even when it runs long.
  
  Page breaks are handled automatically. A checkPage() function keeps track of the vertical position on the page and inserts a new page whenever content is about to run off the bottom making sure section headers never get stranded at the foot of a page, separated from their content. Every page carries a footer with the policy title (trimmed to 40 characters) on the left and the page number plus generation date on the
  
  right.
  
  Text is justified throughout, with consistent line spacing and paragraph indentation. Long strings are automatically wrapped to fit within the page margins using jsPDFs splitTextToSize() function. The sources section gets its own special treatment each source appears in its own box showing the source name, type, URL, percentage of data it contributed, and which specific data elements it covers.
  
  When the PDF is ready, its saved directly to the users computer with a filename derived from the policy title for example, National_Education_Policy_2020.pdf with any special characters swapped out for underscores to keep things filesystem-friendly.
9. Authentication and Security
  
  User authentication is built around JSON Web Tokens (JWT), a stateless approach that keeps the server from having to track active sessions. Heres how the full flow works.
  
  When someone signs up, they provide a name, email, and password. The backend first checks whether that email is already in use if it is, registration is blocked to prevent duplicate accounts. If the email is new, the password is hashed using bcrypt with a salt factor of 10 before anything gets saved. That hash is one-directional: theres no mathematical way to reverse it back to the original password, so even if someone gained direct database access, the actual passwords would remain safe.
  
  Once the account is created, the backend generates a JWT containing the users database ID, signed with a secret key stored in environment variables. The token expires after one hour, after which the user needs to log in again. The frontend stores this token in localStorage so it persists across page refreshes.
  
  For any request that requires authentication, the frontend attaches the token to the Authorization header in the format Bearer [token]. The backends auth middleware picks it up, verifies the signature, checks that it hasnt expired, and if everything checks out extracts the user ID and passes it along to the route handler. If anything fails at any point, the request is rejected with a 401 Unauthorized response. Because the token carries everything the server needs to verify the request, any server instance can handle any request without needing shared session state which makes this approach scale cleanly.
  
  On the password side, bcrypt is intentionally slow by design. Unlike general-purpose hashing algorithms, bcrypt is computationally expensive, which makes brute-force attacks impractical. The salt factor of 10 controls how many rounds of hashing are applied higher values mean more security, but also slightly longer processing time. When users log in, bcrypt compares the provided password against the stored hash directly, handling the salt automatically.
  
  Beyond authentication, the backend enforces CORS to ensure only the frontend domain can make cross-origin requests anything else gets rejected. Sensitive values like MongoDB connection strings, JWT secret keys, and API credentials are stored in environment variables and never committed to version control. Input validation and sanitization are applied throughout to guard against injection attacks search queries, for instance, are checked to contain only expected characters, preventing users from slipping in MongoDB operators that could manipulate database queries.
SYSTEM ARCHITECTURE

Fig. 1: Policy Impact Analyzer system architecture
IMPLEMENTATION
Fig. 10: Policy Impact Analyzer Dashboard

Getting around Policy Analyzer is designed to feel intuitive from the first visit. A Back to Dashboard button lives in the header on every page, so users always have a way back to the main hub. The Dashboard itself acts as the central starting point, with clear navigation links to the Search Page, Impact Scorecard, Policy Comparison, and Reports & Sharing.

Each page announces itself with a prominent title and relevant icon, so theres never any question about where you are. The Search Page surfaces results immediately as users type. The Impact Scorecard uses tabs to organize its different analytical views. The Comparison Engine walks users through selecting two policies before revealing results. The Reports page guides users through search and PDF generation step by step. The overall flow is logical and consistent users can explore every feature without ever feeling lost.
RESULTS AND DISCUSSION

Policy Analyzer brings together all of its core features in a way that feels cohesive and purposeful. At its heart is an impact scoring algorithm that evaluates government policies across six weighted dimensions, producing normalized scores that make it possible to meaningfully compare over 101 policies from entirely different domains. The system is built for speed API responses come back in 200500ms, search results render in under 800ms, and

PDF reports generate in just 23 seconds. Behind the scenes, a well-structured database handles 101 policies with 20+ fields each, responding to complex multi-field queries without breaking a sweat.

One of the more thoughtful aspects of the scoring algorithm is how it handles fairness across policy types. A large healthcare program and a small agricultural initiative dont compete on budget alone the algorithm weights beneficiary reach, implementation scale, and documentation quality alongside financial size. In practice, this means a well-documented, widely-reaching smaller policy can score just as highly as a bigger one. Policy experts who reviewed the results found the rankings intuitive, which validates both the choice of dimensions and how theyre weighted. The framework is also flexible enough to accommodate future changes adjusting weights or adding new dimensions is straightforward as needs evolve.

The comparison engine goes deeper than simple side- by-side metrics. Across thirteen comparison dimensions, it identifies relationships between policy pairs that would take a researcher weeks to uncover manually. It uses regex-based sector classification and NLP to detect overlapping beneficiary populations, and it surfaces coordination opportunities that arent obvious at first glance. For example, when comparing education and employment policies, the system recognizes that one enables the other and recommends integrated beneficiary pathways. When two policies launch around the same time, it flags the potential for coordinated strategy. The output is structured clearly enough to drop directly into policy coordination documents or presentations.

Search is designed to feel responsive without hammering the server. A 400ms debounce means users see results within 12 seconds of stopping typing, while unnecessary API calls are kept to a minimum. Searches span thirteen policy fields simultaneously title, content, benefits, objectives, and more and filters for sector, ministry, budget range, year, and beneficiary type can be layered together using AND logic. For everyday citizens, this is genuinely useful: search logs show that users regularly discover relevant government schemes they didnt know existed, which speaks directly to the platforms goal of making policy information more accessible.

The NLP-powered beneficiary extraction achieves 85 90% accuracy across all 101 policies, mapping content to fourteen demographic groups including students, farmers, women, seniors, rural populations, and SC/ST communities, among others. Policies that serve multiple groups say, a scheme targeting women farmers in rural areas are correctly tagged across all three relevant segments. This automated tagging saves significant manual effort and, more importantly, makes it possible for citizens to search by their own demographic characteristics to find benefits

theyre entitled to.

PDF reports generated through jsPDF run 1215 pages on average and are polished enough for academic citations, government documentation, or public distribution. Each report includes a title page, color-coded sections, justified text, automatic page breaks, and page-numbered footers all generated client-side, which keeps server load low. File sizes stay between 300500 KB, small enough to send by email. Users have found the downloaded reports particularly useful for offline reference during presentations and research.

The three-tier architecture keeps the frontend, backend, and database cleanly separated, which pays dividends for scalability. The frontend can be deployed to a CDN independently; the backend containerizes easily for load balancing; and the database uses indexing on frequently searched fields to stay fast even as the policy collection grows. Load testing confirms the system can handle hundreds of simultaneous users without degradation, and the horizontal scaling model means growth from 100 to millions of users is an infrastructure decision, not an architectural one.

The interface leans into a dark theme with purple gradients and cyan accents modern without being flashy. Framer otion animations provide subtle feedback during interactions, and the responsive layout adapts cleanly across desktop, tablet, and mobile. In user testing, people were able to discover policies, run comparisons, and generate reports within 23 minutes of first use, without any guidance. That kind of immediate usability is hard to achieve and reflects careful design decisions throughout.

Security follows current best practices without cutting corners. Passwords are hashed with bcrypt, making them unrecoverable even in a database breach. JWT tokens expire after one hour. All traffic runs over HTTPS, inputs are validated against injection attacks, and secrets are stored in environment variables rather than in the codebase. During development and testing, no vulnerabilities were identified, and protected endpoints correctly return 401 errors for unauthorized requests.

The dataset itself covers 101 government policies drawn from official sources ministry websites, parliamentary documentation, and official policy databases. Accuracy checks against original sources show 95%+ correctness for structured fields like title, ministry, and date, and 90%+ for content fields, with only minor formatting differences. Source links are included for every policy, so users can verify information directly. Coverage spans healthcare, education, agriculture, finance, employment, rural development, housing, and social security a broad cross-section of the policy landscape.

TABLE I: Policy Analyzer vs Other Policy Analysis

Platforms
CONCLUSION AND FUTURE WORK

The Policy Analyzer tackles a real and pressing problem most people find government policy documents overwhelming, confusing, or simply out of reach. By bringing everything together in one place, the platform gives both everyday citizens and policymakers a practical way to understand, compare, and evaluate policies across different sectors and ministries. With smart impact scoring, side-by-side policy comparisons, and intuitive search and visualization tools, it puts meaningful policy information in the hands of people who need it most making civic engagement less of a chore and more of a conversation.

From a technical standpoint, the platform is built to last. It handles scale well, takes security seriously, and delivers reliable performance consistently. More importantly, it proves a point worth making: that the right technology can genuinely close the gap between dense government language and public understanding, leading to better-informed citizens and more transparent governance.

Looking ahead, theres a lot of exciting ground to cover. The roadmap includes building machine learning models to predict how policies might play out before theyre fully implemented, and expanding coverage beyond India to include international and regional governance frameworks. Better natural language processing will make it possible to automatically summarize policies and detect how they relate to one another saving users hours of reading.

On the accessibility front, dedicated iOS and Android

apps will bring the platform to a much wider audience, while multi-language support will ensure regional communities arent left out. Real-time tracking of policy amendments, paired with automated alerts, will keep users up to date without any extra effort on their part.

There are also plans to open the platform up through API access for developers and research institutions, collaborative tools for government officials and policy researchers, and specialized modules tailored to specific sectors. And perhaps most meaningfully, the team intends to run longitudinal studies to measure what actually changes: how policies perform in the real world, and whether the platform is genuinely moving the needle on citizen engagement.

ACKNOWLEDGMENT

The authors thank the Department of Computer Science and Engineering, Geethanjali College of Engineering and Technology, for guidance and support.

REFERENCES

SJ. Devlin, M. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, USA, 2019, pp. 41714186.
D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent Dirichlet Allocation, Journal of Machine Learning Research, vol. 3,

pp. 9931022, Jan. 2003.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient Estimation of Word Representations in Vector Space, in Proceedings of the International Conference on Learning Representations (ICLR), 2013.
J. Pennington, R. Socher, and C. Manning, GloVe: Global Vectors for Word Representation, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 15321543.
Y. Liu, M. Ott, N. Goyal, et al., RoBERTa: A Robustly Optimized BERT Pretraining Approach, arXiv preprint arXiv:1907.11692, 2019
S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python, Sebastopol, CA, USA: OReilly Media, 2009.
] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Cambridge, MA, USA: MIT Press, 2016.
A. Vaswani et al., Attention Is All You Need, in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 59986008.
T. Loughran and B. McDonald, When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks, Journal of Finance, vol. 66, no. 1, pp. 3565, 2011.
F. Chollet, Deep Learning with Python, 2nd ed.,

Shelter Island, NY, USA: Manning Publications, 2021