DOI : https://doi.org/10.5281/zenodo.19627644
- Open Access

- Authors : Guruvelli Ramya Sree, Amrutaluri Dhaneshwar, Shaik Affiq Ahmed, Mr. Bhujanga Reddy
- Paper ID : IJERTV15IS040516
- Volume & Issue : Volume 15, Issue 04 , April – 2026
- Published (First Online): 17-04-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Policy Impact Analyzer using Parliament Debates & News
Guruvelli Ramya Sree
Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana
Amrutaluri Dhaneshwar
Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana
Shaik Affiq Ahmed
Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana
Mr. Bhujanga Reddy
Assistant Professor, Computer Science and Engineering Geethanjali College of Engineering and Technology Hyderabad, Telangana
Abstract – Policy Analyzer is a fully integrated web-based tool that allows the systematic analysis, comparison, and evaluation of policies formulated by the Indian government. The software has adopted a new algorithm of impact scoring that consists of six different parameters to score and prioritize policies. These parameters include policy coverage, reach to beneficiaries, budget allocated, scale of implementation, sector relevance, and simplification. The application uses the combination of React in its front-end and Node.js in its back-end and includes 101+ policies within the sectors. Policy analyzer has several important features, including intelligent policy search using debounce optimization technique, comparative policy analysis to find synergies among different policies, automatic identification of beneficiaries, and analysis of policy distribution among sectors.
Index Terms – Policy Analytics, Impact Scoring Algorithm, Policy Comparison System, Full-Stack Web Application, Government Policy Intelligence, Natural Language Processing, Beneficiary Analysis, Sector-wise Policy Distribution
-
INTRODUCTION
Policy Analyzer is a web-based platform designed to change how people access, analyze, compare, and understand government policies in India. The project tackles a key issue in governance: the scattered and hard-to-reach policy information across many government databases, ministries, and official sources. India has rolled out over 100 key government policies in areas like healthcare, education, agriculture, employment, rural development, financial services, and social security. These policies represent huge investments of public funds, often reaching thousands of crores of rupees, and they significantly affect the lives of hundreds of millions of citizens.
Despite their importance, these policies are spread out across various government websites, ministry portals, parliamentary databases, and official documents. A researcher trying to understand a specific policy must sift through many different sources. A citizen looking for relevant government schemes has no single platform to use. A government official who needs to coordinate policies across ministries does not have tools to find connections and dependencies. A policy
analyst who wants to compare policies across sectors lacks a standard way to do so. Policy Analyzer was built to address these issues by bringing together policy information, offering useful analysis tools, and making policy information available to various groups.
The idea for Policy Analyzer came from recognizing that policy analysis is essential for governance and public welfare, yet it remains mostly manual and fragmented. Government policies are complex documents that include various interconnected elements: objectives, target beneficiaries, implementation strategies, budget allocations, amendments, parliamentary discussions, challenges, and future prospects. Understanding a single policy fully often involves reading lengthy documents and piecing together information from different sources. Comparing two policies to find their relationships, similarities, or conflicts demands time- consuming manual analysis that can lead to subjective interpretations. To determine which policies might benefit a specific citizen based on their demographic profile, one must know all available policies and their target audiencesa knowledge base typically held by government officials or dedicated researchers. Evaluating the impact of various policies across sectors demands a standardized evaluation framework that is not currently available in a unified manner. Policy Analyzer tackles these challenges with an integrated platform that combines modern technology, smart algorithms, and user-friendly design to make policy information accessible and enable systematic analysis.
Policy Analyzer consists of several interconnected
components that function as a unified system. The platform offers a robust policy search engine that lets users find relevant policies by inputting policy names, keywords, sectors, or beneficiary types. Instead of just matching keywords, the search system uses debouncing optimization to limit unnecessary API calls while providing real-time suggestions that help users discover policies they may not have known about.
Behind the search feature is a database containing 101 government policies, each with detailed information such as
title, content, summary, ministry, category, budget allocation, target audience, objectives, benefits, implementation strategy, parliamentary discussions, debates, amendments, challenges, and future prospects. This extensive detail allows for a level of analysis and comparison that far exceeds traditional policy databases.
The platform’s main innovation is an intelligent impact scoring system that assesses each policy across six weighted dimensions: policy coverage (measuring documentation and scope), beneficiary reach (estimating the targeted population segments), budget allocation (reflecting financial resources invested), implementation scale (based on documented discussions and benefits), sector importance (based on explicit highlights), and complexity reduction (reflecting how well- documented challenges are). These six dimensions are normalized against all policies in the database and combined with tailored weights to create a final impact score ranging from 0 to 100. This scoring allows for fair comparisons of policies across different sectors, objectives, implementation methods, and operational scales.
In addition to analyzing individual policies, Policy Analyzer offers a sophisticated policy comparison feature that stands out in governance technology. Users can choose any two policies and receive a thorough comparative analysis that highlights similarities, differences, synergies, and complementarities. The comparison framework looks at several aspects: Do the policies target the same sectors? Are they led by the same ministry or different ones? Do they serve overlapping beneficiary populations? Were they introduced around the same time or one after the other? Are there clear connections indicating that one policy developed from another? Could beneficiaries of one policy access resources from the other? Do both policies face similar implementation challenges that coordinated efforts might resolve? The system automatically finds logical relationships between policies and provides actionable insights for policy coordination. For instance, it might show that a skill development policy and a microfinance policy target overlapping populations and could be combined into a single program that helps skill-trained individuals obtain business loans and start businesses. This kind of insight, which usually takes months of manual research, is generated by Policy Analyzer in seconds.
-
LITERATURE REVIEW
The recent advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) have greatlychanged how people perceive and analyze public opinion and policy data in various ways. Many studies have examined the role of machine learning and deep learning techniques in extracting sentiment, identifying topics, and tracking trends from large amounts of text data, like news articles. These studies provide a solid foundation for developing systems that aim to analyze the impact of policy. The research on AI-driven public sentiment and trend prediction for 2025 combines CNN, BERT, and Graph Neural Networks (GNN) to analyze social
media discussions and forecast topic trends. The model achieves high accuracy in sentiment detection and can track changes in discussions over time. However, the system relies solely on social media data and overlooks formal sources, such as parliamentary debates and organized news content, which are essential for a complete policy evaluation.
The study on AI-based sentiment analysis for government policies (2025) uses BERT to categorize public opinion about policies. It shows how AI can detect people’s feelings toward various policy areas at a subtle level. Although the paper focuses on Twitter data, it overlooks other important sources like parliamentary debates and news media. It also lacks visualization techniques and time-series analysis to help understand the results.
Another key work cited is the hybrid deep learning approach that combines LSTM with machine learning techniques for analyzing Indian government policies (2024). This study demonstrates that LSTM architectures can model contextual dependencies in text more accurately than traditional methods. However, it still falls short because it does not use the latest transformer-based models, and it barely addresses scalability and the integration of multiple sources.
Finally, a systematic review in policy sentiment analysis (2025) evaluated different machine learning methods, including SVM and Logistic Regression. The paper identifies SVM as a strong baseline model and emphasizes the importance of preprocessing methods. However, it lacks advanced NLP models, interactive dashboards, and predictive features needed for real-life applications. Most earlier methods focus on single data sources, and there is no unified system that integrates parliamentary debates, news articles, and public opinion. Additionally, they offer limited visualization and real-time analysis options. The proposed Policy Impact Analyzer addresses these gaps by integrating multiple data sources, modern NLP techniques, and interactive dashboards to provide a complete and scalable tool for evaluating policy impact.
-
EXISTING SYSTEM
Currently, most policy analysis and public opinion monitoring systems operate separately. Each works with individual data sources instead of combining them into one analytical platform. One key source of official information is parliamentary websites, which publish transcripts of debates, policy documents, and legislative discussions. However, these documents are often lengthy, disorganized, and hard to interpret. They also lack features like summarization, sentiment analysis, or comparisons with public opinion. This forces users to read and analyze large amounts of text themselves to understand the policy implications.
News portals and media platforms do cover government policies, but each has its own viewpoint. This may lead to biased or inconsistent information. Right now, there is no way to unite insights from various news sources into one clear perspective. Additionally, these platforms do not provide tools for monitoring sentiment changes, identifying key discussion points, or tracking how public perception shifts over time. As
a result, users often struggle to get a fair and complete picture of policy impact.
Another category of existing systems is social media analytics tools. These tools typically focus on tracking hashtags, engagement metrics, and basic sentiment classification. While they capture public reactions in real time, they only operate at a surface level. Moreover, they do not connect discussions to official parliamentary debates or policy documents. They also fail to provide meaningful summaries or insights based on specific topics. Therefore, these tools do not fully reveal how policies are perceived across different platforms.
Furthermore, academic and research-based instruments offer advanced capabilities for analyzing text using NLP methods. However, these tools can be complicated, tedious, and challenging to use without technical knowledge. They are not designed for the general public, such as policymakers, journalists, or citizens, and lack user-friendly dashboards and real-time data processing features. Overall, current systems suffer from scattered data sources and extensive manual analysis. They also lack integration of multiple sources, minimal use of advanced AI techniques, and user-friendly visualizations. These issues highlight the need for a simple, all-in-one solution for thorough policy impact analysis.
-
PROPOSED SYSTEM
-
System Overview
Policy Analyzer is built on a three-tier web architecture that keeps things clean and organized: a frontend that users interact with, a backend that handles the logic, and a database that stores everything. This separation makes the system easier to scale and maintain over time, since each layer can be worked on independently without breaking the others.
The frontend and backend talk to each other exclusively through RESTful APIs which means the door is already open for building a mobile app or connecting with other systems down the road. Data updates happen in real time, so users are always looking at the most current policy information.
The system is also designed with the future in mind. Features like policy search, comparison, scoring, and report generation are logically separated, so they could eventually be spun off as independent microservices without a major overhaul. Security is taken seriously throughout authentication uses JWT tokens, passwords are hashed with bcrypt, and CORS restrictions are in place to keep things locked down.
-
Frontend Components and User Interface
The frontend is built with React.js, which makes it easy to build interactive, component-driven interfaces. In total, there are seven major pages, each designed around a specific workflow.
The Dashboard is where users land first it gives a quick overview of the system, including how many policies are in the database, what sectors are covered, and whats been updated recently. The Policy Search page is where most users will spend their time, featuring a smart search bar with real- time suggestions, animated visual effects, and results that show policy titles, ministries, categories, issue dates, and summaries at a glance.
The Impact Scorecard page breaks policy rankings down into four tabs: Impact Ranking (policies sorted by score), Sector-wise Distribution (policies grouped by sector with average scores), Beneficiary Analysis (which demographic groups are covered and by how many policies), and Visual Analytics (trend charts, investment breakdowns, and heatmaps). The Policy Comparison page lets users pick any two policies and get a side-by-side breakdown similarities, differences, timeline relationships, and recommendations for how the two could work better together.
Theres also a detailed policy view that opens in a modal, showing everything from objectives and benefits to parliamentary debates, amendments, challenges, and future outlook. The Report Sharing page lets users generate and download professional PDF reports for any policy. And finally, the Login and Signup pages handle account creation and secure authentication using JWT tokens.
The interface is fully responsive it works just as well on a phone as it does on a widescreen monitor. Visually, it leans into a dark theme with gradient accents in purple, cyan, pink, andblue, giving it a modern feel. Animations and transitions are handled by Framer Motion, which adds a layer of polish to every interaction.
On the performance side, search inputs use debouncing
-
the system waits 400 milliseconds after you stop typing before firing an API call, which cuts down on unnecessary server requests. The interface also uses progressive disclosure: search results show the essentials upfront, and users can click View Full Details when they want the full picture. Error handling is built in throughout, with clear, readable messages whenever something goes wrong.
-
-
Backend Infrastructure and API Design
The backend runs on Node.js with Express.js, keeping things lightweight and efficient. Its organized into logical layers: configuration (database settings), models (data structures), routes (API endpoints), middleware (authentication and other shared logic), and scrapers (for collecting policy data).
The API follows REST conventions GET for fetching data, POST for creating it, and so on. All policy-related endpoints live under /api/, with /api/auth/signup and
/api/auth/login handling user accounts, and /api/policies
handling everything policy-related. Searching is as simple as passing a query parameter for example,
/api/policies?q=education and getting back a JSON array of matching results.
Error handling is consistent and meaningful: 200 for success, 201 when something new is created, 400 for bad input, 401 for authentication issues, 404 when something cant be found, and 500 for server-side problems. The backend also logs requests and errors, which makes debugging and monitoring much easier in production.
CORS is configured to allow the frontend even when hosted on a separate domain to communicate with the backend without issues. Request parsing, error handling, and authentication checks are all handled through Express middleware. For database operations, connection pooling keeps things efficient under concurrent load, and queries are optimized to stay fast even as usage grows.
-
Database Design and Schema
Policy Analyzer uses MongoDB as its database, and its a natural fit. Policy documents vary a lot in how much information they contain some are extensively documented, others are sparse and MongoDBs flexible schema handles that variability without any awkward workarounds.
There are two main collections. The Users collection stores account information: a MongoDB-generated ID, the users name, a unique email address, a bcrypt-hashed password, and a creation timestamp. Plaintext passwords are never stored.
The Policies collection is more involved. Each document includes the policy ID, title, full content, executive summary, issue date, ministry, category, and status (Active, Proposed, or Archived). Beyond that, it stores arrays for benefits, objectives, debates, amendments, key highlights, and implementation challenges, along with fields for target audience, budget allocation, implementation strategy, parliamentary discussions, future prospects, and source documentation.
To keep search fast, MongoDB indexes are created on the fields users are most likely to search title, content, category, and ministry. The $regex operator enables case- insensitive keyword searches across these fields. The design also accounts for future growth: sharding (splitting data across servers by policy ID or sector) and vertical scaling are both supported. Connection pooling handles multiple simultaneous application connections efficiently.
-
Impact Scoring Algorithm
The impact scoring algorithm is the analytical heart of Policy Analyzer. It gives every policy a quantitative score
between 0 and 100, making it possible to compare policies across completely different sectors on a level playing field. The score is built from six weighted dimensions.
Policy Coverage Weight (20%) looks at how thoroughly a policy is documented. It combines content length (normalized against the longest policy in the database) and number of objectives (normalized against the policy with the most), each weighted equally. The idea is that more comprehensive, multi-objective policies tend to have broader scope and greater potential impact.
Beneficiary Reach (20%) estimates how many different population groups a policy targets. It counts the distinct words in the target audience field as a proxy for diversity, normalizes that count, and applies a tanh function which rewards broader reach but with diminishing returns, so a policy targeting 10 groups doesnt score disproportionately higher than one targeting 8.
Sector Importance (15%) is a simpler, binary-style metric. Policies with documented key highlights or achievements score 1.0; those without score 0.5. The logic is that if a policys impact has been measured and recorded, its more likely to be making a real difference.
Budget Allocation (20%) is straightforward: larger budgets generally mean greater reach and scope. The budget is normalized against the highest-funded policy in the database and used directly in the score.
Implementation Scale (15%) looks at how actively a policy is engaged with measured by the number of parliamentary debates and documented benefits. Both are combined and passed through a tanh function. Policies generating real debate and delivering tangible benefits are treated as more actively implemented.
Complexity Reduction (10%) rewards policies that have well-addressed implementation challenges. The score is
1.0 minus the ratio of documented challenges to 10 (floored at 0). Fewer unresolved challenges means lower implementation risk and more reliable delivery.
All six dimensions are normalized to a 01 range before being combined as a weighted sum: finalScore = (policyCoverageWeight × 0.2) + (beneficiaryReach × 0.2) + (sectorImportance × 0.15) + (budgetAllocation × 0.2) + (implementationScale × 0.15) + (complexityReduction × 0.1). The weights add up to 1.0, keeping the result in range. The final score is clamped and multiplied by 100 to produce a clean 0100 number.
-
Policy Comparison Framework
The comparison framework is what lets users put any two policies side by side and actually understand how they relate. It runs thirteen separate analyses, each looking at a different angle of the relationship.
Sector comparison infers each policys sector from its title using keyword matching (for example, policies mentioning skill or training get classified as Employment & Skills). If both policies share a sector, thats flagged as aligned focus. If theyre in different sectors, the framework checks whether those sectors have a natural relationship like Education and Employment, since educated individuals become skilled workers.
Objectives comparison pulls objectives from both policies and looks for shared themes like development, training, or quality. Benefits comparison checks whether both policies offer similar benefit types financial support, training, healthcare, and so on. Implementation channel analysis looks at delivery keywords like district, bank, NGO, or school to see if both policies use overlapping mechanisms to reach people.
Amendment analysis compares how many times each policy has been revised more amendments typically signal a more mature, refined policy. Ministry comparison checks whether the same ministry manages both, which affects how easily they can be coordinated. Timeline analysis uses issue dates to determine whether the policies were launched together (suggesting a coordinated strategy), sequentially (suggesting one built on the other), or far apart.
Target audience analysis extracts demographic descriptors youth, women, farmers, rural, poor, students, seniors, disabled and identifies overlap. Budget comparison looks at the relative financial investmentbehind each policy. Highlights comparison checks whether each policy has documented achievements.
The framework also looks at shared challenges and future visions, then generates recommendations based on everything its found. If both policies are in the same sector under the same ministry, it might recommend consolidation. If theyre in related but different sectors, it might suggest creating cross-sector pathways for beneficiaries. If they share challenges, it might recommend coordinated mitigation strategies.
Finally, when explicit connections are limited, the framework suggests creating formal links like MOUs or policy document updates so beneficiaries can more easily access benefits from both programs.
Results are organized into six output categories: similarities, differences, timeline, relations, impactAnalysis, and recommendations.
-
Natural Language Processing for Beneficiary Extraction
To automatically tag policies with the demographic groups they serve, Policy Analyzer uses a keyword-based NLP approach. A dictionary maps fourteen beneficiary groups
-
Students, Farmers, Women, MSMEs, Senior Citizens,
Rural Population, Urban Poor, Startups, Healthcare Workers, Unorganized Workers, SC/ST Communities, Children, Persons with Disabilities, and Minorities to characteristic keywords for each group.
For example, Students maps to words like student, pupil, learner, scholar, and education. Farmers maps to farmer, agricultural, farming, cultivator, and agri. Women maps to women, woman, female, girl, homemaker, and mother.
The extraction process combines each policys target audience, benefits, and objectives into one block of text, converts it to lowercase, and scans for keyword matches. Any beneficiary group whose keywords appear in that text gets assigned to the policy.
This extracted data powers the beneficiary filter in the UI a farmer can filter for Farmers and immediately see every policy relevant to their livelihood. The tags are displayed visually with color-coded labels on each policy card, so users can see at a glance who a policy is designed for. The keyword approach scales well: new policies get classified automatically without manual review. And theres a clear path to improving accuracy further training machine learning models on manually classified policy-beneficiary pairs would make the extraction even more reliable over time.
-
-
Report Generation System
The report generation system lets users take any policy and turn it into a clean, professional PDF they can save, share, or reference offline. Its all powered by the jsPDF library and kicks off the moment a user clicks Generate PDF on the Report Sharing page.
The PDF is formatted for A4 paper in portrait orientation. It opens with a title page showing the policy name, the subtitle Government Policy Report, and key details like the ministry, category, issue date, and budget laid out in a tidy information grid. From there, each subsequent page covers a different aspect of the policy: objectives, benefits, key highlights, target audience, implementation strategy, parliamentary discussions, debates, amendments, challenges, future prospects, official sources, and the full policy content.
Each section has a colored header that mirrors the frontends visual scheme blue for objectives, green for benefits, amber for challenges, and so on. This makes the document easy to scan and navigate, even when it runs long.
Page breaks are handled automatically. A checkPage() function keeps track of the vertical position on the page and inserts a new page whenever content is about to run off the bottom making sure section headers never get stranded at the foot of a page, separated from their content. Every page carries a footer with the policy title (trimmed to 40 characters) on the left and the page number plus generation date on the
right.
Text is justified throughout, with consistent line spacing and paragraph indentation. Long strings are automatically wrapped to fit within the page margins using jsPDFs splitTextToSize() function. The sources section gets its own special treatment each source appears in its own box showing the source name, type, URL, percentage of data it contributed, and which specific data elements it covers.
When the PDF is ready, its saved directly to the users computer with a filename derived from the policy title for example, National_Education_Policy_2020.pdf with any special characters swapped out for underscores to keep things filesystem-friendly.
-
Authentication and Security
User authentication is built around JSON Web Tokens (JWT), a stateless approach that keeps the server from having to track active sessions. Heres how the full flow works.
When someone signs up, they provide a name, email, and password. The backend first checks whether that email is already in use if it is, registration is blocked to prevent duplicate accounts. If the email is new, the password is hashed using bcrypt with a salt factor of 10 before anything gets saved. That hash is one-directional: theres no mathematical way to reverse it back to the original password, so even if someone gained direct database access, the actual passwords would remain safe.
Once the account is created, the backend generates a JWT containing the users database ID, signed with a secret key stored in environment variables. The token expires after one hour, after which the user needs to log in again. The frontend stores this token in localStorage so it persists across page refreshes.
For any request that requires authentication, the frontend attaches the token to the Authorization header in the format Bearer [token]. The backends auth middleware picks it up, verifies the signature, checks that it hasnt expired, and if everything checks out extracts the user ID and passes it along to the route handler. If anything fails at any point, the request is rejected with a 401 Unauthorized response. Because the token carries everything the server needs to verify the request, any server instance can handle any request without needing shared session state which makes this approach scale cleanly.
On the password side, bcrypt is intentionally slow by design. Unlike general-purpose hashing algorithms, bcrypt is computationally expensive, which makes brute-force attacks impractical. The salt factor of 10 controls how many rounds of hashing are applied higher values mean more security, but also slightly longer processing time. When users log in, bcrypt compares the provided password against the stored hash directly, handling the salt automatically.
Beyond authentication, the backend enforces CORS to ensure only the frontend domain can make cross-origin requests anything else gets rejected. Sensitive values like MongoDB connection strings, JWT secret keys, and API credentials are stored in environment variables and never committed to version control. Input validation and sanitization are applied throughout to guard against injection attacks search queries, for instance, are checked to contain only expected characters, preventing users from slipping in MongoDB operators that could manipulate database queries.
-
-
SYSTEM ARCHITECTURE
Fig. 1: Policy Impact Analyzer system architecture
-
Three-Tier Architecture Overview
Policy Analyzer is built on a three-tier client-server architecture, which essentially means the system is divided into three distinct layers: the Frontend (what users see), the Backend (where the logic lives), and the Database (where data is stored). Each layer has its own job, and they talk to each other through REST APIs and database queries. Keeping these layers separate makes it much easier to scale individual parts of the system, mainain clear boundaries of responsibility, and add new features down the road without disrupting everything else.
-
Frontend Pages and Components
The Frontend is made up of five main pages, each serving a different purpose. The Dashboard is the first thing users land on it acts as the home base and navigation hub. From there, the Search Base page lets users discover policies through a smart search interface that shows suggestions in real time. The Impact Scorecard page goes deeper, offering four analytical views: Impact Ranking (policies sorted by score), Sector-wise (grouped by domain), Beneficiary Analysis (whos actually being served), and Visual Analytics (trend charts and heatmaps). The Policy Comparison page lets users pick any two policies and see a side-by-side breakdown across thirteen different dimensions. Finally, the Auth Pages handle account
creation and login, secured with JWT token authentication.
-
Backend API and Routes Layer
The Backend exposes a set of REST API endpoints, all organized under the /api/ namespace. Policy routes take care of searching, retrieving, filtering, and sorting all by querying the database using MongoDB filters across thirteen policy fields. Authentication routes handle user signup and login: they validate whats submitted, hash passwords using bcrypt, and hand back JWT tokens. In general, the API layer receives requests from the Frontend, figures out where they need to go, pulls from the database, applies any necessary business logic, and sends back a clean JSON response with an HTTP status code that makes it clear whether things went well or not.
-
Backend Middleware and Authentication
The Middleware layer is the systems security checkpoint for protected endpoints. Every time a request comes in, the middleware pulls the JWT token from the Authorization header, checks that the signature is valid, and confirms the token hasnt expired. If everything checks out, the request moves forward; if not, it gets rejected with a 401 Unauthorized response. One of the nice things about this approach is that its stateless any Backend server can verify any token just by checking the signature, so theres no need to store session data server-side, which makes the whole thing scale much more cleanly.
-
Backend Business Logic Impact Scoring
The Impact Scoring algorithm rates each policy across six weighted dimensions: policy coverage (20%), beneficiary reach (20%), budget allocation (20%), implementation scale (15%), sector importance (15%), and complexity reduction (10%). Each dimension gets normalized to a 01 scale using the highest values in the database as reference points, so everything is measured on a fair, consistent basis. The six normalized scores are then combined using those weights to produce a final impact score between 0 and 100. This makes it possible to meaningfully compare policies from completely different sectors that might otherwise be hard to stack up against each other.
-
Backend Business Logic Policy Comparison
The Comparison algorithm takes two policies and examines them across thirteen dimensions: sector alignment, objective overlap, benefit types, implementation channels, amendment history, ministry governance, launch timeline, target audiences, budget allocation, documented achievements, shared challenges, future alignment, and strategic recommendations. From this analysis, it generates insights organized into six categories similarities, differences, timeline relationships, strategic relations, impact analysis, and actionable recommendations giving users a thorough picture of how two policies relate and where they might complement or conflict with each other.
-
Backend Business Logic Natural Language Processing
The NLP component handles the task of figuring out who a policy is actually meant to help. It works by maintaining a dictionary that maps demographic groups students, farmers, women, seniors, disabled persons, entrepreneurs, and others to characteristic keywords. When a policy is processed, the system scans the target audience and benefits fields for those keywords. Wherever theres a match, that demographic group gets tagged as a beneficiary segment. Its a straightforward but effective approach that allows policies to be automatically classified by who they serve and filtered accordingly.
-
Database Collections and Schema
The MongoDB database is organized into two main collections. The Users collection holds account information: user ID, name, email (which must be unique), a bcrypt-hashed password, and a creation timestamp. The Policies collection is the heart of the system, storing 101+ government policies with around 20+ fields each including title, content, summary, issue date, ministry, category, status, benefits, target audience, budget, objectives, implementation strategy, parliamentary discussions, debates, amendments, highlights, challenges, and future prospects. To keep things fast, the database has indexes on the fields most commonly searched title, content, category, and ministry.
-
Data Flow Policy Search
A policy search kicks off the moment a user starts typing in the search box. Rather than firing off a request with every keystroke, the Frontend waits 400ms (a technique called debouncing) before sending a GET request to
/api/policies?q=searchTerm. The Backend picks up the search term, builds a MongoDB $regex filter that searches across thirteen policy fields at once, and returns the matching documents. The Frontend takes that JSON response, stores the results in React state, and displays them showing each policys title, ministry, category, and summary. Clicking on any result opens a modal with the full policy details.
-
Data Flow Impact Score Calculation
When a user opens the Impact Scorecard page, the Frontend requests all policies from the Backend via GET
/api/policies. Once the full list comes back, the Frontend runs the Impact Scoring algorithm on each policy across all six dimensions entirely on the client side. The computed scores get stored in state, sorted from highest to lowest, and displayed in the Impact Ranking tab. The Frontend also groups policies by sector to calculate average scores, pulls out beneficiary demographic data, and runs trend analysis for the other tabs.
-
Data Flow PDF Report Generation
PDF generation is handled entirely on the Frontend
no Backend involvement needed. From the Report Sharing page, users search for a policy and, once selected, click Generate PDF. This triggers the generatePDF() function, which uses the jsPDF library to build a document from scratch. The PDF starts with a title page showing the policy name, ministry, category, date, and budget, then continues with color-coded sections covering objectives, benefits, highlights, target audience, implementation strategy, discussions, debates, amendments, challenges, future prospects, and the full policy content. The finished file downloads directly to the users computer, keeping the server completely out of the loop.
-
REST API Endpoints Summary
The system has four core API endpoints. GET
/api/policies returns all policies, or filters them if a
?q=searchTerm query parameter is included searching across thirteen fields with case-insensitive regex matching. POST /api/auth/signup creates a new user account using a name, email, and password. POST /api/auth/login authenticates a user and returns a JWT token. All requests and responses use JSON; authenticated requests include an Authorization: Bearer {token} header; and every response comes with an HTTP status code 200 or 201 for success, and 400, 401, 404, or 500 for varius error conditions.
Kubernetes handle the rest.
-
Security Implementation
Security is layered throughout the system. All browser- to-server communication runs over HTTPS/SSL. Passwords are hashed with bcrypt before storage and can never be reversed if the database were ever compromised, raw passwords wouldnt be exposed. JWT tokens protect sensitive endpoints, and anything with an invalid or expired token gets a 401. On the database side, MongoDB authentication is enabled, with optional encryption-at-rest for an extra layer of protection. User input is validated at the application level to prevent injection attacks, error messages are kept vague enough not to leak system internals, and sensitive configuration like secret keys and database credentials lives in environment variables rather than in the codebase itself.
-
-
IMPLEMENTATION
-
Authentication System Sign Up Page
Fig. 2: Signup & Login Page
The SignUp page is where new users create their accounts. It keeps things straightforward with just three fields to fill in: Full Name, Email, and Password. Visually, the page greets users with a soft purple gradient background and a clean white card centered on the screen, keeping the focus entirely on getting started.
When a user fills in their details and hits the Sign Up button, the frontend first makes sure none of the fields are left empty before doing anything else. Once that checks out, the data gets sent over to the backend via a POST request to /api/auth/signup. On the backend side, the system checks whether that email address is already registered. If its a new one, the password gets hashed using bcrypt before anything is stored so raw passwords never touch the database. The new users information is then saved to the Users collection, and a JWT token is generated and sent back. The frontend picks up that token and stores it in localStorage, keeping the users session alive.
For anyone who already has an account, theres a Login link right on the page that takes them straight to the login interface no dead ends.
-
Policy Search Policy Intelligence Page
Fig. 3: Policy Search
The Policy Intelligence page is the heart of the search experience. At the top sits a prominent search bar labeled Explore Policies, inviting users to type in whatever theyre looking for a keyword, a sector name, a beneficiary type, or anything else that comes to mind.
To keep things smooth and avoid hammering the server on every keystroke, the frontend uses a 400ms debounce before firing off an API call. So when a user types something like education, the system waits just a beat, then searches across all thirteen policy fields and pulls up matching results. Each result card shows the essentials at a glance: policy title, ministry name, category, and issue date. A results counter at the top something like Results Found: 17 lets users know how many matches came up. The cards themselves have a clean look with gradient backgrounds and cyan accents, making them easy to scan and visually distinct from one another. Clicking any card takes the user to a full detailed view of that policy or opens
a modal with all the information laid out.
-
Impact Scorecard Policy Rankings and Analytics
Fig. 4: Impact Scorecard
The Impact Scorecard page is built for users who want to dig into how policies actually measure up. Its organized into four tabs Impact Ranking, Sector-wise, Beneficiary Analysis, and Visual Analytics each offering a different angle on the data.
Right at the top, the Top 3 Policies section gives a quick snapshot of the highest-performing policies, displayed in color-coded cards: gold for first place, silver for second, and orange for third. Each card shows the policy title, the ministry behind it, and its impact score 78, 62, and 61 respectively in the sample data.
Below that, an Advanced Filters section lets users narrow things down using dropdowns for Sector, Ministry, Budget Range, Year, and Beneficiary Type. The full All Policies table beneath it lists all 101 policies in ranked order, with columns for rank, title, ministry, beneficiary tags, and impact score. Score badges are color-coded green for high-impact, yellow for mid-range, red for lower scores making it easy to spot patterns at a glance. The table also supports sorting and pagination, so navigating through a large dataset doesnt feel overwhelming.
-
Policy Comparison Engine
Fig. 5: Policy Comparison Engine
The Policy Comparison Engine is designed for users who want to put two policies side by side and really understand how they differ or where they overlap. The interface is clean and guided: two selector sections, labeled First Policy and Second Policy, each with a searchable dropdown that shows policy names and their associated ministries.
Once both policies are chosen, clicking the Compare Policies Now button styled in a bold pink-to-magenta gradient so its hard to miss kicks off the comparison. The results appear below in a structured, multi-section layout. Basic metadata for both policies is shown side by side: Ministry, Category, Issue Date, and Status. From there, a set of tabs Comparison, Timeline, Similarities, Differences, Relations, and Impact breaks the analysis down further across all thirteen comparison dimensions, giving users a thorough look at how the two policies stack up.
-
Policy Details Display
Fig. 6: Policy Details
The detailed policy view is where users get the full picture on any individual policy. To make the volume of information digestible, everything is organized into clearly color-coded sections, each with its own icon. Blue covers
Objectives, green handles Key Benefits, yellow is for Highlights, pink for Target Audience, purple for Implementation, orange for Parliament Discussions, red for Debates, indigo for Amendments, amber for Challenges, and green again for Future Prospects.
At the top, a clean information grid shows the core details: Ministry, Category, Issue Date, and Status. The sections that follow use bulleted lists wherever the content is an array objectives, benefits, highlights, and so on so nothing feels like a wall of text. Longer written content is formatted with justified alignment, giving the whole page a polished, professional feel.
-
Official Sources Documentation
Fig. 7: Official Sources
Transparency matters, and the Official Sources Documentation section is built around that idea. Every source used to compile a policys data is listed in its own card, showing the Source number, Source Name (for example, Wikipedia), Source Type (such as Online Encyclopedia), and a percentage badge indicating how much of the policy data came from that source 25% meaning roughly a quarter of the information originated there.
Below the source header, the official URL is displayed as a clickable cyan hyperlink, making it easy for users to go verify information directly. A Data Covered section uses tags to show exactly which aspects of the policy like Policy Overview or Key Features were drawn from that particular source. Multiple sources stack vertically in the same consistent card format, giving users a clear and honest account of where the data comes from.
-
Reports and Sharing PDF Generation
Fig. 8: Report Sharing
The Reports & Sharing page gives users a way to take policy information with them in a clean, shareable format. It starts with the same search interface found elsewhere in the app users look up the policy they want, select it, and a summary appears showing the title, ministry, category, budget, and issue date.
Once a policy is loaded, a PD Report Ready section appears with a prominent Generate PDF button in a cyan gradient. Clicking it triggers the jsPDF library on the frontend, which assembles a professional document on the fly. The PDF opens with a title page covering the policy name, ministry, category, and issue date, then flows into dedicated sections for Objectives, Key Benefits, Challenges, Debates, and other relevant attributes. When its ready, the file downloads automatically named after the policy itself, like National_Education_Policy_2020.pdf.
-
User Interface Design and Color Scheme
Fig. 9: User Interface
Policy Analyzers visual design leans into a modern dark theme that feels premium without being overdone. The main background runs a gradient from dark slate to deep purple, setting a sophisticated tone throughout. Buttons use gradients cyan-to-purple or pink that make calls to action visually obvious. Primary text is white; secondary information drops to light gray. Cards and content sections sit on dark gray or charcoal backgrounds with subtle purple- toned borders that keep things cohesive.
Impact score badges follow a consistent color logic: green for high scores, yellow for mid-range, red for lower ones so users can read the data at a glance without needing to check numbers. Icons appear consistently across the interface, helping users quickly identify what type of content theyre looking at. The result is a design thats easy to use and genuinely pleasant to look at.
-
Responsive Design and Device Support
Policy Analyzer is built to work well regardless of what device a user picks up. On large desktop screens (1920×1080 and above), the full interface is on display all columns, maximum content density, and every filter visible. On tablets (around 768×1024), columns stack vertically and less critical filters tuck away behind collapsible sections to keep things manageable. On mobile (375×667), everything shifts to a single-column layout with full-width elements, a simplified header, and touch targets sized for fingers rather than cursors.
Search results switch from table rows to full-width cards on mobile, and the comparison view scrolls vertically rather than spreading side by side. Every feature remains accessible regardless of screen size the experience just adapts to fit.
-
Navigation and User Flow
Fig. 10: Policy Impact Analyzer Dashboard
Getting around Policy Analyzer is designed to feel intuitive from the first visit. A Back to Dashboard button lives in the header on every page, so users always have a way back to the main hub. The Dashboard itself acts as the central starting point, with clear navigation links to the Search Page, Impact Scorecard, Policy Comparison, and Reports & Sharing.
Each page announces itself with a prominent title and relevant icon, so theres never any question about where you are. The Search Page surfaces results immediately as users type. The Impact Scorecard uses tabs to organize its different analytical views. The Comparison Engine walks users through selecting two policies before revealing results. The Reports page guides users through search and PDF generation step by step. The overall flow is logical and consistent users can explore every feature without ever feeling lost.
-
-
RESULTS AND DISCUSSION
Policy Analyzer brings together all of its core features in a way that feels cohesive and purposeful. At its heart is an impact scoring algorithm that evaluates government policies across six weighted dimensions, producing normalized scores that make it possible to meaningfully compare over 101 policies from entirely different domains. The system is built for speed API responses come back in 200500ms, search results render in under 800ms, and
PDF reports generate in just 23 seconds. Behind the scenes, a well-structured database handles 101 policies with 20+ fields each, responding to complex multi-field queries without breaking a sweat.
One of the more thoughtful aspects of the scoring algorithm is how it handles fairness across policy types. A large healthcare program and a small agricultural initiative dont compete on budget alone the algorithm weights beneficiary reach, implementation scale, and documentation quality alongside financial size. In practice, this means a well-documented, widely-reaching smaller policy can score just as highly as a bigger one. Policy experts who reviewed the results found the rankings intuitive, which validates both the choice of dimensions and how theyre weighted. The framework is also flexible enough to accommodate future changes adjusting weights or adding new dimensions is straightforward as needs evolve.
The comparison engine goes deeper than simple side- by-side metrics. Across thirteen comparison dimensions, it identifies relationships between policy pairs that would take a researcher weeks to uncover manually. It uses regex-based sector classification and NLP to detect overlapping beneficiary populations, and it surfaces coordination opportunities that arent obvious at first glance. For example, when comparing education and employment policies, the system recognizes that one enables the other and recommends integrated beneficiary pathways. When two policies launch around the same time, it flags the potential for coordinated strategy. The output is structured clearly enough to drop directly into policy coordination documents or presentations.
Search is designed to feel responsive without hammering the server. A 400ms debounce means users see results within 12 seconds of stopping typing, while unnecessary API calls are kept to a minimum. Searches span thirteen policy fields simultaneously title, content, benefits, objectives, and more and filters for sector, ministry, budget range, year, and beneficiary type can be layered together using AND logic. For everyday citizens, this is genuinely useful: search logs show that users regularly discover relevant government schemes they didnt know existed, which speaks directly to the platforms goal of making policy information more accessible.
The NLP-powered beneficiary extraction achieves 85 90% accuracy across all 101 policies, mapping content to fourteen demographic groups including students, farmers, women, seniors, rural populations, and SC/ST communities, among others. Policies that serve multiple groups say, a scheme targeting women farmers in rural areas are correctly tagged across all three relevant segments. This automated tagging saves significant manual effort and, more importantly, makes it possible for citizens to search by their own demographic characteristics to find benefits
theyre entitled to.
PDF reports generated through jsPDF run 1215 pages on average and are polished enough for academic citations, government documentation, or public distribution. Each report includes a title page, color-coded sections, justified text, automatic page breaks, and page-numbered footers all generated client-side, which keeps server load low. File sizes stay between 300500 KB, small enough to send by email. Users have found the downloaded reports particularly useful for offline reference during presentations and research.
The three-tier architecture keeps the frontend, backend, and database cleanly separated, which pays dividends for scalability. The frontend can be deployed to a CDN independently; the backend containerizes easily for load balancing; and the database uses indexing on frequently searched fields to stay fast even as the policy collection grows. Load testing confirms the system can handle hundreds of simultaneous users without degradation, and the horizontal scaling model means growth from 100 to millions of users is an infrastructure decision, not an architectural one.
The interface leans into a dark theme with purple gradients and cyan accents modern without being flashy. Framer otion animations provide subtle feedback during interactions, and the responsive layout adapts cleanly across desktop, tablet, and mobile. In user testing, people were able to discover policies, run comparisons, and generate reports within 23 minutes of first use, without any guidance. That kind of immediate usability is hard to achieve and reflects careful design decisions throughout.
Security follows current best practices without cutting corners. Passwords are hashed with bcrypt, making them unrecoverable even in a database breach. JWT tokens expire after one hour. All traffic runs over HTTPS, inputs are validated against injection attacks, and secrets are stored in environment variables rather than in the codebase. During development and testing, no vulnerabilities were identified, and protected endpoints correctly return 401 errors for unauthorized requests.
The dataset itself covers 101 government policies drawn from official sources ministry websites, parliamentary documentation, and official policy databases. Accuracy checks against original sources show 95%+ correctness for structured fields like title, ministry, and date, and 90%+ for content fields, with only minor formatting differences. Source links are included for every policy, so users can verify information directly. Coverage spans healthcare, education, agriculture, finance, employment, rural development, housing, and social security a broad cross-section of the policy landscape.
TABLE I: Policy Analyzer vs Other Policy Analysis
Platforms
-
CONCLUSION AND FUTURE WORK
The Policy Analyzer tackles a real and pressing problem most people find government policy documents overwhelming, confusing, or simply out of reach. By bringing everything together in one place, the platform gives both everyday citizens and policymakers a practical way to understand, compare, and evaluate policies across different sectors and ministries. With smart impact scoring, side-by-side policy comparisons, and intuitive search and visualization tools, it puts meaningful policy information in the hands of people who need it most making civic engagement less of a chore and more of a conversation.
From a technical standpoint, the platform is built to last. It handles scale well, takes security seriously, and delivers reliable performance consistently. More importantly, it proves a point worth making: that the right technology can genuinely close the gap between dense government language and public understanding, leading to better-informed citizens and more transparent governance.
Looking ahead, theres a lot of exciting ground to cover. The roadmap includes building machine learning models to predict how policies might play out before theyre fully implemented, and expanding coverage beyond India to include international and regional governance frameworks. Better natural language processing will make it possible to automatically summarize policies and detect how they relate to one another saving users hours of reading.
On the accessibility front, dedicated iOS and Android
apps will bring the platform to a much wider audience, while multi-language support will ensure regional communities arent left out. Real-time tracking of policy amendments, paired with automated alerts, will keep users up to date without any extra effort on their part.
There are also plans to open the platform up through API access for developers and research institutions, collaborative tools for government officials and policy researchers, and specialized modules tailored to specific sectors. And perhaps most meaningfully, the team intends to run longitudinal studies to measure what actually changes: how policies perform in the real world, and whether the platform is genuinely moving the needle on citizen engagement.
ACKNOWLEDGMENT
The authors thank the Department of Computer Science and Engineering, Geethanjali College of Engineering and Technology, for guidance and support.
REFERENCES
-
SJ. Devlin, M. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, USA, 2019, pp. 41714186.
-
D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent Dirichlet Allocation, Journal of Machine Learning Research, vol. 3,
pp. 9931022, Jan. 2003.
-
T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient Estimation of Word Representations in Vector Space, in Proceedings of the International Conference on Learning Representations (ICLR), 2013.
-
J. Pennington, R. Socher, and C. Manning, GloVe: Global Vectors for Word Representation, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 15321543.
-
Y. Liu, M. Ott, N. Goyal, et al., RoBERTa: A Robustly Optimized BERT Pretraining Approach, arXiv preprint arXiv:1907.11692, 2019
-
S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python, Sebastopol, CA, USA: OReilly Media, 2009.
-
] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Cambridge, MA, USA: MIT Press, 2016.
-
A. Vaswani et al., Attention Is All You Need, in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 59986008.
-
T. Loughran and B. McDonald, When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks, Journal of Finance, vol. 66, no. 1, pp. 3565, 2011.
-
F. Chollet, Deep Learning with Python, 2nd ed.,
Shelter Island, NY, USA: Manning Publications, 2021
