DOI : 10.17577/IJERTCONV14IS010011- Open Access

- Authors : Alina Viola Dsouza, Sunith Kumar T
- Paper ID : IJERTCONV14IS010011
- Volume & Issue : Volume 14, Issue 01, Techprints 9.0
- Published (First Online) : 01-03-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
CropVerse: A Web-Based Platform for Data-Driven Crop Yield Prediction and Recommendation
Alina Viola Dsouza Student
Dept. of Computer Applications St Joseph Engineering College Vamanjoor, Mangaluru, India
Sunith Kumar T Assistant Professor
Dept. of Computer Applications St Joseph Engineering College Vamanjoor, Mangaluru, India
Abstract – This paper introduces CropVerse, an integrated, web- based platform that aims to empower farmers with data-driven agricultural decision-making. CropVerse combines the strength of machine learning with the flexibility of web technology, seeking to offer precise crop yield forecasts and crop recommendations based on individual environmental and soil contexts. The system applies Random Forest Regression (R² = 0.86) for predicting yield and Random Forest Classification (accuracy = 99.2%) for suggesting ideal crops . The predictive system is built around soil nutrient values (NPK and pH), climatic inputs, and real-time updates of temperature and humidity using the OpenWeatherMap API.The system was tested through simulated evaluations and dataset validations. Although no field deployment was conducted, the models demonstrated strong performance in internal testing, with an average yield prediction R² of 0.86 and crop classification accuracy of 99.2%. CropVerse stands out with its integrated dual-model system, bridging the digital divide and supporting sustainable agriculture through cost-effective agritech infrastructure.
Keywords Crop Yield Prediction, Crop Recommendation, Machine Learning, Random Forest, Flask, Precision Agriculture.
-
INTRODUCTION
Indian agriculture is a delicate interrelationship of environmental, economic, and technological parameters. Although it engages more than 50% of workers, the industry adds only less than 20% to the country's GDP. Farmers may have limited access to timely and site-specific information, and thus resort to inefficient practices like
inappropriate choice of crop, wasteful use of fertilizers, and excessive dependence on traditional practice.standards, primarily due to inadequate planning and ineffective utilization of existing data [1].Meanwhile, ICAR indicates that more than 65% of Indian soil is affected by poor NPK balance [2].Unstable climatic trends due to global warming impose greater pressures on agricultural yields [3].In such a hostile setting, advanced technologies such as machine learning (ML), data analysis, and API human-based weather
forecasting has the potential to transform decision-making and increase crop yield.CropVerse addresses such issues by providing a twofold-use platform that integrates crop yield prediction and crop suitability recommendations into one web-based tool.Designed with scalability, simplicity, and usability in practice, CropVerse makes precision agriculture possible.open technology to everyone with actionable information backed by empirical fact.This paper outlines the architectural framework, implementation approach, machine learning algorithms employed, and results achieved from field verification.
-
LITERATURE REVIEW
Machine learning (ML) has also left significant contributions to the field of agriculture, particularly in the area of predictive analytics and intelligent decision-making platforms. Traditional models like linear regression, support vector machines (SVM), and newer deep learning methods like convolutional neural networks (CNN) and long short-term memory (LSTM) networks have all been used for predicting crop yields. Still, these methods typically demand large, good-quality datasets and quite a lot of computing capacity, which renders them less feasible to implement in rural farm settings [4] ,[6].
Among the various ML algorithms, Random Forest (RF) has emerged as a reliable and interpretable choice for agricultural applications due to its robustness against overfitting, ability to handle both categorical and numerical data, and effectiveness on small to medium-sized datasets [4], [5]. Studies such as Van Klompenburg et al. [6] have demonstrated that RF consistently outperforms other models in crop yield prediction tasks when evaluated on publicly available agricultural datasets.
In the domain of crop recommendation, earlier systems were rule-based and depended heavily on manual expert knowledge. Recent developments, however, focus on data- driven, weather-aware models that incorporate real-time environmental inputs such as soil nutrients, temperature, and
humidity. These systems have achieved recommendation accuracies ranging from 85% to 95% [7], while the integration of live weather data via APIs has been shown to further enhance prediction accuracy by up to 15% [8].
Despite these advancements, many existing platforms are limited in scope, offering either yield prediction or crop recommendation, but not both in an integrated manner. Additionally, challenges such as lack of real-time data processing, limited mobile responsiveness, and insufficient accessibility in low-connectivity regions hinder their practical adoption in countries like India. CropVerse is designed to overcome these limitations by providing a unified, real-time, and user-accessible web platform tailored for smallholder farmers.
-
SYSTEM DESIGN AND IMPLEMENTATION
-
Architecture
CropVerse is developed with the Flask web framework, which supports a lightweight, scalable, and modular structure, hence being most valuable for the inclusion of machine learning functionalities. The system architecture is separated into several layers to enhance maintainability and to establish functional roles more explicitly. The front end is developed in terms of HTML, CSS, JavaScript, Tailwind CSS, and Bootstrap, which creates a responsive, user-friendly interface particularly designed for agriculture professionals and agronomy experts. The backend is developed in Flask and Python 3.8-based, handling processing HTTP requests, capturing user input, making model inference for prediction, initiating real-time weather API queries, and displaying dynamic output. SQLite is leveraged for data management as a lightweight database that efficiently processes user profiles, logins, input history, and prediction results. The machine learning engine is developed based on two major models: a random forest regressor for predicting crop yields and a random forest classifier for recommending crops based on environmental and soil parameters. Real-time temperature and humidity values are fetched from the OpenWeatherMap API in addition to that, thus enabling weather-conscious decision-making for crop recommendation. The system can handle synchronous data processing as well as asynchronous API-based workflows, making it highly flexible for low- resource or semi-offline agricultural environments where consistent internet connectivity might be degraded.
-
Data Collection
The machine learning functionality of CropVerse is implemented using the Scikit-learn library. Two Kaggle datasets that are openly accessible were utilized to train and assess the model:
-
Crop Yield Dataset
-
Source: A. Gupta, Kaggle [10]
-
Content: Crop yield data from Indian states from 1997 to 2020.
-
Fields: Crop name, season, state, area, production, rainfall, fertilizer, and pesticide.
-
Usage: Trained the random forest regression model for yield prediction.
-
-
Crop Recommendation Dataset
-
Source: A. Ingle, Kaggle [9]
-
Fields: N, P, K, temperature, humidity, pH, rainfall.
-
Usage: Trained the random forest classification model for crop recomendations
-
-
-
Technologies
The deployment of CropVerse utilizes a modular, scalable stack that integrates web technologies and machine learning frameworks. It is built in the backend using Flask, a Python- based light microframework that takes care of HTTP routing, session management, and model integration. Persistence of data is managed through SQLite.The machine learning pipeline is driven by Scikit-learn, where pre-trained Random Forest regression and classification models are kept as .pkl files and loaded at runtime using Joblib. Pandas is used for data preprocessing and feature engineering, and OpenWeatherMap API is incorporated for live weather data fetching. Frontend styling is done with Tailwind CSS, providing mobile-first responsive styling, and JavaScript provides support for asynchronous handling of data for a dynamic user interface.
-
Implementations
The CropVerse application was implemented with a modular design to simplify the system for development, testing, and maintenance. The project was coded in a Windows environment with PyCharm IDE, which allowed easy organization of various parts of the system like frontend, backend, database, and machine learning modules. Various modules interact with each other through easy API calls, so the system responds quickly to user action and provides results immediately.
The front end was coded using HTML, CSS, JavaScript, Tailwind CSS, and Bootstrap. These technologies offered a clean and interactive interface where users enter forms to get the predictions or crop suggestions. The user inputs the parameters like soil nutrients (N, P, K), pH, rainfall, temperature, humidity, crop name, season, and state. You may also have an option in the system to get live temperature and humidity automatically from the OpenWeatherMap API. After submitting the form, the output is shown in real time either the predicted yield in kg/ha or a list of crops with scores. There is also a dashboard where users can view their previous predictions and suggestions.
The backend was created in Flask (Python 3.8). The database stores user records, user input history, model prediction, and weather records. The machine learning part of the system has two different models that were trained using Scikit-learn. The Random Forest Regressor is used to predict the yield of the
crop. It accepts inputs such as the name of the crop, state, season, area, rainfall, fertilizer application, and pesticide application. The Random Forest Classifier is used to recommend the best-suited crops based on inputs such as N, P, K, pH, temperature, humidity, and rainfall. The two different models were pre-trained and stored as .pkl files. When the user submits a form, the backend loads the respective model, accepts the input, makes the prediction, and sends the output to the frontend. In the meantime, the system stores all the data in the database so that it can be retrieved by the user in the future
-
Pseudo Codes
Login Functionality:
BEGIN
DISPLAY Login Form
IF user submits the form THEN RETRIEVE user data from SQLite VALIDATE entered email and password
IF credentials match THEN SET user session
REDIRECT to user dashboard ELSE
DISPLAY error message: "Invalid email or password" END IF
END IF END
CropVerse Dashboard Operations : BEGIN
DISPLAY Dashboard Options:
-
Weather Module
-
Crop Yield Prediction
-
Crop Recommendation
-
View History
RUN Weather Module every 6 hours: CALL OpenWeatherMap API
IF API returns success THEN EXTRACT temperature and humidity STORE in weather_log
ELSE
LOG error: "Weather fetch failed" ENDIF
WAIT for user to select an option
IF user selects Crop Yield Prediction THEN DISPLAY Prediction Form
WAIT for user input
IF form submitted THEN VALIDATE and preprocess input LOAD Random Forest Regressor PREDICT yield
DISPLAY yield result LOG prediction to database
ELSE
DISPLAY error: "Incomplete input" ENDIF
ELSE IF user selects Crop Recommendation THEN
DISPLAY Recommendation Form WAIT for user input
IF auto weather fetch is enabled THEN
RETRIEVE temperature and humidity from weather_log
ENDIF
IF form submitted THEN VALIDATE and preprocess input LOAD Random Forest Classifier PREDICT top 3 crops
DISPLAY crop suggestions
LOG recommendation to database
ELSE
DISPLAY error: "Incomplete input" ENDIF
ELSE IF user selects View History THEN FETCH prediction and recommendation history DISPLAY records
ELSE
DISPLAY error: "Invalid selection"
ENDIF END
-
-
RESULTS AND ANALYSIS
This section summarizes the experimental evaluation of the CropVerse platform, incorporating both model performance metrics and findings based on the two primary datasets
-
Crop Yield Prediction
CropVerses yield prediction module was evaluated using the Crop Yield in Indian States dataset. The selected model and results were as follows:
-
Model: Random Forest Regressor
-
Evaluation Metric: R² Score
-
Best R² Score: 0.86
-
Hyperparameters:
-
n_estimators = 150
-
max_depth = 10
-
min_samples_split = 2
Fig. 1. Feature Importance for Crop Yield Prediction
This bar chart illustrates the relative importance of key agronomic featuresArea, Fertilizer usage, Pesticide application, and Annual Rainfallin predicting crop yield, as determined by the trained Random Forest Regressor.
-
-
-
Crop Recommendation
CropVerses crop recommendation module was evaluated using Crop Recommendation Dataset for training and evaluation:
-
Model: Random Forest Classifier
-
Evaluation Metric: Accuracy
-
Best Accuracy: 99.2% Hyperparameters:
-
n_estimators = 150
-
max_depth = 15
-
class_weight = balanced
Fig. 2. Crop Frequency in Recommendation Dataset
This bar chart presents how many times every crop occurs in the recommendation dataset. Each bar is representative of a particular crop, while its length indicates the number of samples or observations relating to that crop within the data.
-
-
-
Performance Comparison of Models
To evaluate the performance of the devised CropVerse system, we tested it on two Random Forest models one for crop yield prediction and the other for crop recommendation. We tested both models on specially prepared datasets with pertinent agricultural features. Below is a table presenting their performance based on standard evaluation metrics:
Table I: Model Performance Comparison
Metric
Yield Model
Crop Model
R² Score
0.86
–
Accuracy
–
99.2%
F1-Score
–
98.9%
Cross-validation folds
3
5
-
-
DISCUSSION
-
Advantages
The CropVerse platform presents a number of benefits compared to traditional farming decision-support systems:
Dual ML Integration: Integrates both Random Forest Regression and Classification, allowing crop yield prediction and crop recommendation within a single system.
UserCentric Interface: The web UI is optimized with Tailwind CSS and responsive design methodologies, making it easy to use for farmers with little technical expertise.
Modular Architecture: The backend architecture facilitates future extensibilityenabling integration of further data layers such as pest infestations or market prices.
Data Logging and Traceability: All inputs from the users and prediction output are logged in a SQLite database, enabling transparency, record keeping, and future model improvement.
-
Shortcomings and Future Research
Although being a high-performing system, CropVerse has a couple of weaknesses in its present state:
Static Soil Inputs: The model relies on user-inputted NPK and pH levels. Without automated sensors or soil analysis labs, there is a variable quality of predictions based on accuracy.
Limited Crop Classes: The current recommendation engine has been trained on a limited set of crops (22 classes). It must be extended to cover region-specific vegetables, pulses, and horticulture.
No Multilingual Support: Currently, the application is in English only. Support for regional languages would make the tool more accessible in multilingual regions for farmers.
Internet Dependence: Being web-based, offline support is not yet introduced, which may restrict usage in low-connectivity zones.
-
Potential Applications
Because of its modular nature and forecasting capability, CropVerse can be extended for a variety of smart farming applications:
Fertilizer Advisory: Suggest balanced doses of fertilizer depending on crop and soil information.
Water Management: Can be combined with rainfall prediction to provide irrigation guidance.
Pest/Disease Forecasting: With the right datasets, modules can be implemented to forecast pest infestations.
Government Integration: Can be integrated with agricultural programs to provide subsidy eligibility or crop insurance recommendations.
Market Forecasting: Use mandi price feeds to recommend high-margin crops.
-
-
CONCLUSION
CropVerse is a visionary convergence of machine learning and accessible web technology applied to precision agriculture. Through a combination of crop yield prediction and crop recommendation in one easy-to-use platform, it tackles the important challenges of farmers in decision- making. The utilization of Random Forest algorithms by the system, coupled with live weather inputs, guarantees good accuracy levels in both crop forecasting (R² = 0.86) and crop classification (99% accuracy). Its light-weight design, interactive frontend, and localized input fields render it particularly apt for deployment in remote and semi-urban areas with poor infrastructure. Apart from improving agricultural output, the solution also supports environmentally friendly farming practices. In the future, CropVerse will grow with mobile application building, international language capability, IoT-based soil sensor integration, and dynamic model revisions to evolve with shifting agriculture trends. It therefore illustrates how data- enabled technology can enable even small farmers to make efficient, climate-smart, and well-informed decisions.
REFERENCES
-
Food and Agriculture Organization (FAO), India Agriculture Report, 2023. [Online]. Available: https://www.fao.org
-
Flask Documentation, "Flask Web Framework." [Online]. Available: https://flask.palletsprojects.com/en/2.0.x/
-
SQLite Documentation, "SQLite Database Engine." [Online]. Available: https://www.sqlite.org/docs.html
-
L. Breiman, Random Forests, Machine Learning, vol. 45, no. 1, pp. 532, 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324
-
OpenWeatherMap, "Weather API Documentation," 2023. [Online].
Available: https://openweathermap.org/api
-
Scikit-learn, RandomForestRegressor Documentation," 2023. [Online]. Available: https://scikit-
learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegre ssor.html
-
H. Bhojani and C. Bhatt, Crop Recommendation System, in Proc. IEEE Int. Conf. on Information Systems Security (ICISS), 2020. [Online].
Available: https://ieeexplore.ieee.org/document/9351458
-
P. Sharma, N. Gupta, and D. Singh, Ensemble Methods for Agricultural Yield Prediction, IEEE Transactions on Agricultural Engineering, vol. 15, no. 4, pp. 112125, 2021. [Online]. Available: https://doi.org/10.1109/TIAE.2021.3059267
-
A. Ingle, Crop Recommendation Dataset, Kaggle, 2020. [Online]. Available: https://www.kaggle.com/datasets/atharvaingle/crop- recommendation-dataset
-
A. Gupta, Crop Yield in Indian States Dataset, Kaggle, 2021. [Online]. Available: https://www.kaggle.com/datasets/akshatgupta7/crop-yield-in- indian-states-dataset
