CropVerse: A Web-Based Platform for Data-Driven Crop Yield Prediction and Recommendation

Alina Viola Dsouza; Sunith Kumar T

doi:10.17577/IJERTCONV14IS010011

Techprints 9.0 - 2026 (Volume 14 - Issue 01)

CropVerse: A Web-Based Platform for Data-Driven Crop Yield Prediction and Recommendation

DOI : 10.17577/IJERTCONV14IS010011

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 44
Authors : Alina Viola Dsouza, Sunith Kumar T
Paper ID : IJERTCONV14IS010011
Volume & Issue : Volume 14, Issue 01, Techprints 9.0
Published (First Online) : 01-03-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

CropVerse: A Web-Based Platform for Data-Driven Crop Yield Prediction and Recommendation

Alina Viola Dsouza Student

Dept. of Computer Applications St Joseph Engineering College Vamanjoor, Mangaluru, India

Sunith Kumar T Assistant Professor

Dept. of Computer Applications St Joseph Engineering College Vamanjoor, Mangaluru, India

Abstract – This paper introduces CropVerse, an integrated, web- based platform that aims to empower farmers with data-driven agricultural decision-making. CropVerse combines the strength of machine learning with the flexibility of web technology, seeking to offer precise crop yield forecasts and crop recommendations based on individual environmental and soil contexts. The system applies Random Forest Regression (R² = 0.86) for predicting yield and Random Forest Classification (accuracy = 99.2%) for suggesting ideal crops . The predictive system is built around soil nutrient values (NPK and pH), climatic inputs, and real-time updates of temperature and humidity using the OpenWeatherMap API.The system was tested through simulated evaluations and dataset validations. Although no field deployment was conducted, the models demonstrated strong performance in internal testing, with an average yield prediction R² of 0.86 and crop classification accuracy of 99.2%. CropVerse stands out with its integrated dual-model system, bridging the digital divide and supporting sustainable agriculture through cost-effective agritech infrastructure.

Keywords Crop Yield Prediction, Crop Recommendation, Machine Learning, Random Forest, Flask, Precision Agriculture.

INTRODUCTION

Indian agriculture is a delicate interrelationship of environmental, economic, and technological parameters. Although it engages more than 50% of workers, the industry adds only less than 20% to the country's GDP. Farmers may have limited access to timely and site-specific information, and thus resort to inefficient practices like

inappropriate choice of crop, wasteful use of fertilizers, and excessive dependence on traditional practice.standards, primarily due to inadequate planning and ineffective utilization of existing data [1].Meanwhile, ICAR indicates that more than 65% of Indian soil is affected by poor NPK balance [2].Unstable climatic trends due to global warming impose greater pressures on agricultural yields [3].In such a hostile setting, advanced technologies such as machine learning (ML), data analysis, and API human-based weather

forecasting has the potential to transform decision-making and increase crop yield.CropVerse addresses such issues by providing a twofold-use platform that integrates crop yield prediction and crop suitability recommendations into one web-based tool.Designed with scalability, simplicity, and usability in practice, CropVerse makes precision agriculture possible.open technology to everyone with actionable information backed by empirical fact.This paper outlines the architectural framework, implementation approach, machine learning algorithms employed, and results achieved from field verification.
LITERATURE REVIEW

Machine learning (ML) has also left significant contributions to the field of agriculture, particularly in the area of predictive analytics and intelligent decision-making platforms. Traditional models like linear regression, support vector machines (SVM), and newer deep learning methods like convolutional neural networks (CNN) and long short-term memory (LSTM) networks have all been used for predicting crop yields. Still, these methods typically demand large, good-quality datasets and quite a lot of computing capacity, which renders them less feasible to implement in rural farm settings [4] ,[6].

Among the various ML algorithms, Random Forest (RF) has emerged as a reliable and interpretable choice for agricultural applications due to its robustness against overfitting, ability to handle both categorical and numerical data, and effectiveness on small to medium-sized datasets [4], [5]. Studies such as Van Klompenburg et al. [6] have demonstrated that RF consistently outperforms other models in crop yield prediction tasks when evaluated on publicly available agricultural datasets.

In the domain of crop recommendation, earlier systems were rule-based and depended heavily on manual expert knowledge. Recent developments, however, focus on data- driven, weather-aware models that incorporate real-time environmental inputs such as soil nutrients, temperature, and

humidity. These systems have achieved recommendation accuracies ranging from 85% to 95% [7], while the integration of live weather data via APIs has been shown to further enhance prediction accuracy by up to 15% [8].

Despite these advancements, many existing platforms are limited in scope, offering either yield prediction or crop recommendation, but not both in an integrated manner. Additionally, challenges such as lack of real-time data processing, limited mobile responsiveness, and insufficient accessibility in low-connectivity regions hinder their practical adoption in countries like India. CropVerse is designed to overcome these limitations by providing a unified, real-time, and user-accessible web platform tailored for smallholder farmers.
SYSTEM DESIGN AND IMPLEMENTATION
1. Architecture
  
  CropVerse is developed with the Flask web framework, which supports a lightweight, scalable, and modular structure, hence being most valuable for the inclusion of machine learning functionalities. The system architecture is separated into several layers to enhance maintainability and to establish functional roles more explicitly. The front end is developed in terms of HTML, CSS, JavaScript, Tailwind CSS, and Bootstrap, which creates a responsive, user-friendly interface particularly designed for agriculture professionals and agronomy experts. The backend is developed in Flask and Python 3.8-based, handling processing HTTP requests, capturing user input, making model inference for prediction, initiating real-time weather API queries, and displaying dynamic output. SQLite is leveraged for data management as a lightweight database that efficiently processes user profiles, logins, input history, and prediction results. The machine learning engine is developed based on two major models: a random forest regressor for predicting crop yields and a random forest classifier for recommending crops based on environmental and soil parameters. Real-time temperature and humidity values are fetched from the OpenWeatherMap API in addition to that, thus enabling weather-conscious decision-making for crop recommendation. The system can handle synchronous data processing as well as asynchronous API-based workflows, making it highly flexible for low- resource or semi-offline agricultural environments where consistent internet connectivity might be degraded.
2. Data Collection
  
  The machine learning functionality of CropVerse is implemented using the Scikit-learn library. Two Kaggle datasets that are openly accessible were utilized to train and assess the model:
  - Crop Yield Dataset
    - Source: A. Gupta, Kaggle [10]
    - Content: Crop yield data from Indian states from 1997 to 2020.
    - Fields: Crop name, season, state, area, production, rainfall, fertilizer, and pesticide.
    - Usage: Trained the random forest regression model for yield prediction.
  - Crop Recommendation Dataset
    - Source: A. Ingle, Kaggle [9]
    - Fields: N, P, K, temperature, humidity, pH, rainfall.
    - Usage: Trained the random forest classification model for crop recomendations
3. Technologies
  
  The deployment of CropVerse utilizes a modular, scalable stack that integrates web technologies and machine learning frameworks. It is built in the backend using Flask, a Python- based light microframework that takes care of HTTP routing, session management, and model integration. Persistence of data is managed through SQLite.The machine learning pipeline is driven by Scikit-learn, where pre-trained Random Forest regression and classification models are kept as .pkl files and loaded at runtime using Joblib. Pandas is used for data preprocessing and feature engineering, and OpenWeatherMap API is incorporated for live weather data fetching. Frontend styling is done with Tailwind CSS, providing mobile-first responsive styling, and JavaScript provides support for asynchronous handling of data for a dynamic user interface.
4. Implementations
  
  The CropVerse application was implemented with a modular design to simplify the system for development, testing, and maintenance. The project was coded in a Windows environment with PyCharm IDE, which allowed easy organization of various parts of the system like frontend, backend, database, and machine learning modules. Various modules interact with each other through easy API calls, so the system responds quickly to user action and provides results immediately.
  
  The front end was coded using HTML, CSS, JavaScript, Tailwind CSS, and Bootstrap. These technologies offered a clean and interactive interface where users enter forms to get the predictions or crop suggestions. The user inputs the parameters like soil nutrients (N, P, K), pH, rainfall, temperature, humidity, crop name, season, and state. You may also have an option in the system to get live temperature and humidity automatically from the OpenWeatherMap API. After submitting the form, the output is shown in real time either the predicted yield in kg/ha or a list of crops with scores. There is also a dashboard where users can view their previous predictions and suggestions.
  
  The backend was created in Flask (Python 3.8). The database stores user records, user input history, model prediction, and weather records. The machine learning part of the system has two different models that were trained using Scikit-learn. The Random Forest Regressor is used to predict the yield of the
  
  crop. It accepts inputs such as the name of the crop, state, season, area, rainfall, fertilizer application, and pesticide application. The Random Forest Classifier is used to recommend the best-suited crops based on inputs such as N, P, K, pH, temperature, humidity, and rainfall. The two different models were pre-trained and stored as .pkl files. When the user submits a form, the backend loads the respective model, accepts the input, makes the prediction, and sends the output to the frontend. In the meantime, the system stores all the data in the database so that it can be retrieved by the user in the future
5. Pseudo Codes
Login Functionality:

BEGIN

DISPLAY Login Form

IF user submits the form THEN RETRIEVE user data from SQLite VALIDATE entered email and password

IF credentials match THEN SET user session

REDIRECT to user dashboard ELSE

DISPLAY error message: "Invalid email or password" END IF

END IF END

CropVerse Dashboard Operations : BEGIN

DISPLAY Dashboard Options:
- Weather Module
- Crop Yield Prediction
- Crop Recommendation
- View History
RUN Weather Module every 6 hours: CALL OpenWeatherMap API

IF API returns success THEN EXTRACT temperature and humidity STORE in weather_log

ELSE

LOG error: "Weather fetch failed" ENDIF

WAIT for user to select an option

IF user selects Crop Yield Prediction THEN DISPLAY Prediction Form

WAIT for user input

IF form submitted THEN VALIDATE and preprocess input LOAD Random Forest Regressor PREDICT yield

DISPLAY yield result LOG prediction to database

ELSE

DISPLAY error: "Incomplete input" ENDIF

ELSE IF user selects Crop Recommendation THEN

DISPLAY Recommendation Form WAIT for user input

IF auto weather fetch is enabled THEN

RETRIEVE temperature and humidity from weather_log

ENDIF

IF form submitted THEN VALIDATE and preprocess input LOAD Random Forest Classifier PREDICT top 3 crops

DISPLAY crop suggestions

LOG recommendation to database

ELSE

DISPLAY error: "Incomplete input" ENDIF

ELSE IF user selects View History THEN FETCH prediction and recommendation history DISPLAY records

ELSE

DISPLAY error: "Invalid selection"

ENDIF END
RESULTS AND ANALYSIS

This section summarizes the experimental evaluation of the CropVerse platform, incorporating both model performance metrics and findings based on the two primary datasets
1. Crop Yield Prediction
  
  CropVerses yield prediction module was evaluated using the Crop Yield in Indian States dataset. The selected model and results were as follows:
  - Model: Random Forest Regressor
  - Evaluation Metric: R² Score
  - Best R² Score: 0.86
  - Hyperparameters:
    - n_estimators = 150
    - max_depth = 10
    - min_samples_split = 2
      
      Fig. 1. Feature Importance for Crop Yield Prediction
      
      This bar chart illustrates the relative importance of key agronomic featuresArea, Fertilizer usage, Pesticide application, and Annual Rainfallin predicting crop yield, as determined by the trained Random Forest Regressor.
2. Crop Recommendation
  
  CropVerses crop recommendation module was evaluated using Crop Recommendation Dataset for training and evaluation:
  - Model: Random Forest Classifier
  - Evaluation Metric: Accuracy
  - Best Accuracy: 99.2% Hyperparameters:
    - n_estimators = 150
    - max_depth = 15
    - class_weight = balanced
      
      Fig. 2. Crop Frequency in Recommendation Dataset
      
      This bar chart presents how many times every crop occurs in the recommendation dataset. Each bar is representative of a particular crop, while its length indicates the number of samples or observations relating to that crop within the data.
3. Performance Comparison of Models
To evaluate the performance of the devised CropVerse system, we tested it on two Random Forest models one for crop yield prediction and the other for crop recommendation. We tested both models on specially prepared datasets with pertinent agricultural features. Below is a table presenting their performance based on standard evaluation metrics:

Table I: Model Performance Comparison

Metric

Yield Model

Crop Model

R² Score

0.86

–

Accuracy

–

99.2%

F1-Score

–

98.9%

Cross-validation folds

3

5
DISCUSSION
1. Advantages
  
  The CropVerse platform presents a number of benefits compared to traditional farming decision-support systems:
  
  Dual ML Integration: Integrates both Random Forest Regression and Classification, allowing crop yield prediction and crop recommendation within a single system.
  
  UserCentric Interface: The web UI is optimized with Tailwind CSS and responsive design methodologies, making it easy to use for farmers with little technical expertise.
  
  Modular Architecture: The backend architecture facilitates future extensibilityenabling integration of further data layers such as pest infestations or market prices.
  
  Data Logging and Traceability: All inputs from the users and prediction output are logged in a SQLite database, enabling transparency, record keeping, and future model improvement.
2. Shortcomings and Future Research
  
  Although being a high-performing system, CropVerse has a couple of weaknesses in its present state:
  
  Static Soil Inputs: The model relies on user-inputted NPK and pH levels. Without automated sensors or soil analysis labs, there is a variable quality of predictions based on accuracy.
  
  Limited Crop Classes: The current recommendation engine has been trained on a limited set of crops (22 classes). It must be extended to cover region-specific vegetables, pulses, and horticulture.
  
  No Multilingual Support: Currently, the application is in English only. Support for regional languages would make the tool more accessible in multilingual regions for farmers.
  
  Internet Dependence: Being web-based, offline support is not yet introduced, which may restrict usage in low-connectivity zones.
3. Potential Applications
Because of its modular nature and forecasting capability, CropVerse can be extended for a variety of smart farming applications:

Fertilizer Advisory: Suggest balanced doses of fertilizer depending on crop and soil information.

Water Management: Can be combined with rainfall prediction to provide irrigation guidance.

Pest/Disease Forecasting: With the right datasets, modules can be implemented to forecast pest infestations.

Government Integration: Can be integrated with agricultural programs to provide subsidy eligibility or crop insurance recommendations.

Market Forecasting: Use mandi price feeds to recommend high-margin crops.
CONCLUSION

CropVerse is a visionary convergence of machine learning and accessible web technology applied to precision agriculture. Through a combination of crop yield prediction and crop recommendation in one easy-to-use platform, it tackles the important challenges of farmers in decision- making. The utilization of Random Forest algorithms by the system, coupled with live weather inputs, guarantees good accuracy levels in both crop forecasting (R² = 0.86) and crop classification (99% accuracy). Its light-weight design, interactive frontend, and localized input fields render it particularly apt for deployment in remote and semi-urban areas with poor infrastructure. Apart from improving agricultural output, the solution also supports environmentally friendly farming practices. In the future, CropVerse will grow with mobile application building, international language capability, IoT-based soil sensor integration, and dynamic model revisions to evolve with shifting agriculture trends. It therefore illustrates how data- enabled technology can enable even small farmers to make efficient, climate-smart, and well-informed decisions.

REFERENCES

Food and Agriculture Organization (FAO), India Agriculture Report, 2023. [Online]. Available: https://www.fao.org
Flask Documentation, "Flask Web Framework." [Online]. Available: https://flask.palletsprojects.com/en/2.0.x/
SQLite Documentation, "SQLite Database Engine." [Online]. Available: https://www.sqlite.org/docs.html
L. Breiman, Random Forests, Machine Learning, vol. 45, no. 1, pp. 532, 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324
OpenWeatherMap, "Weather API Documentation," 2023. [Online].

Available: https://openweathermap.org/api
Scikit-learn, RandomForestRegressor Documentation," 2023. [Online]. Available: https://scikit-

learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegre ssor.html
H. Bhojani and C. Bhatt, Crop Recommendation System, in Proc. IEEE Int. Conf. on Information Systems Security (ICISS), 2020. [Online].

Available: https://ieeexplore.ieee.org/document/9351458
P. Sharma, N. Gupta, and D. Singh, Ensemble Methods for Agricultural Yield Prediction, IEEE Transactions on Agricultural Engineering, vol. 15, no. 4, pp. 112125, 2021. [Online]. Available: https://doi.org/10.1109/TIAE.2021.3059267
A. Ingle, Crop Recommendation Dataset, Kaggle, 2020. [Online]. Available: https://www.kaggle.com/datasets/atharvaingle/crop- recommendation-dataset
A. Gupta, Crop Yield in Indian States Dataset, Kaggle, 2021. [Online]. Available: https://www.kaggle.com/datasets/akshatgupta7/crop-yield-in- indian-states-dataset

Metric	Yield Model	Crop Model
R² Score	0.86	–
Accuracy	–	99.2%
F1-Score	–	98.9%

CropVerse: A Web-Based Platform for Data-Driven Crop Yield Prediction and Recommendation

INTRODUCTION

LITERATURE REVIEW

SYSTEM DESIGN AND IMPLEMENTATION

RESULTS AND ANALYSIS

Fig. 1. Feature Importance for Crop Yield Prediction

Fig. 2. Crop Frequency in Recommendation Dataset

Table I: Model Performance Comparison

DISCUSSION

CONCLUSION

REFERENCES