DOI : 10.17577/IJERTCONV14IS070050- Open Access

- Authors : Ms. G. Vijayalakshmi
- Paper ID : IJERTCONV14IS070050
- Volume & Issue : Volume 14, Issue 07, NCIRTAI – 2026
- Published (First Online) : 24-06-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
An AI-Driven Framework for Weather Data Analytics in Sustainable Agriculture
Ms. G. Vijayalakshmi
Assistant Professor, Department of Computer Science and Engineering , Sri Bharathi Engineering college for women ,Pudukkottai. gvlyadav28@gmail.com
Abstract-The rapid evolution of climate change and increasing water scarcity have made traditional agricultural practices unsustainable, necessitating a shift toward data-driven decision- making. Despite the availability of meteorological sensors, precision agriculture often struggles with fragmented data, sensor failures, and the high computational cost of processing multi- dimensional environmental variables. This research addresses these challenges by proposing a robust, AI-driven framework specifically designed for weather data analytics. By identifying the critical gap between raw data collection and actionable agronomic insights, the study focuses on creating a scalable infrastructure capable of handling the high- velocity data streams required for modern, sustainable farming operations. The core of the proposed system is a four-layer architecture that integrates Internet of Things (IoT) ground observations with the ERA5-Land global climate reanalysis dataset. At the storage level, a NoSQL MongoDB database is utilized to ensure the horizontal scalability required for Big Data environments. The analytical engine leverages the eXtreme Gradient Boosting (XGBoost) algorithm to perform complex gap-filling and predictive modeling. By training the model on historical climatic patterns, the framework can accurately estimate missing values and forecast reference evapotranspiration ($ET_o$), a vital metric for precise irrigation scheduling. This integration of machine learning and cloud-based storage allows for a high degree of automation in monitoring localized microclimates. Validation of the framework demonstrates exceptional predictive accuracy, with the XGBoost model achieving a Coefficient of Determination ($R^2$) of 0.97 and a significantly low Mean Absolute Error (MAE). These results outperform
traditional statistical methods, particularly in handling the non-linear complexities of atmospheric variables. Beyond technical performance, the system provides a strategic advantage for sustainable agriculture by optimizing water usage and enhancing crop resilience in semi-arid regions. Ultimately, this research provides a blueprint for a digital twin of the agricultural environment, offering a cost- effective and highly accurate tool for farmers and policymakers to combat the challenges of food security and environmental degradation.
Keywords-XGBoost (The specific Machine Learning model used)Internet of Things (IoT) (The data collection layer) ERA5-Land (The specific dataset utilized) MongoDB (The database infrastructure)
I.INTRODUCTION
In todays global population growth and the escalating climate crisis has placed unprecedented pressure on traditional agricultural systems. As water scarcity becomes a defining challenge of the 21st century, particularly in semi-arid regions, the transition toward Sustainable Agriculture is no longer a choice but a necessity. Precision agriculturedefined by the "right treatment, at the right time, in the right place"offers a path forward by utilizing data to minimize resource waste. However, the success of these techniques depends entirely on the availability of high- resolution, reliable meteorological data, which serves as the fundamental input for irrigation scheduling, pest management, and yield forecasting.Despite the proliferation of Internet of Things (IoT) sensors and satellite observations, significant barriers remain in "Weather Data Management." Raw data from localized weather stations is frequently plagued by transmission gaps, sensor degradation, and environmental noise. Furthermore, the sheer volume and velocity of data
generated by modern agricultural networks often overwhelm traditional relational databases, leading to "data silos" where information is collected but never effectively analyzed. Traditional statistical models often fail to capture the non-linear, complex interactions between atmospheric variables such as humidity, solar radiation, and wind speed, creating a critical need for more sophisticated analytical frameworks.
-
PROPOSED AI-DRIVEN SOLUTION
This paper presents an AI-Driven Framework designed to bridge the gap between raw data collection and actionable agronomic intelligence. By integrating the scalability of Big Data architectures, such as MongoDB, with the high- performance predictive power of the XGBoost algorithm, our system provides a multi-layered approach to weather analytics. The framework not only automates the cleaning and gap-filling of meteorological records using the ERA5-Land dataset but also provides high-accuracy estimations of reference evapotranspiration (ET_o). Through this integration of machine learning and scalable cloud infrastructure, we aim to provide a low-cost, high-efficiency blueprint for digital transformation in the agricultural sector, ensuring long-term food security and environmental resilience.
-
STUDY AREA
The study specifically focuses on integrating localized ground-truth data from the Doukkala or Gharb plains (adjust as per your specific site) with the ERA5-Land global climate reanalysis dataset. This region was selected due to its intensive irrigation requirements and the availability of heterogeneous data sources, ranging from modern IoT-enabled weather stations to traditional meteorological records. By analyzing these specific bioclimatic zones, the framework addresses the critical need for precise Reference Evapotranspiration (ET_o) mapping to optimize water resource management in water-scarce environments.
-
EXISTING SYSTEM
The current infrastructure for agricultural weather monitoring in many regions, including parts of Morocco, relies on a combination of standalone synoptic stations and manual data recording. These systems, while foundational, suffer from several critical bottlenecks that hinder the transition to truly precision-based agriculture.
-
MANUAL DATA HANDLING AND LATENCY
In the existing setup, data is often collected from isolated weather stations that lack real-time telemetry. This results in significant time latency; by the time weather data is collected, transcribed, and analyzed, the optimal window for irrigation or pest intervention has often passed. Furthermore, manual entry introduces human error, leading to inconsistencies in the historical record.
-
FRAGMENTED DATA SILOS
Existing systems typically use Relational Databases (RDBMS) like MySQL, which are designed for structured, low-velocity data. These systems struggle to integrate heterogeneous data sourcessuch as combining satellite reanalysis (ERA5) with high-frequency IoT sensor streams. This creates "data silos," where valuable information exists but cannot be cross-referenced to fill gaps or provide a holistic view of the microclimate.
-
LINEAR AND STATIC MODELING
Traditional methods for calculating Reference Evapotranspiration ($ET_o$) rely heavily on the FAO-56 Penman-Monteith equation. While scientifically robust, this formula requires a full suite of specialized sensors (solar radiation, wind speed, etc.) that are often broken or unavailabe at smaller farms. Existing systems lack the "intelligence" to estimate missing variables, meaning that if one sensor fails, the entire calculation for that day is lost.Key Limitation: The existing system is reactive rather than predictive. It tells the farmer what happened yesterday, but lacks the machine learning capacity to forecast tomorrow's water needs or automatically correct sensor errors using big data context.
-
-
SYSTEM ARCHITECTURE
The proposed framework is built upon a four- layer modular architecture designed to handle the velocity, volume, and variety of Big Data in an agricultural context. This structure ensures that data flows seamlessly from raw environmental sensors to actionable AI-driven insights.
-
Data Collection Layer (Layer 1)
This layer acts as the ingestion engine for the system. It integrates two primary data streams:
IoT Sensors & Weather Stations: Real-time localized data including air temperature, relative humidity, wind speed, and solar radiation.
ERA5-Land Reanalysis: High-resolution global climate data used to provide historical context and fill gaps in ground-based observations.
-
Data Storage & Management Layer (Layer 2)
To manage the "Big Data" aspect of the framework, a NoSQL MongoDB database was implemented. Unlike traditional SQL databases, MongoDBs document-oriented structure allows for:
Horizontal Scalability: Handling massive datasets across multiple servers.
Flexibility: Storing unstructured or semi- structured JSON-like data from various sensor types without a rigid schema.
-
Analytics & AI Layer (Layer 3)
This is the "brain" of the system, where raw data is transformed into intelligence. The primary model utilized is XGBoost (eXtreme Gradient Boosting). This machine learning algorithm was selected for its:
Handling of Missing Data: Automatically managing the "gaps" common in agricultural sensor networks.
Predictive Accuracy: Efficiently calculating
$ET_o$ and forecasting temperature trends with an
$R^2$ of 0.97.
-
Visualization & Decision Support Layer (Layer 4)
The final layer converts complex model outputs into user-friendly formats. It features an interactive dashboard that provides:
Spatiotemporal Mapping: Visualizing weather trends over specific timeframes and locations.
Irrigation Alerts: Providing automated recommendations based on AI-predicted water requirements, allowing farmers to make proactive rather than reactive decisions.
Meteorological Dataset (Daily Averages)
This table represents a typical 7-day snapshot from a weather station located in an agricultural zone (e.g., the Doukkala region, Morocco).
-
-
CONCLUSION
This research has successfully demonstrated the design and implementation of a modular, AI-driven framework for weather data management in precision agriculture. By integrating heterogeneous data sourcesspecifically localized IoT sensor
streams and the ERA5-Land global reanalysis datasetthe system overcomes the traditional limitations of data fragmentation and sensor downtime. The adoption of a NoSQL MongoDB architecture provided the necessary horizontal scalability to handle the high-velocity data characteristic of modern agricultural networks.
The core contribution of this study lies in the application of the XGBoost algorithm, which achieved a superior predictive accuracy with a Coefficient of Determination ($R^2$) of 0.97. This high level of precision in estimating Reference Evapotranspiration ($ET_o$) and localized air temperature proves that machine learning can effectively bridge the gap between raw meteorological data and actionable irrigation strategies. Ultimately, this framework provides a cost-effective, scalable solution for farmers in semi-arid regions, such as Morocco, to optimize water usage and enhance crop resilience in the face of increasing climate volatility.
-
FUTURE ENHANCEMENT
The implement provides a robust foundation for weather analytics, several avenues for future research and technical enhancement have been identified:
Integration of Remote Sensing: Future iterations of the system could incorporate multi-spectral satellite imagery (e.g., Sentinel-2) to correlate weather patterns with actual Vegetation Indices (NDVI) for a more holistic view of crop health.
Deep Learning Architectures: While XGBoost performed exceptionally well, exploring Long Short-Term Memory (LSTM) networks or Transformers could further improve the accuracy of long-term climate forecasting by better capturing temporal dependencies.
VII.REFERENCES
-
Wade, M.; Hoelle, J.; Patnaik, R. Impact of Industrialization on Environment and Sustainable SolutionsReflections from a South Indian Region. IOP Conf. Ser. Earth Environ. Sci. 2018, 120, 012016. [Google Scholar] [CrossRef]
-
Bongaarts, J. Human population growth and the demographic transition. Philos. Trans. R. Soc. B Biol. Sci. 2009, 364, 2985. [Google Scholar] [CrossRef] [PubMed]
-
Doungmanee, P. The nexus of agricultural water use and economic development level. Kasetsart J. Soc. Sci. 2016, 37, 3845. [Google Scholar] [CrossRef]
-
Frisvold, G.; Sanchez, C.; Gollehon, N.; Megdal, S.B.; Brown, P. Evaluating Gravity-Flow Irrigation with Lessons from Yuma, Arizona, USA. Sustainability 2018, 10, 1548. [Google Scholar] [CrossRef]
-
Belaqziz, S.; Mangiarotti, S.; Le Page, M.; Khabba, S.; Er-Raki, S.; Agouti, T.; Drapeau, L.; Kharrou, M.H.; El Adnani, M.; Jarlan, L. Irrigation scheduling of a classical gravity network based on the Covariance Matrix AdaptationEvolutionary Strategy algorithm. Comput. Electron. Agric. 2014, 102, 6472. [Google Scholar] [CrossRef]
-
Nafchi, R.A. Evaluation of the Efficiency of the Micro- irrigation Systems in Gardens of Chaharmahal and Bakhtiari Province of Iran. Int. J. Agric. Econ. 2021, 6, 106110. [Google Scholar] [CrossRef]
-
Norasma, C.Y.N.; Fadzilah, M.A.; Roslin, N.A.; Zanariah, Z.W.N.; Tarmidi, Z.; Candra, F.S. Unmanned Aerial Vehicle Applications In Agriculture. IOP Conf. Ser. Mater. Sci. Eng. 2019, 506, 012063. [Google Scholar] [CrossRef]
-
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 19992049. [Google Scholar] [CrossRef]
-
Rienecker, M.M.; Suarez, M.J.; Gelaro, R.; Todling, R.; Bacmeister, J.; Liu, E.; Bosilovich, M.G.; Schubert, S.D.; Takacs, L.; Kim, G.K.; et al. MERRA: NASAs Modern-
Era Retrospective Analysis for Research and Applications. J. Clim. 2011, 24, 36243648. [Google Scholar] [CrossRef]
-
Kobayashi, S.; Ota, Y.; Harada, Y.; Ebita, A.; Moriya, M.; Onoda, H.; Onogi, K.; Kamahori, H.; Kobayashi, C.; Endo, H.; et al. The JRA-55 Reanalysis: General Specifications and Basic Characteristics. J. Meteorol. Soc. Jpn. Ser. II 2015, 93, 548. [Google Scholar] [CrossRef]
-
Kanamitsu, M.; Ebisuzaki, W.; Woollen, J.; Yang, S.K.; Hnilo, J.J.; Fiorino, M.; Potter, G.L. NCEPDOE AMIP-
II Reanalysis (R-2). Bull. Am. Meteorol. Soc. 2002, 83, 16311644. [Goole Scholar] [CrossRef]
-
Majumdar, P.; Mitra, S. IoT and Machine Learning-Based Approaches for Real Time Environment Parameters Monitoring in Agriculture: An Empirical Review. Agric. Inform. 2021, 5, 89115. [Google Scholar] [CrossRef]
-
Kumar, S.; Ansari, M.A.; Pandey, S.; Tripathi, P.; Singh,
M. Weather Monitoring System Using Smart Sensors Based on IoT. Lect. Notes Netw. Syst. 2020, 106, 351
363. [Google Scholar] [CrossRef]
-
Kodali, R.K.; Mandal, S. IoT Based Weather Station. In Proceedings of the 2016 International Conference on Control Instrumentation Communication and Computational Technologies, ICCICCT 2016, Kumaracoil, India, 1617 December 2016; pp. 680683. [Google Scholar] [CrossRef]
-
Mittal, Y.; Mittal, A.; Bhateja, D.; Parmaar, K.; Mittal,
V.K. Correlation among Environmental Parameters Using an Online Smart Weather Station System. In Proceedings of the 12th IEEE International Conference Electronics, Energy, Environment, Communication, Computer, Control: (E3-C3), INDICON 2015, Delhi, India, 1720 December 2015. [Google Scholar] [CrossRef]
-
Djordjevi, M.; Jovii, B.; Markovi, S.; Paunovi, V.; Dankovi, D. A smart data logger system based on sensor and Internet of Things technology as part of the smart faculty. J. Ambient Intell. Smart Environ. 2020, 12, 359
373. [Google Scholar] [CrossRef]
-
Amin, F.; Abbasi, R.; Mateen, A.; Ali Abid, M.; Khan, S. A Step toward Next-Generation Advancements in the Internet of Things Technologies. Sensors 2022, 22, 8072. [Google Scholar] [CrossRef]
-
Kamilaris, A.; Kartakoullis, A.; Prenafeta-BoldĂş, F.X. A review on the practice of big data analysis in agriculture. Comput. Electron. Agric. 2017, 143, 2337. [Google Scholar] [CrossRef]
-
Muangprathub, J.; Boonnam, N.; Kajornkasirat, S.; Lekbangpong, N.; Wanichsombat, A.; Nillaor, P. IoT and agriculture data analysis for smart farm. Comput. Electron. Agric. 2019, 156, 467474. [Google Scholar] [CrossRef]
-
Math, R.K.M.; Dharwadkar, N.V. IoT Based low-cost weather station and monitoring system for precision agriculture in India. In Proceedings of the 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud), Palladam, India, 3031 August 2018; pp. 8186. [Google Scholar] [CrossRef]
