A Hybrid Multi-Modal AI Orchestration Framework for Scalable Agricultural Decision Support Systems

Keerthana K V; Prajwal M; H Amit Kumar; Deepak C Yadav

doi:10.17577/IJERTCONV14IS060104

ACSCON - 2026 (Volume 14 - Issue 06)

A Hybrid Multi-Modal AI Orchestration Framework for Scalable Agricultural Decision Support Systems

DOI : 10.17577/IJERTCONV14IS060104

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 6
Authors : Keerthana K V, Prajwal M, H Amit Kumar, Deepak C Yadav, Raksha R
Paper ID : IJERTCONV14IS060104
Volume & Issue : Volume 14, Issue 06, ACSCON – 2026
Published (First Online) : 15-06-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

A Hybrid Multi-Modal AI Orchestration Framework for Scalable Agricultural Decision Support Systems

1st Keerthana K V

Department of Computer Science and Engineering,

JSS Science and Technology University, Mysuru, India

Email: keerthanakv08@gmail.com

2nd Prajwal M

Department of Computer Science and Engineering,

JSS Science and Technology University, Mysuru, India Email:prajwalm0105204@gmail.com

3rd H Amit Kumar

Department of Computer Science and Engineering,

JSS Science and Technology University, Mysuru, India Email: amith368@gmail.com

4th Deepak C Yadav Department of Computer Science and Engineering,

JSS Science and Technology University, Mysuru, India

Email: sedeepakyadav@gmail.com

5th Raksha R

Department of Computer Science and Engineering,

JSS Science and Technology University, Mysuru, India Email: raksha@jssstuniv.in

AbstractAgricultural decision-making in real-world scenarios often requires combining multiple types of information including crop conditions, disease symptoms, and market trends. However, most existing systems address these tasks independently, which limits their practical usability. In this work, we design and implement a hybrid multi-modal AI orchestration framework as part of the AgroSync platform to provide unified agricultural decision support. The proposed system integrates three key components: a language-based advisory module, a vision-based crop disease analysis module, and a statistical forecasting module based on ARIMA. Instead of treating these components as separate solutions, we introduce an orchestration pipeline that manages request classification, contextual data injection, model selection, and structured response generation. The architecture follows a microservice-based design to ensure modularity and scalability, making it suitable for deployment in resource constrained rural environments. The forecasting module uses an ARIMA(5,1,0) configuration to analyze commodity price trends, while the advisory module provides contextual recommendations using language models. Experimental evaluation focuses on system-level behavior rather than isolated model accuracy. Results show that the orchestration pipeline maintains consistent output structure, reliable integration across modules, and stable execution of forecasting workflows. The proposed approach demonstrates how combining multiple AI paradigms within a single coordinated system can improve the practicality of agricultural decision support solutions.

KeywordsArtificial Intelligence; Multi-Modal AI; Agricultural Decision Support Systems; AI Orchestration; Crop Disease Detection; ARIMA Forecasting

INTRODUCTION

Agriculture in developing regions often involves decision- making under uncertainty, where farmers need to consider multiple factors such as soil conditions, crop health, weather patterns, and market prices. While several digital tools have been developed for tasks like crop recommendation, disease detection, and price forecasting, these systems usually operate

independently and do not provide a unified decision-making experience.

In practical scenarios, these tasks are interconnected. For example, identifying a crop disease should ideally lead to immediate treatment suggestions, while market trends should influence decisions on when to sell produce. However, most existing agricultural AI solutions focus on optimizing individual models rather than integrating them into a single coordinated system.

In this work, we focus on addressing this gap by designing a hybrid multi-modal AI orchestration framework within the AgroSync platform. The goal is not to build a new standalone model, but to create a system that effectively combines multiple AI services into a structured pipeline. The framework integrates language-based advisory reasoning, image-based crop disease analysis, and time-series forecasting using ARIMA models.

A key contribution of this work is the introduction of a deterministic orchestration pipeline that manages how user requests are processed. This includes classifying the type of request, injecting relevant contextual information such as crop type and region, selecting the appropriate AI module, and enforcing a structured output format. This approach ensures consistency across responses and reduces ambiguity when multiple AI services are involved.

The proposed architecture follows a microservice-based design, allowing each AI component to function independently while still being part of a unified system. This makes the framework scalable and adaptable for real-world deployment, especially in environments with limited computational resources.

Overall, this work emphasizes system-level integration over isolated model performance, highlighting the importance of orchestration in building practical and deployable agricultural

decision support systems.
RELATED WORK
1. Agricultural Decision Support Systems
  
  Early agricultural decision support systems relied on rule-based expert systems incorporating agronomic heuristics and static rule sets. With the advent of machine learning, supervised learning techniques such as Random Forests, Support Vector Machines, and Neural Networks have been widely applied to crop recommendation and yield prediction tasks [8], [9].
  
  However, these systems typically focus on singular predictive objectives and lack integration across advisory, diagnostic, and forecasting domains. Furthermore, most implementations are model-centric rather than architecture-centric.
2. Vision-Based Crop Disease Detection
  
  Computer vision has been extensively applied to crop disease classification using convolutional neural networks trained on curated leaf datasets [12]. While such approaches demonstrate strong classification performance, they require extensive labeled datasets and retraining for new crop categories
  
  Multimodal vision-language models offer broader generalization capabilities by leveraging pre-trained representations. However, systematic architectural integration of such models within scalable agricultural platforms remains limited.
3. Time-Series Forecasting in Agricultural Markets
  
  Commodity price forecasting has traditionally employed statistical models such as ARIMA and SARIMA due to their interpretability and computational efficiency [1]. Deep learning approaches including LSTM and Transformer-based forecasting models have been explored but often require large datasets and substantial computational resources.
  
  For infrastructure-constrained environments, lightweight and interpretable models remain attractive for integration within decision support systems.
4. Hybrid AI Architectures
Hybrid AI architectures combine symbolic reasoning, statistical modeling, and neural inference to leverage complementary strengths [6]. In enterprise systems, microservice-based architectures facilitate modular AI integration through API abstraction layers.

Despite growing adoption in enterprise AI systems, hybrid AI orchestration frameworks tailored for agricultural decision support remain underexplored. This work advances the field by presenting a structured integration strategy designed specifically for agricultural contexts.
Proposed Hybrid Multi-Modal AI Architecture

The proposed framework adopts a layered microservice-oriented architecture to ensure mdularity, scalability, and infrastructure efficiency. The architecture is illustrated in Fig. 1.

Fig. 1. Layered Hybrid AI Architecture of the Proposed Framework.

The architecture consists of five primary layers:
1. Presentation Layer
2. API Gateway Layer
3. AI Orchestration Layer
4. Multi-Modal AI Services Layer
5. Data Persistence Layer
  1. Presentation Layer
    
    The Presentation Layer provides a multilingual user interface
    
    enabling farmers to submit natural language queries, upload crop images, and visualize forecasting outputs. The frontend remains stateless and communicates exclusively through RESTful APIs to ensure separation of concerns.
  2. API Gateway Layer
    
    The API Gateway acts as the centralized routing controller, handling
    
    request authentication, validation, and routing. It abstracts internal
    
    services from direct exposure and enforces uniform request formatting.
  3. AI Orchestration Layer
    
    The AI Orchestration Layer is the core intelligence mediator of the framework. It performs:
    - Request classification
    - Context injection
    - Model selection
    - Structured prompt generation
    - API invocation
    - Schema enforcement
    - Response normalization
      
      This deterministic orchestration pipeline ensures reliable integration of heterogeneous AI modules.
  4. Multi-Modal AI Services Layer
    
    The system integrates three primary AI paradigms:
    1. Text-based advisory reasoning via LLM APIs
    2. Image-based disease analysis via multimodal vision APIs
    3. Statistical price forecasting via ARIMA modelling
    Each service is encapsulated within modular abstraction classes to ensure provider independence and fault isolation
  5. Data Persistence Layer
  The Data Layer stores user logs, forecasting datasets, and system metadata using relational database management systems. Normalized schema design ensures data integrity and
  
  efficient retrieval.
AI ORCHESTRATION PIPELINE

The defining contribution of the proposed framework is its deterministic AI orchestration mechanism, which coordinates heterogeneous AI services within a unified decision-support workflow. The orchestration pipeline is illustrated in Fig. 2.

Fig. 2. Multi-Modal AI Orchestration Pipeline.

The pipeline consists of the following sequential stages:
1. Request Classification
2. Context Injection
3. Model Selection
4. Structured Prompt Construction
5. API Invocation
6. Schema Enforcement
7. Response Normalization
8. Client Delivery
1. Request Classification
  
  Incoming user requests are categorized into one of three primary types:
  - Advisory queries (text-based reasoning)
  - Disease diagnosis queries (image-based analysis)
  - Market forecasting queries (time-series prediction)
    
    Classification is performed using endpoint routing and metadata inspection. This modular routing prevents cross-domain ambiguity.
2. Context Injection
  
  Agricultural queries are highly context-sensitive. The orchestration layer augments user inputs with structured contextual parameters including:
  - Crop type
  - Geographic region
  - Season
  - Soil characteristics
  - Market metadata (if applicable)
  - Language preference
    
    This contextual augmentation ensures domain alignment before AI invocation.
3. Model Selection and Invocation
  
  Based on classified request type, the orchestration layer selects the corresponding AI module:
  - LLM-based advisory engine
  - Vision-based diagnostic engine
  - ARIMA-based forecasting engine
    
    All AI services are accessed via abstraction wrappers, enabling vendor-neutral integration.
4. Schema Enforcement
  
  A critical reliability mechanism is schema-constrained output validation. All LLM-generated outputs must conform to predefined JSON schemas. Invalid responses are either retried or normalized.
  
  This structured validation reduces hallucination risk and improves deterministic behavior.
5. Response Normalization
Final responses are standardized into uniform output formats before being delivered to the client interface. This guarantees UI compatibility and consistent user experience.
MULTI-MODAL INTEGRATION

STRATEGY

Unlike isolated agricultural systems, the proposed framework enables cross-modal reasoning across advisory, vision, and forecasting modules.

The integration strategy is illustrated in Fig. 3. Fig. 3. Hybrid Multi-Modal AI Integration Model.
1. AdvisoryForecast Coupling
  
  Forecasting outputs influence advisory recommendations. For example:
  - Upward price trends may trigger "hold and sell later" guidance.
  - Downward trends may prompt immediate selling suggestions.
    
    This coupling enables dynamic economic reasoning.
2. DiseaseAdvisory Coupling
  
  Image-based disease diagnosis is integrated into advisory responses. Once a disease is detected:
  - Treatment steps are embedded within advisory content.
  - Preventive measures are generated contextually.
    
    This reduces fragmentation between diagnosis and recommendation.
3. Cross-Module Context Sharing
The orchestration layer allows modules to share contextual metadata. This shared-state mechanism enables composite decision-making rather than isolated inference.
STRUCTURED PROMPT ENGINEERING

FORMALIZATION

Prompt engineering plays a foundational role in ensuring deterministic advisory outputs.

We formalize the prompt construction process as:

= (, , , )

Where:

A. Mathematical Formulation

The general ARIMA(p, d, q) model is defined as:
- = User input p q
- = Context parameters (1- . i Li)(1-L)dyt=(1+ . j Lj)t
- = Language directive
- = Output schema constraint
  
  Where:
  
  i=1
  
  j=1
- = Template transformation function
1. Role Conditioning
  
  ach advisory prompt begins with domain-specific role definition:
  
  "You are an agricultural advisory expert specializing in regional farming conditions."
  
  Role conditioning restricts generative scope.
2. Schema-Constrained Output
  
  Outputs are enforced to match structured JSON format:
  
  {
  
  "analysis": "…",
  - is the lag operator
  - )are autoregressive coefficients
  - *are moving average coefficients
  - is differencing order
  - +is white noise
For the implemented configuration:

= 5, = 1, = 0

This yields:

(1 _ ^)(1 )^ _ = (1 + _ ^) _

B. Stationarity and Differencing

First-order differencing ensures stationarity:

"recommendations": "…",

"precautions": "…",

+

= +

+,-

"language": "…"

}

Schema validation ensures response reliability and downstream compatibility.

C. Multilingual Determinism

Language enforcement is embedded directly within prompt construction rather than applied via post-processing translation. This approach improves semantic consistency.
STATISTICAL FORECASTING

INTEGRATION

The forecasting module integrates an Autoregressive Integrated Moving Average (ARIMA) model configured as ARIMA(5,1,0). The forecasting pipeline is illustrated in Fig. 4.

Fig. 4. ARIMA-Based Forecasting Workflow.

This stabilizes the mean and enables autoregressive modeling.

C. Forecast Utilization

Forecast outputs include:
- Short-term price prediction
- Confidence intervals
- Trend direction classification
These outputs are injected into advisory modules for integrated decision support.

Statistical forecasting approaches such as ARIMA remain particularly suitable for agricultural decision support due to their interpretability and minimal computational requirements. Unlike deep learning forecasting models that require extensive training datasets and specialized hardware accelerators, ARIMA models can operate efficiently using limited historical commodity price data. This property makes ARIMA-based forecasting practical for deployment in rural agricultural environments where computational infrastructure may be constrained. Additionally, the transparency of autoregressive models allows agricultural advisors and policymakers to interpret predicted price trends and confidence intervals, improving trust in automated market-aware decision support recommendations.
EXPERIMENTAL EVALUATION

The evaluation of the proposed framework focuses on architectural robustness, functional reliability, and modular integration stability rather than large-scale predictive benchmarking. The goal is to validate the feasibility of hybrid multi-modal AI orchestration in an agricultural decision-support context.
1. Evaluation Objectives
  
  The experimental validation addresses the following objectives:
  1. Functional correctness of the orchestration pipeline
  2. Structured schema compliance of AI outputs
  3. Stability of multi-modal routing
  4. Forecasting pipeline integrity
  5. Observational latency characteristics
2. Functional Validation
  1. Advisory Module Testing
    
    Text-based advisory queries were tested across multiple agricultural scenarios including:
    - Crop planning recommendations
    - Seasonal farming advisory
    - Market timing suggestions Each response was evaluated for:
    - JSON schema compliance
    - Context inclusion correctness
    - Multilingual formatting consistency
    - Logical coherence of recommendations
      
      All valid responses adhered to predefined schema constraints. Beyond functional correctness, the modular orchestration architecture was also evaluated for integration stability across heterogeneous AI services. The orchestration layer maintained consistent response formatting when advisory reasoning, disease diagnosis, and forecasting modules were invoked simultaneously. This demonstrates that the structured orchestration pipeline successfully coordinates multiple AI services without cross-module interference. The abstraction-based service design further enables individual AI components to be updated or replaced without affecting the overall system workflow, improving maintainability and long-term system robustness.
  2. Vision-Based Disease Analysis Testing
    
    Image-based diagnosis was evaluated using sample crop leaf images representing common disease categories.
    
    Validation criteria included:
    - Proper image encoding and API invocation
    - Disease identification output structure
    - Integration of treatment recommendations
    - Advisory coupling consistency
      
      The orchestration layer successfully routed diagnostic outputs into advisory modules for integrated responses.
  3. Forecasting Pipeline Validation
    
    The ARIMA(5,1,0) model was validated using statistically structured synthetic commodity price sequences to demonstrate forecasting pipeline functionality.
    
    Evaluation included:
    - Stationarity verification after differencing
    - Autoregressive coefficient stability
    - Forecast generation consistency
    - Confidence interval estimation
  The forecasting outputs were successfully integrated into advisory reasoning for market-aware recommendations.
3. Observational Latency Analysis
  
  Although no dedicated benchmarking framework was deployed, qualitative latency observations indicated:
  - Text-based advisory latency primarily dependent on external LLM inference time
  - Vision analysis latency dependent on image encoding size and API processing time
  - ARIMA forecasting latency negligible due to local computation
  Future quantitative evaluation will incorporate timestamp- based logging and statistical latency measurement.
4. Proposed Quantitative Metrics for Future Work
For empirical benchmarking in future deployment phases, the following metrics are proposed:
- Root Mean Squared Error (RMSE) for forecasting
- Mean Absolute Error (MAE) for price prediction
- Schema Compliance Rate for advisory responses
- Average API Response Time
- Throughput under concurrent request simulation
These metrics establish a foundation for future large-scale evaluation.
SCALABILITY AND DEPLOYMENT ANALYSIS

Scalability is a primary architectural objective of the proposed framework.
1. Stateless Frontend and API Gateway
  
  The presentation layer remains stateless, enabling horizontal scaling via load balancing mechanisms.
  
  The API gateway enforces:
  - Authentication
  - Request validation
  - Uniform routing
    
    This separation enhances maintainability and extensibility.
2. Infrastructure Efficiency
  
  Unlike monolithic deep learning deployments requiring GPU infrastructure, the proposed architecture:
  - Leverages external AI APIs for inference
  - Performs lightweight statistical forecasting locally
  - Minimizes hardware requirements
This makes the framework suitable for deployment in infrastructure-constrained rural environments.
LIMITATIONS

Despite its architectural strengths, the framework has several limitations.
1. API Dependency
  
  Reliance on external AI providers introduces potential variability in response behavior and latency.
2. Synthetic Forecasting Data
  
  The current validation uses statistically structured synthetic datasets rather than live agricultural market data. Future integration with real-time agricultural market APIs is required for empirical validation.
3. Absence of Field Trials
Large-scale deployment across rural farming communities has not yet been conducted. Real-world usability studies remain future work.
CONCLUSION

This paper presented a hybrid multi-modal AI orchestration framework designed to support practical agricultural decision- making. Unlike traditional approaches that focus on individual model performance, the proposed system emphasizes the integration of multiple AI components into a unified workflow.

The framework combines language-based advisory reasoning, image-based disease diagnosis, and ARIMA-based price forecasting within a structured orchestration pipeline. By introducing stages such as request classification, context injection, and schema-based response validation, the system ensures consistent and reliable outputs across different types of queries.

The microservice-based architecture further enhances modularity and scalability, allowing individual components to be updated or replaced without affecting the overall system. This design makes

the framework suitable for deployment in real-world agricultural environments where flexibility and resource efficiency are important.

While the current implementation demonstrates the feasibility of multi-modal AI orchestration, future work will focus on integrating real-time agricultural datasets, conducting large-scale field evaluations, and improving quantitative performance benchmarking.

Overall, this work highlights the importance of system-level design in developing next-generation agricultural decision support platforms that go beyond isolated machine learning models.

REFERENCES

G. E. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control. San Francisco, CA, USA: Holden- Day, 1976.
J. Brownlee, Introduction to Time Series Forecasting with Python. Melbourne, Australia: Machine Learning Mastery, 2018.
T. Wolf et al., Transformers: State-of-the-Art Natural Language Processing, in Proc. Conf. Empirical Methods Natural Language Processing (EMNLP), 2020.
J. Devlin, M. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in Proc. NAACL-HLT, 2019.
A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, in Proc. Advances in Neural Information Processing Systems, 2012.
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 3rd ed. Pearson Education, 2016.
S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, vol. 9, no. 8, pp. 17351780,

1997.
A. Kamilaris and F. X. Prenafeta-Boldú, Deep Learning in Agriculture: A Survey, Computers and Electronics in Agriculture, vol. 147, pp. 7090, 2018.
K. Liakos et al., Machine Learning in Agriculture: A Review, Sensors, vol. 18, no. 8, 2018.
J. Fountas et al., Farm Management Information Systems: Current Situation and Future Perspectives, Computers and Electronics in Agriculture, vol. 115, pp. 40 50, 2015.
M. Jones et al., Artificial Intelligence Applications in Agriculture, Agricultural Systems, vol. 187, pp. 102994, 2020.
H. Zhang et al., Deep Learning-Based Crop Disease Detection: A Review, Computers and Electronics in Agriculture, vol. 180, 2021.

A Hybrid Multi-Modal AI Orchestration Framework for Scalable Agricultural Decision Support Systems

KeywordsArtificial Intelligence; Multi-Modal AI; Agricultural Decision Support Systems; AI Orchestration; Crop Disease Detection; ARIMA Forecasting

Agricultural Decision Support Systems

Vision-Based Crop Disease Detection

Time-Series Forecasting in Agricultural Markets

Hybrid AI Architectures

Presentation Layer

API Gateway Layer

AI Orchestration Layer

Multi-Modal AI Services Layer

Data Persistence Layer

Request Classification

Context Injection

Model Selection and Invocation

Schema Enforcement

Response Normalization

AdvisoryForecast Coupling

DiseaseAdvisory Coupling

Cross-Module Context Sharing

A. Mathematical Formulation

Role Conditioning

Schema-Constrained Output

B. Stationarity and Differencing

C. Multilingual Determinism

C. Forecast Utilization

Evaluation Objectives

Functional Validation

Advisory Module Testing

Vision-Based Disease Analysis Testing

Forecasting Pipeline Validation

Observational Latency Analysis

Proposed Quantitative Metrics for Future Work

Microservice-Based Modularity

Stateless Frontend and API Gateway

Infrastructure Efficiency

API Dependency

Synthetic Forecasting Data

Absence of Field Trials