Comparative Evaluation of Machine Learning Models for Predicting the Impact of Artificial Intelligence on Employment Outcomes in Diverse Labor Markets

Shravani Rajendra Bandal; Mr. Hanumant P.jagtap

doi:10.17577/IJERTCONV14IS020150

NCRTCS - 2026 (Volume 14 – Issue 02)

Comparative Evaluation of Machine Learning Models for Predicting the Impact of Artificial Intelligence on Employment Outcomes in Diverse Labor Markets

DOI : 10.17577/IJERTCONV14IS020150

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 21
Authors : Shravani Rajendra Bandal, Mr. Hanumant P.jagtap
Paper ID : IJERTCONV14IS020150
Volume & Issue : Volume 14, Issue 02, NCRTCS – 2026
Published (First Online) : 21-04-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Comparative Evaluation of Machine Learning Models for Predicting the Impact of Artificial Intelligence on Employment Outcomes in Diverse Labor Markets

Author: Shravani Rajendra Bandal

MAEERs MIT Arts, Commerce and Science College, Alandi (D.)

Co-Author: Mr. Hanumant P.Jagtap

MAEERs MIT Arts, Commerce and Science College, Alandi(D.)

Abstract – The rapid advancement of Artificial Intelligence (AI) technologies has significantly transformed global labor markets, reshaping employment structures across industries and geographic regions. While AI enhances productivity and efficiency, it simultaneously introduces concerns regarding automation-driven job displacement, wage polarization, and skill obsolescence. This study presents a comprehensive comparative evaluation of machine learning models for predicting AI-driven employment outcomes using structured labor market data. The dataset includes variables such as automation risk percentage, projected job openings, AI impact level, salary distribution, required education level, remote work ratio, gender diversity, and industry classification.

The performance of Decision Tree, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), XGBoost, and Deep Neural Networks (DNN) is evaluated using accuracy, precision, recall, F1-score, and cross-regional generalization metrics. Results indicate that ensemble methods, particularly XGBoost, achieve superior predictive performance with accuracy exceeding 97%. Cross-industry validation reveals moderate performance degradation when models are applied to unseen labor markets, emphasizing the importance of generalizability. The study further explores explainability techniques for interpreting employment outcome predictions. Findings demonstrate that automation risk, projected growth, and AI impact level are dominant predictive factors. This research contributes to AI-driven workforce forecasting and provides insights for policymakers, education planners, and economic strategists.

Keywords: Artificial Intelligence, Employment Prediction, Automation Risk, Machine Learning, Labor Economics, XGBoost, Workforce Analytics, Generalization.

INTRODUCTION

Artificial Intelligence (AI) is redefining economic structures by automating routine processes, augmenting human capabilities, and altering employment demand patterns. Across manufacturing, retail, transportation, healthcare, and information technology sectors, AI adoption is accelerating [1]. While technological progress historically generates new employment opportunities, it also displaces specific occupational categories [2].

Studies suggest that automation risk varies significantly across industries depending on task routine intensity and digital transformation maturity [3]. For example, manufacturing and transportation sectors demonstrate higher exposure to robotic automation, whereas IT and healthcare show resilience due to cognitive complexity requirements [4].

Predicting employment outcomes under AI influence requires advanced analytical techniques capable of modeling nonlinear relationships among multiple economic indicators. Traditional statistical models often struggle with such complexity. Machine learning approaches offer scalable solutions capable of capturing intricate interdependencies within labor datasets [5].

This study aims to:
1. Compare predictive performance of multiple ML models.
2. Analyze automation risk distribution across industries.
3. Evaluate cross-regional generalization capability.
4. Provide interpretable insights into employment impact drivers.

LITERATURE REVIEW

AI and Labor Market Transformation

Autor [1] argues that automation replaces routine labor but complements cognitive-intensive tasks. Frey & Osborne [2] estimated high automation vulnerability across developed economies. Acemoglu and Restrepo [3] quantified employment decline in regions adopting industrial robots.
Machine Learning in Economic Forecasting

Random Forest and Gradient Boosting models have demonstrated strong predictive performance in labor market analytics [6]. XGBoost enhances boosting performance via regularization and tree optimization [7].
Automation Risk Modeling

	Automation Risk (%)	Estimated automation probability
	Remote Work Ratio (%)	Work flexibility indicator
	Gender Diversity (%)	Workforce inclusiveness
	Required Education	Qualification level

IV. METHODOLOGY

A. Feature Engineering

New Feature:

Growth = Projected Openings (2030) Job Openings (2024)

Categorical encoding applied to:

Automation risk percentage is frequently derived from occupational task analysis frameworks [2]. Higher routine index correlates with greater displacement probability.

D. Explainability in Socio-Economic AI Systems

SHAP and LIME provide interpretable feature contribution analysis [8]. Transparency is essential for policy-level decision-making.

Research Gaps

Limited cross-regional validation
Overreliance on accuracy metrics
Insufficient explainability integration
Lack of comparative ML analysis in labor economics

DATASET DESCRIPTION

	Attribute	Description
	Industry	Sector classification
	Job Status	Increasing / Decreasing
	AI Impact Level	Low / Medium / High
	Median Salary	Annual salary (USD)
	Job Openings (2024)	Current openings
	Projected Openings (2030)	Future forecast

Attributes

Industry
AI Impact Level
Education Level

B. Models Evaluated

Model	Type
Decision Tree	Rule-based
Random Forest	Ensemble
SVM	Kernel-based
KNN	Distance-based
XGBoost	Gradient Boosting

C. Evaluation Metrics

Accuracy Precision Recall

F1-Score

RESULTS AND PERFORMANCE ANALYSIS Table 1: Model Performance Comparison

Model

Accura cy

Precisi on

Rec all

F1-

Sco re

Generaliza tion

Decisi on Tree	90.5	89.8	90.2	90.0	Medium
Rando m Forest	96.2	96.0	96.1	96.0	High
SVM	94.8	94.5	94.7	94.6	Medium
KNN	92.3	92.0	92.2	92.1	Low
XGBo ost	97.6	97.5	97.6	97.5	Very High
DNN	96.9	96.7	96.8	96.7	High

Table 2: Statistical Summary

Statistic	Accuracy
Mean	94.7
Median	95.5
Minimum	90.5
Maximum	97.6
Std Dev	2.6

Automation Risk by Industry

Transportation and Retail show highest automation exposure.
Projected Job Growth Analysis

IT and Healthcare demonstrate strong positive growth.
Employment Status Distribution

Balanced distribution between increasing and decreasing jobs.

CROSS-REGIONAL GENERALIZATION ANALYSIS

Cross-industry validation shows performance drop of 48% when applied to unseen sectors.

Table 3: Cross-Industry Validation Accuracy

Train Industry

Test Industry

Accuracy

IT

Retail

91.2

Manufacturing

Healthcare

89.8

Retail

Transportation

90.5

This confirms domain shift impact [3].
DISCUSSION

The findings of this study confirm that AI-driven employment transformation is neither uniformly destructive nor universally beneficial. Instead, its impact varies across industries, skill levels, and technological readiness.

6.1 Interpretation of Model Performance

The superior performance of XGBoost (97.6%) aligns with prior research demonstrating the effectiveness of gradient boosting in structured datasets [7][10]. Unlike single decision trees, boosting algorithms iteratively correct previous model

displacement probability. This supports human capital theory, which argues that skill acquisition enhances labor adaptability [17].

Digital skill development emerges as a critical protective factor. As AI technologies evolve, workers with technical adaptability experience smoother occupational transitions [18].

Cross-Regional Generalization and Domain Shift

Cross-industry validation revealed performance degradation of 48%, indicating domain shift. Economic structures vary significantly across regions due to:
- Policy differences
- Infrastructure gaps
- Labor cost variations
- Industry maturity levels

errors, leading to improved bias-variance tradeoff [11]. Random Forest also performs strongly due to ensemble averaging [6].

Support Vector Machines and KNN show moderate performance, likely due to high-dimensional categorical encoding and nonlinear feature interactions [12]. Deep Neural Networks demonstrate competitive accuracy; however, they require larger datasets to fully exploit their representational capacity [13].

These results confirm that ensemble learning methods are most suitable for employment forecasting problems involving heterogeneous socio-economic variables.

Economic Interpretation of Automation Risk

Automation risk is strongly associated with routine task intensity, confirming earlier task-based labor theories [2][14]. Industries such as transportation and retail exhibit higher automation exposure due to repetitive operational processes. Conversely, IT and healthcare demonstrate lower displacement probability because of cognitive complexity and human interaction requirements [1][15].

The relationship between projected job growth and automation risk indicates that AI adoption often shifts job composition rather than eliminating employment entirely. This supports the creative destruction framework proposed by Schumpeter [16].
Role of Education and Skill Levels

Education level moderates automation risk. Higher educational qualifications correlate with reduced

Domain adaptation techniques such as transfer learning and federated learning may reduce generalization errors in future studies [19].

Policy Implications

The findings suggest several policy-level interventions:
1. Investment in digital reskilling programs.
2. Incentives for industries adopting AI responsibly.
3. Social protection frameworks for displaced workers.
4. Education reform to emphasize computational literacy.

The World Economic Forum [4] emphasizes that reskilling initiatives will determine whether AI leads to net

employment gains or losses.

VII.CHALLENGES AND LIMITATIONS

Despite strong predictive performance, several technical and economic challenges exist.

Data-Related Limitations
- The dataset lacks longitudinal time-series records.
- Automation risk percentages are estimated rather than experimentally measured.
- Regional macroeconomic indicators (GDP growth, inflation) are not included.
- Informal labor markets are excluded.
Model Interpretability Constraints

Although XGBoost achieves high accuracy, it functions as a semi-black-box model. Without explainability tools such as SHAP [8], policy adoption may face resistance.
Ethical and Fairness Concerns

AI models trained on historical labor data may reinforce existing inequalities [20]. For example:
- Gender bias in hiring patterns
- Wage disparity replication
- Regional economic favoritism
  
  Fairness-aware machine learning frameworks are necessary [21].
Economic Uncertainty Factors

Global economic shocks (pandemics, geopolitical crises, inflationary cycles) significantly influence employment patterns but are not modeled in this dataset [22].
Generalization Issues

Models trained on structured datasets may not generalize to:
- Developing economies
- Informal workforce sectors
- Gig economy platforms
  
  Domain adaptation and multi-country datasets are required [19].
Technological Acceleration Risk

AI capabilities evolve rapidly. Models trained on current automation risk estimates may become outdated within short timeframes [23].

leverage predictive analytics responsibly to design adaptive labor strategies.

Future systems should integrate:
- Real-time economic indicators
- Explainable AI techniques
- Fairness constraints
- Cross-country federated learning
- Time-series employment forecasting

IX. REFERENCES

Autor, D. (2015). Why are there still so many jobs? Journal of Eonomic Perspectives.
Frey, C., & Osborne, M. (2013). The future of employment.
Acemoglu, D., & Restrepo, P. (2020). Robots and jobs.
World Economic Forum. (2023). Future of Jobs Report.
Athey, S. (2018). Machine learning in economics.
Breiman, L. (2001). Random forests.
Chen, T., & Guestrin, C. (2016). XGBoost.
Lundberg, S., & Lee, S. (2017). SHAP values.
Molnar, C. (2022). Interpretable Machine Learning.
Friedman, J. (2001). Gradient boosting machines.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning.
Cortes, C., & Vapnik, V. (1995). Support vector networks.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning.
Autor, D., Levy, F., & Murnane, R. (2003). The skill content of recent technological change.
Brynjolfsson, E., & McAfee, A. (2017). The business of artificial intelligence.
Schumpeter, J. (1942). Capitalism, Socialism and Democracy.
Becker, G. (1964). Human Capital Theory.
Deming, D. (2017). The growing importance of social skills.
Pan, S., & Yang, Q. (2010). A survey on transfer learning.
Barocas, S., & Selbst, A. (2016). Big datas disparate impact.
Mehrabi, N., et al. (2021). A survey on bias and fairness in machine learning.
International Labour Organization. (2022). World Employment Report.
OECD. (2023). AI and the Future of Work Report.

VIII.CONCLUSION

This expanded analysis confirms that machine learning models, particularly ensemble methods such as XGBoost, provide reliable predictive frameworks for forecasting AI- driven employment outcomes. However, predictive accuracy alone is insufficient. Transparency, fairness, and cross- regional robustness must accompany performance optimization.

AI does not inherently cause mass unemployment. Instead, it reshapes job composition, increases demand for digital skills, and accelerates structural transformation. Policymakers must

Train Industry	Test Industry	Accuracy
IT	Retail	91.2
Manufacturing	Healthcare	89.8
Retail	Transportation	90.5

Comparative Evaluation of Machine Learning Models for Predicting the Impact of Artificial Intelligence on Employment Outcomes in Diverse Labor Markets

Author: Shravani Rajendra Bandal

Keywords: Artificial Intelligence, Employment Prediction, Automation Risk, Machine Learning, Labor Economics, XGBoost, Workforce Analytics, Generalization.

INTRODUCTION

LITERATURE REVIEW

AI and Labor Market Transformation

Machine Learning in Economic Forecasting

Automation Risk Modeling

IV. METHODOLOGY

A. Feature Engineering

Growth = Projected Openings (2030) Job Openings (2024)

D. Explainability in Socio-Economic AI Systems

Research Gaps

DATASET DESCRIPTION

Attributes

B. Models Evaluated

C. Evaluation Metrics

RESULTS AND PERFORMANCE ANALYSIS Table 1: Model Performance Comparison

Table 2: Statistical Summary

Automation Risk by Industry

Projected Job Growth Analysis

Employment Status Distribution

CROSS-REGIONAL GENERALIZATION ANALYSIS

Table 3: Cross-Industry Validation Accuracy

DISCUSSION

6.1 Interpretation of Model Performance

Cross-Regional Generalization and Domain Shift

Economic Interpretation of Automation Risk

Role of Education and Skill Levels

Policy Implications

VII.CHALLENGES AND LIMITATIONS

Data-Related Limitations

Model Interpretability Constraints

Ethical and Fairness Concerns

Economic Uncertainty Factors

Generalization Issues

Technological Acceleration Risk

IX. REFERENCES

VIII.CONCLUSION