International Scholarly Publisher
Serving Researchers Since 2012

Comparative Evaluation of Machine Learning Models for Predicting the Impact of Artificial Intelligence on Employment Outcomes in Diverse Labor Markets

DOI : 10.17577/IJERTCONV14IS020150
Download Full-Text PDF Cite this Publication

Text Only Version

Comparative Evaluation of Machine Learning Models for Predicting the Impact of Artificial Intelligence on Employment Outcomes in Diverse Labor Markets

Author: Shravani Rajendra Bandal

MAEERs MIT Arts, Commerce and Science College, Alandi (D.)

Co-Author: Mr. Hanumant P.Jagtap

MAEERs MIT Arts, Commerce and Science College, Alandi(D.)

Abstract – The rapid advancement of Artificial Intelligence (AI) technologies has significantly transformed global labor markets, reshaping employment structures across industries and geographic regions. While AI enhances productivity and efficiency, it simultaneously introduces concerns regarding automation-driven job displacement, wage polarization, and skill obsolescence. This study presents a comprehensive comparative evaluation of machine learning models for predicting AI-driven employment outcomes using structured labor market data. The dataset includes variables such as automation risk percentage, projected job openings, AI impact level, salary distribution, required education level, remote work ratio, gender diversity, and industry classification.

The performance of Decision Tree, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), XGBoost, and Deep Neural Networks (DNN) is evaluated using accuracy, precision, recall, F1-score, and cross-regional generalization metrics. Results indicate that ensemble methods, particularly XGBoost, achieve superior predictive performance with accuracy exceeding 97%. Cross-industry validation reveals moderate performance degradation when models are applied to unseen labor markets, emphasizing the importance of generalizability. The study further explores explainability techniques for interpreting employment outcome predictions. Findings demonstrate that automation risk, projected growth, and AI impact level are dominant predictive factors. This research contributes to AI-driven workforce forecasting and provides insights for policymakers, education planners, and economic strategists.

Keywords: Artificial Intelligence, Employment Prediction, Automation Risk, Machine Learning, Labor Economics, XGBoost, Workforce Analytics, Generalization.

  1. INTRODUCTION

    Artificial Intelligence (AI) is redefining economic structures by automating routine processes, augmenting human capabilities, and altering employment demand patterns. Across manufacturing, retail, transportation, healthcare, and information technology sectors, AI adoption is accelerating [1]. While technological progress historically generates new employment opportunities, it also displaces specific occupational categories [2].

    Studies suggest that automation risk varies significantly across industries depending on task routine intensity and digital transformation maturity [3]. For example, manufacturing and transportation sectors demonstrate higher exposure to robotic automation, whereas IT and healthcare show resilience due to cognitive complexity requirements [4].

    Predicting employment outcomes under AI influence requires advanced analytical techniques capable of modeling nonlinear relationships among multiple economic indicators. Traditional statistical models often struggle with such complexity. Machine learning approaches offer scalable solutions capable of capturing intricate interdependencies within labor datasets [5].

    This study aims to:

    1. Compare predictive performance of multiple ML models.

    2. Analyze automation risk distribution across industries.

    3. Evaluate cross-regional generalization capability.

    4. Provide interpretable insights into employment impact drivers.

  2. LITERATURE REVIEW

    1. AI and Labor Market Transformation

      Autor [1] argues that automation replaces routine labor but complements cognitive-intensive tasks. Frey & Osborne [2] estimated high automation vulnerability across developed economies. Acemoglu and Restrepo [3] quantified employment decline in regions adopting industrial robots.

    2. Machine Learning in Economic Forecasting

      Random Forest and Gradient Boosting models have demonstrated strong predictive performance in labor market analytics [6]. XGBoost enhances boosting performance via regularization and tree optimization [7].

    3. Automation Risk Modeling

    Automation Risk (%)

    Estimated automation probability

    Remote Work Ratio (%)

    Work flexibility indicator

    Gender Diversity (%)

    Workforce inclusiveness

    Required Education

    Qualification level

    IV. METHODOLOGY

    A. Feature Engineering

    New Feature:

    Growth = Projected Openings (2030) Job Openings (2024)

    Categorical encoding applied to:

    Automation risk percentage is frequently derived from occupational task analysis frameworks [2]. Higher routine index correlates with greater displacement probability.

    D. Explainability in Socio-Economic AI Systems

    SHAP and LIME provide interpretable feature contribution analysis [8]. Transparency is essential for policy-level decision-making.

    Research Gaps

    • Limited cross-regional validation

    • Overreliance on accuracy metrics

    • Insufficient explainability integration

    • Lack of comparative ML analysis in labor economics

  3. DATASET DESCRIPTION

Attribute

Description

Industry

Sector classification

Job Status

Increasing / Decreasing

AI Impact Level

Low / Medium / High

Median Salary

Annual salary (USD)

Job Openings (2024)

Current openings

Projected Openings (2030)

Future forecast

Attributes

  • Industry

  • AI Impact Level

  • Education Level

B. Models Evaluated

Model

Type

Decision Tree

Rule-based

Random Forest

Ensemble

SVM

Kernel-based

KNN

Distance-based

XGBoost

Gradient Boosting

C. Evaluation Metrics

Accuracy Precision Recall

F1-Score

  1. RESULTS AND PERFORMANCE ANALYSIS Table 1: Model Performance Comparison

    Model

    Accura cy

    Precisi on

    Rec all

    F1-

    Sco re

    Generaliza tion

    Decisi on Tree

    90.5

    89.8

    90.2

    90.0

    Medium

    Rando m Forest

    96.2

    96.0

    96.1

    96.0

    High

    SVM

    94.8

    94.5

    94.7

    94.6

    Medium

    KNN

    92.3

    92.0

    92.2

    92.1

    Low

    XGBo

    ost

    97.6

    97.5

    97.6

    97.5

    Very High

    DNN

    96.9

    96.7

    96.8

    96.7

    High

    Table 2: Statistical Summary

    Statistic

    Accuracy

    Mean

    94.7

    Median

    95.5

    Minimum

    90.5

    Maximum

    97.6

    Std Dev

    2.6

    1. Automation Risk by Industry

      Transportation and Retail show highest automation exposure.

    2. Projected Job Growth Analysis

      IT and Healthcare demonstrate strong positive growth.

    3. Employment Status Distribution

    Balanced distribution between increasing and decreasing jobs.

  2. CROSS-REGIONAL GENERALIZATION ANALYSIS

    Cross-industry validation shows performance drop of 48% when applied to unseen sectors.

    Table 3: Cross-Industry Validation Accuracy

    Train Industry

    Test Industry

    Accuracy

    IT

    Retail

    91.2

    Manufacturing

    Healthcare

    89.8

    Retail

    Transportation

    90.5

    This confirms domain shift impact [3].

  3. DISCUSSION

The findings of this study confirm that AI-driven employment transformation is neither uniformly destructive nor universally beneficial. Instead, its impact varies across industries, skill levels, and technological readiness.

6.1 Interpretation of Model Performance

The superior performance of XGBoost (97.6%) aligns with prior research demonstrating the effectiveness of gradient boosting in structured datasets [7][10]. Unlike single decision trees, boosting algorithms iteratively correct previous model

displacement probability. This supports human capital theory, which argues that skill acquisition enhances labor adaptability [17].

Digital skill development emerges as a critical protective factor. As AI technologies evolve, workers with technical adaptability experience smoother occupational transitions [18].

    1. Cross-Regional Generalization and Domain Shift

      Cross-industry validation revealed performance degradation of 48%, indicating domain shift. Economic structures vary significantly across regions due to:

      • Policy differences

      • Infrastructure gaps

      • Labor cost variations

      • Industry maturity levels

errors, leading to improved bias-variance tradeoff [11]. Random Forest also performs strongly due to ensemble averaging [6].

Support Vector Machines and KNN show moderate performance, likely due to high-dimensional categorical encoding and nonlinear feature interactions [12]. Deep Neural Networks demonstrate competitive accuracy; however, they require larger datasets to fully exploit their representational capacity [13].

These results confirm that ensemble learning methods are most suitable for employment forecasting problems involving heterogeneous socio-economic variables.

    1. Economic Interpretation of Automation Risk

      Automation risk is strongly associated with routine task intensity, confirming earlier task-based labor theories [2][14]. Industries such as transportation and retail exhibit higher automation exposure due to repetitive operational processes. Conversely, IT and healthcare demonstrate lower displacement probability because of cognitive complexity and human interaction requirements [1][15].

      The relationship between projected job growth and automation risk indicates that AI adoption often shifts job composition rather than eliminating employment entirely. This supports the creative destruction framework proposed by Schumpeter [16].

    2. Role of Education and Skill Levels

Education level moderates automation risk. Higher educational qualifications correlate with reduced

Domain adaptation techniques such as transfer learning and federated learning may reduce generalization errors in future studies [19].

    1. Policy Implications

      The findings suggest several policy-level interventions:

      1. Investment in digital reskilling programs.

      2. Incentives for industries adopting AI responsibly.

      3. Social protection frameworks for displaced workers.

      4. Education reform to emphasize computational literacy.

The World Economic Forum [4] emphasizes that reskilling initiatives will determine whether AI leads to net

employment gains or losses.

VII.CHALLENGES AND LIMITATIONS

Despite strong predictive performance, several technical and economic challenges exist.

    1. Data-Related Limitations

      • The dataset lacks longitudinal time-series records.

      • Automation risk percentages are estimated rather than experimentally measured.

      • Regional macroeconomic indicators (GDP growth, inflation) are not included.

      • Informal labor markets are excluded.

    2. Model Interpretability Constraints

      Although XGBoost achieves high accuracy, it functions as a semi-black-box model. Without explainability tools such as SHAP [8], policy adoption may face resistance.

    3. Ethical and Fairness Concerns

      AI models trained on historical labor data may reinforce existing inequalities [20]. For example:

      • Gender bias in hiring patterns

      • Wage disparity replication

      • Regional economic favoritism

        Fairness-aware machine learning frameworks are necessary [21].

    4. Economic Uncertainty Factors

      Global economic shocks (pandemics, geopolitical crises, inflationary cycles) significantly influence employment patterns but are not modeled in this dataset [22].

    5. Generalization Issues

      Models trained on structured datasets may not generalize to:

      • Developing economies

      • Informal workforce sectors

      • Gig economy platforms

        Domain adaptation and multi-country datasets are required [19].

    6. Technological Acceleration Risk

      AI capabilities evolve rapidly. Models trained on current automation risk estimates may become outdated within short timeframes [23].

      leverage predictive analytics responsibly to design adaptive labor strategies.

      Future systems should integrate:

      • Real-time economic indicators

      • Explainable AI techniques

      • Fairness constraints

      • Cross-country federated learning

      • Time-series employment forecasting

IX. REFERENCES

  1. Autor, D. (2015). Why are there still so many jobs? Journal of Eonomic Perspectives.

  2. Frey, C., & Osborne, M. (2013). The future of employment.

  3. Acemoglu, D., & Restrepo, P. (2020). Robots and jobs.

  4. World Economic Forum. (2023). Future of Jobs Report.

  5. Athey, S. (2018). Machine learning in economics.

  6. Breiman, L. (2001). Random forests.

  7. Chen, T., & Guestrin, C. (2016). XGBoost.

  8. Lundberg, S., & Lee, S. (2017). SHAP values.

  9. Molnar, C. (2022). Interpretable Machine Learning.

  10. Friedman, J. (2001). Gradient boosting machines.

  11. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning.

  12. Cortes, C., & Vapnik, V. (1995). Support vector networks.

  13. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning.

  14. Autor, D., Levy, F., & Murnane, R. (2003). The skill content of recent technological change.

  15. Brynjolfsson, E., & McAfee, A. (2017). The business of artificial intelligence.

  16. Schumpeter, J. (1942). Capitalism, Socialism and Democracy.

  17. Becker, G. (1964). Human Capital Theory.

  18. Deming, D. (2017). The growing importance of social skills.

  19. Pan, S., & Yang, Q. (2010). A survey on transfer learning.

  20. Barocas, S., & Selbst, A. (2016). Big datas disparate impact.

  21. Mehrabi, N., et al. (2021). A survey on bias and fairness in machine learning.

  22. International Labour Organization. (2022). World Employment Report.

  23. OECD. (2023). AI and the Future of Work Report.

VIII.CONCLUSION

This expanded analysis confirms that machine learning models, particularly ensemble methods such as XGBoost, provide reliable predictive frameworks for forecasting AI- driven employment outcomes. However, predictive accuracy alone is insufficient. Transparency, fairness, and cross- regional robustness must accompany performance optimization.

AI does not inherently cause mass unemployment. Instead, it reshapes job composition, increases demand for digital skills, and accelerates structural transformation. Policymakers must