SmartConnect - AI powered LinkedIn Outreach Automation

Prof Sowmya M; Anagha M K; Keerthana M

doi:10.17577/IJERTCONV14IS060052

ACSCON - 2026 (Volume 14 - Issue 06)

SmartConnect – AI powered LinkedIn Outreach Automation

DOI : 10.17577/IJERTCONV14IS060052

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 10
Authors : Prof Sowmya M, Anagha M K, Keerthana M
Paper ID : IJERTCONV14IS060052
Volume & Issue : Volume 14, Issue 06, ACSCON – 2026
Published (First Online) : 15-06-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

SmartConnect – AI powered LinkedIn Outreach Automation

Prof Sowmya M

Dept of Artificial Intelligence and Data Science Nitte Meenakshi Institute of Technology Bangalore, India

sowmya.m@nmit.ac.in

Anagha K M

Dept of Artificial Intelligence and Data Science Nitte Meenakshi Institute of Technology Bangalore, India kmanaghajois@gmail.com

Keerthana M

Dept of Artificial Intelligence and Data Science Nitte Meenakshi Institute of Technology

Bangalore, India km226077@gmail.com

In modern recruitment systems, identifying relevant candidates and engaging them through personalized communication remains a time- consuming and inefficient process. Existing approaches primarily rely on keyword-based matching or template-driven outreach, limiting both accuracy and adaptability. This paper presents SmartConnect, a hybrid artificial intelligence framework that integrates semantic candidate ranking with context-aware outreach generation. The proposed system introduces a multi-factor scoring mechanism that combines skill similarity, role alignment, and experience relevance to improve candidate matching. In addition, a structured prompt-based approach is used to guide large language models for generating personalized and professional outreach messages. The framework unifies candidate retrieval and communication into a single pipeline, improving both relevance and engagement. Experimental evaluation on a dataset of simulated professional profiles demonstrates that the proposed method provides more accurate recommendations and enhanced message personalization compared to traditional approaches. The results highlight the potential of combining natural language processing and generative artificial intelligence for scalable recruitment automation.

KeywordsAI recruitment, semantic similarity, natural language processing, recommender systems, outreach automation, large language models

descriptions. In parallel, the emergence of Large Language Models (LLMs) has made it possible to generate human-like, context-aware text, opening new opportunities for automating recruiter communication.

Despite these advancements, most existing systems treat candidate matching and outreach generation as independent tasks. This separation limits the effectiveness of recruitment automation, as the communication process is not directly informed by the relevance of candidate profiles. Furthermore, many outreach systems rely on static templates, which lack personalization and fail to engage candidates effectively.

To address these challenges, this paper proposes SmartConnect, a hybrid artificial intelligence framework that integrates semantic candidate ranking with context-aware outreach generation. The proposed system introduces a multi- factor scoring mechanism that combines skill similarity, role alignment, and experience relevance to improve candidate selection. Additionally, a structured prompt-based approach is used to guide LLM-based message generation, ensuring personalized and professional communication.

Introduction

The rapid growth of professional networking platforms has significantly increased the availability of candidate data for recruitment processes. Platforms such as LinkedIn provide access to a vast pool of professionals with diverse skills, experiences, and career trajectories. However, identifying relevant candidates and initiating personalized communication remains a time-consuming and inefficient task for recruiters. Manual screening of profiles and crafting individualized outreach messages often require substantial effort, limiting scalability and productivity in modern hiring workflows.

Traditional recruitment systems primarily rely on keyword-based filtering and rule-based approaches, which fail to capture the contextual meaning of candidate profiles. Such systems often overlook suitable candidates due to variations in terminology, leading to reduced matching accuracy. Recent advancements in Natural Language Processing (NLP) and Machine Learning (ML) have enabled the development of semantic similarity models that better understand contextual relationships between skills, roles, and professional

The key contributions of this work are as follows:
- Development of a hybrid multi-factor ranking model for improved candidate matching
- Integration of candidate retrieval and outreach generation into a unified pipeline
- Design of a structured prompt mechanism for controlled and personalized message generation
- Experimental evaluation demonstrating improved relevance and engagement compared to traditional approaches
The proposed framework aims to enhance recruitment efficiency by automating both candidate discovery and communication while maintaining contextual relevance and personalization.
Related work

The application of Artificial Intelligence (AI) in recruitment and professional networking has gained significant attention in recent years. Various approaches have been

proposed to automate candidate matching, improve recommendation systems, and enhance communication using Natural Language Processing (NLP) and Machine Learning (ML) techniques.

Early recruitment systems primarily relied on keyword- based matching and rule-based filtering methods. While these approaches are computationally simple, they often fail to capture the semantic relationships between candidate skills and job requirements, leading to inaccurate or incomplete candidate selection. To overcome these limitations, recent studies have employed semantic similarity models, particularly transformer-based architectures such as BERT and Sentence-BERT, which generate contextual embeddings for improved matching accuracy [1], [2].

In addition to candidate matching, research has also focused on automated communication systems. Pre-trained language models such as GPT, Flan-T5, and DistilGPT-2 have been widely used for generating human-like text in applications such as chatbots, email automation, and customer engagement [3], [4]. These models have demonstrated strong capabilities in producing context-aware and coherent responses, making them suitable for outreach automation in recruitment and marketing domains.

Several works have explored AI-driven recruitment platforms that integrate recommendation systems with candidate profiling [6]. These systems leverage structured and unstructured data to improve candidate-job alignment. However, many of these approaches depend on historical hiring data or proprietary datasets, limiting their applicability in open or generalized environments. Furthermore, most systems focus primarily on candidate matching and do not incorporate personalized communication as part of the pipeline.

Recent advancements in business automation and customer relationship management (CRM) have introduced AI-based tools for lead generation and outreach optimization [9]. While these systems provide automation capabilities, they often rely on static templates or lack adaptability to user- specific contexts, reducing personalization effectiveness.

Despite the progress in both semantic matching and text generation, a key limitation in existing research is the lack of integration between candidate ranking and outreach generation. Most systems treat these components independently, resulting in suboptimal recruitment workflows. Additionally, the use of single-metric similarity measures fails to capture the multi-dimensional nature of candidate relevance.

o address these gaps, the proposed work introduces a unified framework that combines hybrid semantic ranking with context-aware message generation. By integrating multi-factor similarity scoring with structured LLM-based personalization, the system aims to improve both candidate selection accuracy and engagement quality in recruitment automation.
METHODOLOGY

The proposed SmartConnect framework is designed as a two-stage pipeline that integrates semantic candidate ranking with context-aware outreach generation. The methodology consists of multiple components, including data preprocessing, feature extraction, hybrid ranking, and personalized message generation.
1. System Overview
  
  The overall system architecture follows a sequential
  
  workflow:
  1. Data preprocessing and text normalization
  2. Semantic embedding generation
  3. Hybrid candidate ranking using multi-factor scoring
  4. Context-aware outreach message generation using LLMs
    
    This structured pipeline ensures that both candidate selection and communication are optimized within a unified framework.
2. Data Preprocessing and Text Normalization
  
  The input dataset consists of simulated LinkedIn profiles containing unstructured textual information such as skills, job roles, experience, and professional summaries. To ensure consistency and improve model performance, a comprehensive preprocessing pipeline was implemented.
  
  Initially, data cleaning was performed to remove duplicate records, handle missing values, and eliminate noisy or irrelevant text. All textual data were converted to lowercase to maintain uniformity. Special characters, punctuation, hyperlinks, and extra whitespace were removed as part of normalization.
  
  Next, tokenization was applied to split textual content into smaller meaningful units (tokens). This process was carried out using the spaCy NLP library, which provides context-aware token segmentation. Tokenization enables efficient feature extraction and downstream processing.
  
  Following tokenization, lemmatization was performed to reduce words to their base forms while preserving semantic meaning. For example, variations such as learning, learned, and learns were mapped to the root word learn. Lemmatization improves consistency in text representation and enhances similarity computation.
  
  Additionally, stopword removal was applied to eliminate commonly used words that do not contribute to semantic meaning. Named Entity Recognition (NER) was also employed to extract key entities such as job roles, organizations, and technical skills, enabling structured representation of profile data.
3. Semantic Embedding Generation
  
  After preprocessing, the textual data were converted into numerical representations using Sentence Transformer models. These models generate dense vector embeddings that capture contextual relationships between words and phrases.
  
  Each candidate profile was represented as a high- dimensional embedding vector derived from attributes such as skills, experience, and professional summaries. These embeddings enable semantic comparison between profiles beyond simple keyword matching.
4. Hybrid Candidate Ranking Model
  
  To improve matching accuracy, a multi-factor hybrid scoring mechanism was introduced instead of relying solely on cosine similarity. The final relevance score for each candidate is computed as a weighted combination of multiple similarity measures:
  - Skill similarity: Measures overlap and semantic closeness of skill sets
  - Role similarity: Captures alignment between job titles and responsibilities
  - Experience similarity: Evaluates similarity in professional background
  The overall ranking score is computed as a weighted sum of these factors, where the weights can be adjusted based on recruiter preferences. This approach enables more flexible and accurate candidate selection by capturing multiple dimensions of relevance.
5. imilarity Computation
  
  Cosine similarity was used to measure the closeness between embedding vectors. Profiles with higher similarity scores were ranked as more relevant candidates. This semantic approach ensures that candidates with similar expertise are identified even if different terminology is used.
6. ontext-Aware Message Generation
  
  To automate recruiter communication, a Large Language Model (LLM) was integrated for generating personalized outreach messages. A structured prompt template was designed to guide the generation process by incorporating candidate- specific attributes such as name, role, company, and key skills.
  
  The prompt engineering approach ensures that generated messages are:
  - Contextually relevant
  - Professionally structured
  - Personalized to each candidate
  In addition, fallback template-based messages were used in scenarios where LLM access was unavailable, ensuring system reliability.
7. Ranking and Generation Integration
  
  Unlike traditional systems, the proposed framework integrates candidate ranking with message generation. The ranking outputs are used to condition the LLM prompts, ensuring that higher- ranked candidates receive more refined and context-aware communication.
  
  This coupling between ranking and generation improves both recommendation relevance and outreach effectiveness.
8. ystem Implementation
The system was implemented using Python-based libraries, including spaCy for NLP processing, Sentence Transformers for embedding generation, and transformer-based models for text generation. A lightweight interface was developed for demonstration and evaluation purposes.
EXPERIMENTAL SETUP

The experimental setup was designed to evaluate the effectiveness of the proposed SmartConnect framework in terms of candidate matching accuracy and outreach message personalization. The evaluation includes dataset preparation, baseline comparisons, and performance metrics.
1. taset Description
  
  The experiments were conducted on a dataset of simulated
  
  LinkedIn profiles due to privacy constraints associated with real-world data. The dataset initially contained 1,000 profiles, out of which 807 valid profiles were retained after preprocessing.
  
  Each profile includes the following attributes:
  - Name
  - Job designation and company
  - Skills and areas of expertise
  - Professional summary and experience details
  - Interests and sample posts
    
    The dataset represents diverse professional roles across multiple domains, enabling evaluation of semantic similarity and recommendation performance.
2. Baseline Metho
  
  To assess the effectiveness of the proposed hybrid ranking model, comparisons were made against the following baseline approaches:
  - Keyword-based matching: Profiles are matched based on direct keyword overlap
  - TF-IDF similarity: Textual similarity computed using term frequencyinverse document frequency
  - Cosine similarity (single metric): Semantic similarity computed usin embeddings without hybrid scoring
    
    These baselines represent commonly used techniques in traditional recruitment systems.
3. E luation Metrics
  
  The system was evaluated using both quantitative and qualitative metrics:
  - Precision@K: Measures the proportion of relevant candidates in the top K recommendations
  - Recall@K: Evaluates the ability to retrieve relevant candidates from the dataset
  - Response relevance score: A qualitative measure assessing the personalization and contextual accuracy of generated messages
  - Execution time: Measures the computational efficiency of embedding generation and recommendation retrieval
    
    These metrics provide a comprehensive assessment of both recommendation accuracy and system performance.
4. Experimental Procedure
  
  The evaluation process was conducted in the following steps:
  1. Preprocessed profile data were converted into embeddings using Sentence Transformers
  2. Candidate similarity scores were computed using both baseline and proposed hybrid methods
  3. Top-K recommendations were generated for selected profiles and skill queries
  4. Personalized outreach messages were generated using the LLM-based module
  5. Results from the proposed method were compared with baseline approaches
5. I lementation Details
  
  The system was implemented using Python and relevant NLP libraries. Sentence Transformer models were used for embedding generation, while transformer-based language models were used for message generation.
  
  All experiments were conducted on a standard computing environment with CPU support. The average embedding generation time for the dataset was approximately 56 seconds,
  
  while recommendation retrieval was performed in under one second per query, demonstrating the efficiency of the proposed system.
6. Ablation Study
  
  To evaluate the contribution of individual components, an ablation study was performed by selectively removing key modules:
  - Without hybrid scoring (only cosine similarity)
  - Without LLM-based message generation
  - Without preprocessing enhancements
The performance degradation observed in each case highlights the importance of the proposed components in improving overall system effectiveness.
CONCLUSION AND FUTURE WORK

This paper presented SmartConnect, a hybrid artificial intelligence framework designed to automate candidate discovery and personalized outreach in recruitment systems. The proposed approach integrates semantic similarity-based candidate ranking with context-aware message generation using large language models. Unlike traditional systems that rely on keyword matching or static templates, the proposed framework introduces a multi-factor scoring mechanism that considers skill similarity, role alignment, and experience relevance, resulting in more accurate and meaningful candidate recommendations.

The experimental evaluation demonstrates that the hybrid ranking model outperforms conventional approaches such as keyword-based and single-metric similarity methods. In addition, the structured prompt-based message generation mechanism enhances personalization and produces contextually relevant communication, thereby improving engagement quality. The integration of ranking and generation into a unified pipeline further strengthens the effectiveness of the system by aligning candidate relevance with outreach content.

Overall, the proposed framework highlights the potential of combining Natural Language Processing and generative artificial intelligence to build scalable and intelligent recruitment solutions. The system not only reduces manual effort but also improves the efficiency and quality of recruiter- candidate interactions.

Despite its promising performance, certain limitations remain. The use of a simulated dataset may not fully capture the diversity and complexity of real-world professional profiles. Additionally, the evaluation of message quality is primarily qualitative and can be further strengthened using large-scale user feedback.

Future work will focus on enhancing the system in several directions. Integration with real-world professional networking platforms will enable real-time data processing and validation. The incorporation of adaptive learning techniques, such as reinforcement learning, can further optimize the ranking model based on user preferences and feedback. Expanding the system to support multilingual profiles will improve its applicability in

global recruitment scenarios. Furthermore, incorporating ethical AI principles, including fairness, transparency, and bias mitigation, will be essential for responsible deployment in real- world applications.

In conclusion, SmartConnect provides a robust foundation for intelligent recruitment automation by bridging the gap between candidate selection and personalized communication through an integrated AI-driven framework.
ACKNOWLEDGEMENT

The authors would like to express their sincere gratitude to their academic mentors and faculty members for their continuous guidance, motivation, and technical insights throughout the development of this project. Their valuable feedback and constructive discussions significantly contributed to refining the systems architecture and research direction.

We also extend our appreciation to the Department of Artificial Intelligence and Data Science for providing the necessary infrastructure and resources that enabled the completion of this work. Finally, the authors acknowledge the use of open-source libraries and AI tools such as Hugging Face Transformers, SentenceTransformers, and Streamlit, which formed the foundation of the implementation. The inspiration for integrating OpenAIs GPT-based models into this project stems from ongoing innovations in AI.
REFERNCES

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, Proc. NAACL- HLT, 2019.
N. Reimers and I. Gurevych, Sentence-BERT: Sentence embeddings using Siamese BERT-networks, Proc. EMNLP-IJCNLP, 2019.
OpenAI, Language models are few-shot learners, Advances in Neural Information Processing Systems (NeurIPS), 2020.
S. Das, P. Roy, and D. Sarkar, AI in recruitment: Applications, challenges, and opportunities, IEEE Access, vol. 9, pp. 134562134575, 2021.
Y. Kim, Convolutional neural networks for sentence classification, Proc. EMNLP, 2014.
A. M. Rahman, A. Hussain, and A. Alhajj, An intelligent recruitment system using semantic matching, IEEE Transactions on Computational Social Systems, vol. 8, no. 4, pp. 934947, 2021.
Streamlit Inc., Streamlit: The fastest way to build and share data apps,
[Online]. Available: https://streamlit.io
Hugging Face, Transformers library, [Online]. Available:

https://huggingface.co/transformers
M. Qureshi, N. Rehman, and S. Khan, AI-driven talent acquisition: A review f automation in recruitment, International Journal of Advanced Computer Science and Applications, vol. 12, no. 10, pp. 456463, 2021.
D. W. Hosmer and S. Lemeshow, Applied Logistic Regression, 3rd ed. New York, NY, USA: Wiley, 2013.

SmartConnect – AI powered LinkedIn Outreach Automation

KeywordsAI recruitment, semantic similarity, natural language processing, recommender systems, outreach automation, large language models

Keyword-based matching: Profiles are matched based on direct keyword overlap

TF-IDF similarity: Textual similarity computed using term frequencyinverse document frequency

Cosine similarity (single metric): Semantic similarity computed usin embeddings without hybrid scoring

Precision@K: Measures the proportion of relevant candidates in the top K recommendations

Recall@K: Evaluates the ability to retrieve relevant candidates from the dataset

Response relevance score: A qualitative measure assessing the personalization and contextual accuracy of generated messages

Execution time: Measures the computational efficiency of embedding generation and recommendation retrieval