DOI : 10.17577/IJERTCONV14IS010075- Open Access

- Authors : Vishakha Divakar Puthran, Rakshitha P
- Paper ID : IJERTCONV14IS010075
- Volume & Issue : Volume 14, Issue 01, Techprints 9.0
- Published (First Online) : 01-03-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
AI-Assisted Accessory Fitment and Styling: Integrating Convolutional Networks with Color Theory for Fashion Recommendation
Vishakha Divakar Puthran
Department of computer applications
St Joseph Engineering College , Mangaluru.
Rakshitha P
Asst. Professor Department of computer applications
St Joseph Engineering College , Mangaluru.
AbstractThe rapid evolution of computer vision and artificial intelligence (AI) has transformed a variety of markets tremendously, with the fashion industry becoming a major space for AI personalization and enhanced digital buying experiences. In e-commerce settings, consumers tend to face product uncertainty when making a decision, non-personalized suggestions, and not being able to view products prior to buying. These type of limitations will probably result in lower customer interactions, more product returns, and dissatisfaction of customers. To address these problems of users, this system was designed that shows developing and using an AI-powered accessory positioning and recommendation system designed especially for online fashion applications.The system involves a CNN-based facial landmark detection model that has been trained to identify important facial landmarks such as the forehead, chin, nose bridge, and eyes.This lets the precise and real-time virtual placement of fashion accessories on users .Using these methods for feature extraction to evaluate the visual patterns of user uploaded outfit photos, an AI- powered accessory recommendation model helps with virtually trying on accessories.
This analysis shows the system projections and predicts appropriate accessories based on the user's preferences, the state of the outfit, current fashion, and facial structure.The system's multi-tiered architecture includes a PHP managed backend for data storage,manipulating, handling, and managing an AI prediction service based on Flask for virtually trying , processing
, accessory fitting and a cross-platform user interface based on Flutter.The technology offers an effective, manipulative , scalable, and interactive solution aimed at more personalization in the digital fashion realm using AI-based decision-making, image processing, and trend analysis functionalities.The findings of this research tells that AI can significantly enhance customer satisfaction, reduce the issue of product mismatches and predicts better products, and aid fashion retailers in analysis of trends and dynamic stock control management. This research further provides a platform for future development of AI-based fashion systems, including the integration of virtual reality (VR) and more sophisticated facial recognition machine learning models.
Index TermsFace detection, Accessory recommendation, Deep learning, CNN, Color theory, Image processing, Augmented reality, Personalized styling .
-
INTRODUCTION
The rapid advancement of computer vision and artificial
intelligence has revolutionized the fashion sector, enabling it to offer more interactive and customized digital shopping experiences. Despite these developments, customers end up with challenges such as uncertainty in product fit, size-unfitting recommendation, and limited exposure to see accessories with the chosen clothes. All these elements contribute to lower engagement and increased product return rates. This paper describes the design and implementation of an intelligent accessory placement and personalized recommendation system powered by AI for use in online fashion apps. The approach combines a convolution neural network that has been trained to identify and project important facial landmarks such as the eyes, nose bridge, forehead, and chin to allow accurate virtual fitting of items like glasses, hats, and ties.
At the same time, a recommendation model extracts visual attributes of user-uploaded images to determine the most prominent colors of clothing. Based on color theory, accessories are recommended by the system that look good with both the user's facial features and clothing. The system employs a backend controlled by PHP for securing the data, an AI service on Flask for prediction and computation, and a Flutter-based cross-platform UI for maintaining smooth user experience.
Result of the evaluation confirms that the system actually optimizes personalization, customer confidence in product choice, and reduces online purchases mismatch. The system's multi-tiered architecture includes a PHP managed backend for data storage,manipulating, handling, and managing an AI prediction service based on Flask for virtually trying , processing , accessory fitting and a cross-platform user interface based on Flutter.
-
LITERATURE REVIEW
In recent years, there has been a growth in the innovation that involves fashion and artificial intelligence (AI), and researchers have investigated how intelligent systems can enhance design efficiency, identification, as well as customer experience.A new face alignment approach in 3D-aligned space using self-attention mechanisms has been introduced in [1], significantly improving the accuracy of face landmark detection, particularly in difficult cases such as occlusion or extreme pose.The current approach can also be used to further improve facial analysis software, which is being used more and more in virtual try-on and AR-based fashion experiences.With respect to clothing, a
stand-alone way in which to characterize clothing based on semantic attributes like sleeve type or color, was machine learning models that understand how clothing items influence one another, as proposed in [2].This piece provides a starting point for systems that identify and tag fashion objects in informal 2 images.A tiered fashion AI architecture of activities, ranging from low-level image parsing to high-level trend prediction and recommendation, was laid out in [3] that demonstrated the increasing dimension and usability of the field.A method for detecting masked faces, especially relevant during the pandemic, was proposed in [6], where a comparison of traditional and deep learning techniques not only ensures safety but also indirectly influences fashion tech by enhancing detection for partially occluded faces in images.As a whole, these articles show how AI determines technical accuracy in increasing the ability to recognize and categorize objects and converts fashion's industry to elevate creativity, personalization, and consumer interactions. AI in fashion design and the use of multimodal tools (those that work with images, sketches, or text) were further explained in brief[4], demonstrating how these tools can help to change the designer's process and promote faster manufacturing and more personalized products. Additionally, [5] focused on social media and its use of deep learning to extract fashion intelligence from platforms like Instagram ,Facebook and TikTok, demonstrating how trend forecasting and recommendation models can directly incorporate influencers and user-generated content.
-
METHODOLOGY
Within a single AI-powered fashion assistant system, this research combines two separate but related models:
-
a Virtual Accessory Try-On Model that visualizes accessories on a user's face in real time
-
an Accessory Recommendation Model that uses facial and clothing features to recommend suitable accessories and complementary colors.
-
System Architecture Overview
The architecture is a multi-layered one that integrates input acquisition, preprocessing, inference, and output rendering. The frontend is implementd in Flutter, giving a cross-platform experience. Everything related to prediction services is wrapped in backend Python modules exposed through REST APIs. That way, there is scalability and maintainability, and the virtual try-on and recommendation pipelines can run independently.
-
Virtual Accessory Try-On Model
Virtual Accessory Try-On Module: receives live webcam input, detects facial landmarks, scales and positions accessory overlays dynamically, and renders augmented output.
Fig. 1. Flowchart of the virtual accessory placement process.
-
Input Acquisition
The system records ongoing video frames from the webcam of the user. Each frame is flipped horizontally so as to produce a mirror-like effect, which improves intuitiveness (i.e., if the user moves left, the image shown moves left). The input image is also transformed from BGR to RGB color space for MediaPipe Face Mesh compatibility because the model requires RGB input tensors.
-
Facial Landmark Detection
Facial landmarks are identified with the help of MediaPipe Face Mesh model that provides 468 reference points for each frame at a time. Certain landmarks are used to compute positions and scales of accessories and place them on the face: Eyes (landmark 33 and 263)
Nose tip (landmark 1) Forehead (landmark 10)
Chin (landmark 152)
The scaling factor d is computed based on the Euclidean distance between the left and right eye landmarks.
where (x1,y1) and (x2,y2) denote the eye landmark coordinates.
-
Accessory Overlay Logic
The digital accessories are kept as PNG images along with alpha transparency channels. The accessory is resized proportionally each frame and composited using per-pixel alpha blending:
Where :IoutputI_ is the final displayed pixe l, (x,y)E[0,1] is the transparency at pixel (x,y)(x,y), IoverlayI_ is the accessory pixel color, IbackgroundI_ is the webcam frame pixel color.
This expression allows for smooth compositing of the accessory over the video stream.
-
-
Accessory Recommendation Model
Accessory Recommendation Module: handles real-time camera capture as well as static image uploads, executes CNN-based accessory type prediction, and uses KMeans clustering to calculate dominant outfit color and formulate color compatibility suggestions.These workflows depict the decoupling of concerns between try-on and recommendation pipelines, which were merged into a single application for end-user engagement.
Fig.2. Flowchart of the accessory recommendation system.
1. Modes of Operation
The model has two input modes:
-
Real-Time Camera Mode: Takes a photo from the webcam when commanded by the user.
-
Image Upload Mode: Processes an existing user-uploaded image.
This double functionality provides flexibility of use, supporting both live recommendations and processing of stored images.
-
Preprocessing
The input image is resized to a fixed resolution of 128×128 pixels, normalizing the input size for the neural network. The intensity of each pixel is normalized to the [0,1] interval by division by 255 to enhance model convergence and prediction stability:
where X is the raw pixel matrix.
-
CNN Accessory Classification
A Convolutional Neural Network used in TensorFlow Keras makes accessory class predictions. The last dense layer employs a softmax activation function, which returns a probability
distribution over all classes: Where K is the number of accessory categories (e.g., glasses, tie, hat), P is the probability of each class.
The class to be predicted is picked by finding the index with the maximum probability.
-
Dominant Color Extraction
To estimate the color of the clothing, a region beneath the face is cropped after the face bounding box has been detected. To segment dominant color components, K-Means clustering ( k=3) is used:
Where: Ci represents cluster i, i is the centroid of cluster i. RGB centroids are mapped to pre-defined the color labels through heuristic rules.
5 Color Compatibility Mapping
Once the dominant color is identified, a rule-based mapping dictionary delivers suggested accessory colors. For instance, when the detected color is "Red," suggested accessories can be "Black" or "Gold." Such mapping is shown with the predicted type of accessory and prediction confidence to guide users in style decisions.
-
-
Dataset Preparation and Preprocessing
The system has embedded many preprocessing processes in order to get all the inputs formatted in a uniform way, normalized properly, and preprocessed for both overlay and classification operations. The major preprocessing processes are described below:
-
Accessory Classification Preprocessing
In order to prepare the input images for the CNN-based accessory recommendation model, each frame or uploaded image was resized into a uniform resolution size of 128×128 pixels. This resizing makes all the input dimensions in all samples equal and allows for the batched inference processing efficiently. The pixel values were normalized into the range [0,1] by each channel value being divided by 255. This normalization speeds up model convergence and prevents scale-related instability.In addition, to satisfy the input shape criteria of the Keras neural network API, a batch axis was included to every preprocessed array.
-
Dominant Color Extraction Preprocessing
The system used equalization of histograms in YCrCb color space to find clothing colors that were clear.Specifically, it equated the channel brightness to improve local contrast so that color regions are more easily to distinguish in varying lighting scenarios.Prior to further processing, the processed image was initially restored to the BGR color space.All pixels with RGB values greater than the threshold set brightness were excluded from clustering to prevent bias from reflective backgrounds.This eliminated the possibility that the background, rather than the users clothing, would influence the resulting colour predictions.Facial Landmark Detection Preprocessing Prior to facial landmark detection for positioning virtual accessories,webcam frames were converted from BGR to RGB since the MediaPipe Face Mesh model requires it.The frames were also flipped horizontally for a mirror effect to mimic a natural and intuitive user experience when users are looking at their live video.Mirroring also gave accessories the impression as if they are applied in synchronization with the users perspective.
-
Facial Landmark Detection Preprocessing
Prior to face landmark detection for virtual accessory placement, webcam frames were converted from BGR to RGB because the MediaPipe Face Mesh model requires it. Additionally, frames were flipped horizontally for a mirrored effect in order to mirror a natural and intuitive user experience when users are seeing their live video. Mirroring also caused accessories to look like they had been applied in tandem with the user's perspective.
-
-
Implementation Details
-
Programming Languages and Libraries:
Python: Primary implementation language.
OpenCV: Image capture, preprocessing and display. MediaPipe: Face detection and landmark estimation. TensorFlow Keras: Deep learning inference.
Scikit-learn: KMeans clustering for color extraction.
-
User Interface:
Real time video window displaying accessory overlaysOverlay text displaying predicted accessory and color suggestionsKeyboard mode control and output capture commands./p>
-
Operational Modes:
Virtual Try-On: Provides frame processing and overlay. Recommendation Mode: Provides prediction for live or uploaded images.
-
-
-
RESULTS AND ANALYSIS
The system was tested on various conditions like classification accuracy, processing delay and real time application. The testing was conducted on test images and webcam sessions that provided real user behavior with the system. The analysis provides both numerical results and graphical plots of the system responses.
-
Accessory Classification Accuracy
A data set with various types of accessory classes like ties, hats, and glasses were used to train the CNN classifier. The training was done using more than 10 epochs, and a split of 80/20 was also applied. The system could identify correctly between accessory types based on varying lighting and positioning states of the users and their movements depending on the final model's validation . The accuracy of around 92% was obtained.
The overall number of correct predictions for every accessory class across the validation set is demonstrated in a bar chart.
Fig.3. Distribution of Correct Predictions per Accessory Class.
-
Latency and Processing Time
To ascertain real-time performance, system latency was estimated over 50 trials. Latency was measured as the aggregate time (in milliseconds) taken for processing a single frame from image capture to rendering output.Average processing time for the Virtual Try-On Module averaged at around 58 ms per frame, affirming that the system has an interactive user experience well within real-time limits (<100 ms).The inconsistency in processing time was depicted with a histogram, which illustrates that frames were processed within less than 70 ms most of the time, occasionally with outliers due to system resource competition.
Fig.4. Histogram of Frame Processing Latency.
-
Dominant Color Detection and Recommendation Consistency
The recommendation module was also tested for extracting prominent colors of clothing and recommending matching accessories. Testing was conducted using webcam-captured images as well as uploaded sample images that depicted diverse outfits.
A pie chart was drawn to show the range of detected dominant colors for all test inputs. The figure shows the diversity of the input data and establishes that the system is capable of dealing with a wide variety of color contexts robustly.
Fig.5. Distribution of Detected Dominant Clothing Colors.
-
Qualitative Evaluation of Virtual Try-On
In addition to quantitative measures of precise accuracy and a qualitative visual examination was conducted to assess the realism, stability, and user experience of the virtual try-on system. The overlay alignment was stable on the broad range of head positions and face shapes of the user. Glasses were well- centered between eye landmarks that were detected, with scaling based on the Euclidean distance between eye corners. The hat overlay was relative to the forehead landmark, and the tie placement referenced the chin landmark. Accessories had minimal jitter in real-time video streams regardless of whether users rotated their heads or moved around within the camera frame.
Figure 6 illustrates a sample output in which glasses are used appropriately and proportionately sized to the face. Figure 7 depicts correct hat alignment, and Figure 8 shows the addition of a tie accessory. One desirable aspect of the system is the adjustable combination of accessories. The user can switch dynamically between:
Wearing an individual accessory (e.g., glasses only), Mixing two accessories (e.g., tie and hat), Displaying all accessories at the same time.
This interactivity is user controlled and greatly improves the user experience by enabling experimentation with various styles. As seen in Figures 9 and 11, the system displays multiple accessories with uniform alignment and without graphical artifacts.Overall, user comments in testing sessions among
peers reflected high levels of satisfaction with the ease of use, responsivity, and visual realism of the try-on process.
Fig.6. Glasses between eye landmarks. Fig.7. Hat with transparency blending.
Fig.8. Tie relative to chin landmark. Fig.9. All accessories applied on face.
Fig.10. Single accessory (glasses). Fig.11. Dual accessories (hat and tie).
-
Qualitative Evaluation of Accessory Recommendation
The accessory recommendation module was qualitatively tested under both real-time camera mode and image upload mode. In most test scenarios, the system was able to detect the face and extract the clothing region under the bounding box.Dominant color identification via K-Means clustering proved to be very stable and produced robust mappings to named colors such as Red, Black, and White. Figure 12 is an example of the accessory prediction output window, with the class prediction and confidence and presents corresponding color compatibility suggestions.To support edge cases properly, it was also tried to test when the system failed to detect a face in the input frame. In those instances, the program did explicitly show "Face not detected". Figure 13 illustrates the fallback message as displayed in real-time mode, demonstrating the system's ability to be helpful and informative under less than ideal situations.In experimental trials, the recommendation module was intuitive and helpful, textual feedback being easily comprehended and aiding users in making decisions. In general, the combination of AI-facilitated recommendation and visual try-on was found to significantly contribute to the personalization and interestingness of the fashion experience.
FIG.12. PREDICTED ACCESSORY, CONFIDENCE FIG.13. NO FACE DETECTED MESSAGE
-
-
CONCLUSION
This study helps to prove the usage of having deep learning methods and color theory in building an improved, smart, and offer a more personalized online fashion experience. Through this application of facial landmark detection using a convolutional neural network coupled with dominant color analysis-based accessory recommendation and prediction, the system can remove typical problems associated with online shopping, including issues regarding fit and style compatibility.The results exhibit a high ability of the system to accurately recognize facial features and position accessories in real-time with high placement accuracy and high correlation with stylist-curated proposals. User feedback also testified that the solution improves confidence and satisfaction in the selection of accessories digitally. The multi-layered architecture includes a PHP backend, a Flutter user interface, and a Flask AI prediction service that enables the system to be scalable while being accessible on multiple devices. This integrated approach offers fashion retailers an efficient way of reducing return rates, increasing customer engagement, and enhancing trend analysis capabilities.Overall, this book shows how AI has the potential to revolutionize e-commerce by providing richer, more tailored experiences. It paves the way for future innovation in virtual try-on technology and smart recommendation platforms, opening the door to even more immersive and adaptive uses in the fashion world.
-
FUTURE SCOPE
The possibilities for the system to be improved in the near future:
The future of this system might help the users to try accessories in real life using augmented reality and virtual reality interfaces.
-
More Facial Landmark Models: Incorporating more advanced facial recognition techniques and increased resolution landmark models would continue to improve accuracy with respect to complex facial expressions, occlusions, or cutting angles.
-
Personalized Style Preferences: Evolving the recommendation engine to consider style history on an
individual basis, cultural influence, and brand loyalties would make personalization deeper than color compatibility.
-
Real-Time Video Try-On: Extending the system to support real-time video streams could allow the customer to move their head unconstrained while trying on accessories virtually, for a more natural experience.
-
Integration with E-commerce Sites: Integrating recommendations directly with e-commerce websites and real- time inventory management systems that would help the users for automatic purchasing, enabling customers to simply buy products that were recommended by the system.
-
Adaptive Learning: By including user feedback and comments to improve the models, the system will be able to get sufficient knowledge dynamically by improving prediction based on real-world use data and adapting the new fashion trends.
-
Increased Dataset Diversity: Expanding training datasets with a greater diversity of skin tones, face shapes, and cultural dress will make the system more diverse and resilient in varying global user bases. These guidelines highlight the vast potential for research and development that lies in the future, ultimately leading to even smarter, more personalized, and interactive fashion experiences through artificial intelligence.
REFERENCE
-
B. Li, Z. Liu, and J. Wang, Learning facial structural dependency in 3D aligned space for face alignment, Image Vis. Comput., vol. 150, Oct. 2024.
-
H. Chen, A. Gallagher, and B. Girod, Describing clothing by semantic attributes, in Proc. Eur. Conf. Comput. Vis. (ECCV), Berlin, Heidelberg: Springer, 2012, vol. 7572, pp. 609623.
-
X. Gu, F. Gao, M. Tan, and P. Peng, Fashion analysis and understanding with artificial intelligence, Inf. Process. Manage., vol. 57, no. 5, Sep. 2020.
-
Z. Guo et al., AI-assisted fashion design: A review, IEEE Access, vol. 11, pp. 8840388422, Aug. 2023, doi: 10.1109/ACCESS.2023.3306235.
-
M. Mameli et al., Deep learning approaches for fashion knowledge extraction from social media: A review, IEEE Access, vol. 10, pp. 15451565, Jan. 2022, doi: 10.1109/ACCESS.2021.3137893.
-
T. A. Hosny et al., Artificial intelligence-based masked face detection, J. Artif. Intell. Data Sci., vol. 3, no. 2, 2024. [Online]. Available: [Check PDF source for exact DOI or URL]
