Detectron2 Object Detection & Manipulating Images using Cartoonization

Download Full-Text PDF Cite this Publication

Text Only Version

Detectron2 Object Detection & Manipulating Images using Cartoonization

Allena Venkata Sai Abhishek

Dept of Computer Science and Engineering GITAM University Visakhapatnam, India

Sonali Kotni

Dept of Computer Science and Engineering GITAM University Visakhapatnam, India

Abstract In today's world, there is a rapid increase in the autonomous vehicle. There are various levels of autonomous vehicles depending upon the degree of autonomy-for the lower degree of autonomy driver has more power and functionality for managing, on coming to the fully automated vehicle like Tesla are expected to have full control over the functions. These advances cooperate to plan the vehicle's position and its nearness to everything around it. Because of this, there is popularity for these vehicles, since they give a great deal of advantages to individuals utilizing them. We use the Facebook AI Research software system that implements object detection algorithms, Caffe2 deep learning framework for advanced object detection by offering speedy training. We have also manipulated images to derive insights addressing the issues companies face when making the step from research to production. We have implemented detectron2 object detection for faster detection of objects. There is labeling of the object & we used manipulation of images using cartoonization.

Keywords Mask R-CNN; Retina Net; Faster R-CNN; RPN; Fast R-CNN, R-FC; Classification; Deep Learning; Grayscale;

  1. INTRODUCTION

    In this automation, the information is gathered by the on-board sensors without any communication. Moreover, these automated vehicles can communicate with each other In this automation, the data is accumulated by the on-board sensors with no correspondence [1]. Besides, these computerized vehicles can speak with one another and can share data about the climate. . We utilize the Facebook AI Research programming framework that executes object location calculations, Caffe2 profound learning structure for cutting edge object discovery by offering expedient preparing.

    The objective of Detectron is to offer a

    • High-quality,

    • High-execution

    • Codebase for object location research.

    It is intended to be adaptable to help fast execution and assessment of novel exploration.

  2. DATA COLLECTION

    Then input data for our model is image type. We give an input image in either JPEG or PNG format. The input image is used for manipulation of using various cartoonization techniques. Then we import the necessary libraries for the uploading of the images.

    Fig. 1. Input Image

  3. PROPOSED MODEL

    The proposed model take up input as images in the format of PNG, or JPEG, using multiple libraries in which, we use the Detectron2 for the faster object detection of the objects using various object detection algorithms such as Mask R-CNN; Retina Net; Faster R-CNN; RPN; Fast R-CNN, R-FC; Classification; Deep Learning; Grayscale. We take the backbone & proposal that we crop & wrap and we implement all the box, mask, key points, dense pose and semantic segmentation, while clubbing it and generating labels and we detect the object using a box [4]. We have also manipulated images by grey scaling, cartoonizing, applying bilateral & Gaussian filtering, to derive insights addressing the issues companies face when shifting from research to production.

    Fig. 2. Proposed model framework

  4. METHODOLOGY

    The detectron2 framework is initially imported using the git command from the Github repository of the detectron2.

    1. Installing & Importing the required Dependencies:

      We install & import the required dependencies that are as follows-

      1. pyyaml

      2. CUDA

      3. Torch

      4. Torchvision

      5. detectron2 logger

      6. Numpy

      7. JSON

      8. OpenCV

      9. Random

      10. detectron2 utilities

        Fig. 3. Dependencies Installed & Imported

    2. Uploading an Image:

      We define a function read_file function to read the file. Then input data for our model is image type. We give an input image in either JPEG or PNG format. The input image is used for manipulation of using various cartoonization techniques. Then we import the necessary libraries for the uploading of the images.

      Fig. 4. Image uploading

      Fig. 5. Image that has been uploaded

    3. Detecting& Labelling of Image:

      We detect & label the objects of the image. We take the backbone & proposal that we crop & wrap and we implement all the box, mask, key points, dense pose and semantic segmentation, while clubbing it and generating labels and we detect the object using a box. A mask is applied on the image, finding the key points[2]. Then we use dense pose & semantic segmentation to finally display the image with labeled box.

      1. We get the Image from uploading or either from the MS- COCO dataset.

      2. We have created a detectron2 configuration and a detectron2 Default Predictor for the running of the inference on a particular image.

      3. We also add the model – specific configuration like, Tensor Mask, etc. here as we are not running a model in detectron2's core library.

      4. We set a certain threshold for this model.

      5. We find a model from detectron2's model zoo.

      6. Last and final step is to visualize our processed image.

      Fig. 6. Detecting& Labeling of Image

      Fig. 7. Detecting& Labeling of Image Computations

    4. Manipulating the Image:

    We use the input image and we manipulate it by using the following techniques to derive insights [3]. Manipulating of an image can be done in many ways. Here the image is manipulated by cartoonizing the image which involves in adding a cartoon effect to the image and the image can be filtered by using various filters. The steps for manipulating an image are briefly described below.

    1. Creating an edge mask function

      Initially an image is uploaded from device or from a dataset. Then an edge mask is created. When creating an edge mask,the thickness of an image's edges is given first consideration when producing an edge mask. The cv2.adaptiveThreshold () method will be used to identify the edge of a picture. To determine the threshold for smaller areas of the picture, we utilize the cv2.adaptiveThreshold () function. As a result, different thresholds are obtained for various parts of the same picture. It will highlight the black edges surrounding the image's objects.

    2. Converting into grayscale

      Secondly, the image is converted to grayscale. Here the image consists of two colors i.e. black and white. During the process of gray scaling and image, the noise is compressed from the image to reduce the number of detected edges that are not required. cv2.adaptiveThreshold () defines the line size of the edge. The thicker borders that will be highlighted in the image will have a higher line size.

      Fig. 9. Converting into grayscale

    3. Reducing color palette

      Color Quantization: This method reduces the number of colors in the image and gives it a cartoon effect. When presenting output with a finite number of colors, color quantization is accomplished using the K-means clustering method. K-means is an unsupervised machine learning algorithm that performs clustering. From the word, K means number of clusters and Means refers to the variance. We can determine the number of color in the output picture using different values of K. So, here for the present image the number of colors is reduced to 9.

      Fig. 10 . Reducing color palette

    4. Bilateral Filtering

      The bilateral filter is the next approach for decreasing picture noise. It decreases the image's blurriness and sharpness. Consider a 3D bilateral filter that is processing an image's edge region. Each pixel value is replaced by a weighted average of neighboring pixel values in a bilateral filter. In order to retain edges, it uses a variety of pixel intensities.

      For bilateral filtering, there are three key requirements. They are:

      Fig. 8. Detecting& Labeling of Creating an edge mask function

      • d :Diameter of each pixel neighborhood

      • sigmaColor: A higher value for the parameter implies that colors from further away in the pixel neighborhood will be blended together, resulting in bigger semi-equal color regions.

        • sigmaSpace: As long as the pixels' colors are close enough, a higher value of the parameter implies that they will affect each other.

      Fig. 11 . Bilateral Filtering

    5. Combining edge mask with the colored image – Adding Cartoon Effect

      Finally the edge mask is combined with the color- processed image. Here cv2.bitwise_and function is used. Bitwise operations are performed on the image to get the output. Now you can see how an image can be converted into a cartoon. So, come on and have a try by converting your images into a cartoon.

      Fig. 12. Combining edge mask with the colored image –

      Adding Cartoon Effect

    6. Filtering the Image

    Apart from using bilateral filter for filtering the image, Gaussian Blur, sharpen and mean Blur kernel can be used in filtering an image.

    Gaussian blurring to an Image: This approach utilizes a Gaussian filter that performs a weighted average. The Gaussian blurs weights pixel values based on their distance from the kernel's centre. The weighted average is less affected by pixels that are further away from the centre.

    Median Blurring to an Image: Each pixel in the source picture is replaced by the median value of the image pixels in the kernel region in median blurring.

    Sharpening an Image: A 2D-convolution kernel can be used to sharpen a picture. Create a custom 2D kernel first, then apply the convolution operation to the picture with the filter 2D ( ) method.

    Fig. 13. Filtering the Image

  5. RESULTS

    We were successfully able to detect the objects of the images and we were able to label it according to the predefined dataset, & we manipulate the images as shown below:

    Fig. 14. Detecting & labeling of the objects

    Fig. 15. Converting into grayscale

    .

    Fig. 16. Color Quantization

    Fig. 17. Bilateral Filtering

    Fig. 18. Adding Cartoon Effect

    Fig. 19. Filtering the Image using Gaussian, sharpen and Mean Blur filters

  6. CONCLUSION

    In this study, We fine tuned a framework that comprised of the superlative model for the object detection application practices, for which we have developed and implemented, an advanced object detection by offering speedy training, the FAIR software system that implements object detection algorithms like, Mask R-CNN, Retina Net, Faster R-CNN, RPN, Fast R-CNN, R-FCN & it uses Caffe2 deep learning framework for it. We have also manipulated images by grey scaling, cartoonizing, applying bilateral & Gaussian filtering, to derive insights addressing the issues companies face when making the step from research to production.

  7. REFERENCES

  1. https://analyticsvidhya.com/blog/2018/01/facebook-launched- detectron-platform-object-detection-research/

  2. https://towardsdatascience.com/image-labelling-using-facebooks- detectron-4931e30c4d0c

  3. Archana B. Patankar; Purnima A. Kubde; Ankita Karia (Aug. 2016).

    Image cartoonization methods IEEE

  4. Vung Pham; Chau Pham; Tommy Dang (2020). Road Damage Detection and Classification with Detectron2 and Faster R-CNN 20511275 IEEE

Leave a Reply

Your email address will not be published. Required fields are marked *