Gray Level Co-occurrence Matrix For Indian Monument Image Classification

DOI : 10.17577/NCRTCA-PID-401

Download Full-Text PDF Cite this Publication

Text Only Version

Gray Level Co-occurrence Matrix For Indian Monument Image Classification

Rohini A. Bhusnurmath

Dept of Computer Science

Karnataka State Akkamahadevi Womens University Vijayapur, Karnataka, India 586-108

Adiba Maniyar*

Dept of Computer Science

Karnataka State Akkamahadevi Womens University Vijayapur, Karnataka, India 586-108

Abstract Indian heritage monuments offer ideas about our history and the growth of civilization. As there are numerous religious communities in India, so Indian culture and heritage are diverse and rich. Heritage monuments play a vital role in a country. Numerous computational techniques are used to protect historical artifacts as they define the heritage of country. Recognition of monuments is a difficult issue in area of classification of images because of its variation in structural design. Many complications have to be faced as several factors have impact on recognition method. In present work, authors have accomplished a system for classification of Indian monument images depending on its features. The state-ofthe-art Gray Level Co-occurrence Matrix method (Haralick features) is considered for feature extraction. The proposed work is mainly focuses on creation of new dataset in the type of CSV file which is evaluated on Indian Heritage Image Retrieval Dataset (IHIRD). The two datasets (Indian Heritage Image Retrieval Dataset I) IHIRD I and (Indian Heritage Image Retrieval Dataset II) IHIRD II were formed from original dataset. Later, a variety of ML classifiers are applied and calculated accuracies of all classifiers and achieved better result of 98.65%.

Keywords Recognition of monuments, GLCM, Machine Learning Classifiers, Image Classification, Image analysis.


    Monuments are found anywhere across the globe, and they are method for different nations to display to the world, what they have accomplished or what they are proudest of. A monument is a structure created to honor a person, an occasion, or to play a significant role in social community's memory of the past or cultural heritage. A monument serves as a physical proof of an era. It represents a kingdom's memory. These are crucial and visual source for carefully analyzing Indian history. There are numerous monuments in India that have a connection to their religious beliefs of people [1].

    People from different castes, creeds, cultures, and faiths obtain pleasure in the culturally loaded history of India. They are represented with monuments [2].Tourism is remarked as a leading way of revenue for many nations all across the globe. India is marked as the peak travel locations for visitors across the globe because of its historical and enriching assets [3].

    Classification of image is basic and broadly mentioned mission in computer vision. Classification of Indian monuments is a challenging task because of its variation in structure, different orientations and presence of noise. Many

    complexities have to be come across since numerous factors have impact on recognition method. As these images contains noise in the type of animals, sky, people, trees etc which frequently leads in reduction of accuracy [2]. The method used in this paper is classifying monument images in acquired dataset by applying different ML techniques such as K- Nearest Neighbors [14], Decision Tree classifier [15], Random Forest classifier [16], Naive Bayes classifier [17], Ada Boost classifier [18] and Logistic Regression [19]. These classifiers provided better result with least computation time.

    The summarization of paper is discussed below. Section II contains literature survey. Section III comprises of methodologies, proposed work and classification algorithms. Section IV comprises of detailed description of dataset. Section V deals with result and discussion. Conclusion and future Scope are discussed in last section.


    Hesham et al. [1] have studied how to recognize monumental images using various techniques of machine learning. the standard dataset used in this work is indian monuments images. resnet50 and vgg 16 are considered as deep learning technique. the model is trained and later cross validation method is applied with k-fold technique with 5 folds and for per fold 25 epochs are used. the resnet50 classified unseen data with the highest performance metrics

    Saini et al. [2] have used Deep Convolution Neural Network (DCNN) for feature extraction. The model is trained on manually acquired images which exhibit Indian diversity. Experiment is carried out on heritage monuments and achieved better accuracy of 92.7%.

    Etaati et al. [3] states that a web-based, mobile outline using deep neural networks is presented in this paper for automatically identifying the Iranian historical sites. The work is performed on Iranian monuments. The de-centralized servers process the photographs taken by the mobile device, and landmark's information is then ascertained and sent to the tourist's smart phone. The suggested work is assessed using Iran's tourist attractions, and result obtained from experiments demonstrates that it can identify historical locations with better accuracy of 95%.

    Ninawe et al. [4] have focused on recognition of architectural monuments using convolution neural networks (CNNs).The

    convolution neural network is trained using a deep learning method. Images are separated into two categories in the planned work: Indian Mughal monuments and cathedrals. The experimental results states that the approach can correctly identify whether a query image is an Indian Mughal monument or cathedral.

    Gada et al. [5] states that the deep learning architectural model is used which provides better accuracies in the classification of images. Later, the concept of transfer learning on Inception v3 architecture is used which achieves excellent results with a testing accuracy in range of 96-99% and training accuracy of 99.4%.The images used in dataset are from Golden Quadrilateral of India (Kolkata, Delhi, Mumbai and Chennai).

    Jethale et al. [6] have proposed that by overlaying an educational video, text, and image on the photograph of the monument, the real-time application is developed. It will provide help to travelers with an interactive experience. In this method primarily it extracts features and later SURF algorithm is applied, classification is performed using SVM classifier and better results were achieved.

    Giuseppe et al. [7] planned a work that uses two methods such as KNN classification and local visual descriptors. The article states that KNN classifier along with land recognition system is considered to resolve problem related to recognition of monuments. The dataset used is Pisa dataset and acquired good results.

    Nagendraswamy et al. [8] have focused on categorization of archeological monumental images. The method uses GLCM which extracts gray color features. Later, Alex Net architecture is used for extracting deep learning features. Support Vector machine classifier is used and obtained result of 98.10 %.

    Sharma et al. [9] have presented two methods such as Radon Barcodes and Convolution Neural Networks to classify monumental images with respective to their styles. The first method gives accuracy of 76% and second method gives accuracy of 82%.

    Paolanti et al. [10] have focused on estimating the sentiment of social media images related to Cultural Heritage is presented. An expertly trained Deep Convolution Neural Network (DCNN) can identify the emotion in an image; later, examined performance of CNNs models: Inception,VGG16 and ResNet50, and have achieved better results.

    Podder et al. [11] have presented a work on content-based image retrieval (CBIR) on Indian Heritage Image Retrieval Data set (IHIRD). The dataset consists of various images that exhibit historical and mythological artifact. The main elements of dataset are paintings and sculptures of monumental images. With the help of approaches such as SURF descriptor and Locality Sensitive Hashing technique, better performance was achieved.

    Rohini et al. [12] have proposed an article on partial differential equation, Haralick features, Anisotropic diffusion,

    texture approximation and LDA. Later, features were extracted and KNN classifier was applied. All work is performed on dataset called Brodatz and achieved maximum accuracy.

    Rohini et al. [13] have presented a work on anisotropic diffusion and local directional binary patterns. Later, feature extraction was performed in order to carry out image classification on RGB color space. The work is carried out on a database called as Oulu and achieved maximum result. The image is divided into small parts using discrete wavelet transforms (DWT) on coloured channels. Utilising the (GLCM) idea, statistical characteristics for texture images are extracted.


    The proposed work is summarized in the form of steps as shown below:

    Step 1: The dataset called Indian Heritage Image Retrieval Dataset is taken by communicating with IIT Khargpur.[14]

    Step 2: The dataset pre-processing and preparation is performed as per annotation file, to create well framed dataset with appropriate classes detailed description is shown in section IV A and IV B.

    Step 3: Reading the IHIRD monument images from the dataset.

    Step 4: Divide each monument image into sub image as described in Table 2.

    Step 5: Extracted Haralick features (contrast, dissimilarity, energy, homogeneity, correlation and area square mean) from each sub image.

    Step 6: Create the CSV file from the features extracted in the step 5.

    Step 7: Divide the dataset into train and test sets in 80:20 ratios.

    Step 8: Do training to machine learning classifiers using train sets.

    Step 9: Do testing to machine learning classifiers using test sets.

    Step 10: Repeat Step 8 and 9 for various machine learning classifiers.

    Step 11: Choose the best machine learning classifier.

    1. Gray-Level Co-occurrence Matrix (GLCM):

      It is a technique for extracting texture features. A grayscale image's tone or grayscale intensity and the spatial distance (d) in a certain direction () are observed by GLCM in respect to two adjacent pixels (second-order texture), with the first pixel

      (i) serving as the reference and the second (j) as the neighbor pixel [21]. GLCM features extraction is listed in section B.

    2. Haralick Features:

      • Contrast: It helps to compute variations in gray level values of GLC matrix.

      • Dissimilarity: Dissimilarity feature helps to compute the distance between pairs of objects (pixels) in the region of interest.

      • Homogeneity: Homogeneity feature helps to compute the closeness of the pixels.

      • Correlation: Correlation measures the occurrence of joint probability of mentioned pixel pairs.

      • Energy: It describes texture uniformity.

      • Area Square Mean: It computes the area square mean.

    3. Classification Algorithms:

      The different ML algorithms considered are:

      • Random Forest Classifier

      • Naive Bayes Classifier

      • K-Nearest Neighbors

      • Decision Tree Classifier

      • Logistic Regression

      • Ada Boost Classifier

      • Random Forest Classifier: A regression and classification problem-based classifier is regarded as the Random Forest classifier [16]. It creates decision trees for diverse samples, using their average for regression and majority vote for categorization.

      • Naive Bayes Classifier: Based on the Bayes theorem, the Naive Bayes algorithm [17] is applied for categorization problems. It has a sizable training set and is mostly used for text categorization.

      • K-Nearest Neighbors: The k-nearest neighbors method [12], often known as KNN or k-NN, that makes classifications or predictions about how a single point will be grouped

      • Decision Tree Classifier: Classification and regression issues can be resolved using a decision tree classifier [15]. Sometimes referred to as a "tree- structured classifier", where every leaf node states the classification outcome and inside nodes represents features.

      • Logistic Regression: A categorical dependent variable's outcome is determined through logistic regression. Output must thus be a categorical or discrete. It offers the probabilistic values, range between 0 and 1.

      • Ada Boost Classifier: The term "meta-estimator" may also be termed as an Ada Boost classifier [18]. To make future classifiers focus more on problematic situations, it first fits a classifier on the original

      dataset, then fits successive copies of the classifier on it using different weights for instances that were mistakenly categorized.

    4. Proposed Architecture:

    The Fig 1 as shown below depicts working of proposed work in which the first and second step states that consider an image from dataset and divide into 4 patches. The third step is extracting Haralick features from images. The fourth step is applying all machine learning [20] algorithms. In last step, images are classified and better classification accuracy is obtained.

    Fig 1: System Overview


    1. Dataset Description:

      The dataset consists of 112 classes and each class consists of variable number of images. The names given to each class is done according to name of monument. The dataset contains various heritage monuments from West Bengal, Karnataka and Andhra Pradesh. The monument images used in work are remarked as UNESCO World Heritage Sites [11]. The walls and pillars of images depict the mythological and historical story of monuments. The Fig 2 shows few of sample images from the IHIRD dataset [14].

      Fig 2: Sample Images in IHIRD Dataset [14]

      Many temples depict various avatars of Vishnu,which are beatuifully carved in stones. The sculptures carved in granite stones, the expertise rock cutting techniques and paintings reveals the architectural pattern of Vijayanagara kingdom[11].

      Table 1 shows short summary of monumental images that are taken in our dataset. It shows some of class names and their locations and its significance how it attracts the tourist.For example,the classes called as Achyutaraya Temple, Lakshmi Narasimha Temple, Krishna Temple, Hazara Rama Temple, Viroopaksha Temple, Vishnu Temple and Vittala Temple are located in Hampi and has its own specialty for tourists attraction.

      Table 1: Details of images considered for experimentation.

    2. Dataset Preparation

      Images of various heritage objects make up the heritage data set. The following images are taken within the Hoysaleswara Temple in Halebidu, Karnataka, from a variety of perspectives. Because the images are taken from same temple, their carving styles are quite comparable. [11].

      From a collection of 2060 images, in proposed work dataset is clustered with 1215 images into 112 classes. Every class stands for a concept, that includes a few images taken from different perspectives and lighting[11]. All visible identical images that belong to particular monuments are collected together. Initially, choose a monument image from dataset, and followed by the visibly identical images of a monumental

      image that depict the same scene and idea are grouped altogether.

      The Fig 3 shows that all the similar images of elephant are grouped into a single class called Elephant. Likewise, 12 classes were created which comprises of 1215 images.

      Fig 3: Sample images in a class called Elephant from Hampi location.

      Table 2 describes clear description of two IHIRD datasets used in proposed work. It states the number of images, image size, number of patches, sub image size and features extracted [20].

      Table 2: Dataset Description of Indian Heritage Image Retrieval Dataset [14]

      Dataset Name

      Indian Heritage Image Retrieval Dataset (IHIRD I)

      Indian Heritage Image Retrieval Dataset (IHIRD II)

      Number of images



      Image size (pixel)



      Sub image size (pixel)



      Number of patches in each image



      Number of images after patchifying





      Features Extracted

      Contrast, Homogeneity, Dissimilarity, Energy, Area Square Mean (ASM)

    3. Randomization and Data Splitting

    The entire dataset is partitioned in the ratio of 80:20. The train size of data is 80% and test size of data is 20%.


The proposed work is implemented on Intel(R) with i3 core, 64-bit operating system, x64-based processor and RAM is

4.00 GB. Python programming language is used.

A. Results:

The proposed work is focused on ML classifiers which are described in section III C. The results tabulation is done in Table 3 and Table 4.

Table 3: Classification Accuracy of 4860 images with IHIRD I dataset

From the Table 3, it is concluded that Random Forest classifier has achieved better accuracy of 97.32%, Naive Bayes classifier gives 96.29% accuracy, K Neighbor classifier gives 79.81% accuracy, Decision Tree classifier gives 95.65% accuracy, Logistic Regression gives 2.23% accuracy and Ada Boost classifier gives 0.46% accuracy. In proposed method, K-fold approach [1] is used with ten-folds to reduce bias.

Table 4: Classification Accuracy of 19440 images with IHIRD II dataset

From the Table 4, it is concluded that Random Forest classifier gives 76.72%, Naive Bayes classifier gives 98.33% accuracy, K Neighbor classifier has achieved a better accuracy of 98.65%, Decision Tree classifier gives 98.47% accuracy, accuracy, Logistic Regression gives 22% accuracy and Ada Boost classifier gives 0.79% accuracy.

The Fig 4 and 5 below describes the graphical representation of classification accuracy for all applied ML classifiers.

Fig 4: Bar graph view of all ML classifiers for IHIRD I dataset

Fig 5: Bar graph view of all ML classifiers for IHIRD II dataset

From the literature survey, it has been noted that work related to classification is not performed on Indian Heritage Image Retrieval Dataset (IHIRD) [14].

Table 5: Comparison of results

Proposed Work


Classification Accuracy

GLCM Feature Extraction




97.32 %

98.65 %


This paper focuses on classification of monumental images. The proposed work is carried out on Indian Heritage Image Retrieval Dataset which is collected from IIT Khargpur. The experimental work is done on two different datasets, developed from initial Indian Heritage Image Retrieval

Dataset, IHIRD I contains 4860 images and IHIRD II contains 19440 images. Later for both datasets Haralick features were extracted, and CSV files were created. The various ML classifiers were applied on created CSV files and improved accuracy was attained with shorter calculation times. It has achieved better accuracy of 98.65%. The proposed work showed that features of monumental images were helpful in classifying images and achieved better results.

In future work, size of dataset shall be increased and working on monumental images with many more variations based on structural design.


The authors appreciate the reviewers' thorough analysis, insightful criticism, and helpful recommendations.


[1] Hesham, S., Khaled, R., Yasser, D., Refaat, S., Shorim, N., & Ismail, F.

H. (2021, January). Monuments recognition using deep learning vs machine learning. In 2021 IEEE 11th annual computing and communication workshop and conference (CCWC) (pp. 0258-0263). IEEE

[2] Saini, A., Gupta, T., Kumar, R., Gupta, A. K., Panwar, M., & Mittal, A. (2017, December). Image based Indian monument recognition using convoluted neural networks. In 2017 International conference on big data, IoT and data science (BID) (pp. 138-142). IEEE.

[3] Etaati, M., Majidi, B., & Manzuri, M. T. (2019, March). Cross platform web-based smart tourism using deep monument mining. In 2019 4th International conference on pattern recognition and image analysis (IPRIA) (pp. 190-194). IEEE.

[4] Ninawe, A., Mallick, A. K., Yadav, V., Ahmad, H., Sah, D. K., & Barna,

C. (2021). Cathedral and Indian Mughal monument recognition using tensorflow. In Soft Computing Applications: Proceedings of the 8th International Workshop Soft Computing Applications (SOFA 2018), Vol. I 8 (pp. 186-196). Springer International Publishing.

[5] Gada, S., Mehta, V., Kanchan, K., Jain, C., & Raut, P. (2017, December). Monument recognition using deep neural networks. In 2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (pp. 1-6). IEEE.

[6] Jethale, A., Nath, N. V., Arawkar, T., Bajpeyi, A., & Nirwal, D. (2018). Monument Informatica: A Tour based Guide system using Real Time Monument Recognition. Research Journal of Engineering and Technology, 9(4), 373-379.

[7] Amato, G., Falchi, F., & Gennaro, C. (2015). Fast image classification for monument recognition. Journal on Computing and Cultural Heritage (JOCCH), 8(4), 1-25.

[8] Pavan Kumar, M. P., Poornima, B., Nagendraswamy, H. S., Manjunath, C., Rangaswamy, B. E., Varsha, M., & Vinutha, H. P. (2022). Image Abstraction Framework as a Pre-processing Technique for Accurate Classification of Archaeological Monuments Using Machine Learning Approaches. SN Computer Science, 3(1), 87.

[9] Sharma, S., Aggarwal, P., Bhattacharyya, A. N., & Indu, S. (2018). Classification of Indian monuments into architectural styles. In Computer Vision, Pattern Recognition, Image Processing, and Graphics: 6th National Conference, NCVPRIPG 2017, Mandi, India, December 16-19, 2017, Revised Selected Papers 6 (pp. 540-549). Springer Singapore.

[10] Paolanti, M., Pierdicca, R., Martini, M., Felicetti, A., Malinverni, E. S., Frontoni, E., & Zingaretti, P. (2019). Deep convolutional neural networks for sentiment analysis of cultural heritage. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 871-878.

[11] Podder, D., Shashaank, M. A., Mukherjee, J., & Sural, S. (2021). IHIRD: A Data Set for Indian Heritage Image Retrieval. Digital Techniques for Heritage Presentation and Preservation, 51-73.

[12] Hiremath, P. S., & Bhusnurmath, R. A. (2016). PDE based features for texture analysis using wavelet transform. International Journal on Cybernetics & Informatics, 5(1), 143-155.

[13] Hiremath, P. S., & Bhusnurmath, R. A. (2014). RGB-based color texture image classification using anisotropic diffusion and LDBP. In Multi- disciplinary Trends in Artificial Intelligence: 8th International Workshop, MIWAI 2014, Bangalore, India, December 8-10, 2014. Proceedings 8 (pp. 101-111). Springer International Publishing.


[15] Garg, M., Malhotra, M., & Singh, H. (2021). A novel machine-learning framework-based on LBP and GLCM approaches for CBIR system. Int. Arab J. Inf. Technol., 18(3), 297-305.

[16] Bhosle, N., & Kokare, M. (2020). Random forest-based active learning for content-based image retrieval. International Journal of Intelligent Information and Database Systems, 13(1), 72-88.

[17] Vatamanu, O. A., Frandes, M., Ionescu, M., & Apostol, S. (2013, November). Content-based image retrieval using local binary pattern, intensity histogram and color coherence vector. In 2013 E-Health and Bioengineering Conference (EHB) (pp. 1-6). IEEE.

[18] Lin, H. J., Kao, Y. T., Yang, F. W., & Wang, P. S. (2006). Content-

based image retrieval trained by ada boost for mobile application. International Journal of Pattern Recognition and Artificial Intelligence, 20(04), 525-541.

[19] Caenen, G., & Pauwels, E. J. (2001, December). Logistic regression model for relevance feedback in content-based image retrieval. In Storage and Retrieval for Media Databases 2002 (Vol. 4676, pp. 49- 58). SPIE.

[20] Celia, B., & Felci Rajam, I. (2012). An efficient content based image retrieval framework using machine learning techniques. In Data Engineering and Management: Second International Conference, ICDEM 2010, Tiruchirappalli, India, July 29-31, 2010. Revised Selected Papers (pp. 162-169). Springer Berlin Heidelberg.

[21] Alharan, A. F., Fatlawi, H. K., & Ali, N. S. (2019). A cluster-based feature selection method for image texture classification. Indonesian Journal of Electrical Engineering and Computer Science, 14(3), 1433- 1442

[22] Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN computer science, 2(3), 160.