MLP Based Gait Recognition Technique

DOI : 10.17577/IJERTV10IS080116

Download Full-Text PDF Cite this Publication

Text Only Version

MLP Based Gait Recognition Technique

Siddhant Kashinath Jalmi

dept. of Electrical and Electronics Engineering Goa College of Engineering

Ponda-Goa, India

AbstractConsidering the point of security and identification of personnel in the era of social distancing is a challenging task. Keeping aside the biometric technique involving the contact-based identification, the research and improvement has been directed towards the human gait recognition method. Using MLP the results obtained from training and testing has shown high accuracy, making this method a useful technique of gait recognition. The results are tested for two sets of data, giving accuracy of more than 80% for one dataset.

Keywords Multi layer perceptron; artificial neural network; silhouette; coxa point; stance phase; swing phase.


    With increasing emphases on authentication and identification, biometric is becoming an important area of research. For persons identity authentication, various aspects of human physiology are been used. Biometrics is defined as the science of propounding the identity with respect to various characteristics traits of human being. Biometrics is an area of technology which uses intelligent methods for identification and verification of person. Cutting and Kozlowsks perception experiments that were based on light point can be linked to the idea of identifying a person from gait. In early 1990s, Niyogi and Adelsonin initiated a work towards recognition of human from gait in computer vision. The gait is behavioral linked biometric technology and can be defined as a unique way of moving on feet. The complexity involved due to inclusion of muscle action, skeletal structure, length of the limb, structure of bone, makes it difficult to imitate or hide if not possible. Dynamics of various body parts mobilizes its centre of gravity from one location to the other in the most efficient way. The contribution of body dynamics and static parameters of various parts in gait recognition is considered to be immense [1].

  2. CLASSIFICATION OF BIOMETRICS Biometrics Technologies are classified into two categories:

    • Physiological Biometric

    • Behavioral Biometric

    1. Physiological Biometric

      This kind of biometrics is directly linked and can be derived from human body by measurement of various points. Out of various methods, the most successful and prominent methods are:

      • Face recognition:

        Iris scans and hand scans. Face Recognition The face is of prime attention in social intercourse, playing a major role in conveying identity and emotions. Face recognition plays an important role in many applications such as security systems, credit card verification and criminal identification. We can recognize thousands of faces learned throughout our lifetime and identify familiar faces at a glance even after years of separation. The ability to model a particular face and distinguish it from a large number of stored face models would make it possible to improve identification. Investigations by numerous researchers [2][3] over the past several years have indicated that certain facial characteristics are used by human beings to identify faces. Face is made up of many distinct micro and macro elements. The macro elements include the mouth, nose, eyes, cheekbones, chin, lips, forehead, and ears. The micro features include the distances between the macro features, or a reference feature and the size of the feature itself. All these features can be used by face biometric systems to help identify and authenticate someone. The main step of human face identification is to extract the relevant features from facial images. Research in the field primarily intends to generate sufficiently reasonable familarities of human faces so that another human can correctly identify the face.

      • Iris Recognition:

      The human iris is an annular part between pupil (black portion) and cornea eyes which is responsible for controlling the diameter and size of the pupils as shown in figure. It also controls the amount of light which is allowed through retinal in order to protect the eyes retina. The structure of human Iris contains five layers of fiber like tissues. Iris recognition systems will scan the iris in different ways. It will analyze over

      200 points of the iris including: rings, furrows, freckles, the corona and others characteristics. After recording data from each individual, it will save the information in a database for future use and comparing it every time a user wants to access to the system. Iris recognition systems are considered as one of the accurate security systems. Iris Recognition unique and easy to identify a user. Although the system requires installation equipment and expensive fees, it is still the easier method to identify a user.

      • Finger Print Recognition:

        Our fingerprint are made up of a number of ridges and valley on the surface of finger which are unique to each individual. The ridges form two minutiae points:ridge endings-where the ridges end, and ridge bifurcations-where the ridges split in two. The uniqueness of a fingerprint can be determined by the different patterns of ridges and furrows as well as the minutiae points. There are two main algorithms which are used to recognize fingerprints: minutiae matching and pattern matching. In Minutiae matching details of the extract minutiae is used. When users register with the system, the images of minutiae location and the direction on finger surface are recorded. This is further compared with the one which is provided at the time of access.Since fingers experience so much wear and tear from cuts and burns, software must be able to do image rebuilding. The drawback of using this system is capability of a biometric device to be reliable in real-life conditions.[4]

      • Hand Geometry Recognition:

      Basically Hand geometry consists of two sides i.e. one is Palm side and Dorsum side or top side or back side of the hand as shown in figure . To achieve personal authentication through hand recognition system either physical characteristics (length, width, thickness and surface area etc.) of the fingers or the hands are measured. This method is found in commercial and residential applications, in time and attendance systems and in general personal authentication applications. Hand geometry recognition systems may provide three kinds of services; Verification, Classification and Identification.

    2. Behavioral Biometric

      Behavioral biometrics is related to behavior, nature and expression of human. The biometric features are extracts based on characteristics of an act performed by a person. This measure includes Voice Recognition, keystroke recognition, signature recognition and gait recognition.

      • Signature Recognition:

        Handwriting may be a skill that's highly personal to individuals and consists of graphical marks on the surface in reference to a specific language. Handwritten signature is the most widely used personal methods for identity authentication. Biometric signature recognition systems will count and analyse the physical act of signaturing, like the stroke order, the pressure applied and therefore the speed. As a symbol of consent and authorization handwritten signature has been targeted for forging. Therefore, with the rising demand for processing and analysing individuals identification faster and more accurately, the planning of an automatic signature verification system faces a true

        challenge. Persons signature pattern depends on time and stat of his mind, thus has possibility to show variation with respect to time. Handwritten signature may be a sort of identification for an individual. The use of special characters and flourishes while composing the signature makes it unreadable. Thus the analysis of it should be as a complete image and not as letters or set alphabets or words arranged together.[5]

      • Gait Recognition:

    Gait is one of the biometrics that is different from the traditional biometrics (face, iris, finger, hand, keystroke, voice, DNA etc). Gait is referred to as a style of every persons walk pattern. It is one among the few biometric shortcome which will be not able to identify any person from a distance. Hence this is extremely approapriate for the vigilance scenario, where the identification of any individual can be secretively formed. Unlike other biometric methods this method doesnt require any subject co-operation for collecting the features. Use of optic flow by some algorithms, related to a group of dynamically extracted moving points of the person can describe the gait of any individual. We cannot deny the possibility of tracking an individual over a period of time when it is gait based.


    A peculiar manner person walks can be one definition of gait. Analysis of walk pattern for any individual can be done by analyzing gait cycle.[6] since its a repeat motion of body part[7]. Each person has a unique style of walking i.e. unique gait feature which is used for authentication. The gait cycle is basically divided as Stance phase & Swing phase. Compared to the other traditional methods gait has advantage of being non-invasiveness, contactless, identified from a distance, and accurate results at low resolution. Different approaches have been used for gait recognition. The background subtraction technique to extract the silhouettes form the image is a very important step. The classification of feature extraction techniques used can be shown as: Model based and Model- free based methods.

    1. Model Based Method

      Model Based approaches aim to model persons structure or motion having geometrical curves are taken by counting the structural parameters of the models or by figuring out the motion trajectories [8]. S.Rohit and Rohan Ravi [9] approached a similar method working on the spatial variation of the subjects limbs and width vector of the silhouettes over a cycle of frames. The change in the spatial-temporal features are computed and its feature matrix is built. By cropping the image its resizing is done and, using feed forward back propagation learning algorithm area under the limbs of the subject, is measured. (DCT) Discrete Cosine Transform is to be applied on the feature matrix and (MSPCA) Multi-scale Principal Component Analysis may be used for dimensional reduction of the extracted area signals. The evaluation of the

      feature matrix is then done by using neuro fuzzy and K-NN classification methods. Using the spline curves as the modelling method, limbs can be modelled, which intern helps to find the coordinates of coxa joint, couple of knee joints, and the couple of ankle joints for each silhouette. This method marks its advantage in the precision of results even when objects like bag or jacket is added. In [6] the methodology adopted for gait identification deals with intersection with fuzzy classification. Shoulder and feet joints are used as feature of the model-based approach. Feet body joint is separated as toe and heel of both the legs i.e., left and right leg. the left. Further two triangles are formed by joining the shoulder and feet (toe and heel). The intersecting points are computed on CASIA database and Fuzzy classifier is used.

    2. Model-Free Method:

    This Methods approaches for characterization of dynamic variables with respect to spatial variation in gait cycle. The sequence of the image fed are analysed by variation in shape and distance vector for characterizing gait feature. This approach is used [10] where the conversion of walking figure into temporal sequence of distance is required. Later converting 2D silhouette into 1D signal and using it with PCA to obtain several principal components of lower dimensions. For classification Nearest neighbour classifier (NN) is used with respect to class exemplars (ENN). A lot of work is going on silhouettes images and Gait Energy images[9]. Liang and Tieniu [10] uses spatial-temporal silhouette analysis. The moment of silhouettes is captured, to reduce the dimensionality of the input feature space PCA may be used to time varying distance signal extracted from a sequence of silhouette images. Supervised pattern classification techniques are finally performed for recognition. Jang Hee Yoo [7] proposed a method to recognise every human based on their gait with the help of BPN network. In this method, extraction of silhouette of human body is done from the images fed, and then 2D stick structure is extracted from the body silhouette by identifying the body points. By employing the enhanced BPN algorithm, various gait features can be identified and made available. Then, an enhanced back-propagation neural network algorithm is employed to identify the gait features. Principal Component Analysis (PCA) is widely used to reduce the dimensionality of the input feature space. [8] Morphological skeleton operator MICA based on eigenspace can also be used which is trained using the sequence of silhouette images. The system is capable of identifying the gait features and thereby an individual based on self-similarity measure. Analysis of gait may also be used in age finding and the sex recognition based on gait features. Change in shape of person over a time aids in recognizing people by gender [9]. The important parameters like center of mass, uncommon deviation, height from top to toe are used. [6] one method uses unbalanced CASIA dataset to take samples. Average values taken over the set of frames of giat sequence of specific features of seven ellipses fitting silhoute regions are used for feature extraction. To evaluate how effective the classification in imbalanced context is, well accepted unbiased measures are used. Further SVM is used for the management of the data imbalance. The error classification is

    done by 10-fold class validation performed atleast 10 times repeatedly. We can use center of mass, length of step size and cycle length as the key feature in gait recognition. Artificial neural network is then used for classification and the results are compared based on the number of hidden layer, selecting proper of training algorithm and fixing different parameter for the purpose of training.


    A neural network comprises of highly interconnected neural computing element like our brain due to which that have the power to find out and gain knowledge due to its massively paralleled distributed processing system. ANN is a nonlinear data/information processing device built from large number of processing devices interconnected within called as neurons. For analysing information, it uses mathematical model comprising an interconnected group of ANN science supported a connectionist approach to computation. Artificial neural network is designed to resemble our biological neural system thus making It function in similar manner as our brain works. It is formed of an outsized number of massively interconnected data processing elements (neuron) working in unison to unravel specific problems. McCulloch and Pitts contribution is considered to be very promising in increasing the computational power by combining many simple processing units.

    Here are the n inputs to the model, are the weights attached to the input links. Strength of the synapses are indicated by the weights which are nothing but multiplicative factor of the input.

    To get the final output, the sum is made to pass on to activation function () which is a nonlinear filter i.e.

    y = (I)

    in human being, the learning occurs by adjuting the synaptic connections existing between the neurons. Same is the case with ANN.

    1. Components of the Network

      Weights: A neural network comprises of huge number of neurons which are linked together by directed communication link in association with weights. Weights can be marked as zero or can be computed.

      Activation function: It is used to calculate the output response of a neuron. To obtain the response, the summation of the weighted input signal is passed through an activation function. The function is a nonlinear in multilayer network and may be a linear in some cases. Bias: A bias acts same as a weight on a connection from a unit whose activation is always

      1. Net input to the unit is directly proportional to the bias. The bias is used to enhance the neural network performance, with a value initialized to zero or any specific value depending on neural network. Threshold: It () is used to calculate and measure the activations of a given network The output may be calculated based on the value of threshold i.e the activation function is based on the value of . The layer-by-layer

      arrangement of neurons and the connecting pattern within and in-between layer is referred as the network architecture. The neurons within a layer may be fully interconnected or not interconnected.

    2. Training

      It is the modification cycle of the weights in the connecting link between network layer aiming for achievement of the desired output, and the internal process occurring during network training is called learning. The training can be classified as Supervised training, Unsupervised training, Reinforcement training. Supervised learning is the process of making available the network with a sequential set of inputs and comparing the outputs with the expected responses. The training continues until the expected response is noticed. The weights are than adjusted as per the learning algorithm. This learning focuses to minimize the current errors of all processing elements. Until acceptable network accuracy is reached the global error reduction is created over time by continuously modifying the input weight. In unsupervised training the individual computing element neurons compete with each other. This process takes place until the winner is established. The resulting values of the winner neurons decides the class to which a particular data set belongs. In Reinforcement learning process, neurons receive the signals from environment as feedback which directly affects its learning. Infect the network is only provided with clue whether the output is correct or wrong. Network then uses it as data to enhance its performance.

    3. Multi Layer Perceptron

    A multilayer perceptron (MLP) is considered to be a feedforward artificial neural network model that performs the mapping of input information onto output. A MLP comprises of multiple layers of nodes in a manner of directed graph, with every layer fully linked to the next one. MLP uses a supervised learning technique known as backpropagation for the training purpose. Layers are updated by initiating at the inputs and terminating with the outputs. Each neuron calculates a weighted sum of the incoming set of signals, to produce a net input, and passes this value through its sigmoidal activation function to get the neurons activation value. Unlike the perceptron technique, an MLP can solve linearly inseparable problems. MLP is a modification of the standard linear perceptron and can distinguish data that are not linearly separable.

    • Architecture

    The multilayer perceptron comprises of three or more layers (an input and an output layer with one or more hidden layers) of nonlinearly-activating nodes and is thus considered as a deep neural network. Every node in one layer links with a certain weight Wij to every other node in the following layer. Learning is carried out in the perceptron by altering connection weights after each piece of data is processed, based on the amount of error in the output compared to the expected result. This is supervised learning, and is done through backpropagation, by generalization of the least mean squares algorithm in the linear perceptron.

    The error in output node in the nth data point is represented by

    here d is the target value, y is the value generated by the perceptron.

    Then corrections are made to the weights of the nodes based on those corrections which minimize the error in the entire output, given by

    Using gradient descent, change in each weight is given by

    where yi is the output of the preceding neuron and is the learning rate, which is selected cautiously to ensure that the weights converge to a response quick enough, without generating oscillations.

  5. GAIT RECOGNITION SYSTEM The steps involved in Gait recognition are:

    1. Silhouette is extracted with the help of subtraction technique at the background, and analyzed and to remove disturbance components during preprocessing.

    2. The extracted silhouette is then resized by cropping, to obtain a design image area of concern.

    3. The proposed feature namely are extracted which as unique and hence the signature feature for each individual.

    4. Feature matrix given to classifier for identification.


  1. Feature Extraction

    The Feature extraction technique involves extracting the feature of each silhouette image. Firstly, the image is converted into binary format. The first feature is seen as the center of mass. Center of mass is located taking into consideration the region of interest in the silhoutte.

    The next feature is the step length. It is calculated considering the width of the bounding box. With a persons movement, step length varies.

    The next feature is the location of coxa bone and left and right knee point location. As seen in the figures below, the coxa point is located taking in consideration the height of the silhoutte. Next the right and left knee points are marked. This completes the location task. later a triangle is formed with these three points and area of this region is calculated.

    For the silhoutte shown in the above figure the features extracted are: centroid =261.0038 157.6532 Step length =48

    Area =104.

  2. Training Results using Multilayer Perceptron Network

    Performance Analysis for dataset 1:

    The figures below show the analysis for MLP network with the number of hidden neurons as 30:30. The figure below shows the network response graph indicating the no of samples in each output group. Total hit rate of the network is shown by performance plot. The gradient for this data is 0.025731 at epoch 18. Regression plots indicate how well the output values are closely related to the target values as seen in the figure. The training state for the network indicates the best validation performance which is 0.044095 at epoch 12.

    Performance Analysis for dataset 2

    The figures below indicated the performance for an MLP network with dataset 2 with hidden neuron as 29:29.The

    gradient for this data is 0.036217 at epoch 14. Regression plots indicate how well the output values are closely related to the target values as seen in the figure. The training state for the network indicates the best validation performance which is 0.046401 at epoch 8.

  3. Testing Results using Multilayer Perceptron Network




Neurons in First layer

Neurons in Second layer

Validation Performance


Dataset 1
















Dataset 2

















  1. Amin, Tahir, and Dimitrios Hatzinakos. "Determinants in human gait recognition." In 2012 25th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pp. 1-4. IEEE, 2012.

  2. Liao, Rijun, Shiqi Yu, Weizhi An, and Yongzhen Huang. "A model- based gait recognition method with body pose and human prior knowledge." Pattern Recognition 98 (2020): 107069.

  3. Rida, Imad, Noor Almaadeed, and Somaya Almaadeed. "Robust gait recognition: a comprehensive survey." IET Biometrics 8, no. 1 (2019): 14-28.

  4. Khandizod, Anita. "IJERT-Multimodal Biometric System by Feature Level Fusion of Palmprint and Fingerprint."

  5. Ganiyu, Shefiu O., Mikail O. Olaniyi, Olawale Surajudeen Adebayo, and Terfa Daniel Akpagher. "Systematic Review of Facia Recognition Algorithms and Approaches for Crime Investigations." (2020).

  6. Dash, Sachikanta, and Rajendra Kumar Das. "An implementation of neural network approach for recognition of handwritten Odia text." In Advances in Intelligent Computing and Communication, pp. 94-99. Springer, Singapore, 2020.

  7. Yoo, Jang-Hee, Doosung Hwang, and Mark S. Nixon. "Gender classification in human gait using support vector machine." In International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 138-145. Springer, Berlin, Heidelberg, 2005.

  8. Ju Han and Bir Bhanu. Individual recognition using gait energy image. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28(2):316322, 2006.

  9. Jang-Hee Yoo, Doosung Hwang, Ki-Young Moon, and Mark S Nixon. Automated human recognition by gait using neural network. In Image Processing Theory, Tools and Applications, 2008. IPTA 2008. First Workshops on, pages 16. IEEE, 2008.

  10. Wang, Liang, Tieniu Tan, Huazhong Ning, and Weiming Hu. "Silhouette analysis-based gait recognition for human identification." IEEE transactions on pattern analysis and machine intelligence 25, no. 12 (2003): 1505-1518.

Leave a Reply