Mouse Movement Through Finger By Image Recognition Process

DOI : 10.17577/IJERTV2IS2534

Download Full-Text PDF Cite this Publication

Text Only Version

Mouse Movement Through Finger By Image Recognition Process

Mouse Movement Through Finger By Image Recognition Process

Gowrishankar. R1, Vinodhkumar. B2

1PG scholar Sri Shakthi Institute Of Engineering And Technology, Coimbatore-641062

2Assistant professor/ECE, Sri Shakthi Institute Of Engineering And Technology, Coimbatore- 641062

Abstract Abundant amount of input devices are used to interact with the computer world are more precisely saying to digital world but very less research work has been done on control through body gestures movements. This project aims to replace the functions of hardware input devices like mouse, keyboard and etc with the help of sixth sense technology. This process is done by using finger, mouse cursor movement with the help of image recognition and color recognition process. Here image processing is done through MATLAB for color and image recognition process. Red, green and blue are the default color used for recognition process. Live video is captured from camera and the video is converted into number of frame images. In the frame images color sorting algorithm is used to determine the majority color presented in the frame. In order to demonstrate the function of the mouse movement majority color is identified in a video and a corresponding video is played from the database.

Keywordsgesture movement ,RGB, Image recognition


    Sixth Sense Technology is a revolutionary way to augment the physical world directly without using dedicated electronic chips. Sixth Sense is a set of wearable devices that acts as a gestural interface and aggrandize the physical world around us with digital information and lets the users to use natural hand gestures to interact with the digital information through it. This technology gaining its popularity strength because of the usability, simplicity and ability to work independently in todays scenario. Many other modern technologies are available which are widely used like touch screen that not only saves the utilization time but increases the ease of usability too.

    1. Components Used

      Its implementation consists of five main components and a microphone which is optional and can be used for speech recognition purpose, that collectively acts as a system in itself and each device has its important role in the system. The devices include a webcam or digital camera, colored caps or markers, a phone or laptop, a

      projector and a mirror. Camera is used to capture the object in sight range and follow the users hand gestures, sending the data to phone or laptop connected with it. Camera acts as a digital eye connecting the user to the digital world. Colored caps or markers are attached at the finger tips of

      the user. Marking the users fingers with different colors helps the webcam to recognize the gestures made by the fingers. The movements and arrangements of these markers are grasped as gestures that act as an interaction instructions for the projected application interfaces. A phone or laptop with web enabled services is used as the processing device that processes the input video data send by the camera. Other software in it searches the web and interprets the gestures. The projector projects the visual information enabling the surfaces and physical objects to be used as interfaces. It itself contains a battery with the battery life of around three hours. The tiny LED projector displays the data send from the phone on any surface on view like any object, wall or a person. Mirror is used to display the projected images or videos simply by reflecting it to the desired object or surface.

      This method connects the gap between the physical world and digital world but still has close relation such as position of projector, position of camera for capturing the gestures and the accuracy of the projected output depends on the accuracy of the input taken by the camera. This hindrance increases the use of commands along with the hand gestures as the position of camera is a major constraint in the image capturing and projected output that directly affects the efficiency and accuracy. So to remove this constraint, actions which we regularly perform in our daily life are converted to commands. Speech also has its importance in this method as they can be stored in a database in the integrated circuit and corresponding actions are performed when the speech is recognized from the user.

    2. About The System

    Mouse is the most popular input device now-a- days used for the human interaction with the computer

    systems to interact with the digital world through users hand. The popularity gained its strength from the days when the development of GUIs based operating systems started like Microsoft Windows 95, Windows 98, Macintosh, Windows Amiga, Symbian OS, and many more but still it is dependent on the system. The mouse device changes the relative position of itself, with respect to the base surface on which it is lying, transforms the motion from the form of two-dimensional coordinates to the device controller to begin the cursor movement on the screen. The mouse clicks, that generates the signals on pressing the mouse buttons acts as an input to the pre-programmed structures that finally fires an event or triggers an activity accordingly. Earlier it was done by using electronic chips but here we emphasize on image recognition instead of electronic chips. Other applications can also be developed, using similar approach, like taking the live pictures from the physical world and saving them to the system using hand gestures, checking current time, watching videos, drawing scenery and many more. This increases the intuitiveness of the system with the physical world where user can paint the physical world using the digital information in their own way.



    First step of the project is to capture the live video for converting them into frames. The camera is the key input device of the Sixth Sense system and also acts as a digital eye of the system. It basically captures the scene the user is looking at. The video stream captured by the camera is passed to mobile computing device which does the appropriate computer vision computation. Here the camera Captures users hand movements and gestures (used in reorganization of user gestures). Captures the scene in front and objects the user is interacting with (used in object reorganization and tracking) others. Takes a photo of the scene in front when the user performs a framing gesture. Captures the scene of projected interface (used to correct the alignment, placement and look and feel of the projected interface components).the video capturing is the main process of the system because from the video only the

    frames has been converted. By using external camera or internal camera in the system user can record the live video. Then the captured video is given to the laptop or a computer which is installed with MATLAB software. Through this software user can convert the video into continuous frames.


    Matlab is a data analysis and visualization tool which has been designed with powerful support for matrices and matrix operations. As well as this, Matlab has excellent graphics capabilities, and its own powerful programming language. One of the reasons that Matlab has become such an important tool is through the use of sets of Matlab programs designed to support a particular task. These sets of programs are called toolboxes, and the particular toolbox of interest to us is the image processing toolbox. Rather than give a desciption of all of Matlab's capabilities, we shall restrict ourselves to just those aspects concerned with handling of images. A Matlab function is a keyword which accepts various parameters, and produces some sort of output: for example a matrix, a string, a graph. Examples of such functions are sin, imread, imclose. There are many functions in Matlab, and as we shall see, it is very easy (and sometimes necessary) to write our own. Matlab's standard data type is the matrix all data are considered to be matrices of some sort. Images, of course, are matrices whose elements are the grey values (or possibly the RGB values) of its pixels. Single values are considered by Matlab to be matrices, while a string is merely a matrix of characters being the string's length. When you start up Matlab, you have a blank window called the Command Window in which you enter commands.


    After capturing live video then it is given into the laptop which is installed with MATLAB software. Because in this project MATLAB software is used as source code for converting live video into continuous frames. Normally we use avi type of video files used to capture and considered for conversion. If the camera records in different format then the video are converted into avi format then it is given to the matlab program for image conversion. The size of the image must be 480*640 and their frame rate is 29.9.for a 15 seconds video approximately 300 frames will be converted with 24 bit per pixel. By using this frames image recognition and color recognition takes place.


    In this color recognition process by considering the images which is converted from the video. By using the primary colors Red, Green and Blue color recognition takes place. To calculate the majority color presented in each frame of the video. Using color sorting algorithm color recognition process takes place. When a live video is

    captured through camera then the video is converted into continuous frames. In that each image is considered for color recognition process. Each images is divided into different layers (i.e) input image, binary image, mean image and finally the output image. The color detection algorithms scan every frame for pixels of a particular quality. To recognize a pixel as part of a valid object, its Y, U and V components must fall within the ranges defined in the Thresholds section of the color definition file. The latter is a regular text file with at least two sections, Colors and Thresholds. The Colors section has an entry for each object to be detected. It defines an RGB color triplet, a merge parameter (0 1), a color identifier (0 31) and a color label (text). Every entry in Colors must have a corresponding entry in Thresholds. The latter defines ranges for a pixels Y-component (brightness), its U- component (first color attribute) as well as its V-component (second color attribute).

    Here color markers are used for color recognition process when live video is recorded. By considering the each image camera recognize the RGB color markers. For color recognition process considers each frame, in that MATLAB calculates the majority color presented.


    After color recognition process completed next step is displaying a default video for primary colors which is stored in the database. For primary colors three different videos have assumed to display .when color recognition process takes place in the image to calculate the majority color presented in the image. Depends upon the majority color certain video will be displayed .the default video which is stored in the database will be an avi file format because MATLAB recognize those type of video formats.



    The methodology used is based on the Sixth Sense Technology where user have some devices which together acts as a system. Aim of the system is to move mouse cursor as the user moves his/her fingers. For this purpose, three components of Sixth Sense are used i.e. Camera, Colored Caps and MATLAB installed in Laptop. The approach works in a continuous manner where camera takes the live video, sending to the laptop, and MATLAB installed in laptop processes the input and recognizes the colors at the finger tips of the user. Camera takes the video and starts recording the live video and in continuation of recording it sends the live video to MATLAB which is already installed in laptop which is connected with the camera. In MATLAB, code is prepared which convert the incoming live video from camera into frames of images or slicing of video is done in the form of images. These

    images that are obtained from the slicing of video are then processed for color recognition process.

    The output of the color recognition process are the images that contains only those colors of which color caps are present at the finger tips of the user. Neither the fingers of user are not shown in the output images nor are any background colors there in the output images from the color recognition process. For this purpose, RGB values of the color caps are set prior in the code so that no other color will be detected in the image after color recognition except the caps colors. The output images are displayed in continuation and at the same speed as the speed at which slicing of video is done, so that it looks like a continuous movie in which the input is physical world and the output is only those colors which are present at the fingertips of the user. The color is then associated with the mouse cursor in code so that whenever the color moves in the output image from one position to another, the mouse cursor gets attached at the same position where the color is now displayed.


    The image processing and color recognition was simulated in the MATLAB to process the image from the live video from camera to obtain the final image showing required colors only. The algorithm returns the image showing the desired colors only that are placed at the fingertips of the user.

    Fig-1 Input image

    the above figure is used for color recognition process using color sorting algorithm the above image is divided into input image, binary image, mean image. The above figure is the input image for the color recognition process which is given to the MATLAB software to separate the colors present in the image.

    Binary image is the second process of the color separation here the image is divided into two different layers black and white. It shows the maximum color present in the image.

    Fig 2 binary output

    After the binary output image mean image is considered for the color separation process. In this image it shows the maximum color presented in the image. By using this image majority color present in the image is calculated.

    Fig 3 mean output

    the above figure the majority color for this particular image is Red by calculating through the MATLAB program which is displayed in the command window of the program.

    Fig 4 command window C.DISPLAY THE DEFAULT VIDEO

    After determining the majority color in the image, next step is to display the default video which is stored in the database. Three different videos are used for primary colors (RGB) to differentiate the majority color displayed

    in the command window. By using the color only default video is displayed from the database.


The use of image processing and color recognition in MATLAB for the implementation in this method is proved to be practically successful. Then the default video is displayed for the respective majority color occurred for the live video and also the default video is displayed for the sample input video. The approach has huge potential once it gets further optimized, as its time complexity is higher, with the help of hardware having better specifications.


  1. HAN, M. H., JANG, D. The use of maximum curvature points for the recgnition of partially occluded objects. Pattern Recognition 1990, vol. 23, no. ½, p.2123.

  2. MANJUNATH, B. S., SALEMBIER, P., SIKORA, T. Introduction to MPEG-7 Multimedia Content Description Interface. March 2003,ISBN:0-471-48678-7.

  3. RO, Y. M., KIM, M., KANG, H. K., MANJUNATH, B.S., KIM, J.W. MPEG-7 homogeneous texture descriptor. ETRI Journal. 2001, vol. 23, no. 2.

  4. FORD, A., ROBERTS, A. Colour Space Conversions. August 11,1998, coloureq.pdf

  5. MUÑIZ, R., CORRALES, J.A. Novel techniques for color texture classification. In IPCV'06: Proceedings. 2006, p. 114- 120.

  6. Qiu, Color Image Indexing Using BTC, IEEE Transactions on Image Processing, Volume 12, Number 1, pp.93-101, January 2003.

  7. H.B.Kekre, Sudeep D. Thepade, Boosting Block Truncation Coding using Kekres LUV Color Space for Image retrieval, WASET Int. Journal of Electrical, Computer and System Engineering (IJECSE),Vol.2, No.3, pp. 172-180, Summer 2008. Available online at

  8. M.J. Swain, D.H. Ballard, Color indexing, Int. J. Computer.Vision 7(1) (1991) 11Ð32.

  9. B.V. Funt, G.D. Finlayson, Color constant color indexing,IEEE Trans. PAMI 17(5) (1995) 522Ð529.

  10. E.H. Land, J.J. McCann, Lightness and retinex theory,J. Opt. Soc. Am. 61 (1971) 1Ð11.

Leave a Reply