A Novel System for Repair Guidance Using Augmented Reality

DOI : 10.17577/IJERTV1IS3233

Download Full-Text PDF Cite this Publication

Text Only Version

A Novel System for Repair Guidance Using Augmented Reality

Sidharth Bhatia, P. Vijayakumar

Deptt. Of ECE, SRM University, Kattankulathur, Chennai,Tamil Nadu, INDIA


The daily life of a common man revolves around various forms of appliances/gadgets he uses throughout the day such as a mobile phone, laptop, printer, microwave oven, washing machine, etc. Although these appliances/gadgets are tak en by most of the people for granted, the problem occurs when any of these things do not work as they are expected to. Getting them to the repair shops for every small glitch is expensive as well as time consuming. Although most of the companies which produce these appliances/gadgets do supply them with basic manuals, which deal with how to solve these minor issues, but reading them and at the same time repairing the corresponding appliance/gadget can be a frustrating task at times. These problems can be reduced to a large extent if some k ind of live guidance is available. In thi s paper we propose a method to do so with the help of an augmented reality based system that will guide the user to carry out small scale repair jobs on these gadgets. All that is required is a decent webcam and a computing device, with a processor of 1 GHz or more and a display screen.

  1. Introduction

    The field of augmented reality has grown by leaps and bounds during the past decade. In our daily lives we can see augmented reality at work in sports where cricket or rugby pitches show sponsor advertisements. BMW, Corvette and GM use augmented reality HUDs (Heads -Up Displays) in their cars; mostly for meter and traffic informat ion. A similar kind of application is used for fighter jets for displaying in flight and combat information. In this paper an augmented reality based application is discussed which can guide the user, to repair any appliance/gadget, by overlaying computer generated graphical information on the input taken by a webcam. To make the repair instructions easy to understand the user also gets audio instructions corresponding to the graphical information simu ltaneously. As a proof of concept we have applied our method to help the user change the SIM card of NOKIA 1600 mobile phone. The whole

    paper is divided into 7 sections. The second s ection briefly discusses the work done previously in this domain and our contributions to it. The third section sheds light on our method with the help of flow charts. There the various algorithms that have been used in our method are discussed briefly and also how and where they fit in the whole scheme of things. The fourth section describes the e xperimental results that have been achieved and the performance characteristics of our method. In the fifth section there is a co mparison between the performance of our method and that of the existing ones. The sixth section talks about the limitations of this method and thoughts about how they may be tackled. In the seventh section our future goals regarding this method are discussed. The last section concludes the paper.

  2. Related work and our contributions

    A notable work in the field of augmented reality is SixthSense by Pranav Mistri [1]. Th is system, like our own system, co mprises of a camera and a computing device that processes this informat ion. But instead of a display screen it has a projector. The system is used for interactions between user and the computer to get information via gestures and coloured bands on the users fingers. Some other applications such as Layar and Wikitude, which basically aug ment information about the surroundings onto the view of the ca mera of the smartphone, have also been developed for smartphones.

    Another work worth mentioning is that of Juan et al. in [2]. Th is is one of the first AR applications aimed at treating psychological disorders. This AR system works with the help of a transparent HMD (Head Mounted Display). The system when ON ma kes the user to see cockroaches all around and over him/her and thus a treatment is carried out accordingly. Another application in the field of defense is discussed in [3]. This helps mechanics perform routine ma intenance tasks inside an armored vehicle turret. In [4] an application in the med ical fie ld is discussed. In this paper the authors claim to have made an augmented reality s ystem in which heart beats evaluation and visualization is done. The system collects cardiologic data in various circu mstances, for e xa mple while the

    person using the system is running, play ing, sleeping, etc. The system then processes it and displays it in an augmented reality environ ment in a mean ingful form so that even the people who do not have much knowledge about the heart can make sense from the data presented and take precautionary measures to avoid any heart problems. The work by Kazuyo Iwa moto et al. in

    [5] discusses an augmented reality system, wh ich uses a monocular video see-through HMD to present instructions on to a work object, for


    Frame from Camera



    Detection Flag high or low?


    performing an experiment in which the user does

    line heating for plate bending. The system augments a straight line, on to a p late, which the user has to follo w while he is performing heating to bend the plate effectively.

    Our contribution to this growing world of augmented reality applicat ions is a system that helps the user repair an appliance/gadget by augmenting repair informat ion on to the camera feed and displaying it on a PC/laptop/tablet screen. By using this application even a person who has no technical know-how about how the specific appliance/gadget works can easily perform sma ll scale repairs on his appliance/gadget. As said above we demonstrate our concept by performing the task of changing the SIM card of a NOKIA 1600 mob ile phone. An exa mp le of how the system gives input the user is shown in Fig. 1.

    Fig. 1. An e xa mp le of the output by our method.

  3. Method

    In this section the work flow of our method is discussed which re lies heavily on two independent works of Satoshi Suzuki et al. and John Canny described in [6] and [7] respectively. The constituent frames of the video feed from the came ra are of size 320×240 and have 8-bit RGB colour format. The flow d iagra m of the overall system is shown in Fig. 2. The whole system has ma inly t wo phases, Detection and Tracking.

    Fig. 2. The overall work flow of our method.

    The detection phase is responsible for finding the object as a whole within the came ra fra me . The tracking phase generates and augments the computer generated guidance information on to the fra me and displays it in the output window.

    The whole system consists of different stages which e xecute one after another after the previous stage is fin ished and the user has requested for the next stage directions. Each stage has its own detection and tracking phases and therefore is independently responsible for guiding the u ser to perform a specific task on the phone. There are in all 5 stages in the case of our application on changing the SIM card of a Nokia 1600 mobile phone. They are summa rised in Fig. 3.

    Fig. 3. Stages in overall working of the system.

    1. Detection phase

      The detection phase, as already discussed is only responsible for detecting the object as a whole within the given fra me and finding the bounding rectangle coordinates for it. The functioning of the detection phase is very simila r but not same for every stage. We discuss the first stage detection in detail. The whole proess is described by the help of two continuous flow charts shown in Fig. 5(a) and (b). To find the edges using the canny edge detection method mentioned in [7], we must have a

      gray scale image to operate on. We get the image by applying the following equation on every pixe l.

      Y=0.299*R+0.587*G+0.114*B (1)

      In the above equation Y is the gray scale value of the pixe l we have got for the corresponding Red (R), Green (G) and Blue (B) co mponents of the pixe l in question.

      In the above equation Y is the gray scale value of the pixe l we have got for the corresponding Red (R), Green (G) and Blue (B) co mponents of the pixe l in question. After getting the gray scale image we apply the procedure mentioned in [7] to get the binary image of the original colour fra me consisting of the edges. During the hysteresis thresholding stage in [7] the thresholds are kept at 0 and 150. By following this procedure we get a binary image consisting of the strong boundaries, which is what we require. A sa mple output achieved from this procedure is shown in Fig. 4. Here the upper image is the source image and the lower image is the binary image found after applying canny edge detection method. After getting the binary image consisting of the edges the next step is to find the contours in that binary image by using the method described in [6]. This method does the topological analysis of the binary image we have obtained. The method consists of two algorith ms which in effect ca lculate the inner and outer contours of the objects inside the binary image. After getting the contours sequence this sequence needs to be searched for a correspondence to the approximate shape of our object, in this case being a quadrilatera l. To do this the Douglas-Peucker algorithm mentioned in [11] is used. The algorithm finds simila r curves with fewer points.

      Fig. 4. Sa mp le output of edge detection.

      A1 Frame from

      Camera Feed

      RGB To

      Grayscale conversion

      Canny Edge Detection

      Find Contours

      Find Quadrilaterals from this sequence


      Quadrilaterals found?


      Check the qaudrilateral


      Contours inside it?



      Quadrilaterals found?


      Children quadrilaterals saved

      Frame copy stored

      Sequence of Detected Contours saved

      Save as parent

      Fig. 5(a). Detection phase flow diagra m (Part I).

      Saved Children quadrilaterals

      Find a quadrilateral of a specified area at each end of the parent


      NO Do Such Quadrilaterals




      Are the colour specifications satisfied?


      Calculate coordinates for an Upright Bounding Rectangle around the Mobile Phone

      Raise the Detection Flag

      Detection Phase Ends

      Frame copy stored

      Coordinates saved

      0 A2 0.05(A) (3)

      If the above given conditions are met then the two internal quadrilatera ls are checked for their positions within the outer quadrilateral. If the outer quadrilateral is divided into two parts equally along its height then the quadrilatera l with the area A1 should be present within one half and the quadrilateral with the area A2 should be present within the other half and the centers of gravity of both should be apart by a distance D. The value of D should be within the range specified in (4).

      0.7(A) D 0.9(A) (4)

      If the above criterion is met then we check the interested region signified by the outer quadrilateral for the required colour information. To do this we perform A ND operation on the original colour image and the image obtained after filling the area under the quadrilateral by wh ite pixe ls on a black background. The result which we get after the

      And operation is an image which has only the region which is within the outer quadrilateral in the original co lour image. The rest is black. This technique can be called as a kind of bac kground subtraction technique.

      The image obtained from the above steps is now

      checked for colour co mposition. This is done by colour thresholding the image. The image is first transformed fro m RGB co lour space to HSV colour space. After obtaining an HSV image individual pixe ls can be simply checked in each region of the object for the range of H component required. All the pixe ls that correspond to this range of H are made white and the others are made blac k. To

      Fig.5(b). Detection phase flow d iagra m (Pa rt II).

      The resultant quadrilatera ls fro m this method are shown in Fig. 6. The coordinates of the vertices of the quadrilaterals and the contours from wh ich they were derived are saved. The next step is to check which quadrilatera l signifies the outer boundary of the phone. We start from the quadrilateral with the largest area. The required quadrilateral should have two quadrilatera ls within its boundaries as shown in Fig. 6. The ne xt step is to check whether these internal quadrilatera ls have the required area proportionalities satisfying the criteria. In the case of our mobile phone the criteria is that there should be two quadrilaterals within the outer quadrilateral out of which one has an area A1 with an area specified in (2) and the other has an area A2 specified in (3). The area of the outer quadrilateral is assumed to be A.

      0.25(A) A1 0.40(A) (2)

      know about how well the colours of the object match those which are required we can just count the white pixe ls in a specific region of the image and check for the ratio between the area of that region and the number of white pixe ls within it. If the range of H is fulfilled then the outer quadrilateral wh ich was obtained, as shown in Fig. 6, is confirmed as the boundary of the object. An upright bounding rectangle for the object is calculated and the detection flag is raised.

    2. Tracking phase

      The functioning of the tracking phase, as discussed above is to generate guidance informat ion. We will discuss about the first stage tracking in detail. The rest of the stages utilize the same basic method as in the first stage. When the first stage of tracking is entered the results obtained fro m the detection phase are the coordinates of an upright bounding box around the mobile phone/object, the coordinates of the quadrilatera ls inside the boundary of the object and their appro ximate

      positions inside the outer quadrilateral, wh ich signifies the object boundary, and the coordinates of a rotated rectangle around the outer quadrilateral. In this phase we need to generate guidance information which tells the user how to re move the battery cover.

      Windows machine with a 1.67 GHz dual-core AMD E450 APU and 2 GB of RAM. Our method was found giving us an average frame rate of 23 FPS. The variation in fra mes processed per 10 seconds in the detection phase over a period of 1 minute of usage each is summarised in Fig. 8. Also, the CPU usage in the detection phase during an interval of 5 minutes can be seen in Fig. 9. It can be comprehended from these figures that the CPU usage is around 63% at an average. Therefore our method is light too. This means that this method can be imp le mented easily on portable devices that the user can carry anywhere.

      Fig. 6. Quadrilaterals found fro m the contours.

      To do this we need to know the orientation of the phone. To get the informat ion about the orientation we need to check the coordinates of the vertices of the rotated bounding rectangle, obtained above, for the top left corner of it. Once the top left corner is found we find the side which is greater a mong the rectangles sides, let us assume them to be L0 and L1. Suppose L0 is greater so we divide the bounding rectangle in two parts along L0 of sizes 40 percent of the bounding rectangles area and 60 percent of the bounding rectangles area. The one






      Frames/10 seconds


      10 20 30 40 50 60


      Frmaens processed

      with 40 percnt of the area of bounding rectangle is towards the top left corner. The rectangle A1 is contained either in this region (that means the orientation of the phone is top side up) or it is in the lower reg ion (that is bottom side up). Therefore

      Fig. 8 Fra mes processed per 10 seconds in the

      detection phase.

      CPU usage % (detection)

      after this step we know the orientation of the phone and thus if we know the relat ive position of the battery cover inside the phone we can easily draw the information we require . The result can be seen in Fig.7.


      56 60 61 62





      68 69 67 65 64 65

      Fig. 7. Gu idance informat ion drawn on the input fra me

  4. Results and performances

    Our method was imple mented using OpenCV, Intels open source computer vision library. The code was written in C language and the application was built using Microsofts Visual Studio 2008 IDE. The resultant application was tested on a

    Fig. 9. CPU usage in detection phase during an interval of 5 minutes.

    The amount of RAM used also reflects upon the efficiency of a method. The RAM usage exc luding the 35% dedicated for the OS is shown in Fig. 10. Fro m this figure we can observe that of the 1.3 GB of RAM available (2 GB total minus 35% dedicated to OS) a ma ximu m of 37% is being utilised. Therefore it is fair to say that our method needs only a ma ximu m of 500 MB free RAM to work effectively which is nowadays a very commonplace thing in smart phones. This can be

    taken as another proof for our c laim that our method can be imp le mented on portable devices. As far as the detection performance goes our method was able to detect the phone in 22 fra mes, at an average, out of every 30 fra mes; achieving a hit ratio of approximate ly 73%. These tests were done under norma l indoor lighting conditions where the light was not focused directly on to the phone. An exa mple result is shown in Fig.1 (tracking phase). Some more e xa mp les of tracking phase are shown in Fig. 11. Also, the method is rotation and scale invariant as shown in Fig. 12 and 13.

    Fig. 10. RAM usage over a period of 5 minutes

    Fig. 11. Tracking phase outputs.

  5. Comparision with othe r methods

    A variety of methods are being used around the world nowadays for object detection and tracking. Some o f the well known algorithms are SIFT [8], SURF [9] and Viola Jones [10]. We tested our method against these methods by applying them on the same dataset and got to the conclusion that our method is competitive on the detection front but it is less CPU intensive.

    Fig. 12. Repeatability versus rotation.

    Fig. 13. Repeatability versus distance

    The results are shown in Fig. 12 to 14. In Fig.

    12 the repeatability of the methods for the corresponding 2D rotation of the object can be observed. Fig. 13 de monstrates the scale invariance of our method when compared to the other methods. Taking a look at Fig. 14 it can be seen that the CPU usage of our method is also very competitive when compared to the other methods in question.

  6. Limitations

    First and fore most of the limitat ions of this method is the proble m caused in edge detection due to shadows or reflections off the surface of the object. The major cause of the non-detections is the inability of the system to detect all the edges effectively when the object is re flect ing a lot of amb ient light off its surface or when dense shadows are falling on it. Some of the available norma lisation techniques can be applied but they will have to be optimised first in order to allow the

    overall method to maintain its real-time working ability.

    Another potential shortcoming of this method is the requirement of having the background of a colour which not same or close to that of the object as this can lead to the inability to detect proper edges of the object.

  7. Future work

    Work is on to add 3D graphics overlay to our method shortly. Also a see-through HMD is under development that can be attached to this system easily and will have the ca mera mounted on it in such a way that the camera sees exactly what the user is seeing at that instant thereby reducing the effort in overlaying computer generated graphics over the real world scene.

  8. Conclusion

This paper has presented a method for creating an augmented reality system which can guide any untrained user to perform sma ll scale repairs on geometrica l shaped appliances/gadgets. Our method is fairly accurate and is low on CPU usage while at the same time being rotation and scale invariant to a la rge e xtent.


  1. P. Mistry, P. Maes, Sixthsense, a wearable gestural interface in the Proceedings of SIGGRAPH Asia 2009, Sketch. Yokoha ma, Japan. 2009.

  2. Juan, M.C., Botella, C., Alcañiz, M., Baños, R., Carrion, C., Melero, M., Lo zano, J.A,.I. S. Jacobs and C. P. Bean, An augmented reality system for treat ing psychological disorders: application to phobia to cockroaches in Proceedings of the Third IEEE and ACM International Symposium on Mixed and Augmented Reality [ISMAR 2004].

  3. Steven J. Henderson, Steven Feiner, Evaluating the benefits of aug mented reality for task localization in ma intenance of an armored personnel carrier turret, in IEEE International Symposium on Mixed and Augmented Reality 2009 [Science and Technology Proceedings 19 -22 October].

  4. Edgard La mounie r, Jr., Arthur Bucioli, Ale xandre Cardoso, Adriano Andrade and Alcima r Soares, On the use of augmented reality techniques in learning and interpretation of cardiologic data, in 32nd Annual International Conference of the IEEE EMBS [Buenos Aires, Argentina, August 31 – September 4, 2010].

  5. Kazuyo Iwa moto, A monocular video see- through head mounted display for interactive support system, in Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics [San Antonio, TX, USA

    – October 2009].

  6. Satoshi Suzuki, K.Abbe Topological structural analysis of digitized binary images by border following, in Computer vision, graphics, and image processing, [30, 32-46 (1985)].

  7. John Canny, A computational approach to edge detection, in IEEE transactions on pattern analysis and machine intelligence, [vol. PAMI-8, No. 6, Nove mber 1986].

  8. David G. Lowe . Ob ject recognition fro m local scale-invariant features, in Proceedings of the International Conference on Computer Vision, [Corfu (Sept. 1999)].

  9. Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool, Speeded-up robust features (surf), in Computer vision and image understanding [110 (2008) 346359].

  10. Paul Viola , Michael Jones, Rapid object detection using a boosted cascade of simple Features, accepted for conference on computer vision and pattern recognition [2001].

  11. David Douglas, Thomas Peucker, "Algorith ms for the reduction of the nu mber of points required to represent a digitized line or its caricature", in The canadian cartographer [10(2), 112122 (1973)].

  12. Willia m K. Pratt, Digital Image Processing, fourth edition, Wiley-Interscience [2010 pp. 387-410].

Leave a Reply