Intelligent Image Processing Approach for Visually Challenged Persons for Navigation Assistance

DOI : 10.17577/IJERTV3IS20594

Download Full-Text PDF Cite this Publication

Text Only Version

Intelligent Image Processing Approach for Visually Challenged Persons for Navigation Assistance

1Gowri Subadra. R , 2Mageswar. R, 3Gnana Muthu Veeravel. I

1. Asst Professor, Department of IT

2,3. Student, Department of IT


Puducherry, India

Abstract This paper examines intelligent image processing con-straints which may need to be considered for visual pros- thesis development and proposes a display framework which incorporates context, task and alerts related to a scene. A simulation device to investigate this framework is also described. Mobility requirements, assessment, and devices are discussed to identify the functions required by a prosthesis, and an overview of state of the art visual pros-theses is provided. Two main computer vision approaches are discussed with application to a visual prosthesis: in-formation reduction and scene understanding. Further enhancement of this research is still in progress.

Keywords Visual prostheses, blind mobility, artificial human vision, image processing, computer vision


    Idea behind this project is initially developed for an motive to help visually challenged peoples self-guided. The partial restoration of sight for the blind is an exciting opportunity currently being pursued by a number of international research teams. Partial restoration can be achieved by the use of a camera to capture images, image processing, and then delivery of an electrical impulse to a component of the visual pathway in the human brain. In each case, when this signal is received the subject may perceive multiple points of light (phosphenes). Because it provides the link between camera and implant, image processing is an integral part of all clinical visual prostheses [1].The two most widely used devices are the long cane and guide dog. However these devices have limitations: the long cane is only effective over a short range and a guide dog requires expensive training and maintenance. A number of electronic travel aids (ETAs) have also been developed over the past thirty years In sighted people, vision usually provides the important functions of mobility (such as avoiding obstacles), therefore it is reasonable to assume that an intelligent image processing (IIP) approach may also provide good mobility information. There are two broad classes of computer vision approaches which could be useful to the development of a visual prosthesis system: information reduction and scene understanding. The purpose of this paper is to summarize mobility re-quirements and devices for the

    vision impaired, before identifying some IIP techniques and constraints that may be important for mobility requirements. A display framework and proposed simulation device are then described.


    Mobility is a persons ability to travel between locations gracefully, safely, comfortably and independently [2]. Blind mobility requires skill, effort and training. Mobility problems for a blind person can be caused by changes in terrain and depth (stairs, kerbs); unwanted contacts (bumping); and street crossings (which involve judging the speed and distance of vehicles and may involve identifying traffic light colour) [3]. The most dangerous events for a blind or partially sighted person are drop offs (sudden depth changes, such as on the edge of a subway platform) and moving vehicles [4]. Making unwanted contact with pedes-trians is also undesirable as it can be socially awkward and may pose a threat to a persons safety [5].

    The three main methods for assessing mobility are self report questionnaires, field experiments (such as walk-ing through a shopping mall) and artificial laboratories (such as walking through an artificial maze).


    Mobility devices generally provide information by tactile or auditory methods. A visual prosthesis approach is different as it uses a functioning component of the visual pathway, and does not overload another sense. These two presentation modes are summarized in table 1.

    1. Conventional presentation

      The most useful devices for mobility are the guide dog and the long cane. However, ambient sound is still important in blind mobility (as it is in sighted mobility). Ambient sound might be directional traffic noise, pedestrian crossing signals or voices. Non-visual information may also be tactile or olfactory.

      The long cane provides a blind person with sufficient in- formation for safe movement in the immediate environment.



      Mobility Device

      Sensory method



      c Travel Aids

      Most auditory, some


      Guide dog

      Tactile feedback

      Long Cane

      Tactile and echolation

      Mobility Robot


      Human Guide


      Visual Pathway

      Retinal, Cortical and

      optic nerve


      Phosphene perception

      in subject

      Mobility enhancement in a blind person after long cane training is often dramatic [6]. A cane does not protect against obstacle collision to the upper part of the body (such as wall- mounted public telephones). There is also a risk of tripping other pedestrians with a cane [7]. Guide dogs also provide good mobility assistance by pulling in the same way that a human guide would. They are able to respond to hand and voice signals and are trained to avoid obstacles, prevent veering in street crossings, stop if there is a dangerous situation and intelligently disobey commands that are not safe. A dog may remember common landmarks (such as a particular shop door). However guide dogs are expensive (~US$25,000) and require a person to be comfortable with dogs. A guide dog user also needs to be physically fit and prepared to maintain the dog [8].

      Ultrasound based devices (which generally provide auditory information) include the SonicTorch, SonicGuide and SonicPathfinder. More recent ultrasonic devices include the Navbelt and the Guidecane [9]. A laser cane ETA has also been developed, which uses reflected light energy to provide both tactile and auditory information [10]. The Guide Dogs for the Blind Association of Queen-sland provide a low cost, handheld ultrasound device, the Miniguide, to visually impaired clients. A promising new ultrasound device, the UltraCane, acts as a standard cane while providing tactile information about head and chest level obstacles in the environment.

      The voice is the only commercially available ETA which uses an image processing approach. This ETA presents an acoustic representation of images to a blind person [11]. This portable device captures image data using a head mounted camera, processes this information (using a 64×64 pixel array) and provides an auditory representation once per second.

      Another approach to blind mobility and navigation involves adjusting the environment to provide useful information. Environmental accessibility for blind and the partially sighted can involve using a logical design layout (for example, stairs should be next to lift), assisting with visibility (for example, hand rails should have high contrast) and adequate lighting (which should be 50-100 % greater than that required for normally sighted) [12].

    2. Visua Pathway presentation

    Human vision involves the focusing of light by the lens and cornea in the eye and the absorption of this light by photo- receptors in the retina. Electrical signals from these photo- receptors are then processed through a layer of bipolar and ganglion cells within the retina, before passing to the optic nerve [14]. The amount of information entering the eye is reduced considerably: there are over 120 million photoreceptors and only about 1 million ganglion cells. Most signals from the optic nerve pass through the lateral geniculate body to the visual cortex, where a spatial representation of the retinas appears to exist [15].

    A visual pathway approach promises to provide signifi- cantly more information than a conventional mobility aid, without overloading another sensory channel. The three main approaches to visual prostheses involve a retinal implant (subretinal or epiretinal), an optic nerve cuff electrode and a cortical implant. Each approach has advantages and disadvantages, and all have been successful in providing phosphene perception to subjects [16]. All methods share a common constraint: due to limitations of current technology there is a limited number of phosphenes that can be generated. Therefore it is necessary to develop methods for the optimal use of these phosphenes.


    An image processing approach requires an image to be captured by a sensor and digitized. This image is then usually pre-processed to reduce noise. After these stages we can use an information reduction approach to provide essential environmental information, and/or attempt to understand objects in the environment. We provide a review of IIP as any visual prosthesis system will need to satisfy the same constraints.

    1. Information Reduction (low level vision)

      Most existing visual prosthesis efforts are aimed at this level, which is concerned with the reduction or collapse of visual information. Operations on images at this level are designed to improve image saliency, or to emphasize features of particular importance or relevance, for example kerbs or walls.

      Low level computer vision techniques commonly involve image filtering, edge detection and segmentation to identify objects within the image. At the information reduction level, a real world scene can be processed and presented in a more visually recognizable form, like a picture or cartoon. It might be sufficient for mobility purposes to have the out-line and basic shapes of objects displayed. The reduction of visual information to cartoons, which typically use approximations of features to convey information, can be useful in representing objects using little information [17].

      Another low level technique is motion analysis. The mo-tion

      of a person provides visual information about move-ment relative to the environment and information about the depths of observed scene points [18]. Therefore the analy-sis of image sequences is desirable in a mobility device. One monocular method of judging depth is motion parallax which is used when objects are moving at equal speed: those which are closer to the observer seem to move faster.

      In deciding on the most important features of a scene, it may be useful to consider which parts of a visual scene re-ceive attention from sighted people. Models of the human visual attention have been developed to predict regions of interest in image sequences [19].

      Previous work at our facility [20], [21], has examined the use of various image processing techniques (such as enhancing edges, using different grey scales and extracting the most important image features) to identify a recognition threshold for low quality stationary images. These images are used to represent the limited number of phosphenes available to the subject (typically a 25×25 array). This research aims at providing a means of determining which parts of a visual scene to represent, and a model for inherent information to determine which visual elements of the scene should be presented for maximum perceptual intelligibility by the subject.

      An early visually impaired mobility device which used an information reduction approach was developed during the 1980s. This system [22] used spoken output to describe the current scene. Two modes were provided: the first at-tempted to identify a safe route for the traveler, using an analysis of gray levels within the image. The second mode used scene analysis in an object-identification mode. This mode attempted to use the aspect ratio of identified objects to categorize an object into three classes: long thin objects (such as a pole or mail box), square or circular objects (such as a pot hole), and large objects (such as a car or wall). When an obstacle was detected, the system provided a warning and asked the user to walk slowly.

      More recent research [23] has explored the use of edge detection to determine the positions of lines in an image. The grouping of these lines was then used to classify the object

      (such as a doorway). Paths were also identified using edges and the Hough transform was used to group these into straight lines. A clustering technique, similar to the Hough transform, was then used to find the dominant vanishing point to indicate the subjects direction of travel

    2. Scene Understanding (high level vision)

    This level is concerned with identifying features and extracting information. The scene structure is still there to a degree, but it is idealised or reduced. An example application might be to identify a bus stop, fire hydrant or traffic light. It may also be useful to know the distance to the ob-ject (number of steps, or time at current walking speed).

    Due to the limited number of phosphenes that can be gener- ated by current technology, it may be better to present a symbolic representation. For example a small part of the grid (perhaps 5×5) could be used for information on obstacle locations in the current environment. Object interpretation depends on knowledge of possible objects, and might also depend on context (for example, an outdoor scene versus a home environment. For orientation, it would be useful if a prosthesis system using a scene understanding approach could

    learn to recognize new objects .

    Some recent blind mobility research has attempted to recognize particular objects in a scene. The identification of stair cases was addressed in [25]. This research used a texture detection method (using Gabor filters) to locate distant stair cases. Once a person had moved close enough to the stairs, they were then detected by searching for groups of concurrent lines. The intensity variation was then used to partition the convex and concave lines. Homography with some search criteria was then employed to recover the ver-tical rotation and slope of the located stair case. Although reasonable results were achieved, the approach was found to be slow and not suitable for real-time applications.


    This section identifies a number of constraints which affect visual prosthesis development:

    Number of Phosphenes. Current technology limits the number of phosphenes that can be provided to a patient. Additionally, the size, shape and brightness of phosphenes are not predictable, although recent work in a 4 x 4 retinal prosthesis shows promise in overcoming these problems [26].

    Real-time processing: A visual prosthesis system needs to perform in real-time. However, this has been problematic for other image based mobility systems, particularly those that are stereo-vision based. One way of providing real-time processing may be to restrict the field of view of the camera, although this would restrict the amount of preview (or time to anticipate problems) available to a blind traveler.

    Integration/Prioritization : The integration of different functions is a challenge. Dangerous features of the envi- ronment, such as moving objects, obstacl detection and sudden changes in depth, should be displayed before less important information.

    Scene type: Scene understanding depends to a large extend on context. An intelligent image processing approach would consider the type of context in recognizing objects..

    Device simulation. Currently it is necessary to use a visual

    prosthesis simulation with normally sighted subjects to test the effects of different IIP approaches on mobility.

    Standard set of image sequences: A constraint on the development of both the information reduction and scene understanding approach for mobility is the lack of a standard set of images/image sequences for evaluation and comparability of methods: an approach which has been successfully applied in the field of Information Retrieval [27].


    The display from an intelligent visual prosthesis (or simu- lation) should process different information reduction and scene understanding information depending on the type of scene. For mobility purposes this display depends on three main dimensions of the current scene:

    • Context: The type of scene can affect the type of IIP required. For example, there may be a greater need for information reduction in a crowded shopping mall than a suburban street.

    • Task: Different information is required depending on the current task. A road crossing task may emphasize a straight path to the opposite kerb (to prevent veering), whereas a task involving identifying a set of keys on a cluttered table may involve zooming or object recognition.

    • Alert: The system needs to continually investigate any hazardous features of the current scene. These alerts, such as an approaching tree branch (obstacle detection) or descending stairs (drop off) need to run as background tasks, and interrupt the current display when required.

    Common abstract classes for these dimensions could be developed to reduce decision processing requirements. For example an indoor home and indoor office context could both be assigned to an indoor room class.


    To investigate the mobility display framework, we are developing a visual prosthesis simulation. This portable head mounted device which consists of a Personal Digital Assistant (PDA) and an attached digital camcorder. Currently the PDA display is used to present the phosphene simulation. A normally sighted subject can then wear the device and be assessed on various mobility tasks under different contexts, alert scenarios and IIP conditions. A sheet of cloth is used to limit the subjects visual information to the PDA display.

    A similar simulation approach was used in [28] where the minimum number of phosphenes required for adequate mo- bility was investigated. The pixelised vision simulator device consisted of a video camera connected to a monitor in front of the subjects eyes. A perforated mask was placed on the monitor to reproduce the effect of individual phosphenes. The artificial environment consisted of an indoor maze which contained paper column obstacles. Walking speed and frequency of contact were used as de-pendant variables. This research found that a 25×25 array of phosphenes, with a field of view of 30o would be required for a successful device.


A successful visual prosthesis should result in increased mobility performance. Evidence from previous research suggests that the objective assessment of mobility is important in developing and comparing different devices and techniques. In this paper we have described IIP techniques and constraints related to visual prosthesis development.


This research was initiated by Mr.R.Mageswar , co-author of this paper and also the research was strongly supported by Mrs.Gowri Subadra , Assistant Professor , Department Of Information Technology , Sri Manakula Vinayagar Engineering College , Puducherry.


  1. W. Dobelle, "Artificial Vision for the Blind by Connect-ing a Television Camera to the Brain," ASAIO Journal, vol. 46, pp. 3-9, 2000.

  2. C. A. Shingledecker and E. Foulke, "A human factors approach to the assessment of mobility of blind pedestrians," Human Factors, vol. 20, pp. 273-286, 1978.

  3. D. Geruschat and A. J. Smith, "Low vision and mobility," in Foundations of Orientation and Mobilty, B. B. Blasch and W. R. Weiner, Eds., 2nd ed. New York: American Foundation for the Blind, 1997.

  4. D. G. Pelli, "The visual requirements of mobility," in Principles and Applications, G. C. Woo, Ed. New York: Springer-Verlag, 1986, pp. 134-146.

  5. D. Geruschat, K. A. Turano, and J. W. Stahl, "Tradi-tional Measures of Mobility Performance and Retinis Pigmentosa," Optometry and Vision Science, vol. 75, pp. 525-537, 1998.

  6. A. Dodds, Mobility Training for Visually Handicapped People: A Person-Centred Approach. London: Croom Helm, 1988.

  7. L. W. Farmer and D. L. Smith, "Adaptive technology," in Foundations of Orientation and Mobility, B. B. Blasch and W. R. Weiner, Eds., 2nd ed. New York: American Foundation for the Blind, 1997.

  8. R. H. Whitestock, L. Frank, and R. Haneline, "Dog guides," in Foundations of Orientation and Mobility, B. B. Blasch and W. R. Weiner, Eds., 2nd ed. New York: American Foundation for the Blind, 1997.

  9. S. Shoval, I. Ulrich, and J. Borenstein, " Computerized Obstacle Avoidance Systems for the Blind and Visually Impaired," in Intelligent Systems and Technologies in Rehabilitation Engineering, H. N. L. Teodorescu and L. C. Jain, Eds.: CRC Press, 2000, pp. 414 – 448.

  10. A. Dodds, Rehabilitating Blind and Visually Impaired People. London: Chapman and Hall, 1993.

  11. P. B. L. Meijer, "Vision technology for the totally blind,", 2003.

  12. B. L. Bentzen, "Environmental accessibility," in Foun-dations of Orientation and Mobility, B. B. Blasch and W. R. Weiner, Eds., 2nd ed. New York: American Foundation for the Blind, 1997.

  13. Talking Signs Inc, "Talking Signs Infrared Communications System,", 2003.

  14. T. N. Cornsweet, Visual Perception. New York: Aca-demic Press, 1970.

  15. R. L. Gregory, Eye and Brain: The Psychology of See-ing, 5th ed. Tokyo: Oxford University Press, 1998.

Leave a Reply