2D to 3D Image modelling

Call for Papers - April 2019

Download Full-Text PDF Cite this Publication

Text Only Version

2D to 3D Image modelling

Mrs. Sheetal Dudka1,Mr Suresh Jajoo2 Department of Electronics & Telecommunication Datta Meghe College of Engineering,

Airoli, Mumbai, India

1mudganti.sheetal@gmail.com, 2 jajoosuresh@rediffmail.com.

Abstract: The three dimensional modelling is the procedure of developing 3D model using specialized software. It is a process of creating a wireframe of a model which is represented as a three dimensional model. That object can be alive or inanimate. A three dimensional model is created using a set of points in 3D space, which are connected by various geometric data such as lines, and curved surfaces.

This paper includes the design and development of a three-dimensional scanner, which uses in its data acquisition system using webcam, canny edge detection technique using mat lab, a microcontroller and a display unit as personal computer. The analog part of project includes a rotating platform on which the solid object whose three-dimensional model is to be created is rotated by using servomotor and is controlled by using microcontroller. The software allows the visualization of the virtual model in a three-dimensional environment controlled by the user, besides the data can be exported in different formats.

Keywords—-edge detection, gradient.


    The method takes advantage of prior knowledge, such as known field geometry and appearance, heights, and orientation. It create a temporally consistent depth impression by reconstructing a background panorama with depth for each shot (a series of sequential frames belonging to the same camera) and modelling images.

    It contributes a rapid, automatic, temporally stable and robust 2D to 3D conversion method that can be used for far-back field based shots, which dominate viewing time in many sports. For low angle, close up action, a small number of real 3D cameras can be used in conjunction with this method to provide full 3D viewing at reduced cost. For validation use of this solution to convert one view from ground-

    truth, professional-quality recorded camera footage, and provide visual comparisons between the two.


    In early stages estimated depth maps are used to extract the information from a 2D image. Battiato derived depth maps by using multiple cues from a single image Hoiem created depth

    Maps using machine learning algorithms, in this the depth of objects are assigned based on the trained classes. Sanyos 2D to 3D Conversion Adaptive Algorithm uses a Computed Image Depth

    Method (CID) which combines image segmentation with depth extraction. In the

    edge information is used to segment the 2D image into objects. A depth map is then generated by hypothesize depth gradient mode and a bilateral filter is used for efficient imaging.

    The asymmetric edge adaptive filter can generate depth maps with comparatively accurate object edges, by avoiding the high computation after the introduction of depth generation method. It can reduce the area of artifacts in rendering views by using asymmetric smoothing of depth maps, ensuring an improvement in the image quality with reduced artifacts and distortions.

    The light scattering model uses atmosphere scattering for the depth map generation. This algorithm is used for estimating the depth map of outdoor images having a portion of the sky. When a light ray gets reflected because of a scattering medium, scattering of the ray may take place at different points of the light ray, and it will change the light rays original path. When light reaches the viewer from all points of the light ray, the viewer actually gets the effect of a new light source.

    The energy minimization model uses shading for the depth map generation. It is based on finite elements. The image is divided into small triangular areas. The reflectance map R (p, q) is then approximated using a linear function. The depth map is generated by converting the energy minimization model into a problem of the form of solving a linear equation until a specified error threshold is reached.

    The frontal Texel method uses the depth cue patterned texture for a depth map generation. Patterned texture gives a good 3D impression because of the two properties: the distortion of each Texas and the rate of change of Texel distortion through the image region. Loh and Hartley proposed a method suitable for perspective views. The algorithm proceeds by making the frontal Texel as the reference point. The frontal texel will be unique. Any incorrect hypothesis of the frontal Texel leads to incorrect estimates of the surface orientation; it

    cannot be realized by a reconstructed surface. So a search through all possible frontal texels with the surface consistency constraint will help in generating a unique fontal texel estimate.

    Statistical estimators mode uses the depth cue statistical patterns in the depth map generation. The mean depth of an image can be estimated by machine learning, and then it is possible to estimate the depth value per each pixel. Several sets of global features and local features, such as texture variations, texture gradients, haze, image features are collected. The parameters are estimated using the maximum likelihood estimator and linear regression based on the training data and then the depth map is generated.


    The block diagram of 3D modelling is given.

    Fig 1: Steps in 3D modelling

      1. Gray Scale Conversion

        The Gray scale images are distinct from black and white images and also called as binary images. Gray scale images have shades of gray in between.

        Colours in an image may be converted to a shade of gray by calculating the effective brightness or luminance of the colour image and uses this value to create a shade of gray that matches the desired brightness. The effective gray scale value of a pixel is calculated with the following formula:

        Y= (Red + Green + Blue) / 3 (1)

        Based on the equation (1) effective value of an arbitrary pixel is calculated.

      2. Gaussian Smoothing

        The image after gray scale conversion may contain noise. To get a quality image the presence of noise is not desirable. So Gaussian smoothing is applied to the image. In this the raw image is convolved with a Gaussian value.

        The Gaussian function is:


        Using the Gaussian function a matrix of Gaussian values will be generated as per the above equation (2) and the image will be convolved with the Gaussian matrix. This method will remove the noise and the image will get a blurred effect.

        The std. dev of the Gaussian determines the amount of smoothing.

        Gaussian theoretically has infinite support, but we need a filter of finite size.

        For a 98.76% of the area, we need +/-2.5,

        +/- 3 covers over 99% of the area

        The isotropic Gaussian curve is shown below

        Fig 2: an isotropic Gaussian

      3. Canny Edge Detection

        The algorithm runs in 4 separate steps:

        1. Finding gradients: The edges are marked where the gradients of the image have large magnitudes.

          |G| = |Gx| + |Gy|

          Where: Gx and Gy are the gradients in the x- and y- directions respectively

        2. Finding the edge direction: The direction of the edge must be determined

        3. Non-maximum suppression: edge pixel values less than a fixed threshold is discarded.

        4. Hysteresis: Eliminate breaking up of edges.

          The Canny edge algorithm finds edges where the gray scale intensity of the image changes. The edges are found bydetermining gradients of the image. Gradients of each pixel are determined by applying the Sobel operator in both x and y direction. Then the direction of the image is found using matrix created using a Sobel operator. Non maximum suppression is done by preserving the edges having gradient greater than the fixed threshold value, and deleting everything else. In hysteresis two thresholds are taken the edge pixels stronger than the high thresholds are marked as strong and edge pixels weaker than the lower threshold are suppressed. The edge pixels between the two thresholds and connected to earlier found edges are marked as strong. The strong edges are taken to create the edge image.

          Fig 3: Examples of Hysteresis Thresholding

          Hysteresis Thresholding

          • Keep both a high

            threshold H and a low threshold L.

          • Any edge with

        strength<L are discarded.

        • Any edge with

          strength>H are kept

        • Any edge with

        strength between L and H is kept only if there is a path of edges with strength>L connecting P to an edge of strength> H.

      4. Line and Point Segmentation

        The major operation of line and point segmentation is done next. Edges are characterized by object boundaries and are used for segmentation and identification of objects in an image. The Hough transform is used to detect separate straight lines of the image and thereby identify the true geometric structure of the object. Standard Hough transform (SHT) is used to trace objects like walls, doors etc. Improved Hough transform (IHT) is used to find point values and corners using the point segmentation.

        If we use these edge or boundary points as input to the Hough transform, a curve is generated in polar space (r, ) for each edge point in the Cartesian space. Curves generated by collinear points in the image intersect in peaks (r, ) in the Hough transform space. These intersection points characterize the straight lines of the image. Mapping back from Hough transform space into Cartesian space yields a set of line descriptions of the image.

      5. Depth Map Generation

        For the depth map generation vanishing line detection and gradient plane assignment is used. The main steps in this are:

        1. Vanishing line detection: Initially some image features, like Vanishing Point (VP) and vanishing lines are detected.

        2. Gradient plane generation: During this processing step, the position of vanishing point (relative to the original image) and the slopes of vanishing lines is analyzed.

        3. Depth gradient assignment: A gray level corresponding to the depth level is assigned to every pixel corresponding to depth gradient planes.

        4. Depth map generation by fusion: In this step qualitative depth map and geometric depth map are combined to generate the final depth map.

      6. Image Transformation

    Image transformation defining three types of pattern transformation.

    1. Translation moves the generating function across the image.

    2. Rotation rotates the image generating function by the angle.

    3. Anisotropic scaling scales the generating function an isotropically.


      1. Immersive television system: which can be regarded as a next generation broadcast technology

      2. Immersive video conferencing: which allow geographically distributed users to hold a video conference with a strong sense of physical and social presence. Participants are led to be copresent at a round table discussion .Thanks to the realistic real time reproduction of real- life communication cues like gesture, gaze direction, body language and eye contact

      3. Immersive communication: it is a collaborative systems allow geographically distributed users to work jointly at the same task. A more advance approach is collaborative virtual environments (CVE) or shared virtual environments (SVE)


    An active 3D acquisition group includes methods that introduce a source of energy such as light or ultrasonic waves. In passive methods the process of recording an image does not alter the surrounding environment in any way. A camera flash unit does not emit structured light, but the direction of illumination or the location of shadows changes with the location of a camera.

    Therefore, camera flash can be used as a factor that alters the environment. The second group divides the acquisition methods into ones that collect data in the form of images and direct methods that do not require such data representation like range sensors. The monocular 3D acquisition method defines a number of views for the data acquisition process.

    3D information can be extracted from a single view. But this does not mean that one image is enough to recover the depth. For example, in the case of a range from focus approach, several images are taken from a single view. For each image, the focus will be different. By identifying the sharpest areas in each image, one can get the information about depth of an underlying scene. Other monocular methods like the range of brightness, attenuation or texture use some extra information that must be known in advance. The basic purpose of this project is to model the 3D view of 2d images taken by rotating a solid image through 360 degree. Along with hardware like microcontroller, camera and servo motors, matlab software has also been used extensively for generating the final 3D model. The software allows the visualization of the virtual model in a three-dimensional environment controlled by the user, besides, the data can be exported in different formats.


    1. Kerlow, Isaac, of the History of Computing IEEE Annals

    2. Weisberg, David E. The Engineering Design Revolution; the People, Companies and Computer Systems that changed forever The Practice of Engineering (2008).

    3. Couprie, C., Grady, L., Najman, L., Talbot, and H.: Power Watersheds:A Unifying Graph Based Optimization Framework. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011).

    4. Jalba, A.C., Roerdink, and J.B.T.M.: Efficient surface reconstruction using generalized coulomb potentials IEEE Transactions on Visualization and Computer Graphics13, 15121519 (November 2007)

[5]A. Saxena, M. Sun, and A. Y. Ng, Make3d: Learning 3-d scene structure from a single still image, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, pp. 824840, May 2009.

  1. J. Shi and C. Tomasi, Good features to track, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 1994, pp. 593 600.

  2. Chao-Chung Cheng, Chung-Te Li and Liang-Gee Chen, A 2D-To-3D Conversion System Using Edge Information in Proceedings of IEEE International Conference on Consumer Electronics, 2010.

  3. S. Battiato, S. Curti, E. Scordato, M. Tortora, and M. La Cascia,Depth map generation by image classification, Three-Dimensional Image Capture and Applications VI, vol. 5302, pp. 95-104, 2004.

Leave a Reply

Your email address will not be published. Required fields are marked *