Multisensor Video Object Fusion and Interpretation for Surgical Navigation

DOI : 10.17577/IJERTV3IS10841

Download Full-Text PDF Cite this Publication

Text Only Version

Multisensor Video Object Fusion and Interpretation for Surgical Navigation

Nobert Thomas Pallath*, Tessamma Thomas *

* Department of Electronics, Cochin University of Science & Technology

Abstract

Multisensor video object fusion can be used when multiple scenes of the same object are captured using multiple sensors. The method is useful for a wide variety of applications such as video surveillance, traffic monitoring, satellite imaging, automated inspection system etc. In this paper a novel computer assisted surgical navigation method has been developed using video object fusion using two orthogonally placed sensors or cameras. The modality of using orthogonally placed cameras is useful for analysis of the position and orientation of a micro motor drill in the three dimensional co-ordinates. The orthogonally placed cameras, perpendicular to the plane of motion of the drill under inspection, improve the overall sensitivity of measurement. The drill is used for orthopaedic procedures such as drilling of pilot hole into the pedicle region of human vertebra for the insertion of pedicle screws. The study reveals that real world position of the drill calculated using the 3D image co-ordinates gives precise drill positioning with respect to the 3D real world co-ordinates.

KEYWORDS

Computer assisted spine surgery (CASS), field of visions (FOVs), pedicle screw, micro-motor drill, pattern matching, graphical overlay.

Introduction

Video sequences contain more information than still images about how objects and scenarios change over time [1]. Data fusion deals with methods for merging data from multiple sensors. Information fusion using multiple sensors can serve more efficiently various purposes, such as detection, recognition, tracking and situation assessment etc[2]. Typically, vision applications have multiple video feeds presented to a human observer for analysis. However, the ability of humans to concentrate on multiple videos simultaneously is limited. Therefore, there has been an interest in developing computer vision systems that can analyze information from multiple cameras simultaneously and possibly present it in a compact symbolic fashion to the user .To cover an area of interest, it is reasonable to use cameras with overlapping field of visions( FOVs). Overlapping FOVs are typically used in computer vision for the purpose of extracting 3D information [3]. The relative motion between objects in a scene and a camera, gives rise to the apparent motion of objects in a sequence of images. This motion may be characterized by observing the apparent motion of a discrete set of features or brightness patterns in the images. The objective of the analysis of a sequence of images is the derivation of the motion of the objects in the scene through the analysis of the motion of features or brightness patterns associated with objects in the sequence of images.[4] Two distinct approaches have been developed for the computation of motion from image sequences. The first of these is based on extracting a set of relatively sparse, but highly discriminatory, two-dimensional features in the images corresponding to three-dimensional object features in the scene, such as corners, occluding, boundaries of surfaces, and boundaries demarcating changes in surface reflectivity. Such points, lines and/or curves are extracted from each image. Inter-frame correspondence is then established between these features. Constraints rigid body remains the same after object/camera motion are formulated based on assumptions such as rigid body motion, i.e., the 3-D distance between two features on a rigid body remains the same after object/camera motion. Such constraints usually result in

a system of nonlinear equations. The observed displacement of the 2-D image features are used to solve these equations leading ultimately to the computation of motion parameters of objects in the scene. The other approach is based on computing the optic flow or the two-dimensional field of instantaneous velocities of brightness values (gray levels) in the image plane. Instead of considering temporal changes in image brightness values in computing the optic flow field, it is also possible to consider temporal changes in values that are the result of applying various local operators such as contrast, entropy, and spatial derivatives to the image brightness values. In either case, a relatively dense flow field is estimated, usually at every pixel in the image. The optic flow is then used in conjunction with added constraints or information regarding the scene to compute the actual three-dimensional relative velocities between scene objects and camera.

A task that is closely related to the estimation of motion is the task of estimation of the structure of the imaged scene. In the case of the optic flow method, this consists of grouping pixels corresponding to distinct objects into separate regions, i.e., segmenting the optic flow map, and then computing the three-dimensional coordinates of surface points in the scene corresponding to each pixel in the image at which the flow is computed. In the case of the feature-based analysis, computing structure corresponds to forming groups of image features for each object in the scene and then computing the 3-D coordinates. In this paper we have considered the second approach of feature based analysis using markers.

Computer navigation systems serve as a useful aid in spine surgery [5][6][7].Although pre- operative CT imaging or registration is not required in fluoroscopy based navigation systems, CT based navigation systems have definite advantage with respect to precise preoperative planning using 3D visualization of patient anatomy [1]. Even though X- ray fluoroscopy based navigation is popular, there is risk of prolonged exposure to X- ray radiation [2]. Also, it cannot be used during the entire screw insertion procedure due to possible spatial conflicts between C-arm of the fluoroscopy unit, the surgeon and the surgical instruments [3]. Systems having lower radiation risk are generally expensive [8]. The procedure is to insert two screws into each vertebra to be fused [6][9]. The angle of insertion of the pedicle screws is chosen in a manner, so as to avoid perforation of the pedicle which may cause damage to the spinal cord or roots [10]. We have developed a computer assisted method with low instrumentation cost and high precision using multi sensor video object fusion and computer graphics.

  1. MATERIALS AND METHODS

    The method developed is based on real time processing of the video grabbed using the experimental setup, consisting of cadaveric dry human vertebra, phantom model of the vertebra, micro motor drill, two orthogonally placed CCD cameras, workstation computer with matrox morphis frame grabber, in a created surgical environment. The optimum distance, position, yaw, pitch and roll of both the cameras are fixed. The cameras are placed at a position considering the entire surgical setup like patient position, lighting and also without causing any obstruction to the surgeon and the entire surgical set up. The workstation computer is arranged with the monitor at a viewable distance.

    1. Pre-operative Planning

      Pre-operative planning is an important step in the procedure which involves the analysis and measurement of the pedicle parameters viz., width, height and orientation. The pre-operative axial CT- image of the spine is used for this purpose. The step involves identification of the vertebrae where the pedicle screw is to be inserted, selection of the appropriate representative image of the vertebra, marking the landmark points on the selected image and computing the parameters of the pedicle like width and height. The process is done for both the vertebrae used for fusing. 3D doctor software is used for vertebra modelling and measurements [8]. Markers with a unique geometric shape are designed by considering the shapes of all background objects so as to avoid ambiguity and false detection during object search. The markers are fixed centrally on the body of the micro motor drill in perpendicular direction facing the cameras. An alternate method of fixing markers on the drill owl is also used for tracking the pedicle screw. The centroids of the markers overlap over the axis of the drill or drill owl.

      Figure2. An example of the reference image

      The axial CT image of the candidate vertebra consists of eight or nine slices at a separation of 2 to 3 mm. The fourth or fifth slice is the best representative image [8]. This image provides a clear picture about the pedicle dimensions, from normal anatomy. The image as shown in the figure2 is used to determine the pedicle width, angle and relationship with other anatomical structures. A vertical line is drawn through the middle of the transverse process and equidistant lines from the central lines drawn in each of the spineous process as shown in figure2, aid in the registration step [8]. Registration of the CT image and the actual vertebra is done by overlaying. Two lines drawn through the centre of the pedicle area from the lamina to the vertebra body as shown in figure2, displays the ideal reference path for pedicle screw insertion [8]. The anatomy of the pedicle shows that, it has a nonuniform cylindrical shape, with varying diameter across its length [8]. Graphical cylinder plotted with diameter, fixed using minimum width of the pedicle area as shown in figure2, aids in visualization of the trajectory and tracking of the pedicle screw during insertion [8].

    2. Camera Calibration

      Relationship between pixel coordinates and real world coordinates is established using camera calibration. A dot pattern grid is used to map pixel coordinates to real world coordinates, for accurate analysis and measurement of the drill position and orientation. A square grid pattern is used, for detecting perspective distortions due to camera lens. The mapping physically corrects image distortions, viz. non unity aspect ratio distortion, rotation distortion, perspective distortion, pincushion distortion and barrel type distortion. The results are returned in real world units, which automatically compensates for any distortions in the image. A calibration object is used to hold the defined mapping and used to transform pixel coordinates or results to their real world equivalents.

      Using the theorem of intersecting lines [11], the computational model of the pinhole camera model is denoted by:

      [11]

      where, x, y, z the coordinates of a scene point in the 3D coordinate system whose origin is the projection center and u, v denote the image coordinates. The parameter f is known as the camera constant; it denotes the distance from the projection center to the image plane.

      Figure 3. Camera placement and distances

      From figure 3,

      Where z D the axial distance. Also

      From equation (2),

      Therefore,

      Let Pd be the Pixel distance with respect to the object displacement Rd. Therefore, the Euclidean distance between any two pixel positions is,

      The corresponding object displacement, The ratio,

      From equation (2),

      Therefore,

      Or,

      Using a set of object points { ( x1,y1), ( x2,y2),.. ( xn, yn)}, the corresponding image points { (u1,v1), (u2,v2), .(un,vn)}, are obtained using the camera and the ratio Pd/Rd is found out from (3). Next, the value of V and H are measured after fixing the camera. Knowing the ratio Pd/Rd, V and H, the value of f is found out using (5). Now, knowing, V and H, the value of Rd can be found out for every measured Pd.

    3. Registration and Surgery

After surgical exposure of the spine, one needle is placed in the middle of the superior articular process and two needles are placed, on the spineous process, at distances measured during the pre- operative planning phase [8]. By overlaying the transparent reference image, with lines drawn as mentioned in section [2A], over the video and adjusting the focus and zoom of the camera, the three needles in the video, are exactly made to coincide with the three vertical lines, plotted on the reference image. At this stage, the dimensions of the objects in both the images match, which finalizes the registration process. Now, the drill is positioned with its burr exactly placed at the entry point. Using computer graphics, the cylinder and its axis, with the required height and diameter, measured during the pre-planning phase, are created.

Markers having unique geometric shape are designed by considering the shapes of all background objects, so as to avoid ambiguity and false detection. Marker is fixed centrally on the body of the micro motor drill, so as to face the camera. The video of the drill, with the marker fixed centrally on its body is grabbed and processed in sequential frames. The procedure begins by correcting the orientation of the drill so that, it correctly enters the pedicle canal and the vertebral body. The orientation of the drill is same as the marker orientation. Now, the path of the drill is tracked during insertion, to ensure that it does not go beyond the walls of the pedicle canal or pierce the vertebra body. The method is to search the marker, using edge extraction to get the geometric features of the marker. The search is performed and results are displayed, based on calibration. The algorithm uses edge based geometric features of the models and the target, to establish match. Gradient method is used, for extracting object contours. An object contour is a type of edge that defines the outline of the objects in an image. The edges extracted from the video frame are used to form the images edge map, which represents how the image is defined as a set of edges. The feature calculations are performed using the images edge map. The edge finding method uses operations that are based on differential analysis, where edges are extracted by analyzing intensity transitions in images. Edges are extracted in three basic steps. First, a filtering process provides an enhanced image of the edges, based on the computations of the images derivatives. Second, detection and thresholding operations determine all

pertinent edge elements, or edgels from the image. Third, neighboring edgels are connected to build the edge chains and features are calculated for each edge. The enhanced image of the object contours is obtained by calculating gradient magnitude of each pixel in the image.

First order derivatives of a digital image are based on various approximations of the 2D gradient. The gradient of an image f(x, y) at the location (x, y) is defined as the vector [12]:

The gradient magnitude is calculated at each pixel position, from the images first derivatives. It is defined as [12]:

An edgel or edge element is located at the maximum value of the gradient magnitude over adjacent pixels, in the direction defined by the gradient vector. The gradient direction is the direction of the steepest ascent at an edgel in the image, while the gradient magnitude is the steepness of that ascent. Also, the gradient direction is the perpendicular to the object contour. The marker with the unique geometric shape is fixed as the search model. The search of instances of models in the sequence of video frames is performed. The match between the model and its occurrences in the target image is determined using the values of score and target score. The score is a measure of active edges of the model found in the occurrence, weighted by the deviation in position of these common edges. The model scores are calculated as follows.

Score = Model coverge x (1- (Fit error weighing factor x Normalized Fit Error)) Target score = Target coverage x (1- (Fit error weighing factor x Normalized Fit Error))

The model coverage is the percentage of the total length of the models active edges, found in the target image. 100% indicates that, for each of the models active edges, a corresponding edge was found in the occurrence. The target coverage is the percentage of the total length of the models active edges, found in the occurrence, divided by the length of edges present within the occurrences bounding box. Thus, a target coverage score of 100 % means that, no extra edges were found. Lower scores indicate that, features or edges found in the target are not present in the model. The fit error is a measure of how well the edges in the occurrence, correspond to those of the model. The fit error is calculated as the average quadratic distance, in pixels or calibrated units, between the edgels in the occurrence and the corresponding active edges in the model.

A perfect fit gives a fit error of 0.0. The fit error weighing factor (between 0.0 100.0) determines the importance to place on the fit error when calculating score and target score. An acceptance level is set for both the score and target score. A graphical line, showing the position and orientation of the marker on the drill, is constructed within the graphical cylinder using line drawing technique in computer graphics, and is displayed in real time, by using the position and orientation of the centroid of the marker and drawing the results, in the displays overlay buffer non-destructively. The line is displayed within the graphical cylinder with its axis at exact inclination as that of the axis of the pedicle canal and its dimensions exactly same as that of the pedicle canal, constructed earlier using computer graphics. The graphical results display the position and orientation of the drill and are used for real time drill

control and navigation. Positional results and audio-visual alerts are used to prevent boundary violation, which can lead to pedicle wall perforation. An interactive GUI and real time video display, with real time graphical overlay is built for ease of access, for viewing position and orientation of the drill or pedicle screw during insertion.

Multi sensor video object fusion

Video streams from the two orthogonally placed cameras are processed simultaneously in separate image buffers. The scene of motion of the drill can be visualised in the 3D image coordinates. Each of the orthogonally placed cameras, are also perpendicular to the plane of motion of the drill. The graphical cylinder created using the pedicle dimensions are visualised in a 3D cartesian coordinate system. The centroids of the two markers placed on the drill, each in the XZ and YZ planes are detected. Perpendicular lines passing through the centroid of the markers, intersects mutually and also with the axis of the drill at a single point. This resultant point which then lies parallel to the XY plane, as shown in the fig., is then plotted with respect to the axis of the cylinder and displayed in real time. The trajectory of the resultant point displays the position, orientation and depth of penetration of the drill, through the graphical cylinder in the 3D image coordinates in real time. 3D graphical visualisations require 3D transformations.

3D transformations use 4X4 matrices (X, Y, Z,W). For 3D Translation, point (X,Y,Z) is to be translated by amount Dx, Dy and Dz to location (X',Y',Z').

X' = Dx + X Y' = Dy + Y Z' = Dz + Z

or the point P' = T * P where,

For 3D rotation, we require an axis to rotate about. Here, the 3D transformation shows rotation about the three axis viz; X,Y and Z axes.

_ _

P' = | X' |

| Y' |

| Z' |

| 1 |

– –

_ _

Rz = | cos -sin 0 0 | = Rz

| sin cos 0 0 |

| 0 0 1 0 |

| 0 0 0 1 |

– –

_ _

Rx = | 1 0 0 0 | = Rx

| 0 cos sin 0 |

| 0 sin cos 0 |

| 0 0 0 1 |

– –

_ _

Ry = | cos 0 sin 0 | = Ry

| -sin

0

cos

0 |

| 0

0

0

1 |

| 0 1 0 0 |

– –

_ _

P = | X |

| Y |

| Z |

| 1 |

– –

Fig.3 Graphical Cylinder with Axis and Line in the XYZ Spacial Co-ordinates

In the above figure the points (x,z1) in the XZ plane and (y,z1) in the YZ plane have been transformed to (x,y) in the XY plane and the resultant three dimensional coordinate is (x,y,z1). Also the point (x, z2) in the XZ plane and (y, z2) in the YZ plane have been transformed to (x, y) in the XY plane and the resultant three dimensional coordinates are (x, y, z2). The third point which is away from the axis of the cylinder extending from the point (x,y,z2) is (x3,y3,z3).

III. EXPERIMENTAL SETUP AND RESULTS

  1. Evaluation of Real Time Object Tracking

    The new technique was evaluated, by inserting the drill into the pre-determined point, of the transparent phantom model of the human vertebra, using computer assistance. Three needles were inserted into the landmark points on the phantom vertebra. The focus and zoom of the camera were adjusted so that, the three needles in the video were exactly made to coincide with the three vertical lines plotted on the reference CT image to complete the registration process. The graphical cylinder was drawn, with its axis at an inclination, exactly same as that of the pedicle canal, obtained from the reference CT image of the vertebra. The orientation of the axis of the cylinder was estimated, with respect to the three vertical lines drawn in the reference CT image of the vertebra. The online video of the drill, with the marker fixed centrally on its body was processed in sequential frames. The search of instances of marker models in the sequence of video frames was performed. The centroid of the marker model was found out in each frame of the video.

    Figure4. Object tracking for CASS

    A graphical line, showing the position and orientation of the centroid of the marker on the drill, was displayed using computer graphics, and by using the position and orientation of the centroid of the marker and drawing the results, in the displays overlay buffer non-destructively[13],[14]. The drill was positioned with its burr exactly placed at the entry point on the phantom vertebra. The orientation of the drill was corrected so as to correctly enter the pedicle canal. The orientation of the drill should be the same as the marker orientation. Then, the path of the drill was tracked during insertion, so that it neither goes beyond the walls of the pedicle canal nor pierces the vertebra body. The trajectory of the burr or tip of the drill was viewed, by observing the movement of the graphical line within the cylinder. The depth of insertion was estimated by viewing the movement of the graphical line. Figure4 illustrates the procedure of Computer Assisted Spine Surgery (CASS).

  2. Composing Transformation

    The following table1. displays the comparison of co-ordinate values in the XZ-YZ planes and XYZ spacial co-ordinates. The co-ordinate values in the XZ and YZ planes have been transformed to the co- ordinate values in the XYZ spacial co-ordinates. The resultant graphical cylinder and animated line for different positions of the drill have been studied. The study reveals that the real world position of the drill calculated using the 3D image co-ordinates gives precise drill positioning with respect to the 3D real world co-ordinates.

    Table 1. Composing Transformation Table

  3. Results

A new algorithm was developed for CASS using multisensor video object fusion and 3D graphics. The position and orientation of the drill has been sudied in the 3D co-ordinates. The graphical cylinder with axis and the line displaying the position and orientation of the drill were displayed with respect to the XYZ spacial co-ordinates. Animated graphical overlay over the video has been developed using

XZ Plane

YZ Plane

XYZ (Fusion in 3D Co-ordinates)

x

z1

x

z2

y

z1

y

z2

x

y

z1

x

y

z2

x3

y3

z3

425

434

425

447

430

434

430

447

425

430

434

425

430

447

430

437

455

408

421

408

439

413

421

413

439

408

413

421

408

413

439

412

419

448

371

411

371

424

376

411

376

424

371

376

411

371

376

424

377

383

433

314

429

314

443

319

429

319

443

314

319

429

314

319

443

319

326

452

303

440

303

451

308

440

308

451

303

308

440

303

308

451

308

312

460

305

442

305

454

310

442

310

454

305

310

442

305

310

454

310

317

461

307

444

307

456

312

444

312

456

307

312

444

307

312

456

312

319

465

278

468

278

456

283

468

283

456

278

283

468

278

283

456

281

290

465

271

477

271

450

276

477

276

450

271

276

477

271

276

450

276

283

459

258

486

258

452

263

486

263

452

258

263

486

258

263

452

263

268

461

252

480

252

446

257

480

257

446

252

257

480

252

257

446

257

264

452

236

467

236

449

241

467

241

449

236

241

467

236

241

449

240

248

458

227

456

227

455

232

456

232

455

227

232

456

227

232

455

232

239

464

220

451

220

451

225

451

225

451

220

225

451

220

225

451

225

232

460

212

445

212

453

217

445

217

453

212

217

445

212

217

453

217

224

458

205

438

205

447

210

438

210

447

205

210

438

205

210

447

210

214

456

203

427

203

431

208

427

208

431

203

208

427

203

208

431

209

215

440

194

375

194

381

199

375

199

381

194

199

375

194

199

381

199

206

390

193

404

193

371

198

404

198

371

193

198

404

193

198

371

198

205

380

163

390

163

371

168

390

168

371

163

168

390

163

168

371

166

174

379

155

390

155

284

160

390

160

284

155

160

390

155

160

284

160

167

293

137

368

137

298

142

368

142

298

137

142

368

137

142

298

142

149

307

computer graphics. Comparison of the composing transform table1, justifies that the position of the drill relative to the XZ and YZ planes can combined to visualise the spacial position and orientation of the drill in the 3D co-ordinates.

CONCLUSIONS

A novel computer assisted surgical navigation system for pedicle screw insertion was developed. The trajectory of insertion of the drill or the pedicle screw is displayed in the 3D image co-ordinates and provides an aid to the surgeon, to insert the screw precisely. The system developed is cost effective and has a good precision required for spine surgery. The instrumentation required is simple so that, handling the system is fairly easy.

REFERENCES

  1. Doherty, J.F. ; Van Dyck, R.E. Moving object tracking in video , Applied Imagery Pattern Recognition Workshop, 2000. Proceedings. 29th

  2. Sohaib Khan, Omar Javed, Zeeshan Rasheed, Mubarak Shah, Human Tracking in Multiple Cameras Proc.

    ACM Conf.Multimedia, pp. 201-212, 1995

  3. J. K. Aggarwal and Q. Cai,Human Motion Analysis: A Review Computer and Vision Research Center,

    Department of Electrical and Computer Engineering, The University of Texas at Austin, October 22, 1997

  4. L. Mihaylova, A. Loza, S. G. Nikolov, J. J. Lewis,E.-F. Canga, J. Li, T. Dixon, C. N. Canagarajah and D. R. Bull, The Influence of Multi-Sensor Video Fusion onObject Tracking Using a Particle Filter Proc. of the International Conf. on Information Fusion, 2005.

  5. Yuichi Tamura, NobuhikoSugano, ToshihikoSasama,YoshinobuSato,Shinichi Tamura, Kazuo Yonenobu, Hideki Yoshikawa, TakahiroOchi, Surface-based registration accuracy of CT-based imageguidedspine surgery,

    Eur Spine J (2005) 14: 291297

  6. Robert W. Gaines, JR., M.D., The Use of Pedicle-Screw InternalFixation for the Operative Treatment of Spinal Disorders, The Journal of Bone and Joint Surgery Vol. 82-A, No. 10, October 2000

[7]T. Laine,T. Lund,M. Ylikoski,J. Lohikoski,D. Schlenzka, Accuracyof pedicle screw insertion with and without computer assistance : arandomised controlled clinical study in 100 consecutive patients,Eur Spine J (2000) 9 :235240, Springer-Verlag 2000

  1. Tessamma Thomas, Dinesh Kumar V.P., P.S John, Antony JosephThoppil, James Chacko, A Video Based Tracking System forPedicle Screw Fixation, Proc. of 4th International Conference onComputer Science and its Application ( ICCSA -2006 ) san Diego,California, USA, June 2006

  2. Lutz P. Nolte, M.A Slomczykowski, Uirich Berlemann, Matthias J.Strauss, Robert Hofstetter, Dietrich Schlenzka, Timo Laine, TeijaLund, A new approach to computer aided spine surgery:fluoroscopy based surgical navigation, Eur Spine J (2000) 9 : S78-S88, SpringerVerlag 2000

  3. Moshe Shoham, Member, IEEE, Michael Burman, Eli Zehavi, LeoJoskowicz, Senior Member, IEEE, Eduard Batkilin, Bone-MountedMiniature Robot for Surgical Procedures: Concept and ClinicalApplications and Yigal Kunicher , IEEE Transactions On RoboticsAnd Automation, VOL. 19, NO. 5, October 2003

  4. Pedram Azad, Tilo Gockel, Rüdiger Dillmann, Computer VisionPrinciples and Practice 1st Edition, Elektor International Media BV2008

  5. Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing,second edition, PrenticeHall. 2004

  6. Nobert Thomas Pallath, Tessamma Thomas, Video Object Tracking and Analysis for Computer

    Assisted surgery, The international journal of Multimedia & Its Applications ( IJMA ),February 2012,

    Vol.4.No.1

  7. Nobert Thomas Pallath, Tessamma Thomas, Real Time Computer Assisted Surgical Navigation,

IEEE International Conference on Data Science & Engineering 2012

Leave a Reply