Different Approaches to Prove Cayley-Hamilton Theorem

DOI : 10.17577/IJERTCONV5IS03002

Download Full-Text PDF Cite this Publication

Text Only Version

Different Approaches to Prove Cayley-Hamilton Theorem

Dr. Paramjeet 1 , J K Narwal 2

Assistant Professor,

Ganga Institute of Technology and Management, Kablana, Jhajjar1

Abstract: This paper covers different approach to prove the Cayley Hamilton Theorem using different derivation of the determinant of a matrix in conjuction ith elementary graph theory. In this paper various alternative proofs have been discussed via Schurs Triangularization, a variation of topological proofs, combinatorial proof. The Topological proof uses continuity properties and matrix norms.

Keywords: Matrices, Partial Permutation, Characteristics Polynomial, Topology, Eigen values.


    Defintion 1.1. If A is an n×n matrix ,then the characteristic polynomial of A is defined to be PA(x)= det(xI-A). This is a polynomial in x of degree n with leading term xn . the constant term c0 of a polynomial q(x) is interpreted as c0 I in q(A).

    Theorem 1.2 (Cayley Hamilton Theorem). If A is an n×n matrix ,then pA(A)=0, the zero matrix.

    Theorem 1.3 If q0 is a quaternion of the form q-= a+bi+cj+dk with a,b,c,d, being real , then q2 – 2aq + (a2


    degree less than or equal to n-1. It folowws that there exist matrices D0, D1,Dn-1 with entries from C such that D(x) = Dn-1 xn-1++ D1 x +Do . Then the matrix equation follows

    det(xIn-A) In = (x In-A) adj(xIn-A) = (xIn-A)D(x)

    Substituting pA(x)= det(xIn-A), (and using the fact that scalars commute with matrix)

    XnIn+ bn-1xn-1In+.+ b1xIn+b0In

    = pA(x) In = det(xIn-A)In =(xIn-A)adj(xIn-A)


    =xnDn-1- xn-1ADn-1+ xn-1Dn-2-xn-2ADn-2++ xD0-ADo

    =xnDn-1+ xn-1 (-ADn-1+ Dn-2 ) +..+(-AD1+Do) -ADo

    Since two polynomials are equal if and only if their coefficients are equal , the coefficient matrices are equal ; that is , In=Dn-1, bn-1In= ( -ADn-1+Dn-2),,b1In=-

    q1 = q


    abicjdk a2+b2+c2+d2

    2a a+bi+cj+dk a2+b2+c2+d2 a2+b2+c2+d2

    (AD1+D0), and b0In=-AD0. This means that A may be substituted for the variable x in the equation (2.1) to






    (2a q) a2 + b2 + c2 + d2 = 2aq q2


    PA(A) = An+ bn-1 An-1++ b1A+b0In

    q2 2aq + (a2 + b2 + c2 + d2) = 0

    If one represents a quaternion q= a+bi+cj+dk as a matrix, A =[ + + ],


    PA(A) = A2 -2aA+(2 + 2 + 2 + 2) = 0, and the polynomial given in Theorem 1.3 is characteristic polynomial of A



    Theorem 2.1 (Cayley-Hamilton Theorem). For any n × n Matrix A, PA(A)=0.

    Proof. Let D(x) be the matrix with polynomial entries D(x)= adj(xIn-A), So D(x)(xI-A)= det(xIn-A)In. Since each entry in D(x) is the determinant of an (n-1) X( (n-1) submatrix of (xIn-A), each entry of D(x) is a polynomial of

    =AnDn-1+An-1(-ADn-1 +Dn-2)++A(-AD1+D0)-AD0

    =AnDn-1-AnDn-1+An-1Dn-2- An-1Dn-2+.AD0-AD0 =0

    This proves the theorem


    This proof synthesizes work of Issai Schur, According to this, If S1S2,,Sn are n × n upper triangular matrices such that (i,i) element of Si is zero for all I, then S1S2……..Sn=0

    Proof. (Induction of n)

    For n=1 , there is nothing to be proved since S1=0

    For n=2 , S = 0

    Proof. Fix > 0. It is sufficient to show that , given any

    1 [0 ] , S2=[0 0], where x,y,u,and v are scalars . It is clear that S1S2=0

    Assume now that the theorem is true for some integer m, then S1S2..SmSm+1= T Sm+1,

    matrix , there exists diagonalizable matrix B such that | |2<

    Where | |2 is the Frobenius norm given by

    | | = (

    2 1/2

    2 ,=1| | )

    Where, S1=[ 0 1], ., Sm=[ 0 ]

    And T1,Tm Mm are upper triangular matrices such that the ( i,i) element of Ti is zero.


    Given ,let A=UTU* where U is unitary and T is upper Triangular , possible Schurs Triangularization theorem. Define B=A+ UCU* , where


    T= S S ..S

    = [12 ]= 0 ] and

    Cfg= {


    = }

    1 2 m


    [ 0


    =[ ]

    0 0

    Choose >o , so that B will have distinct eigen values, hence B will be diagonalizable .Let () = {, =

    1,2, } and +1 . Define as

    Where 0m Mm, Am Mm is upper triangular , um and tm are vector columns of order m X 1, 0 is zero row of order 1 X

    ( )2 1 1

    m , and x is a scalar. The Proof is complete.


    { (

    )} < 1

    Main Theorem.( Cayley- Hamilton Theorem).

    If =

    Let pA(t) be the characteristic polynomial of A Mm. Then PA(A)=0

    + =




    + 2

    Proof. Since pA(t) is of degree n with leading coefficient 1 and the roots of pA(t) are precisely the eigen values

    1.., n of A, counting multiplicities , factor pA(t)


    ( )2 ( )2 1 1

    as PA(t) =(t- 1)(t- 2)(t- m)

    1 1 {


    )} >

    Using Schurs Theorem , write A as A= UTU*

    ( )

    ( ) > 1 1

    Where, T is upper triangular with i in the ith diagonal position , i =1,n. The theorem follows.






    = PA(UTU*)=( UTU*- 1I)( UTU*- 2I)….( UTU*- nI)

    =[U(T- 1I)U*][ U(T- 2I)U*] [U(T- nI)U*]

    =U[(T- 1I)(T- 2I) (T- nI)]U*=0

    + 2 > + 2

    The Diagonal entries of T+C are distinct from the choice of

    , on B has distinct eigen values and is diagonalizable. Then

    | |2=| (UCU |2

    =||2, because | |2 is unitarily invariant

    The last equality follows from theorem 2.1.



    ( )2


    because C is Diagonal


    This is most concise alternate proof to the Frobeniuss proof.




    = 2

    < since every < 1

    Theorem 3.1. The set Dn= {| } of diagonalizable matrices dense in Mn

    Example 3.1 If A is diagonalizable , then pA(A)= 0

    Proof. A=PDP-1, where P is invertible.

    D=[1 ]


    ||= 0

    ||= 1

    ||= 2

    ||= 3





    a11a22 x a11a33 x a22a33 x

    -a12a21 x

    -a13a31 x

    -a23a32 x

    -a11a22 a33 a11a23 a32 a22a13 a31 a33a12 a21

    -a12a23 a31

    -a13a32 a21

    And . ,

    PA(A) = (A- 1I)( A- 2I)..( A- nI)

    = P(D- 1I)( D- 2I)..( D- nI)P-1

    = P.0.P-1 = 0

    Main Theorem.(Cayley Hamilton).If PA(t) is the characteristics polynomial of A then PA(A) =0

    Proof. Let Pn be the space of polynomials of degree n or less, with the Zariski topology. From Example 3.1 , the Cayley-Hamilton theorem is proved for all diagonalizable matrices. The mapping : Mn X PN Mn given by (A,f(x))=f(A) is continuous , and the mapping : n Mn X PN given by () =( A, pA(x)) is continuous . Hence the composition

    : MnMn,

    Which is gven by (A)= pA(A) is continuous. This mapping is identically zero on a dense subject of Mn, so by continuity vanishes everywhere.



    3.1.1. Partial permutation

    A partial permutation of {1,,n}is a bijection of a subset of {1,,n} onto itself. The domain of is denoted by dom .The cardinality of dom is called the degree of and is denoted by||.

    A complete permutation whose domain is {1,.,n}. If is a partial permutaion of {1,…,n} , then the completion of , denoted , is the complete permutation of {1,…,n} defined by


    Table 1.1: Terms of pA(x) of M3

    Note that here the coeffiecient of x is the sum of signed terms whose indices come from the 6 partial permutations of {1,2,3} of order 2:

    a11a22 corresponds to 1=(1)(2), a12a21 corresponds to 4=(1,2), etc.

    It will be shown in general that each partial permutation of

    {1,,n} of order q yields a signed term , which is one of he summands in the coefficient of xn-q. Graph theory will help one visualize the partial permutations involved. Let

    be vertices and the ordered pairs ( i, (i)) ,where

    , be directed edges. The bijective properties of

    mandate that this graph is a graph of adjoined cycles.

    This transition from combinatorics to graph theory allows one to use an elegant set an example to prove the Cayley Hamilton Theorem. To use this transiton pA(x) must be described in some detail.

        1. Positive and negative parts of the charactristics polynomial

          Let A = (aij) Mn overC. As defined , the characteristics polynomial of A is :

          () = det( ) () ,()

          Defintion 3.1.1. () = {

          { 1, , }\ }


          Definition 3.1.2. The signature of a complete permutation

          denoted sgn(), is +1if the total number of inversion in is even and -1 if that number is odd.

          Definition 3.1.3. The signature of a complete permutation

          , denoted sgn(), is defined by sgn()=(1)||().

          Where, b = { }

          ij =

          and where the summation is overall complete permutations

          of {1,,n}. It follows that the coefficient of xn-q in pA(x)


          The characteristics polynomial , pA(x), of a matrix is the sum o certain products of elements of that matrix and poers




          of x . It is shonbelow that the pairs of indices (i,j) appearing in one of these products an be described using partial permutations. There is a relation between the elements of a given product. Namely , their subscripts are ordered pairs ( i, (i)) where is a partial permutation of

          {1,n} and . For example , if 3, the terms of pA(x) are:

          To see this, notice that from Equation 3.1 , the coefficient

          of xn-q comes from the terms bi,n(i)= bii= x-aii for n-q of the indices, and bi,n(i)= -ai,n(i) for the other q indices. Such n is completion of a partial permutation of order q. The q terms corresponding to the ai,n(i) are called aij corresponding to . For example, if 3 and q=2, then x(n-q)= x1, and permutes 2 different elements of {1,2,3} so the coefficients of x1 are various products of to terms. This analysis gives a precise algorithm for finding the coefficients of a specific variable in the polynomial pA(x) for n.

          1. Note the variables degree n-q

          2. Find all partial permutations such that ||=q

    3. Create the ordered pairs (i,j)= (, ()),

    1. Multiply all ai,(i) corresponding to a particular

      and attach the approriate sign.

    2. The sum of these signed products is the coefficient of xn-q

    Referencing Table 3.1 , if n=3, q=2 , and is a partial permutation of order 2, then ||=2 and creates two ordered pairs or two aijs . Thus , each term in the column represents the signed product of the two aijs whichare the domain of each of the 6 s .With the coefficients ofeach variable of pA(x) properly defined , pA(x) is the sum of these terms. These coefficients are either positive or negative and the following definitions are created.

    Definition 3.1.3. pA(x)=+() (), Where

    Definition 3.1.4. +() =



    A very common application of the Cayley- Hamilton Theorem is to use it to find An usually for the large powers of n. However many of the techniques involved require the use of the eigen values of A.


    1. William A. Adkins and Mark G. Davidson, The Cayley Hamilton and Frobenious theorems via the Laplace Transformation, Linear Algebra and its Applications 371(2003), 147-152.

    2. Arthur Cayley, A memoir on the theory of Matrices, available from http:// www.jstor.org, 1857.

    3. Jeffrey A. Rosoff, A topological Proof of the Cayley- Hamilton Theorem, Missouri J. Math .Sc. 7(1995), 63-67.

    4. Wikipedia , Arthur Cayley, Available from http:// en.wikipedia .org, 2004

    5. Wikipedia, William Rowan Hamilton, Available from http:// en. wikipedia .org, 2005

    6. D.R. Wilkins, Linear Operators and the Cayley Hamilton Theorem

    , available from http://www.maths.tod.i.e/pub/histMath/people/Hamilton, 2005


    ( ||



    Definition 3.1.5. () =


    ( ||



    Note that if 3 , the variables of degree 1 qnd 0 have a combination of +() () terms.

    With the desciption of pA(x), the cayley Hamilton Theorem may be rewritten as follows:

    Main Theorem =+() = ()

    With the help of some basic graph theory definitions this theorem can be proved.


This theorem is one of the most powerful. It is also known classical matrix theory theorem. Various applications derive their results from C-H theorem. To understand the scope of this theorem , alternate proofs were used. Each proof helped to understand how intertwined areas of mathematics are with respect to matrices and the characteristics polynomial. Straubings combinatorial proof of the C H exploits three aspects of pA(A). First, it elegantly explain the relationship between positvi and negative terms of pA(A). Second , the proof illuminates the importance of n. It is cornerstone for the entire proof. PA(A) is a sum of n products of elements of A. Third the proof introduces a cyclic property to pA(A) are partial permuatins and therefore may be represented by adjoint cycles.

Leave a Reply