Survey on Knowledge Observation With Spatiality Data Mining

Dr.   S.   Anitha Reddy; Dr.   P.   Avinash

doi:10.17577/IJERTV3IS090068

Volume 03, Issue 09 (September 2014)

Survey on Knowledge Observation With Spatiality Data Mining

DOI : 10.17577/IJERTV3IS090068

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 77
Total Downloads : 384
Authors : Dr. S. Anitha Reddy, Dr. P. Avinash
Paper ID : IJERTV3IS090068
Volume & Issue : Volume 03, Issue 09 (September 2014)
Published (First Online): 04-09-2014
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Survey on Knowledge Observation With Spatiality Data Mining

Dr. S. Anitha Reddy Sridevi Womens Engineering College

Hyderabad

Dr. P. Avinash

Sridevi Womens Engineering College, Hyderabad.

AbstractHuge amount of spatiality data is being collected in various applications like remote sensing, information systems, computer cryptotography, geographical information system (GIS), environmental assessment and planning, etc. Now the real challenge is holding attention in presently, and previously unknown information from this large database. This is what the objective of spatiality data mining. The real work is to extend the scope of data mining from logical and transactional database to spatiality database and apply it in the study of spatiality distribution. The paper summarizes the work that has been done so far in spatiality data mining from spatiality data generalization, mining spatiality association rule to spatiality data clustering

Key Words: Data authentication, Data generalization, KDD, Spatiality Data Type, Spatial Data Mining, Raster Map, Vector Maprespectively

INTRODUCTION

The spatiality data mining in spatiality distribution of field specific database is an interdisciplinary research area that basically focuses ideas on knowledge discovery from heterogeneous format of spatiality database and homogeneous format of spatiality database. The study focuses first on the different format of spatiality data mining techniques and secondly a suitable technique to work on spatiality distribution of the database to mine knowledge from such database and get the information relating to spatiality database. The rapidly growing data creates the necessity of knowledge / information discovery from data which leads to promising emerging field, called the data mining or knowledge discovery from database (KDD). Spatiality Data Mining, Shekhar & Chawla 2003[1], Describes as a process of discovering previously unknown, but potentially useful patterns from spatiality database. The process of data mining could be the integration of many things including machine learning, database system, statistics, and information theory. There are many studies available of data mining in relational and transactional database [2, 3, 4, 5], the concept is in high demand to apply it in many other applicative area like spatiality database, temporal database, multimedia database, object-oriented database etc. Section 2 discusses various methods and research gap in between discovering interesting knowledge from spatiality data whereas section 3 discusses one of the applicative are such as applying spatiality data mining in spatiality database. Section 4 discusses the future direction of the research work.
SPATIALITY DATA MINING

Spatiality data are the data related to objects that occupy space. It contains topological and/or distance information and is often organized by spatiality indexing structures and accessed by spatiality access methods. The objects stored in spatiality database are the spatiality objects represented by spatiality data type and are having implicit relationship among them. The implicit relationship among the objects and the distinct feature of spatiality database poses challenge and bring opportunities for mining information from spatiality data [6].

Knowledge discovery from database refers to the extraction of implicit knowledge, spatiality relation, or other patterns not explicitly stored in spatiality database [7].

The work related to statistics [8,9,10,11], machine learning[12,13,14] and database systems[15,16] laid the foundation of knowledge discovery from database. Then after, with respect to spatiality database, the study related to computational geometry[5],spatiality data structure[17,18,19] and spatiality reasoning [20,21] paved the way for the study of spatiality data mining.

The statistical spatiality analysis [9,11] has been the most common approach for analyzing spatiality data. It handles very efficiently the numerical data which comes from the realistic model of spatiality phenomena. But the assumption of statistical independence among the spatiality distributed data causes problem as many of the spatiality data are in fact interrelated. It is because the spatiality Neighboring objects. At the same time the statistical approach cannot model non linear rules very well. Statistical methods also do not work well with incomplete or inconclusive data. Another problem related to statistical spatiality analysisis the expensive computation of the result. Tosupplement the work the machine learning techniques [12,14] and the spatiality database potential[22,23] was nicely utilized. Now to model the non linear rules out of the spatiality and non spatiality data the potential of soft computing can be used.
An aggregate proximity is the measure of closeness of the set of points in the cluster to a feature as opposed to the distance between a cluster boundary and the boundary of a feature. Related to a cluster it would be more interesting result to know why the clusters are there. The question that would more suitable answer about the cluster is that what are the characteristics of the clusters in terms of the feature that are close to them. For example the statement like

85% of the houses in a cluster is close to the feature F (e.g. infected by infectious disease cholera) would be more informative and interesting than statement like one house is close to the feature F.
SPATIALITY EPIDEMIOLOGY- AN APPLICATIVE AREA

Elliott and Wartenberg [37] described Spatiality epidemiology is the description and analysis o geographic, or spatial, variations in disease with respect to demographic, environmental, behavioral, socioeconomic, genetic, and infectious risk factors. The spread of infectious disease is closely associated with the concepts of spatiality and spatio-temporal proximity, as individuals who are linked in a spatiality and temporal sense are at a high risk of getting infected [38]. Proximity to environmental risk factors is therefore important. Thus knowledge of spatiality and temporal variations of disease and characterizing its spatiality structure is essential for the epidemiologist to understand better the populations interactions with its environment [39].

Spatiality epidemiology analysis comprises of wide range of methods. Now it is a big challenge to determine which one to use[38]. The figure below (Figure: 5) is a diagrammatic representation of a spatiality analysis framework taken from Pfeiffer [38] adopted from Bailey and Bailey & Gatrell[4 In the above diagram Pfeiffer identified the following four active groups of the framework:
Modeling introduces the concept of cause-effect relationships using both spatiality and non-spatiality data sources to explain or predict spatiality patterns [38].
FUTURE DIRECTION

Data mining is a young field of study started during late 1980s. Spatiality data mining is an even younger. The traditional data mining researchers extended their study to work on spatiality data mining. Many spatiality data mining methods assume the presence of extended relational model for spatiality database. Some of the future directions of spatiality data mining are enlisted below.

Data Mining in Spatiality Object-Oriented Databases: Many researchers have pointed out that OO database may be a better choice for handling spatiality data rather than traditional relational or extended relational models[32,33].

Mining Under Uncertainty: The use of evidential reasoning [34] can be explored in the mining process for the databases where uncertainty modeling has to be done. Bell, Anand and Shapcott [35] has explained that evidential theory can model uncertainty better than traditional probabilistic models, like Bayesian methods. Fuzzy sets approach was applied to spatiality reasoning[20,36] and it can be extended to spatiality data mining.

Mining Spatiality Data Deviations and Evolution Rules: It is a more challenging and applicative work in spatiality data mining. The work would be related to spatio-temporal databases to study data deviation and evolution rules. Foe example we can find spatiality characteristic evolution rules which summarizes the general characteristics of the changing data. During the mining process we can discover the region having particular epidemiology growth rate more than the countrys average

growth rate. Similarly one can make a comparison of the areas where certain epidemiology increased last year with the area where it has decreased.

These rules may be used by the government and policy makers in formulating policies and plan to curb the problem.

Multidimensional Data Analysis and Rule Visualization: Discovering rule from multidimensional data (non- spatiality and spatial) source is a challenge for the researchers. Multidimensional data analysis and visualization has been studied [42], but multidimensional rule visualization is still an immature area.
CONCLUSION

We have explained that spatiality data mining is a promising field of research with wide application in GIS, medical and environmental data analysis etc. We surveyed the existing methods of spatiality data mining and presented their strength and weaknesses. We have outlined one of the applicative area i.e. spatiality data mining of epidemiology database which is of great importance for the society and policy makers and we hope to give some novel and useful output from our further exploration of this field.
BIBLIOGRAPHY

G. Say, D. Wheeler, Statistical Techniques in Geographical Analysis. London, David Fulton, 1994.
R. Agarwal and R. Srikant. Fast Algorithm for mining association rules. In Proc. 1994 Int. Conf. VLDB,pp.487-499, Santiago,Chile, Sept. 1994.
U. M. Fayyad, G Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, Menlo Park, CA, 1996.
J. Han, Y. Cai, and N. Cercone.Data-Driven Discovery of Quantitative Rules in Relational Databases. IEEE Trans. Knowledge and Data Eng., 5:29-40, 1993.
G. Piatetsky-Shapiro and W. J. Frawley, editors. Knowledge Discovery in Databases. AAAI/MIT Press, Menlo Park, CA, 1991.
W. Lu, J. Han, and B.C. Ooi. Discovery of General Knowledge in Large Spatiality Databases. In Proc. For East Workshop on Geographic Information Systems pp. 275-289, Singapore, June 1993.
K. Koperski and J. Han. Discovery of Spatiality Association Rules in Geographic Information Databases. In Proc. 4th Intl Symp. On large spatiality Databases(SSD 95), pp. 47-66, Portland, Maine, August 1995.
D. K. Y. Chiu, A. K. C. Wong, and B Cheung. A Statistical technique for Extracting Classificatory Knowledge from Databases. In Piatetsky-Shapiro and Frawley [43], pp 125-141.
S. Fotheringham and P. Rogerson. Spatiality Analysis and GIS, Taylor and Fransis, 1994.
L. Kaufman and P J Rousseeuw. Finding groups in Data: an introduction to Cluster Analysis. John Wiley & Sons, 1990.
S.Shekhar and S.Chawla. Spatiality Databases: A Tour. Pretice Hall (ISBN 0-7484-0064-6), 2003.

D. Fisher, Improving Interface through Conceptual Clustering. In Proc. 1987 AAAI Conf., pp. 461-465, Seattle, Washington, July 1987.
R. S. Michalski, J. M. Carbonnel, and T. M. Mitchell, editors. Machie Learning: An Artificial Intelligence Approach. Morgan Kaufmann, Los Altos, CA, 1983.
T. M. Mitchell. Generalization and earch. In Artificial Intelligence, 18:203-226, 1982.
M. Stonebraker. Reading in Database System. Morgan Kaufmann, 1988.
M. Stonebraker. Reading in Database System. 2ed.. Morgan Kaufmann, 1993.
R. H. Guting. An Introduction to Spatiality Database System. In VLDB Journal, 3(4):357-400, October 1994.
R. Guttman. A dynamic index structure for spatiality searching. In Proc. ACM SIGMOD Int. Conf. on Management of Data. Bostan, MA, 1984, pp. 47-57.
H. Samet. The Design and Analysis of Spatiality Data Structure. Addison-Wesley, 1990.
S. Dutta. Qualitative Spatiality Reasoning: A Semi Quantitative Approach Using Fuzzy Logic. In Proc. 1st Symp. SSD89, pp. 345-364, Santa Barbara, CA, July 1989.
M. J. Egenhofer. Reasoning about Binary Topological Relation. In Proc. 2nd Symp. SSD91, pp. 143-160, Zurich, Switzerland, August 1991.

W. G. Aref and H. Samet . Extending DBMS with Spatiality operation. In Proc 2nd Symp. SSD91, pp. 299-318, Zurich, Switzerland, August 1991.
W. G. Aref and H. Samet. Optimization Strategies for Spatiality Query Processing. In Proc. 17th Int. Conf. VLDB, pp. 81-90, Barcelona, Spain, Sept. 1991.
J. Han, and Y. Fu. Exploration of the power of Attribute- Oriented Induction in Data Mining. In[16]
M. Holsheimer and M. Kersten. Architectural Support for Data Mining. In CWI Technical Report CS-R9429, Amsterdam, The Netherlands, 1994.
C. J. Matheus, P. K. Chan, and G. Piatetsky-Shapiro. Systems for Knowledge Discovery in

Databases. In IEEE Trans.

Knowledge and Data Engineering, 5:903-913,1993.
M. Ester, H.P. Kriegel, and X. Xu. Knowledge Discovery in Large Spatiality Databases: Focusing Techniques for

Efficient Class Identification. In Proc. 4th Int. Symp. On Large Spatiality Databases (SSD95),pp.67-82, Portland, Maine, August 1995.
R. Ng and J. Han. Efficient and effective clustering method for spatiality data mining. In Proc. 1994 Int. Conf. Very Large Databases, pp. 144-155, Santiago, Chile, September 1994.
T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: an Efficient Data Clustering Method for Very Large Databases. In Proc. 1996 ACM-SIGMOD Int.Conf.Management of Data, Montreal,Canada, June 1996.
R. Agarwal, T. Imielinski, and A. Swami. Mining Association Rules Between Sets of Items in Large Databases. In Proc. 1993 ACM-SIGMOD Int. Conf. management f Data, pp. 207-216, Washington, D.C., May 1993.
E. Knorr and R. T. Ng. Applying Computational Geometry Concepts to Discovering Spatiality Aggregate, Proximity Relationships.InTechnical Report,University of British Columbia,1995.
L. Mohan and R. L. Kashyap. An Object-Oriented Knowledge Representation for Spatiality Information. In IEEE Transaction on Software Engineering, 5:675-681, May 1988.
J. Han, S. Nishio, and H. Kawano. Knowledge Discovery in Object-Oriented and Active Databases. In F. Fuchi and T. Yokoi(eds), Knowledge Building and Knowledge Sharing, Ohmsha/IOS Press, pp. 221-230, 1994.
J. Guan and D. Bell. Evidence Theory and its Applications, vol.
1. North-Holland, 1991.
D. A. Bell, S. S. Anand, and C. M. Shapcott. Database Mining in Spatiality Databases. International Workshop on Spati- Temporal Databases,1994.
S. Dutta. Topological Constraints:
A Representational Framework for approximate Spatiality and Temporal Reasoning. In Proc. 2nd Symp. SSD91, pp.161-182, Zurich, Switzerland,August 1991.

P. Elliott and D. Wartenberg. Spatiality epidemiology: current approaches and future challenges. Environmental health perspectives, 112(9):998, 2004.
Dirk Pfeiffer. Spatiality analysis in epidemiology. Oxford University Press, GB, 2008.
Frank B. Osei. Spatiality statistics of epidemic data : the case of cholera epidemiology in Ghana. PhD thesis, 2010.
T.C. Bailey and A.C. Gatrell. Interactive spatiality data analysis. Longman Scientific & Technical Essex, 1995.
A. Maroko, J.A. Maantay, and K. Grady. Using geovisualization and geospatiality analysis to explore respiratory disease and environmental health justice in New York city. Geospatiality Analysis of Environmental Health, pages 3966, 2011.
D. Keim, H. P. Kriegel, and T. Seidl. Supporting Data Mining of Large Database by Visual Feedback Queries In Proc. 10th of Int. Conf. on Data Engineering, Houston, TX, pp. 302-313, Feb. 1994.

Scod e	Population	spatial

Survey on Knowledge Observation With Spatiality Data Mining

Leave a Reply