An Effective Analysis of Spatial Data using Nearness Queries

Download Full-Text PDF Cite this Publication

Text Only Version

An Effective Analysis of Spatial Data using Nearness Queries

  1. Vasavi1, Swathi Agarwal2, G. Shailaja3,

    1 Department of CSE,

    2,3Department of IT, CVR College of Engineering-Hyderabad, India.

    Abstract: In this article a combination of geo-spatial data and geographic Information system (GIS) has been review with nearness queries. First Listing of various data mining functions along with applications by considering the classical data is first step of this article. A brief pointing of spatial databases with statistics with various approaches is described. A basic difference of spatial queries approaches for classical data is shown with examples.

    Keyword: Spatial Data Mining, Rules induction, spatial statistics, neighborhood.

    1. INTRODUCTION

      The basic terminology of spatial is now a days used in various areas with increasing growth of maps and data that mainly analyze the instances with occurrences of geographical points. The alphanumerical data on spatial data is giving more terminologies related to map pointing at some temporal data frames. So the relation between instances or objects in occurrence at each time stamp is to be analyzed.

      Basic applications of spatial include the Geo-market analysis, Natural consequential studies, Risk analysis, traffic analysis and so on. In every usage an analysis of instances with objects properties for relating them to their common nearness qualities. In traffic risk analysis [1] an estimation of information related to the injury, road paths, accidents frequency and routing terminologies. Identifying the regions with risk areas and analyzing the respective geographic frame slots. Mostly spatial data mining allows the relationships in terms of neighborhood.

      Data analysis is estimated based on statistics and multidimensional data analysis, but not spatial data [2]. Geographic data is related to data which is near to each other and in terms of space and share same of correlated values. This terminology is called as spatial statistics terms as traditional analysis in statistics with interdependency of nearby areas. Geo statistics for exploration of data analysis based on attributes values and geophysical objects analysis by openshaw. [3] Dimensional data analysis method has been developed as support analysis [4,5]. Data driven analysis is basically dependent on spatial data process and method of implementation in operational analysis tools.

      In this article a description of data mining methods for geophysical systems and value in performing the spatial data mining process. A basic survey of various inference of database usage with statistical approaches.

      Article is structured as follows: In section 2 terms of data mining and tasks of each process. Section 3 various spatial methods of databases, statistics and tasks related to artificial intelligence. Relating spatial and databases approaches with statistical approaches a similarity and differences listed. At last research issues are listed.

    2. SPATIAL DATA MINING

      Extraction of Knowledge, relationships and properties which are not defined in spatial database. Geographic database constitute spatial temporal properties in terms of neighborhood are also present in spatial database. So, spatial analysis is more important in relationship analysis and temporal aspects as central point and accountability of computation.

      Data mining methods [6] are restricted to spatial and temporal data as it does not support location nor relationships between data objects. So, new methods with spatial relationships and data handling has to be implemented. Previous mining methods do not include the spatial analysis with inferring rules for carrying out analysis process. The existing methods can be integrated and extension will be used for spatial data mining process. Data mining method with conventional terms only will help us for alphanumerical data properties.

        1. General framework of Spatial data mining

          Searching for hidden patterns which are existed in large data bases and discovery of interestingness by relationship. Characteristics of data by relationships analysis with huge amount of data obtained in satellite images, medical images, captured images etc. Analysis is costly and unrealistic for all domains to examine spatial data in detail. The basic role for process in spatial analysis as extracting interesting outcomes as patterns and characteristics, getting relationships between the spatial and

          non spatial data sets, framing of data to higher level of representation and reorganization of spatial database to achieve better performance.

          General spatial data stores large amount data related to maps, preprocessed and medical data images and electronic VLSI data streams as layouts. Multidimensional data for topological information and accessing data by method to acquire reasoning and computation. The relationship among objects in space and time with non space data set and setting up knowledge based on query process gives characteristics in detailed. The system structure of the space and time system is as three layers. First is the user interface for input and output. Miner layer for managing data algorithms and storage of mined knowledge. The data source layer has spatial database and other relational database knowledge with original data mined.

        2. Basic primitives of Data mining

          Rules: characteristic rules, discriminative rules, association rules or deviation and evaluation rules can be mined [7]. Spatial characteristics gives the basic description of objects as temporal taken as basic unit. Basic applications are price nearness considerations with geographic regions and spatial characteristics. Spatial association rules with implication of set of data instances having features in spatial databases. A rule associate with the price nearness is the spatial feature and can be simulated as spatial association rule. Thematic maps used for distribution of pattern using map of various type, each map defines the partitioning of the area into set of closed and disjoint slots; with same features. The position of each object in relation to the spatial rules of thematic maps and regions. The two ways of thematic maps defined as raster and vector. Each raster image is pixels of associated attributes values. The intensity of the pixels basically framed as colors nearness values. Representation of each object with the representation as attributes of the boundary points and corresponding values.

        3. Steps in Spatial data mining

          Extension of basic data mining steps is the spatial data mining with criteria and combined. The aim of the summarize data are by finding the classification rules, clusters of similar objects, associations and dependencies to characterize of data. Different method of statistics and other machine learning algorithms as summarizations statistics as global correlation with density, smooth and contrast analysis defined algorithms as generalization, characterization rules. Class identification as spatial classification defined as decision trees. Clustering as point pattern analysis defined as geometric clustering. Dependencies as local autocorrelation defined as association rules. Trends and deviations as kriging and defined as trend rules.

          Spatial data summarization considered as description of global data and extending statistical method with variance and factorial analysis to spatial structures. Statistical analysis of contiguous values with methods developed as measuring neighborhood and local variances, spatial correlation [8,9]. Spatial relations are represented as contiguiy matrix for each relationship between objects. Each contiguity corresponds to different spatial relations such as adjacency and distance gap. Density or factorial analysis in non spatial data objects are not considered. Alphanumerical data with attributes and multiple instances of contiguity matric as characteristics of the data. The data computed by averaging the data instances attribute values and subtracting each contiguity value. Dimensional reduction is the main aim of the factorial theme of representation and dependencies ate especially taken in each method [5,10]. The extension of factorial analysis will be on principal component analysis with original table transformed using smoothening techniques. Generalization with attributes and reducing the details of each geometric value derived from the concept of attribute-oriented values in induction [11], non-spatial [12]. Thematic representation as cultivation types listed as food (cereals (rice, wheat), vegetables, spices, fruits). Each representation given as tree structure [13] and clustering with generalization as thematic methods. The complexity each representation will be nearness value of logarithmic as per objects in process of analysis. The inferring rules are taken for each association and comparison.

          Characteristic rules given for each object in database as defined in space and time [14] with properties typically as part of question of retrieving nearness value of contiguity. Consideration of neighborhood along with the properties of each object is important for characteristic conclusions. Class classification provides logical description of the best partitioning and database

          with nodes and its attributes. Spatial predictor and objects dependencies are statistically classified with non-spatial properties. Classification has analysis of the remotely of geographically located pixels of each object and its collection of instances.

          Geo classification related to the spatial attributes and the determination of algorithm based on the merging and adjacent locations of each object with predefined values are extended. Clustering with the automatic classification leading to the dataset on the similarity index and data clusters. In database approach the data is split to the classes with relational database as automatic classification. The order or group of value indexed with the classes are performed in terms of the temporal attribute values. Nearness values of each data set will be clustered and a set of group may be attributed separated. Algorithms used are CLARANS [15] and DBSCAN [16]. Spatial data applicability is GDBSCAN [17] with spatial shape and data points of attributes in maps.

          Nearness of each object with specific location distance values of the near point. The near-neighbor query of object with specified point in finding the nearness of an location with query processing taken distance as consideration. Region intersection object attribute values and distance value will be more commonly précised with the location values. General spatial data is graphical so graphical query languages will displayed queries with tables. Various operations on the interface, such as choosing area to be viewed as choosing the display on basis of selection conditions. SQL relational databases to store the spatial information and allowing the queries of spatial and mix of non-spatial conditions. Abstract data types with conditions and contains overlaps attributes values.

          Statistical approached with pattern analysis [18,19] and applied research applicability for each attribute of geographic point values. The temporal sequence is given for each databases based on central places of theory involves the maxima and particular attributes. Moving objects trajectories with centre of deviations in relations to the analysis and properties in comparison of the distance to a metropolis location values. In geo-statistics the prediction of Spatio-temporal phenomena for geological values of each specified location values.

    3. CONCLUSION

      Various method in spatial data base with data mining is listed in this article. Both are developed with statistics and database research groups. Summary of the classified data with methodology of incorporating these methods must be demonstrated. The issues related to the data mining as approach by temporal values. Graphical method with enhancing methods of locations values and volumes of data indexing for data involved. Neighborhood structures, instances and its pre-computed values of indexes is available with database.

    4. REFERENCES

      1. Zeitouni, K.: Etude de lapplication du data mining à lanalyse spatiale du risque daccidents routiers par lexploration des bases de données en accidentologie, Final report of the contract PRISM -INRETS, December 1998, 33 p.

      2. Diego GarcaSaiz, Marta Zorrilla and Jose Luis Bosque 2017, A Clustering-based Knowledge Discovery Process for Data Center Infrastructure Management, The journal of supercomputing, Volume 73, Issue 1, January.

      3. Sanders, L.: L'analyse statistique des données en géographie, GIP Reclus, 1989.

      4. Lebart L. et al., "Statistique exploratoire multidimensionnelle" , Editions Dunod, Paris, 439 p., 1997.

      5. Lebart, L. (1984) Correspondence analysis of graph structure. Bulletin technique du CESIA, Paris:2, 1-2, pp 5-19.

      6. M.Hemalatha.M; Naga Saranya.N. A Recent Survey on Knowledge Discovery in Spatial Data Mining, IJCI International Journal of Computer Science, Vol 8, Issue 3, No.2, may,2011.

      7. Geary R.C.: The contiguity ratio and statistical mapping, The incorporated Statistician, 5 (3), pp 115-145.

      8. Samet H., "Design and Analysis of Spatial Data Structures: Hierarchical (quadtree and octree) data structures ", Addison-Wesley Edition, 1990.

      9. Moran P.A.P., The interpretation of statistical maps, Journal of the Royal Statistical Society, B: 10, pp 234-251.,1948.

      10. Benali, H., Escofier, B.: Analyse factorielle lissée et analyse factorielle des différences locales, Revue Statistique Appliquée, 1990, XXXVIII (2), pp 55-76.

      11. Lu, W., Han, J. and Ooi, B.: Discovery of General Knowledge in Large Spatial Databases, in Proc. of 1993 Far East Workshop on Geographic Information Systems (FEGIS'93), Singapore, June 1993, pp. 275-289.

      12. Han J., Cai Y. & Cerone N., "Knowledge Discovery in Databases; An Attribute-Oriented Approach." Proceedings of the 18th VLDB Conference. Vancouver, B.C., August 1992. pp. 547-559.

      13. Samet H., "Design and Analysis of Spatial Data Structures: Hierarchical (quadtree and octree) data structures ", Addison-Wesley Edition, 1990.

      14. Ester, M., Frommelt, A., Kriegel, H.-P., Sander J.: Algorithms for Characterization and Trend Detection in Spatial Databases, Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, 1998.

      15. Ng, R. and Han, J.: Efficient and Effective Clustering Method for Spatial Data Mining, in Proc. of 1994 Int'l Conf. on Very Large Data Bases (VLDB'94), Santiago, Chile, September 1994, pp. 144-155.

      16. Ester, M., Kriegel ,H.-P., Sander, J., Xu, X.: DensityConnected Sets and their Application for Trend Detection in Spatial Databases, Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining, Newport Beach, CA, 1997, pp.10-15.

      17. Knorr E. M., and Ng R. T.: Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining, IEEE Transactions in Knowledge and Data Engineering, Vol 8(6), December 1996.

      18. Openshaw S., Charlton M., Wymer C., Craft A., 1987: "A mark 1 geographical analysis machine for the automated analysis of point data sets", International Journal of Geographical Information Systems, Vol. 1, n° 4, pp. 335-358.

Leave a Reply

Your email address will not be published.