Guidance for Eletricity Comsuption Datasets

DOI : 10.17577/IJERTCONV6IS03027

Download Full-Text PDF Cite this Publication

Text Only Version

Guidance for Eletricity Comsuption Datasets

Mrs. SangeethaS #1, Mr. Baskar K *2, Mrs Vijayarani N *3

#1 Assistant Professor, Department of Information Technology

*2 Assistant Professor, Department of Computer Science and Engineering

*3 Assistant Professor, Department of Computer Science and Engineering

#1 Kongunadu College of Engineering and Technology, Thottiam, Tamil Nadu, India,

*2 Kongunadu College of Engineering and Technology, Thottiam, Tamil Nadu, India,

*3 Trinity college for women, Namakkal, Tamil Nadu, India,

Abstract: The government agencies and the large multinational companies across the world focus on energy conservation and efficient usage of energy. The need of using energy in a efficient way is the need of developing countries like India and China .The emergence of smart grid meters gave us access to huge amount of energy consumption data. Our paper showcases a Business Intelligence tool which uses Apache Hadoop to efficiently handle the existing problems. Taking the advantage of this tool, energy distribution companies can reduce the investment by using community hardware that runs Hadoop. The usage of distributed computing tools also reduces the processing time significantly to enable real-time monitoring and decision making .

Keywords: Smart Grid Meters , Energy Consumption Data , Apache Hadoop, Decision Making

  1. INTRODUCTION

    Countries around the world have set aggressive goals for the restructuring of monopolistic power system towards liberalized markets especially on the demand side. In a competitive retail market, load serving entities (LSEs) will be developed in great numbers. Having a better understanding of electricity consumption patterns and realizing personalized power managements are effective ways to enhance the competitiveness of LSEs.

    Meanwhile, smart grids have been revolutionizing the electrical generation and consumption through a two-way flow of power and information. As an important information source from the demand side, advanced metering infrastructure (AMI), has gained increasing popularity worldwide; AMI allows LSEs to obtain electricity consumption data at high frequency, e.g., minutes to hours. Large volumes of electricity consumption data reveal information of customers that can potentially be used by LSEs to manage their generation and demand resources efficiently and provide personalized service.

    Fig 1.1: Electricity dataset analysis

  2. RELATED WORK

    MySQL was used as the backend in the existing system that has various drawbacks which is limitation of data where the processing large set of datas. If once the data is lost they cant be recovered .so the hadoop tool is used for the proposed system. In the existing system, it takes more time and maintenances cost is very high.

    • Data Analytic Module With Pig

    • Data Analytic Module With Map Reduce

    3.1.1 DATA PREPROCESSING MODULE

    In this module we have to create Data set for Electricity Consumption it contain set of table such that customer details, billing details and payment details for last four years .and this data first provide in MySQL database with help of this dataset we analysis this project.

    Fig 2.1:Mysql database analysis

  3. PROPOSED SYSTEM

    Proposed concept deals with providing database by using hadoop tool we can analyze no limitation of data and simple add number of machines to the cluster and we get results with less time, high throughput and maintenance cost is very less and we are using joins, partitions and bucketing techniques in hadoop.

    Fig3.1:Hadoop database model

      1. MODULES

        • Data Preprocessing Module

        • Data Migration Module With Sqoop

        • Data Analytic Module With Hive

        Fig 3.1.1:Preprocessing Weather dataset

        1. DATA MIGRATION MODULE WITH SQOOP This Data migration module is ready with dataset. So

          that the aim is transfer the dataset into hadoop(HDFS), that will be happen in this module. Sqoop is a command-line interface application for transferring data between relational databases and Hadoop In this module the fetch dataset into hadoop (HDFS) using sqoop Tool. Using sqoop to perform lot of the function, such that if the module want to fetch the particular column or if the module want to fetch the dataset with specific condition that will be support by Sqoop Tool and data will be stored in hadoop (HDFS).

          Fig3.1.2:processing dataset with sqoop

        2. DATA ANALYTIC MODULE WITH HIVE

          Hive is a data ware house system for Hadoop. It runs SQL like queries called HQL (Hive query language) that gets internally converted to map reduce jobs. Hive was developed by Facebook. Hive supports Data definition Language (DDL), Data Manipulation Language (DML)

          and user defined functions. In this module the analysis of dataset using HIVE tool which will be stored in hadoop (HDFS).For analysis dataset HIVE using HQL Language. Using hive to perform Tables creations, joins, Partition, Bucketing concept. Hive analysis the only Structure Language.

          Fig3.1.3 :processing dataset with hive

        3. DATA ANALYTIC MODULE WITH PIG

          Apache Pig is a high level data flow platform for execution Map Reduce programs of Hadoop. The language for Pig is pig Latin. Pig handles both structure and unstructured language. It is also top of the map reduce process running background. In this module also used for analyzing the Data set through Pig using Latin Script data flow language in this all operators, functions and joins applying on the data.

          Fig3.1.4 processing dataset with pig

        4. DATA ANALYTIC MODULE WITH MAPREDUCE

    In this module the Map Reduce is dealing with the technique and a program model for distributed computing based on java. The Map Reduce algorithm contains two important tasks, namely Map and Reduce. It is used for analyzing the data set using MAP REDUCE. Map Reduce Run by Java Program.

    Fig3.1.5 :processing dataset with map reduce

  4. ALGORITHM

    Commonly Map Reduce paradigm is based on sending the computer to where the data resides.

    Map Reduce program executes in three stages, namely map stage, shuffle stage, and reduce stage.

    Map stage : The map or mappers job is to process the input data. Generally the input data is in the form of file or directory and is stored in the Hadoop file system (HDFS). The input file is passed to the mapper function line by line. The mapper processes the data and creates several small chunks of data.

    Reduce stage : This stage is the arrangement of Shuffle stage and Reduce stage. In that Reducers job is to process the data that comes from the mapper. After the handing it produces a new form of output, which will be stored in the HDFS.

  5. USECASEDIAGRAM

    Fig 5.1:usecase model

    The user has to login the page initially.After the login user will reach the page which consists of datas of electricity consumption and bill.The user can get the details of geographics data,time,weather and climate earth level,chart type and statistical function.

  6. HDFS

    Hadoop File System was developed using distributed file system design. It runs on commodity hardware. The other distributed systems, HDFS is highly fault tolerant and it was designed using low-cost hardware .HDFS holds very excess of data and provides easier access. To store such huge data, the files are stored across multiple machines. These files are stored in redundant fashion to set free the system from possible data losses incase of failure. HDFS also makes applications available to parallel processing.

  7. RESULT

7.1 Data preprocessing Module:

The Energy consumption dataset collecting customer details, billing details and payment details are represented in the module data preprocessing. That the collected datasets are accessed in form of Preprocessing with datasets of cities. In the module,Data preprocessing handles in the gathered datasets of the process.

Fig 7.1.1 data processing module

  1. 2.TO RETRIEVE THE PARTICULAR COLUMNS

    The gathered datasets are retrieved from Hadoop distributed file system Database. As per the needs of user the datas are retrieved. The large amount datas are accessed and get the retrieval of weather data.

    Fig7.2.1 Retrieval of data

    These data are framed and gives tabular form of output data.By using their hadoop framework tool the large set of weather data.

  2. FUTURE ENHANCEMENTS

    We are using spark we can get result hundred times faster than Hadoop. The secret is that it runs in-memory on the cluster, and that it isn't tied to Hardtops Map Reduce two- stage paradigm. This makes repeated access to the same data much faster. Spark can run as a standalone or on top of Hadoop YARN, where it can read data directly from HDFS.

    7.3.TABLES CREATIONS

    The Tabular form has created by the input set of data processed from the database. The output has retrieved from Tabular form .In the dataset created in the database system.

    Fig7.3.1:Table creation

    Fig7.3.2:Retrieved dataset

    The table has to be created and then the datas are inserted.After the datas are inserted they are processed and then they are retrieved.

  3. CONCLUSION

To reach the 2050 energy efficiency as well as renewable energy targets and also for the future smart grids, effective use of smart metering technology is crucial. Rational energy use is a must for a larger group of companies, municipalities and public organizations because of the gain in importance of the energy costs and environmental issues. Hence proper information about their consumption is needed by them along with and its distribution between different activities. A total picture of their energy use, potential for savings, along with costs can be given to them by smart meter data analytics, enabling effective energy management. Smart meter sends energy consumption data at small intervals resulting in generating big data. Time and storage are two important factors that affect a lot on building any application. The solution for handling such big data is Hadoop.

11.REFERENCES

  1. USA Department of Energy, Smart Grid / Department of Energy, http://energy.gov/oe/technology-development/smart-grid, 2014.

  2. I. P. Panapakidis, M. C. Alexiadis and G. K. Papagiannis, "Load profiling in the deregulated electricity markets: A review of the applications," in European Energy Market (EEM), 2012 9th International Conference on the, 2012, pp. 1-8.

  3. R. Granell, C. J. Axon and D. C. H. Wallom, "Impacts of Raw Data Temporal Resolution Using Selected Clustering Methods on Residential Electricity Load Profiles," IEEE Trans. Power Systems, vol. 30, pp. 3217-3224, 2015.

  4. N. Mahmoudi-Kohan, M. P. Moghaddam, M. K. Sheikh-El-Eslami, and

    E. Shayesteh, "A three-stage strategy for optimal price offering by a retailer based on clustering techniques," International Journal of Electrical Power & Energy Systems, vol. 32, pp. 1135-1142, 2010.

  5. P. Zhang, X. Wu, X. Wang and S. Bi, "Short-term load forecasting based on big data technologies," CSEE Journal of Power and Energy Systems, vol. 1, no. 3, pp. 59-67, 2015. [6] N. Mahmoudi-Kohan, M. P. Moghaddam, M. K. Sheikh-El-Eslami, and S. M. Bidaki, "Improving WFA k-means technique for demand response programs applications," in Power & Energy Society General Meeting, 2009. PES '09. IEEE, 2009, pp. 1-5.

  6. H. Gao, J. Tang, X. Hu, and H. Liu, Content-aware point of inter- est recommendation on location-based social networks, in Proc. 29th Int. Conf. AAAI, 2015, pp. 1721 1727.

  1. C. Leon, F. Biscarri, I. Monedero, J. I. Guerrero, J. Biscarri, and R. Millan, "Variability and Trend-Based Generalized Rule Induction Model to NTL Detection in Power Companies," IEEE Trans. Power Systems, vol. 26, pp. 1798-1807, 2011.

  2. Y. Wang, Q. Chen, C. Kang, M. Zhang, K. Wang, and Y. Zhao, "Load profiling and its application to demand response: A review," Tsinghua Science and Technology, vol. 20, pp. 117-129, 2015.

  3. R. Li, C. Gu, F. Li, G. Shaddick, and M. Dale, "Development of Low Voltage Network Templates-Part I: Substation Clustering and Classification," IEEE Trans. Power Systems, vol. 30, pp. 3036-3044, 2015.

  4. K. Zhou, S. Yang and C. Shen, "A review of electric load classification in smart grid environment," Renewable and Sustainable Energy Reviews, vol. 24, pp. 103-110, 2013.

  5. G. J. Tsekouras, P. B. Kotoulas, C. D. Tsirekis, E. N. Dialynas, and N.

    D. Hatziargyriou, "A pattern recognition methodology for evaluation of load profiles and typical days of large electricity customers," Electric Power Systems Research, vol. 78, pp. 1494-1510, 2008.

  6. S. V. Verdu, M. O. Garcia, C. Senabre, A. G. Marin, and F. J. G. Franco, "Classification, Filtering, and Identification of Electrical Customer Load Patterns Through the Use of SelfOrganizing Maps," IEEE Trans. Power Systems, vol. 21, pp. 1672-1682, 2006.

Leave a Reply