 Open Access
 Total Downloads : 102
 Authors : Dr. F. A. Mazarbhuiya, Dr. Yusuf Perwej
 Paper ID : IJERTV4IS100576
 Volume & Issue : Volume 04, Issue 10 (October 2015)
 DOI : http://dx.doi.org/10.17577/IJERTV4IS100576
 Published (First Online): 29102015
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
The Mining Hourly Fuzzy Patterns from Temporal Datasets
Dr. F. A. Mazarbhuiya
Assistant Professor, Department of Computer Science & IT
Al Baha University, Al Baha, Kingdom of Saudi Arabia (KSA)
Dr. Yusuf Perwej
Assistant Professor, Department of Computer Science &Engg.
Al Baha University, Al Baha, Kingdom of Saudi Arabia (KSA)
Abstract: The retrace patterns from supermarket datasets is an interesting data mining issue. The hourly fuzzy pattern is an example of such patterns where the pattern holds in some fuzzy time interval in each hour. The issue involves first frequent set mining and then deduces association rules from the frequent sets. We call it hourly fuzzy patterns. In the beginning works were mainly concentrated about a user specified fuzzy pattern mining. In as much as, in some applications,the user may not have previous knowledge about the datasets under opinion, so may not be able to specify fuzzy time interval. Identical may happen due to the boundary of natural language. In this paper, we are intending a method of finding such patterns. The nicety about the method is that the fuzzy time intervals are generated by the method itself. The impact of our method is exhibited with experimental results.
Keywords: Temporal Patterns, Temporal Association Rules, Superimposed Intervals, Fuzzy Set, Right Reference Functions, Left Reference Functions, Membership Functions.

INTRODUCTION
The analysis of temporal data has been pondered as an important data mining problem. An example of such is super market dataset mining. In real world supermarket data if a transaction happens its time also recorded. In [1], the authors have addressed such problems in details. In their work, the attribute associated with the dataset has been pondered like any other attribute. The taking into account time attribute aside, pattern can be extracted which are timedependent and cannot be extracted by [1]. Ale et al [2], first studied such problems in detail. They proposed a method of discovering association rules that hold within the lifespan an itemset. In this spot lifespan of an itemset is the time period between first transactions, including the itemset to the last transaction inclusive the equal in the dataset. The lifespan of an itemset is not necessary equal as that of dataset.
The idea of local Patterns has been proposed by Anjana et al [3], which are patterns that are frequently in certain time intervals and may or may not be frequent throughout the lifespan of the itemset. They proposed an algorithm which can extract locally frequent itemsets along with a list of sequences of time intervals where each locally frequent itemsets is frequent in a sequential time intervals. The considering the timestamp in the form year_month_day
i.e. time hierarchy a method is discussed in [4] which can extract yearly, monthly, periodic or partially periodic
patterns. In [4] a method called set superimposition is used to keep the intervals of the frequency with overlapping. The superimposed intervals turn out to be fuzzy intervals associated with each locally frequent itemsets. Considering time stamp year_month_ day_hou r_minute_second, methods were proposed in [6, 7, 8], for finding yearly, monthly, and daily fuzzy frequent itemsets. Unless if only minute_second are pondered in timehierarchy, then there may exist some patterns which are hourly fuzzy frequent. In this paper, we have discussed hourly fuzzy patterns and devised an algorithm for finding such patterns. Talk about this algorithm similar to that in [6, 7, 8]. The paper is organized as follows. In section2, we discuss related works. In section3, we discuss terms, definitions and notations used in the algorithm. In section4, the algorithm is discussed. In section5, we discuss about results and analysis. Finally a summary and lines for future works are discussed in section6.

RELATED WORKS
The association rules discovery problem was first proposed by Agrawal el al [1]. In their work, if I, is the set of items and D, a large collection of transactions involving the items, the question is to find kinship among the presence of various items in the transactions. The temporal data mining
[9] is an important extension of conventional data mining. If time aspect is taken into account, then more interesting patterns that are time dependent can be extracted. The process of association rule discovery is also extended to incorporate temporal aspects. The corresponding problems are to find valid time periods during which association rules hold and the discovery of possible periodicities that association rules have. This author proposed an algorithm for the discovery of temporal association rules [2]. For each item (or itemset),a lifespan is defined which is the time gap between the first occurrence and the last occurrence of the item in the transaction in the dataset. Next itemsets are calculated only during its lifespan of itemset. After that each rule has associated with it a time frame. In [3], the works done in [2] have been extended by considering time gap between two consecutive transactions containing an item set into account. The acceptance the periodicity of patterns into rumination, Ozden [10] proposed a method, which is able to find userspecified periodic patterns. In [11], the authors discussed about amethod of discovering temporal association rules with respect to fuzzy match, i.e. association rule holding during enough number of intervals given by the corresponding calendar pattern.
In common with works were done in [12] incorporating multiple granularities of time intervals (e.g. first working day of every month) from which both cyclic and user defined calendar patterns can be cognizable. Mining fuzzy patterns from datasets have been studied by many researchers. In [13], the authors presented a method for mining fuzzy temporal patterns from a given process instance. In common with work is done in [14]. Next the method of extracting fuzzy periodic association
[15] rules is discussed. In [6], authors have presented a method of finding yearly fuzzy patterns. In [7], the works were done in mining monthly fuzzy patterns. In [8], methods were discussed for finding daily fuzzy patterns. 
TERMS, DEFINITIONS AND NOTATIONS
Let us scrutiny some definitions and notations used in this paper. Let E be the universe of discourse. A fuzzy set A in E is characterized by a membership function A(x) lying in [0, 1].
A(x) for xE represents the grade of membership of x in A. Thus a fuzzy set A is defined as
A={(x, A(x)), xE }
A fuzzy set A is said to be normal if A(x) =1 for at least one xE. An cut of a fuzzy set is an ordinary set of elements with membership grade greater than or equal to a threshold
, 0 1. Thus an cut A of a fuzzy set A is characterized by A={xE; A(x)} [see e.g. [16]]
A fuzzy set is said to be convex if all its cuts are convex sets.
A fuzzy number is a convex normalized fuzzy set A
defined on the real line R such that

there exists an x0R such that A(x0) =1, and

A(x) is piecewise continuous.
Thus a fuzzy number can be thought of as containing the real numbers within some interval to varying degrees.
Fuzzy intervals are special fuzzy numbers satisfying the following.

there exists an interval [a, b]R such that
A(x )=1for all x [a, b], and
The core of a fuzzy set A within a universal set E is the crisp set that contains all the elements of E having membership grades 1 in A.
<>Set Superimposition
In [17] an operation called superimposition (S) was proposed. If A is superimposed over B or B is superimposed over A, we have
A (S) B = (AB) (+) (AB)(2) (+)(BA) . (1)
Where (AB)(2) are the elements of (AB) represented twice, and (+) represents union of disjoint sets.
To explain this, an example has been taken.
If A= [a1, b1] and B= [a2, b2] are two real intervals such that AB, we would get a superimposed portion. It can be seen from (1)
[a1, b1](S)[a2, b2]= [a(1),a(2)) (+) [a(2),b(1)](2) (+) (b(1),b(2)](2)
Where a(1)=min(a1, a2) a(2)=max(a1, a2) b(1)=min(b1, b2), and b(2)=max(b1, b2)
(2) Explains why if two line segments are superimposed,
the common portion peeps doubly dark [5]. The identity (2) is called thefundamental identity of superimposition of intervals.
Let now, [a1, b1](1/2) and [a2, b2](1/2) be two fuzzy sets with constant membership value Â½ everywhere (i.e. equifuzzy intervals with membership value Â½). If [a1, b1] [a2, b2]
then applying (2) on the two equifuzzy intervals we
can write
[a1,b1](1/2)(S)[a2,b2](1/2)=[a(1),a(2))(1/2)(+)[a(2),b(1)](1)(+)(b(1),b(2)](1/2) (3)
Let [xi,yi], i=1,2,,n, be n real intervals such that
n
xi , yi . Generalizing (3) we get
i1
[x1, y1](1/n)(S)[x2, y2](1/n)(S)…(S) [xn,yn](1/n)=[x(1), x(2))(1/n)(+)[x(2), x(3))(2/n) (+) … (+) [x(r), x(r+1))(r/n)(+) … (+) [x(n),
y(1)](1)(+)(y(1), y(2)]((n1)/n)(+) … (+) (y(nr),y(n
r+1)](r/n)(+)…(+)(y(n2),y(n1)](2/n)(+)(y(n1),y(n)](1/n) (4)
0 0

A(x)is piecewise continuous.
A fuzzy interval can be thought of as a fuzzy number with a flat region. A fuzzy interval A is denoted by A = [a, b, c, d] with a < b < c < d where A(a) = A(d) = 0 and A(x) = 1 for all x[b, c]. A(x) for all x[a, b] is known as left reference function and A(x) for x [c, d] is known as the right reference function. The left reference function is non decreasing and the right reference function is non increasing
The support of a fuzzy set A within a universal set E is the crisp set that contains all the elements of E that have non zero membership grades in A and is notified by S(A). Thus
S(A)={ xE; A(x) 0}
In (4), the sequence {x(i)} is formed by sorting the sequence
{xi} in ascending order of magnitude for i=1,2,n and similarly {y(i)} is formed by sorting the sequence {yi} in ascending order. In as much as the set superimposition is operated on the closed intervals, it can be extended to operate on the open and the halfopen intervals in the trivial way.
LEMMA 1. (THE GLIVENKOCANTELLI LEMMAOF ORDER STATISTICS)
Let X = (X1, X2, ,Xn) and Y = (Y1, Y2,,Yn) be two random vectors, and (x1, x2,,xn) and (y1, y2, ,yn) be two distinctive realizations of X and Y respectively. Assume that the subfields induced by Xk, k = 1, 2, , n are
identical and independent. Accordingly assume that the sub fields induced by Yk, k = 1, 2, , n are also identical and independent. Let x(1), x(2), , x(n) be the values of x1, x2, , xn, and y(1), y(2), , y(n) be the values of y1, y2, , yn arranged in ascending order.
For X and Y if the experiential probability distribution functions 1(x) and 2(y) are defined as in (5) and (6) respectively. Then, the GlivenkoCantelli Lemma of order statistics states that the mathematical expectation of the empirical probability distributions would be given by the respective theoretical probability distributions.
0 x<x(1)
1(x) = (r1)/n x(r1) x x(r)
(5)
1 x x(n)
0 y<y(1)
2 (y) = (r1)/n y(r1) y y(r)
(6)
1 yy(n)
Now, let Xk is random in the interval [a, b] and Yk is random in the interval [b, c] so that P1(a, x) and P2(b, y) are the probability distribution functions followed by Xkand Yk respectively. Then in this case GlivenkoCantelli Lemma gives


ALGORITHM
In our proposed algorithm the timestamps stored in the transactions of temporal data are the time hierarchy of the type second_minute_hour_day_month_year, then we do not contemplate year, month, day, and hour in time hierarchy and only contemplate minute and second. Now we extract frequent itemsets using method discussed in [3]. Each frequent itemset will have a sequence of time intervals of the type [min_second, min_second] associated with it where it is frequent [18]. Using the sequence of time intervals, we can discover the set of superimposed intervals [Definition of superimposed intervals is given in section3] and each superimposed intervals will be a fuzzy intervals. The algorithm is alike that of [6, 7, 8]. The method is as follows for a frequent itemset the set of superimposed intervals is initially empty, algorithm visits each interval associated with the frequent itemset sequentially, if an interval is intersecting with the core of any existing superimposed intervals [Definition of core is given in section3]in the set it will be superimposed on it and membership values will be adjusted otherwise a new superimposed intervals will be started with the this interval. This procedure will be continued till the end of the sequence of time intervals. The proposed procedure will be repeated for all the frequent itemsets as well as possible each frequent itemsets will have one or more superimposed time interval. Additionally the superimposed time intervals are used to generate fuzzy intervals, each frequent itemset will be associated with one or more fuzzy time intervals where it is frequent. In addition to this each superimposed intervals is represented in a compact manner discussed in section3.
For representing each superimposed interval of the form
[t(1) , t(2) ]1/ n[t(2) , t(3) ]2 / n[t(3) , t(4) ]3/ n …….[t(r ) , t(r1) ]r / n ……….E[1(x)] P1(a, x), a x b, and
n1 2
E[ ( y)] P (b, y), b y c
….
(7)
[t(n) ,t'(1) ]1[t'(1) ,t'(2) ] n ………[t'(n2) ,t'(n1) ]n [t'(n1) ,t'(n) ]1/ n2 1
It can be take a notion that in equation (4) the membership values of [x(r), x(r+1)](r/n), r = 1, 2, , n1 look like experiential probability distribution function P1(x) and the
In this algorithm, we lay two arrays of real numbers, one for storing the values t(1), t(2), t(3),.t(n) and the other for
,
membership values of [y(nr), y(nr+1)](r/n) r=1,2,.,n1 look
storing the values
t '(1) ,
t '( 2)
,. t '(n) each of which is a
like the values of experiential complementary probability distribution function or experiential survival function [1 P2(y)].
Therefore, if A(x) is the membership function of an LR fuzzy number A=[a, b, c]. We get from (4)
sorted array. At the moment if a new interval [t, t ' ] is to be superimposed on this interval we add t to the first array by finding its position (using binary search) in the first array
so that it remnant sorted. In like manner t ' is added to the
A(x) P1 (a, x),
a x b
(8)
second array.
Data structure used for representing a superimposed
1 P2 (b, x), b x c
It remains to be seen P1(x) can indeed be the DuboisPrade left reference function and (1 – P2(x)) can be the Dubois Prade right reference function [19]. Baruah [17] has shown that if a possibility distribution is viewed in this way, two probability laws can, indeed, give rise to a possibility law.
interval is
structsuperinterval
{ intarsize, count; short *l, *r;
}
Here arsize represents the maximum size of the array used, count represents the number of intervals superimposed, and l and r is two pointers pointing to the two associated arrays.
Algorithm 4.1
for each locally frequent item set s do
{L sequence of time intervals associated with s Ls set of superimposed intevals initially set to null
lt = L.get();
// lt is now pointing to the first interval in L Ls. append(lt);
While ((lt = L.get()) != null)
{flag = 0;
While ((lst = Ls.get()) != null) if(compsuperimp(lt, lst))
flag = 1;
if (flag == 0) Ls.append(lt);
}
}
compsuperimp(lt, lst)
{ if( intersect(lst, lt)!= null)
{ superimp(lt, lst); return 1;
}
The correspondingly function compsuperimp(lt, lst) first computes the intersection of lt with the core of lst. If the intersection nonempty it superimposes lt by calling the function superimp(lt, lst) which virtually the carries on the superimposition process by updating the two lists associated as described earlier. The function return 1 if lt has been superimposed on the lst else returns 0. Obtain and tag on are functions operating on lists to get a pointer to the next element in a list and to tag on an element into a list.

RESULTS PROCURED
In this paper experimental result purpose, we have used a synthetic dataset T10I4D100K, available from FIMI1 website. A summarized view of the dataset is presented in table 1.The dataset mentioned in table 1and some procured results are presented in table 2.
http://fimi.cs.helsinki.fi/data/
return 0;
}
Table 1.T10I4D100K dataset characteristics
Table 2. The daily fuzzy frequent itemsets for different set of transactions for itemset {5}
No. fuzzy time intervals
150000
100000
50000
0
50000 4 4 4 4 4 4 3 3 3 2
The daily fuzzy frequent itemsets for different set of transactions for itemset {5}
Figure 1. The daily fuzzy frequent itemsets for different set of transactions for itemset {5}
So far as dataset is nontemporal, we incorporate temporal features into it. To this spot we keep the lifespan of our datasets as 1year. Firstly, we take only 10,000 transactions and found that the itemset {5} has a superimposed intervals superimposed on one place and hence it has one fuzzy time interval where it is frequent. For 20,000 and 30,000 transactions the same itemset has two superimposed intervals and so two fuzzy intervals, in the end from 60,000100,000 transactions, we get {5} is frequent in four fuzzy time intervals figure 1 shown.

CONCLUSION AND LINE FOR FUTURE WORK
In this paper, we are discussing methods for discovering hourly fuzzy patterns. The method receive as input a list of time intervals associated with a frequent itemset generated using a method discussed [4]. In spite of, our work we do not contemplate the year, month, day and hour at the time hierarchy and only contemplate minute, second. Finally, every frequent itemset will be associated with a sequence of time intervals of the form [minute_second, minute_second] where it is frequent. The algorithm sees each other interval in the sequence one by one and stores the intervals in the superimposed form. Next every frequent itemset is associated with one or more superimposed time intervals. Each superimposed interval gives fuzzy time intervals. In this case we will have each frequent itemset is associated with one or more fuzzy time intervals. The nicety about the method is that the algorithm is fewer user reliant on i.e. fuzzy time intervals are citation by algorithm automatically. Future work may be possible in the following ways.

A further type of fuzzy patterns, namely weekly, quarterly, half yearly patterns can be extracted.

Clustering of patterns can be done based on their fuzzy time interval associated with yearly patterns using some statistical measure.
REFERENCES

R. Agrawal, T. Imielinski, and A. N. Swami; Mining association rules between sets of items in large databases, In Proc. of 1993 ACM SIGMOD Intl Conf on Management of Data, Vol. 22(2) of SIGMOD Records, ACM Press, (1993), pp 207216.

J. M. Ale, and G. H. Rossi; An Approach to Discovering Temporal Association Rules, In Proc. of 2000 ACM symposium on Applied Computing (2000).

A. K. Mahanta, F. A. Mazarbhuiya,and H. K. Baruah; Finding Locally and Periodically Frequent Sets and Periodic Association Rules, In Proc. of 1st Intl Conf. on Pattern Recognition and Machine Intelligence, LNCS 3776 (2005), pp. 576582.

A. K. Mahanta, F. A. Mazarbhuiya, and H. K. Baruah (2008). Finding Calendarbased Periodic Patterns, Pattern Recognition Letters, Vol. 29(9), Elsevier publication, USA, pp. 12741284.

H. K. Baruah (2010); The RandomnessFuzziness consistency principle, International Journal of Energy, Information and Communications,Vol 1(1), Nov 2010, Japan.

F. A. Mazarbhuiya (2014); Discovering Yearly Fuzzy Patterns, International Journal of Computer Science and Information security (IJCSIS) Vol. 12, No. 9, September 2014.

M. Shenify and F. A. Mazarbhuiya (2015); Discovering Monthly Fuzzy Patterns, International Journal of Intelligence Science (IJIS), 3743, USA.

F. A. Mazarbhuiya (2015); Extracting daily fuzzy patterns, International Journal of Computer Science and Information security (IJCSIS) Vol. 13, No. 10, October 2015 (communicated).

C. M. Antunes, and A. L. Oliviera; Temporal Data Mining an overview, Workshop on Temporal Data Mining7th ACM SIGKDD Intl Conf. on Knowledge Discovery and Data Mining, (2001).

B. Ozden, S. Ramaswamy, and A. Silberschatz; Cyclic Association Rules, In Proc. of the 14th Intl Conf. on Data Engineering, USA (1998), pp. 412421.

Y. Li, P. Ning, X. S. Wang, and S. Jajodia; Discovering Calendar based Temporal Association Rules, Elsevier Science, (2001).

G. Zimbrado, J. Moreira de Souza, V. Teixeira de Almeida, and W. Arauja de Silva; An Algorithm to Discover Calendarbased Temporal Association Rules with Items Lifespan Restriction, In Proc. of the 8th ACM SIGKDD 2002.

R.B.V. Subramanyam, A. Goswami, Bhanu Prasad; Mining fuzzy temporal patterns from process instances with weighted temporal graphsInt. J. of Data Analysis Techniques and Strategies, 2008 Vol.1, No.1, pp.60 77.

S. Jain,S. Jain, and A. Jain; An assessment of Fuzzy Temporal Rule Mining, International Journal of Application or Innovation in Engineering and Management (IJAIEM), Vol. 2, 1, January 2013, pp. 4245.

WanJu Lee, JungYi Jiang and ShieJue Lee; Mining fuzzy periodic association rules, Data & Knowledge Engineering, Vol. 65, Issue 3, June 2008, pp. 442462.

Klir, J. and Yuan, B.; Fuzzy Sets and Logic Theory and Application, Prentice Hill Pvt. Ltd. (2002).

H. K. Baruah; Set Superimposition and its application to the Theory of Fuzzy Sets, Journal of Assam Science Society, Vol. 10 No. 1 and 2, (1999), pp. 2531.

F A Mazarbhuiya, Yusuf Perwej, An Efficient Method for Generating Local Association Rules, International Journal of Applied Information Systems 9(2):15, June 2015. Published by Foundation of Computer Science, New York, USA, DOI: 10.5120/ijais15451368

D. Dubois and H. Prade; Ranking fuzzy numbers in the setting of possibility theory, Inf. Sc.30, (1983), pp. 183224.