Nonintrusive Load Monitoring: A Temporal Multilabel Classification Approach

Share Embed


Descripción

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

1

Non Intrusive Load Monitoring: A Temporal Multi-Label Classification Approach Kaustav Basu, Vincent Debusschere, Member, IEEE, Seddik Bacha, Member, IEEE, Ujjwal Maulik, Senior Member, IEEE, and Sanghamitra Bondyopadhyay, Senior Member, IEEE (Special Section on New Trends in Intelligent Energy Systems)

Abstract—The article tackles the issues related to the identification of electrical appliances inside residential buildings. Each appliance can be identified from the aggregate power readings at the meter panel. The possibility of applying a temporal multilabel classification approach in the domain of non-intrusive load monitoring is explored (non-event based method). A novel set of meta-features is proposed. The method is tested on sampling rates based on the capabilities of current smart meters. The proposed approach is validated over a dataset of energy readings at residences for a period of a year for 100 houses containing different sets of appliances (water heater, washing machines, etc.). This method is applicable for the demand side management of households in the current limitation of smart meters, from the inhabitants or from the grid operator’s point of view. Index Terms—non-intrusive load monitoring, smart meter, disaggregating algorithms, appliances identification, machine learning, smart grid, energy management, multi-label classification.

I. I NTRODUCTION

L

OAD management lets customers adjust their energy consumption according to an expected level of comfort, energy prices variations and sometimes environmental impacts (for example CO2 equivalent emissions) [1]. Demand side management strategies need an accurate evaluation of the energy that can be controlled. Therefore, identifying the usage of every appliance is one of the core issues in the field of smart buildings energy management. From the smart grid point of view, receiving information on the usage of appliances (especially deferrable loads) helps to manage the energy distribution [2], [3], especially for the integration of more fluctuating energy sources, i.e. renewable. The energy management depends on the appliances: some can be postponed (washing machine, etc.) and some cannot (television set, etc.). In this field, there already exist strategies such as demand response to reduce peak demand by reducing the use of electricity, or by shifting it to non-peak times [4]. The proper use of these techniques can depend on time of use pricing, then on energy prices variations and ultimately on consumer acceptance [5]. K. Basu, V. Debusschere and S. Bacha are with the Grenoble Alpes University - G2ELab, France. e-mail: [email protected] U. Maulik is with the Department of Computer Science and Engineering, Jadavpur University, Kolkata, India. S. Bondyopadhyay is with the Indian Statistical Institute, Kolkata, India. Manuscript send January 31, 2014. Copyright (c) 2009 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]

Load identification can also play an important part in the future prediction of usage of particular appliances where the process of historical data collection is made as little intrusive as possible [6], [7]. At the moment, current power meters report only whole-residence data [8]. It is required to separate and subsequently identify the total load into its constituent components, i.e. the appliances. A. Related Work The pioneering work in load separation was started by Hart in the beginning of the 90s [9]. Various methods were then proposed to identify individual appliances from their O N/O FF transitions. Appliance transitions result in corresponding changes in the overall power consumption monitored at the power meter. The non-intrusive appliance load monitoring (N IALM) could be divided into six data flow modules consisting of data acquisition, data pre-processing, event detection, feature extraction, event classification and energy computation [10], [11]. Over the years, major improvements have been achieved in the event detection and feature extraction modules. A N IALM method proposed in [12] classifies the loads based on features that were extracted from the samples of the detected events. In [13], [14], both steady state and transient state features are used. The sampling rate is the rate at which the energy consumption is monitored, 10 to 60 minutes being considered as a low sampling rate and less than a minute as a high one [15]. The steady state and transient state event-based methods are suitable for a high sampling rate but are inefficient at a low sampling rate. Further, the detected load at a high sampling rate can be correlated with the user activity and raises privacy concerns for the inhabitants [16], [17]. Recent developments for N IALM methods at a low sampling rate show that, high energy consuming appliances, such as water heater or washing machine can be identified with reasonable precision even at sampling rate of 15 minutes [18], [19]. A method partially disaggregating total household electricity usage into five load categories was proposed using a low sampling rate in [20], where different sparse coding algorithms are compared and a Discriminative Disaggregation Sparse Coding (DDSC) algorithm is proposed. B. Context Over the years, most of the approaches were based on signal processing at a high sampling rate (typically 1 second) to

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

evaluate the appliance load signature and subsequently use pattern recognition techniques for identification from previously trained classifiers. On the other hand, actual smart meters present low sampling rates for the communication with the grid operators, starting from 10 to 60 minutes and up to daily measurement. It is for example the case of the Automated Metering Management System of the French distribution system operator “Electricit´e R´eseau Distribution de France” (E RDF)1 . These low sampling rates considerably reduce the hardware complexity and are justified by the fact that most of the high energy consuming appliances have a low frequency of usage, typically once a day. The temporal classification of appliances using multi-label classification techniques presents an interesting alternative to signal analysis at low sampling rates and has not been previously tested in the field of load identification for households. The global architecture of the identification process proposed in this work is shown in Fig. 1. All the interactions between the inhabitants, the loads and the grid are centralized in one hub: the classifier, which can be integrated in or be an optional component of a future smart meter or energy box.

2

the classification algorithm (event based or non-event based). The event based algorithms try to detect O N/O FF transitions whereas the non-event based methods try to detect whether an appliance is O N during the sampled duration. We promote a non-event based approach with a very short and non-intrusive training period. Three main contributions are presented in this paper. The first is the identification of the relevant meta-features in the field of residential building appliances, in addition to the ones proposed in [21], and the formalization of a smart meter integrated methodology (Section III). The second is the testing of a variety of state of the art multi-label learners to find the most relevant learners in this field of research for two sampling rates. The third is the assessment of the performance of the classification algorithms by comparing multi-label and singlelabel learners as base of the classifiers (Section V). II. DATABASE The dataset used in this work is a sub-set from a European database on residential consumption, including Central and Eastern European Countries, as well as new European Countries (Bulgaria and Romania), R EMODECE. The sub-set used is called I RISE and is dealing only with houses in France. The database consists of energy consumption data obtained after monitoring one hundred households during a whole year. The dataset corresponding to each house consists of recordings of aggregated power (the energy consumed over the last period) for all the electric appliances in the house at a sampling rate of 10 minutes. Multiple appliances may start during this duration, but would show different temporal behavior. III. P ROPOSED METHOD

Fig. 1. Classifier architecture: centralizing information of a smart household.

C. Contribution The situation considered is where a user gives a recording (time stamp) of his high energy consuming appliances for a week or two and subsequently gets his energy management plan for the year. There is no need of any particular power recordings other than the one of the household power meter. In cases where the users cannot monitor the usage of the appliances, inexpensive sensors can be used for the training phase only. The appliances can be controlled by a local energy management system (private or aggregating many consumers) responding to distribution grid manager flexibility requests (through automatic shut downs, shifts or shadings of the loads). To ensure such a load management, economic incentives will have to be proposed (real time pricing, financial compensation, etc.). The load separation methods can be classified based on the intrusiveness of the training process and on the nature of 1 http://www.erdfdistribution.fr/EN

Linky

The present work formalizes a generic appliance identification technique based on a multi-label learning process using a temporal windowing approach where the only input after the training phase is the time stamped energy readings from the power meter. The size of the sliding window is fixed experimentally to 10 units, each unit being of 10, 30, or 60 minutes, depending on the sampling rate. Increasing the window size can increase the complexity of the algorithm (not always with a visible change in performance) and decreasing the window size can lower the performances. After experiments, the window size can be reduced up to five units for a lower sampling rate (30 to 60 minutes) without a significant drop in performance. The minimum size of the window should be greater than the duty cycle of the appliances being disaggregated. The implementation of the load identification technique is described below, while each step is discussed in the following sections. 1) The energy readings are extracted from the I RISE dataset at the sampling rate of 10 to 60 minutes. 2) Sub-sequences are generated from this dataset using temporal sliding windows (refer to Section III-A) with a window size of 10 units. 3) The meta-features are computed for each sub-sequence (refer to Section III-B and III-C).

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

4) The features thus generated are processed as input attributes and the high energy appliances as output classes for the multi-label classifier (refer to Section III-D). 5) The model is trained using 10 % of the dataset and evaluated on the remaining dataset (refer to Section V). Considering a 10 minutes sampling rate, the method can be directly implemented using current smart meters technologies without any other energy boxes. It reduces the privacy concerns as the daily user activity is less detectable. Indeed, only the high consuming events leave a footprint (even if the event is very short in time, a high consumption will be detected in the energy trace). The drawback of this approach is that short term and long term events with low power consumption remain undetected. A. Temporal sliding window We introduce a few key-terms to facilitate the understanding of this work. Temporal data mining encompasses time series analysis on the form, type and scope of the data. The temporal data can be represented by time series or events and can be processed with tools such as classification among others. Some definitions: Time Series: An ordered set of n real-valued variables T = t1 , . . . , t n . Sub-sequence: For a given time series T of length n, a sub-sequence Ck of T is a sampling of length w ≤ n of contiguous positions from T , that is Ck = tk , . . . , tk+w−1 for 1 ≤ k ≤ n − w + 1. Time Sliding Window: Given a time series T of length n and a sub-sequence of length w, a matrix M of all possible sub-sequences can be built by “sliding windows” across T and placing sub-sequence Ck in the k th row of M . The size of the matrix M is (n − w + 1) × w. In the field of load identification in households, the input (energy) is a time series with an ordered set of real-valued variables whereas the output (predicted classes) is an ordered set of events (appliance states). In this work, time series subsequences are generated from the energy reading, then metafeatures are extracted from the sub-sequences to identify the appliances states. The classifier system for load identification is based on temporal classification using standard propositional machine learning algorithms. The initial step is to populate the sliding window with sufficient historical data that aims at creating a single test instance to start the closed loop classifying process for the future time steps (priming). Subset of the original time series are then shifted in time creating thereby the sub-sequences and preserving time dependency among sub-sequences. Instances containing these sub-sequences are finally presented as standard propositional instances to the classification algorithm. B. Meta-features The problem of load identification being addressed here from the context of temporal classification, the issue is to convert raw data into a model that can be understood by

3

established machine learning techniques. There are three broad approaches in the temporal classification domain: 1) Algorithms which deal specifically with temporal classification, for example, factorial hidden Markov models and sparse coding [22], [23]. 2) Relational learning based techniques, like the recurrent neural networks. 3) Problem representation in a way that can be understood by propositional concept learners [24]. The work presented in this paper is based on the third approach. Knowledge extraction for a specific representation of a problem is a technique of attribute construction applied to represent the underlying substructure to the training instances. In the temporal classification domain, these substructures are in the form of sub-events, defining for example a periodicity in the data [25]. These sub-events become synthetic features, which are then fed to a propositional learner. The concept also allows the inclusion of background and domain knowledge for temporal classification. The output of the learner can then be converted back to a human readable form for example as a decision tree.

C. Generated meta-features One of the primary goals of this work is to define and exploit adequate meta-features based on the requirements of residential load identification, used to identify the sub-structures present in the energy readings and hence the signatures of the appliances. The chosen meta-features are specific to the domain of energy consuming appliances in buildings. These features take into account the different characteristics of loads such as time and duration of use, trends, sequence, spike, and correlation among the energy consumption values of the loads. Most of the meta-features are defined for the sub-sequence centered on the considered time of event t, except for the “hour of the day”. The size of the sliding window for one computation is 2N with N = 5 (refer to III). Some of the presented meta-features are illustrated on a generic time window in Fig. 2 and discussed below: 1) Hour of the day: The hour of the day H(t) is a measurement of the hour of occurrence of an event. It is represented as a numeric value from 0 to 23 in the propositional learner. H(t) = h; ∀h ∈ [0, 23]

(1)

2) Distances from the current event to the local maximum and local minimum: These two fields monitor the position of the local maximum dM (t) and minimum dm (t) of energy consumption in the sliding window counted from the current time of event t. They provide information if the current event is a local minimum, local maximum or neither. The displacement is measured as an integer value where “0” signifies whether the current event is a local minimum or maximum and “+d” or “−d” represents the distance (in time steps) to the local minimum and maximum in the sub-sequence.

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

( E(t + dM ) = Emax dM → Emax = max{E(ti )}; ∀ti ∈ [−N, N ] ( E(t + dm ) = Emin dm → Emin = min{E(ti )}; ∀ti ∈ [−N, N ]

(2) (3)

where E(ti ) is the energy consumed from ti−1 to ti . 3) Energy variation between time steps: This meta-feature takes into account the energy variations between the current time of event t and all other time steps in the considered sliding window, ti . Ev (ti ) = E(t) − E(ti ); ∀ti ∈ [−N, N ]; ti 6= t

(4)

The training data corresponding to a sliding window represents the energy consumption of the appliances while its variations in time represent an indirect image of the load profile.

5) Mean and standard deviation: This meta-feature is used to add more general statistics on top of the other metafeatures. The mean and standard deviation are computed in the considered sliding window for the energy E, for the change of energy from previous state to current state Ev , and for the first derivative of energy ∇. 1 E(t) = 2N Ev (t) =

1 2N

∇(t) =

1 2N

!    E(ti )      ti =−N    !  N  X Ev (ti ) ∀ti ∈ [−N + 1, N ]   ti =−N   !  N  X   ∇(ti )    N X

4) Gradient and Laplacian of energy consumption within the window: For this meta-feature, the gradient ∇, laplacian ∆ and gradient ratio ∇r are evaluated around the current time of event t. The gradient ratio is the ratio between the gradient and the energy value at any given time instance ti in the sliding window.  ∇(ti ) = E(ti ) − E(ti − 1)  

∆(ti ) = ∇(ti ) − ∇(ti − 1) ∇r (ti ) = ∇(ti )/E(ti )

∀ti ∈ [−N + 1, N ]

(5)

 

These meta-features allow the classification algorithm to identify trends in energy consumption within the sliding window. Energy spikes or edges can also be detected in relation to the base energy level from which they occur. At low sampling rates, two appliances having similar edges can be identified from different base energy levels.

(6)

ti =−N

D. Multi-label classification The identification of the state of loads is based on multilabel classification techniques. These techniques are frequently used in the field of information theory and present many advantages over single-label classification [26], [27]. A classification learner approximates a function, by mapping a vector to labels after analyzing input-output examples of this function. The features xi and the target class Y are added to the model in the form: (x, Y ) = (x1 , x2 , x3 , . . . , xn , Y )

Fig. 2. Graphical definition of some of the main meta-features concerning the energy measurements in a sliding window.

4

(7)

For multi-label classification: Y = {0, 1}L where L is the number of appliances. In such classification problems, multiple target labels are assigned to each instance. In this work, a function is built which maps inputs xi to an output vector Y contrary to mapping to a scalar output in single-label classification. Given a dataset of labeled instances, classification algorithms seek relations that will correctly predict the class of future unlabeled instances Y 0 from future features x0 , where: ( 0 if the appliance is O FF 0 0 Y = f (x ) = (8) 1 if the appliance is O N The fact that the multi-label classification takes into account the interdependence among labels (in our case the appliances) is the main reason why this work is based on such classifiers. For example, typically the clothes drier is used after the washing machine. Multi-label classifier models are built on meta-features as inputs which are computed on the subsequences Ck . The appliance states are the output classes. The time stamped data also aids the analysis of temporal information for load identification (temporal patterns). There are two broad approaches in handling multi-label classification algorithms. The first is by problem transformation, where a multi-label problem is transformed into one or more single-label problems and then a state of the art classification algorithm such as Decision Tree or Support Vector Machines is deployed. The second is to modify an existing single-label algorithm directly for the purpose of multi-label classification (algorithm adaptation method, e.g. M L -K NN).

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

Two multi-label problem transformation and three different classification algorithms are implemented and confronted in this work by comparison to a Hidden Markov Model. A brief description of these algorithms follows. 1) Binary Relevance (B R): B R [26], [27] is a method of problem transformation that learns separately single-label binary models for each class or label. It transforms the original data into single label datasets that contain all the examples of the original dataset. 2) Label Powerset (L P): The L P [26], [27] transformation considers different sets of labels that exists in the multi-label dataset as a single label. Unlike the B R classifier, the L P algorithm learns using a single classifier consisting of the number of classes times the number of labels in the original multi-label problem. The primary advantage of using such a transformation is that it takes into account the appliance correlation. The primary drawback is the computation cost compared to B R transformation. 3) Decision Tree Learner (D TL): D TL algorithms represent one of the preferred choices for load classification [28]. Decision Trees are rule based and the built model is easy to visualize. Fig. 3 proposes a D TL example learned on a typical appliance in one of the houses of the I RISE database.

use of the disparity measure. In this work, the J48 implementation of the C4.5 algorithm is used [29] and the parameters are optimized using a parameter selection algorithm during training. 4) Support Vector Machine (S VM): S VM [29] is a powerful tool for data classification described in [30]. The first major step of an S VM classification is to build a decision plane that separates a set of objects with different class memberships. It guarantees the best function to distinguish between members of classes by maximizing the margin between them. The maximal margin hyper-planes allow the best generalization abilities and thus the best classification performances on the training dataset. This procedure requires finding the solution of the following optimization problem: ! l X 1 T w w+C ξi min w,b,ξ 2 i=1 (  yi wT φ(xi ) + b ≥ 1 − ξi subject to ξi ≥ 0

K(xi , xj )

D TL usually leads to a good understanding of the significant features for each appliance. The attribute with the highest normalized information gain is chosen for the decision. The information gain is measured in bits and is given a probability distribution. The information required to predict an event is the distribution’s entropy, given by:

S(p1 , p2 , . . . , pn ) = − p1 log(p1 ) − p2 log(p2 ) − · · · − pn log(pn ) (9) The metric used in practice for the D TL based classification is the gain ratio which corrects the information gain by taking the intrinsic information of a split into account. The algorithm applies it recursively on the sub-lists. The impact of the metafeatures on a learning system can be well understood by the

(10) (11)

with l the total number of sub-sequences, w the normal vector of the hyperplane, b the offset of the hyperplane, C the penalty parameter of ξ the error term and φ the kernel function. The second major step is to chose the kernel function of the algorithm. The Radial Basis function is preferred over others in this work. For two groups i and j, the training vectors xi and xj are mapped to a higher dimensional space by the kernel function φ defined as: ( K(xi , xj ) ≡ φ(xi )T φ(xj )

Fig. 3. Example of a decision tree result for an electric oven.

5

 = exp −γ||xi − xj ||2 ; γ > 0

(12)

where γ is a parameter of the kernel. A grid-search has been conducted on the parameters C et γ using cross-validation. S VM is computationally more expensive than rule based algorithms such as D TL. In this work, the Sequential Minimal Optimization (S MO) implementation in [29] is used with a grid search for parameter optimization during training. 5) Multi-label k-Nearest Neighbor (M L -K NN): M L -K NN [26] is the multi-label implementation of the k-nearest neighbor algorithm for single-label classification. It is a direct implementation for multi-label problem, so a single-label base classifier is not required. It works on the principle that for every test instance, the k-nearest neighbors in the training set are identified. Then, according to statistical information gained from the label sets of these neighboring instances (i.e. number of neighboring instances belonging to each possible class), a maximum a-posteriori principle is used to determine the label set for the test instance. M L -K NN is a learning approach which is different from the rule and function based approaches discussed above. 6) Hidden Markov Model (H MM): Given a model and a set of observations (in our case load measurements) the problem is to find the best set of model parameters that maximizes the probability that it produces the observations. The solution to this problem provides a means to train the model to recognize

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

a particular sequence. Iterative procedures can be used to solve this problem. One such algorithm is the forward-backward algorithm, also known as the Baum-Welch or expectationmaximization method. The H MM implementation in [29] is used in this work. The input is the time series sub-sequences of energy readings from the power meter. The model and its parameters were experimentally fixed. IV. P ROCESS OF IDENTIFICATION

6

consumption, the frequency of usage for different electricity levels, the number of residents, the area of the house, the number of deferrable appliances and the hourly mean of deferrable appliances over a year. In Fig. 4 the 100 houses are projected onto two principal components using Principal component analysis (P CA), based on the above mentioned features. The colors and the shapes represent different clusters and the surface area of the points represents the mean hourly electricity consumed over a year.

A. Classification algorithms In the results (Section V), six classification algorithms are compared, based on the algorithms shown in Section III-D. 1) L P1: L P problem transformation using the D TL algorithm. 2) L P2: L P problem transformation using the S VM algorithm. 3) B R1: B R problem transformation using the D TL algorithm. 4) B R2: B R problem transformation using the S VM algorithm. 5) M L -K NN: the M L -K NN algorithm with k=7. 6) H MM: the H MM algorithm. B. Evaluating the performance of the classifiers The data associated with high energy consuming appliances is generally sparse. So a simple accuracy measurement is insufficient to provide information on the performance of a classifier. Even a classifier which predicts no consumption at all can present a high accuracy score if the appliances are most of the time shut down (a typical example would be the washing machine). For this reason, the confidence of the predictions is monitored in our work using tools commonly used in information theory [26], [27] with following the parameters: F-measure: Is taken to be the weighted harmonic mean of precision and recall, a special case of the general F-measure definition where both are equally weighted. Receiver operating characteristic - Area Under Curve (AUC): Is a graphical plot of the true to false positive rate at various threshold settings of the classification algorithm. The AUC score is given on a scale from 0 to 1, where a score of 1.0 indicates perfect classification and a score below 0.5 shows a quasi-random guessing. V. R ESULTS The comparison of multi-label classification algorithms is presented in Sections V-A1, V-A2, and V-A3 for different categories of houses defined in Section V-A, using the Fmeasure for the performance evaluation (refer to Section IV-B). Due to space constraints, 10 and 60 minutes sampling rates are presented only for the first category of houses. A comparison of two multi-label learners with Hidden Markov Model is proposed in Section III-D6, using AUC for the performance evaluation. A. Defining categories of consumers For a qualitative analysis of the database, a clustering of the houses has been done. Four clusters of houses were obtained using an X-means clustering analysis (with min-max normalization for features) [31]. The features for the analysis are the hourly mean and the standard deviation of electricity

Fig. 4. Hundred houses projected into two principal axis (using P CA) and divided into 4 clusters (color and shape)

Cluster 2 (3 % of the total database) has been merged with Cluster 3 because they represent similar appliance categories. The three major clusters correspond to houses containing different appliance categories defined as follows: Cat. 1: Small number of distinct high energy appliances (Cluster 1). Cat. 2: Small number of distinct high energy appliances with a few grouped appliances (Cluster 2 and 3). Cat. 3: Many high energy appliances including grouped and repeated appliances (Cluster 4). 1) First category: The results in Table I indicate that the multi-label learners such as L P and M L -K NN (relying on appliance correlation) offer better performances on some of the appliances at a 10 minutes sampling rate. That is not the case for the other algorithms which do not consider the appliances correlations. It can also be observed that generally, rule based algorithms such as D TL as base learner provide better performances. The scores are much lower at a sampling rate of 1 hour and it is difficult to distinguish classifiers, except for the washing machine. Indeed, the correlation among appliances is weak and the learner is over-fitting during the training. Among all the appliances, the water heater is identified with the highest scores and the microwave oven with the lowest. This is due to the fact that even at a low sampling rate, the water heater keeps similar temporal usage patterns. On the contrary, the microwave oven presents high variations in both the duration of usage and the consumed energy which makes its identification difficult.

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

TABLE I C AT. 1: C OMPARISON OF DIFFERENT MULTI - LABEL ALGORITHMS FOR IDENTIFYING THE LOADS OF A RESIDENTIAL BUILDING .

Appliance

Sampling

Algorithms

Rate

L P1

L P2

B R1

B R2

M L K NN

Washing Machine

10 min 1h

75.72 55.22

72.50 59.55

79.32 60.07

74.49 54.5

71.98 55.87

Microw. Oven

10 min 1h

45.65 29.94

36.87 11.40

19.04 21.13

31.37 5.10

33.45 2.12

Water Heater

10 min 1h

96.66 89.50

95.99 90.46

97.35 91.12

92.02 91.53

97.40 89.56

Electric Oven

10 min 1h

66.79 51.04

55.82 54.46

73.65 50.14

46.81 40.30

60.53 20.30

Clothes Drier

10 min 1h

76.34 58.16

79.21 60.65

75.24 62.21

68.51 60.11

87.15 69.50

Dish Washer

10 min 1h

64.93 36.86

66.40 36.50

61.42 36.72

67.21 34.66

79.78 32.63

A microwave oven usage provides potential information only for users if they want to reduce their energy consumption at a given time; it is certainly not usable through flexibility requests. Actually, as a side result on load identification, only appliances presenting the possibility to become automatically controlled are well identified (washing machine, clothes drier, dish washer and water heater). In fact, the water heater has already a possible distant grid control in France, the heating of the water operation depending on the time of use pricing. 2) Second category: In this identification, the washing machine and the clothes drier are grouped together and considered as one appliance, along with other loads in the house. The considered sampling rate is 10 minutes. The performance of the five algorithms is presented in Table II. It is observed that when the number of high consuming appliances in a residence is small, the classification performances are generally good. As in the previous section, the L P algorithm and M L -K NN present the best performances. Compared to the performances of the algorithms when no load is grouped (Cat. 1), it can be seen that the appliances are identified with a better F-measure for all of the five algorithms. This category of households is particularly interesting when considering the identification algorithms and a potential (distant) control for energy management and grid support through ancillary services. 3) Third category: In this section, a high number of appliances including doubled appliances are considered. The results are shown in Table II, with a 10 minutes sampling time. The performance of the classifiers is drastically reduced as the number of appliances increases. Only the first water heater is identified with a sufficient accuracy, because its temporal features are well defined, but it is not the case for the second water heater, which is use less frequently. At such a low sampling rate, it is difficult for the classifier to learn the right temporal features of the appliances, if many of them are used simultaneously and especially, if some of them are repeated. In this case, the short training phase is insufficient for a proper identification. Finally, it is observed that the single

7

multi-label classifier (M L -K NN) presents a lower performance than the transformation based multi-label classifiers based on single-label classifiers at the core (for example B R1). This can be explained by the difficulty to find a specific correlation among so many appliances. Considering all the labels at the same time does not provide interesting results. TABLE II C AT. 2 & 3: C OMPARISON OF DIFFERENT MULTI - LABEL ALGORITHMS Algorithms

Cat. Appliance

2

L P1

L P2

B R1

B R2

MLKNN

Wash.Mach. 73.93 Cloth.Drier

67.62

69.37

70.17

76.18

Dish Washer

91.05

92.88

89.56

86.52

93.65

Electric Oven

84.46

77.44

82.92

59.06

79.81

Wash.Mach. 44.14 Cloth.Drier

44.55

36.13

43.27

17.56

40.96

27.43

40.96

18.31

2.14

Microw. Oven

14.93

23.61

34.56

10.29

1.32

Water Heat. 1

75.17

76.33

76.56

76.16

85.92

Water Heat. 2

28.49

24.11

32.43

21.58

6.90

Dish Washer

48.93

46.23

42.16

55.10

38.93

Hot Plates 3

This work is based on voluntarily restrictive conditions. The state of the appliances can change at the same time and the training period is short for the classification algorithms. Indeed, actual consumers would not accept a long period of monitoring and will use simultaneously their appliances. Considering the results of all the three categories of houses, an algorithm capable of using the appliances correlations is generally more suitable than one which is not. Cat. 2 is the most interesting in the context of our work. Indeed, when the number of appliances increases, it is more efficient to limit the possibility of taking into account the appliances correlations since this would interfere with the identification capabilities: the classification algorithm is trying to find relations between appliances where there are none. B. Comparison with a standard load identification method In Table III, a comparison is shown between two multilabel learners and a standard non-intrusive load monitoring algorithm (H MM). The number of states of the H MM is experimentally set to two and the input is the time series subsequence. The results are shown for four houses corresponding to the four clusters defined in Section V-A. They indicate that at low sampling rates the multi-label learners using metafeatures perform considerably better than the H MM. The L P2 which considers the appliances correlations, performs better than the B R2. Also, using S VM as a base learner is better than other base learners (considering AUC measures).

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

TABLE III C OMPARISON OF M ULTI - LABEL LEARNERS AND H IDDEN M ARKOV M ODEL WITH AUC MEASURES

8

small number of appliances, different high energy appliances, repeated appliances, etc. The learned model which considers energy management applied to smart buildings and behavior prediction are matter of future work.

Algorithms Residence House 1 House 2 House 3 House 4

B R2

L P2

HMM

ACKNOWLEDGMENT

97.90 95.57 89.76 95.27

99.24 98.42 97.51 98.96

80.20 72.27 63.55 84.67

This work is part of the project S UPERBAT which is sponsored by the National Research Agency (A NR), France. R EFERENCES

C. Discussion The proposed N IALM technique is suitable for scenarios where multiple appliances start at the same time (similar and dissimilar). The method used in this article is non-event based and uses a multi-label classification approach thereby developing a separate category when two appliances are in the O N state (section III-D2). When the washing machine and the water heater are working during the same period of time, a new combined binary class label will be generated representing the washing machine and the water heater O N states and will be compared with similar instances encountered during training. This holds true also for two similar appliances, for example if two appliances with two possible states (O N and O FF) are represented as binary “1” and “0” respectively. Four new classes will be generated represented as “00”, “01”, “11”, “10” considering all possible state combinations. VI. C ONCLUSION The article proposes several multi-label learning methods that take into account the appliances correlations based on novel meta-features. The loads are identified without the extensive monitoring of the inhabitants during the training phase and without any monitoring thereafter. The inhabitants can monitor their energy consumption for a short period of time and subsequently get an energy management plan for the rest of the year. For grid managers, a good identification would lead to better opportunities of flexibility assessment and requests (through distant shut down, load shading or shifts) and also to a better global behavior prediction, without intervention into the residents’ private lives. The results are computed using 10 and 60 minutes sampling rates on the I RISE database (including hundred houses monitored over one year) using a range of multi-label learning algorithms. The choice of the sampling rate is done to avoid privacy issues, to maintain a realistic order of magnitude considering the actual smart meters, and to decrease the need for big data handling. The results indicate that considering temporal knowledge leads to an increased capability of non-intrusive disaggregation of the aggregated load. The use of multi-label learners also exhibits that there are relations between appliances that are usefull to take into account. The presented algorithms are suitable for load identification, considering particular hypotheses (like appliance grouping) that allow defining categories of houses, for example with a

[1] F. D. Angelis, M. Boaro, D. Fuselli, S. Squartini, F. Piazza, and W. Qinglai, “Optimal home energy management under dynamic electrical and thermal constraints,” Industrial Informatics, IEEE Transactions on, vol. 9, no. 3, pp. 1518–1527, 2013. [2] T. Strasser, F. Andren, M. Merdan, and A. Prostejovsky, “Review of trends and challenges in smart grids: An automation point of view,” in Industrial Applications of Holonic and Multi-Agent Systems, ser. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2013, vol. 8062, pp. 1–12. [3] P. Palensky and D. Dietrich, “Demand side management: Demand response, intelligent energy systems, and smart loads,” Industrial Informatics, IEEE Transactions on, vol. 7, no. 3, pp. 381–388, 2011. [4] P. Siano, “Demand response and smart gridsa survey,” Renewable and Sustainable Energy Reviews, vol. 30, no. 1, pp. 461–478, 2014. [5] V. Gungor, D. Sahin, T. Kocak, S. Ergut, C. Buccella, C. Cecati, and G. Hancke, “A survey on smart grid potential applications and communication requirements,” Industrial Informatics, IEEE Transactions on, vol. 9, no. 1, pp. 28–42, Feb 2013. [6] K. Basu, L. Hawarah, N. Arghira, H. Joumaa, and S. Ploix, “A prediction system for home appliance usage,” Energy and Buildings, vol. 67, no. 1, pp. 668–679, 2013. [7] P. Ducange, F. Marcelloni, and M. Antonelli, “A novel approach based on finite-state machines with fuzzy transitions for nonintrusive home appliance monitoring,” Industrial Informatics, IEEE Transactions on, vol. 10, no. 2, pp. 1185–1197, May 2014. [8] N. Ding, Y. Besanger, F. Wurtz, and G. Antoine, “Individual nonparametric load estimation model for power distribution network planning,” Industrial Informatics, IEEE Transactions on, vol. 9, no. 3, pp. 1578– 1587, Aug 2013. [9] G. Hart, “Nonintrusive appliance load monitoring,” Proceedings of IEEE, vol. 80, no. 12, pp. 1870–1891, 1992. [10] M. Zeifman and R. Kurt, “Nonintrusive appliance load monitoring: Review and outlook,” Consumer Electronics, IEEE Transactions on, vol. 57, no. 1, pp. 76–84, 2011. [11] J. Li and N. Allinson, “Building recognition using local oriented features,” Industrial Informatics, IEEE Transactions on, vol. 9, no. 3, pp. 1697–1704, 2013. [12] M. Berges, E. Goldman, H. Matthews, and L. Soibelman, “Enhancing electricity audits in residential buildings with nonintrusive load monitoring,” Journal of Industrial Ecology, vol. 14, pp. 844–858, 2010. [13] L. Norford and S. Leeb, “Non-intrusive electrical load monitoring in commercial buildings based on steady-state and transient load-detection algorithms,” Energy and Building, vol. 24, no. 1, pp. 51–64, 1996. [14] R. Fernandes, I. D. Silva, and M. Oleskovicz, “Load profile identification interface for consumer online monitoring purposes in smart grids,” Industrial Informatics, IEEE Transactions on, vol. 9, pp. 1507–1517, 2013. [15] K. Basu, V. Debusschere, and S. Bacha, “Appliance usage prediction using a time series based classification approach,” in Industrial Electronics (IECON), IEEE Conference on, October 2012, pp. 1217–1222. [16] B. Birt, G. Newsham, I. Beausoleil-Morrison, M. Armstrong, N. Saldanha, and I. Rowlands, “Disaggregating categories of electrical energy end-use from whole-house hourly data,” Energy and Building, vol. 50, pp. 93–102, 2012. [17] E. Rieur and M. Alahmad, “On the discourse of energy as material: Future feedback technologies and directions for experiencing energy,” Industrial Informatics, IEEE Transactions on, vol. 10, no. 1, pp. 742– 751, Feb 2014. [18] G. Kalogridis, C. Efthymiou, S. Denic, T.A., and R. Cepeda, “Privacy for smart meters: Towards undetectable appliance load signatures,” in Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on, October 2010.

TRANSACTION ON INDUSTRIAL INFORMATICS, SPECIAL SECTION ON NEW TRENDS IN INTELLIGENT ENERGY SYSTEMS

[19] A. Prudenzi, “A neuron nets based procedure for identifying domestic appliances pattern-of-use from energy recordings at meter panel,” in Power Engineering Society Winter Meeting, 2002. IEEE, vol. 2, 2002, pp. 941–946. [20] J. Kolter, S. Batra, and Y. Andrew, “Energy disaggregation via discriminative sparse coding,” Advances in Neural Information Processing Systems, vol. 1, pp. 1153–1161, 2010. [21] K. Basu, V. Debusschere, and S. Bacha, “Load identification from power recordings at meter panel in residential households,” in Electrical Machines (ICEM), International Conference on, 2-5 September 2012, pp. 2098–2104. [22] O. Parson, S. Ghosh, M. Weal, and A. Rogers, “Using hidden markov models for iterative non-intrusive appliance monitoring,” in Neural Information Processing Systems workshop on Machine Learning for Sustainability, December 2011. [23] W. Labeeuw and G. Deconinck, “Residential electrical load model based on mixture model clustering and markov models,” Industrial Informatics, IEEE Transactions on, vol. 9, no. 3, pp. 1561–1569, 2013. [24] M. Kadous, “Temporal classification: Extending the classification paradigm to multivariate time series,” Ph.D. dissertation, The University of New South Wales, 2002. [25] M. Dong, P. Meira, W. Xu, and W. Freitas, “An event window based load monitoring technique for smart meters,” Smart Grid, IEEE Transactions on, vol. 3, no. 2, pp. 787–796, 2012. [26] G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, and I. Vlahavas, “Mulan: A java library for multi-label learning,” Journal of Machine Learning Research, vol. 12, pp. 2411–2414, 2011. [27] G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,” International Journal of Data Warehousing and Mining, vol. 3, no. 3, pp. 1–13, 2007. [28] J. Quinlan, “Induction of decision trees,” Mach, vol. 1, pp. 81–106, 1986. [29] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten, “The weka data mining software: An update,” http://www. cs.waikato.ac.nz/ml/weka, 2009. [30] T. Onoda, G. R¨atsch, and K. M¨uller, “Applying support vector machines and boosting to a non-intrusive monitoring system for household electric appliances with inverters,” in Second International Symposium on Neural Computation, ICSC, 2000. [31] D. Pelleg and A. Moore, “X-means: Extending k-means with efficient estimation of the number of clusters,” in Seventeenth International Conference on Machine Learning, ICML, 2000, pp. 727–734.

Kaustav Basu Kaustav Basu is a PHD student at Grenoble Electrical Engineering Laboratory (G2Elab), Grenoble University. He received the Master Degree in Computer Science and Engineering from Jadavpur University in 2011. During the course he spent 1 year as an Erasmus Mundus exchange student at Ecole Nationale Superieure d’Informatique et Mathematiques Appliquees (ENSIMAG), Grenoble INP and also worked in Laboratory GSCOP in Grenoble, France. He is a recipient of the Erasmus Mundus and all india GATE scholarship. His main domain of interest are application of machine learning in the field of Energy management, load identification, forecasting and other such real life problems.

Vincent Debusschere Dr Vincent Debusschere was born in France in 1981. He joined the Ecole Normale Superieure de Cachan in 2001 for studies in the field of applied physics. He received the Master Degree IST from the University Paris-Sud XI and ENS Cachan in 2005. He received the PHD degree in 2009 in Eco-Design of Electrical Machines. He joined the Grenoble Electrical Engineering Laboratory (G2Elab) in 2010. His main fields of interest are power conversion systems modeling and optimized design, renewable energy integration, energy efficiency and eco-design.

9

Seddik Bacha Seddik Bacha received the Engineering and Magister degrees form the Ecole National Polytechnique of Algiers in 1982 and 1990, respectively. He joined the Grenoble Electrical Engineering Laboratory (G2Elab) and received the PhD and HDR degrees in 1993 and 1998, respectively. He is currently scientific advisor at the SuperGrid Institute of Energy Transition (France) and professor at the Joseph Fourier University of Grenoble, France. His main fields of interest are renewables integration, microgrids and HVDC Transmission grids.

Ujjwal Maulik Prof Ujjwal Maulik, PhD is a Professor in the Department of Computer Science and Engineering, Jadavpur University, Kolkata, India and senior member of the IEEE. He is the recipient of Govt. of India BOYSCAST fellowship, Alexander Von Humboldt Fellowship for Experienced Researchers and Senior Associate of ICTP, Italy. He is a Fellow of the Indian National Academy of Engineering. Dr. Maulik is a co-author of 7 books and more than 250 research publications. His research interests include Computational Intelligence, Bioinformatics, Combinatorial Optimization, Pattern Recognition, Data Mining.

Sanghamitra Bandyopadhyay Prof Sanghamitra Bandyopadhyay, PhD is a Professor at the Indian Statistical Institute (ISI), Kolkata, India. She is a Fellow of the National Academy of Sciences, Allahabad, India (NASI), and Indian National Academy of Engineering (INAE), and senior member of the IEEE. Her research interests include computational biology and bioinformatics, soft and evolutionary computation, pattern recognition and data mining. She was awarded the prestigious Shanti Swarup Bhatnagar Prize in Engineering Science. She has authored/co-authored more than 130 journal papers and 140 articles in international conferences and book chapters, and published six authored and edited books from publishers like Springer, World Scientific and Wiley.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.