Building Simulation: An International Journal

Article Title

Clustering and classification of energy meter data: A comparison analysis of data from individual homes and the aggregated data from multiple homes


smart meter data, daily profile, clustering, classification, data usability, heat substation


The transition towards a more sustainable environment requires the development of new control systems on the demand side to integrate renewable energy sources into the energy systems. For this purpose, energy meter data of homes have been broadly used in modelling, forecast and optimal control of energy use. However, usability and reliability of household energy meter data have not been specifically addressed. In this study, we apply commonly used machine learning methods on the heating consumption data of (1) two individual homes in an apartment building and (2) the district heating substation of the apartment building which includes 72 homes, to identify how the characteristics of data affect the result of data analysis. Two clustering approaches were applied using the K-means algorithm to group similar heating daily profiles. Using the clustering results, different classification algorithms such as logistic regression and random forest were applied to predict the heating consumption level with regards to the weather conditions. The data analysis process showed that the substation data which is the aggregated heating consumption of the 72 homes is more reliable and valid for energy prediction than the data from two individual homes. This is due to the large variation and uncertainty in the daily energy use of individual homes.


Tsinghua University Press