Resumen:: NTL detection : Overview of classic and DNN-based approaches on a labeled dataset of 311k customers. :: SILO. Sistema nacional de repositorios digitales. Uruguay

Conferencia Publicado

NTL detection : Overview of classic and DNN-based approaches on a labeled dataset of 311k customers.

Massaferro Saquieres, Pablo - Di Martino, Matías - Fernández, Alicia

Resumen:

Non-technical losses (NLT) constitute a significant problem for developing countries and electric companies. The machine learning community has offered numerous countermeasures to mitigate the problem. Yet, one of the main bottlenecks consists of collecting and accessing labeled data to evaluate and compare the validity of proposed solutions. In collaboration with the Uruguayan power generation and distribution company UTE, we collected data and inspected 311k costumers, creating one of the world’s largest fully labeled datasets. In the present paper, we use this massive amount of information in two ways. First, we revisit previous work, compare, and validate earlier findings tested in much smaller and less diverse databases. Second, we compare and analyze novel deep neural network algorithms, which have been more recently adopted for preventing NLT. Our main discoveries are: (i) that above 80k training examples, the performance gain of adding more training data is marginal; (ii) if modern classifiers are adopted, handcrafting features from the consumption signal is unnecessary; (iii) complementary customer information as well as the geo-localization are relevant features, and complement the consumption signal; and (iv) adversarial attack ideas can be exploited to understand which are the main patterns that characterize fraudulent activities and typical consumption profiles.

Detalles Bibliográficos
Fecha de publicación:	2021
Temas:	Training Training data Companies Switches Performance gain Smart meters Smart grids Non-technical losses Electricity theft Automatic fraud detection
Idioma	Inglés
Institución:	Universidad de la República
Repositorio:	COLIBRI
Enlace(s):	https://hdl.handle.net/20.500.12008/26892
Nivel de acceso:	Acceso abierto
Licencia:	Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)

Resumen:
Sumario:	Non-technical losses (NLT) constitute a significant problem for developing countries and electric companies. The machine learning community has offered numerous countermeasures to mitigate the problem. Yet, one of the main bottlenecks consists of collecting and accessing labeled data to evaluate and compare the validity of proposed solutions. In collaboration with the Uruguayan power generation and distribution company UTE, we collected data and inspected 311k costumers, creating one of the world’s largest fully labeled datasets. In the present paper, we use this massive amount of information in two ways. First, we revisit previous work, compare, and validate earlier findings tested in much smaller and less diverse databases. Second, we compare and analyze novel deep neural network algorithms, which have been more recently adopted for preventing NLT. Our main discoveries are: (i) that above 80k training examples, the performance gain of adding more training data is marginal; (ii) if modern classifiers are adopted, handcrafting features from the consumption signal is unnecessary; (iii) complementary customer information as well as the geo-localization are relevant features, and complement the consumption signal; and (iv) adversarial attack ideas can be exploited to understand which are the main patterns that characterize fraudulent activities and typical consumption profiles.

NTL detection : Overview of classic and DNN-based approaches on a labeled dataset of 311k customers.

Resultados similares