Skip to main content
Transformer Architectures: Ground-breaking Not Only in the Field of NLP

Efficient Anomaly Detection on Wind Turbines with Temporal Fusion Transformer Models

Transformer architectures have proven to be ground-breaking in the field of natural language processing (NLP) – especially due to models like GPT-3 and applications like ChatGPT, which also caused a sensation outside the world of AI. However, these powerful models do not only develop their potential in language-related applications. Also in other fields, like the projection of time series, transformer architectures have the potential to offer significant advantages compared to conventional models.

A special variant of this technology is the temporal fusion transformer (TFT). Besides its high accuracy in time series projections, the TFT stands out due to its impressive interpretability and its ability for data fusion (i.e. the possibility to use data for training and forecasts across plants). Particularly in the energy industry, where accurate projections are essential for substantiated decisions, this leads to highly interesting possible applications.

In a proof of concept, Iqony has checked the practicability of the TFT model.

Pioneering Proof of Concept with the Scieneers

In doing so, Iqony was supported by the scieneers. The scieneers are an IT company with 35 IT, data, and data science specialists enthusiastic about handling and usefully applying data. They focus on offering data engineering and data science as a service to generate findings and thus value from data. They support their clients‘ teams on demand or offer entire teams to design and develop whole data products and, most notably, initiate their production.

To Iqony, the continuous improvement of the efficiency and safety of its power plants is essential. The solutions are in use at Iqony’s plants but also at customers‘ plants worldwide. A central element consists in the early detection of plant damages in order to minimize downtimes and reduce maintenance costs. A complex anomaly detection system is in use for this purpose, which can be applied at wind farms as well as PV systems and at conventional power plants. It integrates two different strategies: an engineering approach and an autoencoder approach. In our PoC, we focused on using the TFT for the engineering approach.

Here a target value (e.g. bearing temperature) is predicted on the basis of several input data (e.g. rotational speed, outside temperature), and the projected value is compared to the actual value. Via a statistical evaluation of the deviations of the two values, potential anomalies are then identified and thus plant damages are detected early on. Our goal was to analyze how the TFT model performs in this context compared to the multi layer perceptron (MLP) model used to far – and optimized over the years. One focus here was on analyzing the TFT model‘s ability to learn across plants by means of data fusion in order to generate a high-end projection also for plants with a short data history (caused in practice e.g. by a later commissioning of individual plants).

We relied on Azure as the cloud platform for our project. Azure ML Pipelines and MLFlow enabled a smooth and reproducible execution of the model runs. The TFT model was interpreted by means of PyTorch and the code was shared as well as persisted via Azure Devops. The data base consisted in the SCADA data of a wind farm with 40 wind turbines and the data were available as of 2018. For the training of the models, data from April 2018 to April 2019 were used. The evaluation was carried out on the basis of data from 2023.

A direct assessment of the anomaly detection capability was not possible due to the missing deviation statistics for TFTs and unavailable labels for anomalies in the data. To assess the performance of the models anyway, we compared the projection of TFT and MLP for three different use cases (active power, rotor bearing temperature, and gear bearing temperature) visually for selected days and quantitatively by means of the mean absolute error (MAE) metric, together with the experts from scieneers. An immediate inference regarding the ability to detect anomalies thus is not 100% valid; however, strong evidence for this ability can be gathered this way. To be able to share the results efficiently and enable a comprehensive analysis, a dashboard was created in Altair and Python (Fig. 1 and Fig. 2).

Does the TFT model deliver what it promises?

The results were highly promising. For all use cases, the TFT model yielded similar projections as the previous MLP model. For the target variable “Active power“, the only one for which we were able to optimize the hyperparameters of the TFT in the context of the PoC for time reasons, the projections were even provably more accurate on the whole. Particularly remarkable, however, was the TFT’s ability to achieve results comparable to those of the MLP (12 months training) via data fusion also for plants with artificially reduced data history (three vs. 12 months training). This confirms the hypothesis that so-called “cold start“ problems (i.e. small data base when commissioning plants), which often occur in the energy industry, can be mitigated by means of TFTs. Another great advantage was that a single multivariate TFT model like e.g. in the use case of the gear bearing temperature (~12 target variables) could replace up to 480 MLP models currently in operation, which would significantly reduce the complexity for maintenance and operation of the system.

Further analyses like e.g. the direct assessment of the detection of anomalies or the transferability of a TFT model across various wind farms are yet to be carried out. This first field test, however, allows to conclude that TFT models indeed have the potential to improve the existing anomaly detection at Iqony. Thus the transformer architecture in the form of the TFT models may open new perspectives and enable companies like Iqony to further optimize their existing systems and processes in terms of efficiency and effectivity.

About scieneers GmbH

The scieneers are a team of more than 30 experts who have made it their business to create values from data. The word “science“ in their name not only stands for sound knowledge of mathematical-statistical models for the analysis of data but also for the aspiration to always focus on that which is theoretically possible as well as on technology’s state of the art. Here “engineering“ means to consistently implement that which is feasible and provide reliable solutions that are scalable.

The scieneers‘ range of services reaches from conventional business intelligence systems via the application of mathematical-statistical procedures and machine learning right up to automated text, image, and language comprehension by means of artificial intelligence. Here the scieneers provide support along the entire digital value chain, i.e. in generating ideas, deriving business cases that create value, developing data structures and models as well as in the implementation of the solution as well as its seamless integration into the clients‘ operative IT systems and business processes.

Take the chance and exchange ideas with our expert now!

Dr. Peter Deeskow

Head of R&D

Base