This paper presents an interpretable machinelearning model for anomaly detection in door locks using torque data. The model aims to replace the human tactile sense in the quality control process, reducing repetitive tasks and improving reliability. The model achieved an accuracy of 96%, however, to gain social acceptance and operators' trust, interpretability of the model is crucial. The purpose of this study was to evaluate anapproach that can improve interpretability of anomalousclassifications obtained from an anomaly detection model. Weevaluate four instance-based counterfactual explanators, three of which, employ optimization techniques and one uses, a less complex, weighted nearest neighbor approach, which serve as ourbaseline. The former approaches, leverage a latent representation of the data, using a weighted principal component analysis, improving plausibility of the counter factual explanations andreduces computational cost. The explanations are presentedtogether with the 5-50-95th percentile range of the training data, acting as a frame of reference to improve interpretability. All approaches successfully presented valid and plausible counterfactual explanations. However, instance-based approachesemploying optimization techniques yielded explanations withgreater similarity to the observations and was therefore concluded to be preferable despite the higher execution times (4-16s) compared to the baseline approach (0.1s). The findings of this study hold significant value for the lock industry and can potentially be extended to other industrial settings using timeseries data, serving as a valuable point of departure for further research.
Historically, cylinder locks’ quality has been tested manually by human operators after full assembly. The frequency and the characteristics of the testing procedure for these locks wear the operators’ wrists and lead to varying results of the quality control. The consistency in the quality control is an important factor for the expected lifetime of the locks which is why the industry seeks an automated solution. This study evaluates how consistently the operators can classify a collection of locks, using their tactile sense, compared to a more objective approach, using torque measurements and Machine Learning (ML). These locks were deliberately chosen because they are prone to get inconsistent classifications, which means that there is no ground truth of how to classify them. The ML algorithms were therefore evaluated with two different labeling approaches, one based on the results from the operators, using their tactile sense to classify into ‘working’ or ‘faulty’ locks, and a second approach by letting an unsupervised learner create two clusters of the data which were then labeled by an expert using visual inspection of the torque diagrams. The results show that an ML-solution, trained with the second approach, can classify mechanical anomalies, based on torque data, more consistently compared to operators, using their tactile sense. These findings are a crucial milestone for the further development of a fully automated test procedure that has the potential to increase the reliability of the quality control and remove an injury-prone task from the operators.
Real-time engine condition monitoring and fault diagnostics results in reduced operating and maintenance costs and increased component and engine life. Prediction of faults can change the maintenance model of a system from a fixed maintenance interval to a condition based maintenance interval, further decreasing the total cost of ownership of a system. Technologies developed for engine health monitoring and advanced diagnostic capabilities are generally developed for larger gas turbines, and generally focus on a single system; no solutions are publicly available for engine fleets. This paper presents a concept for fleet monitoring finely tuned to the specific needs of micro gas turbines. The proposed framework includes a physics-based model and a data-driven model with machine learning capabilities for predicting system behaviour, combined with a diagnostic tool for anomaly detection and classification. The integrated system will develop advanced diagnostics and condition monitoring for gas turbines with a power output under 100 kW.
Heavy-duty machines are equipment constructed for working under rough conditions and their design is meant to withstand heavy workloads. However, the last decades technical development in cheap electronically components have lead to an increase of electrical systems in traditionally mainly mechanical systems of heavy-duty machines. As the complexity of these machines increases, so does the complexity of detecting and diagnosing machine faults. However, the addition of new electrical systems, such as on-board computational power and telematics, makes it possible to add new sensors that measure signals relevant for fault detection and diagnosis, and to process signals on-board or off-board the machines.
In this thesis, we address the diagnostic problem by investigating data-driven methods for remote diagnosis of heavy-duty machines, where a part of the analysis is performed on-board the machine (fault detection), while another part is performed off-board the machine (fault classification). We propose a diagnostic framework where we use a novel combination of methods for each step in the diagnosis. On-board the machine, we have used logistic regression as an anomaly detector to detect faults that will lead to a stream of individual cases classified as anomalous or not. Then, either on-board or off-board, we can use a probabilistic anomaly detector to identify whether the stream of cases is truly anomalous when we look at the stream of cases as a group. The anomalous group of cases is called a composite case. Thereafter, off-board the machine, each anomalous individual case is classified into a fault type using a case-based reasoning approach to fault diagnosis. In the final step, we fuse the individual classifications into a single aggregated classification for the composite case. In order to be able to assess the reliability of a diagnosis, we also propose a novel case-based approach to estimating the reliability of probabilistic predictions. It can, for instance, be used for assessing the confidence of the classification of a composite case given historical data of the predictive reliability.
This paper describes an evaluation of five machine learning algorithms for predicting the domestic space and hot- water heating production for the next day. The evaluated algorithms were the k-nearest neighbour algorithm, linear regression, regression tree, decision table and support vector machine regres- sion. The hot water production was measured in the ME3Gas project, where data was collected from two Swedish households that use the same type of geothermal heat pumps for space heating and hot-water production. The evaluation consisted of four experiments where we compared the regression performance by varying the number of previous days and the number of time periods for each day as input features. In the experiments, the k-nearest neighbour algorithm, linear regression and support vector machine regression had the best performance.
Many approaches used for diagnostics today are based on a precise model. This excludes diagnostics of many complex types of machinery that cannot be modelled and simulated easily or without great effort. Our aim is to show that by including human experience it is possible to diagnose complex machinery when there is no or limited models or simulations available. This also enables diagnostics in a dynamic application where conditions change and new cases are often added. In fact every new solved case increases the diagnostic power of the system. We present a number of successful projects where we have used feature extraction together with case-based reasoning to diagnose faults in industrial robots, welding, cutting machinery and we also present our latest project for diagnosing transmissions by combining Case-Based Reasoning (CBR) with statistics. We view the fault diagnosis process as three consecutive steps. In the first step, sensor fault signals from machines and/or input from human operators are collected. Then, the second step consists of extracting relevant fault features. In the final diagnosis/prognosis step, status and faults are identified and classified. We view prognosis as a special case of diagnosis where the prognosis module predicts a stream of future features.
This paper describes a generic framework for explaining the prediction of a probabilistic classifier using preceding cases. Within the framework, we derive similarity metrics that relate the similarity between two cases to a probability model and propose a novel case-based approach to justifying a classification using the local accuracy of the most similar cases as a confidence measure. As a basis for deriving similarity metrics, we define similarity in terms of the principle of interchangeability that two cases are considered similar or identical if two probability distributions, derived from excluding either one or the other case in the case base, are identical. Thereafter, we evaluate the proposed approach for explaining the probabilistic classification of faults by logistic regression. We show that with the proposed approach, it is possible to find cases for which the used classifier accuracy is very low and uncertain, even though the predicted class has high probability.
This paper describes a generic framework for explaining the prediction of probabilistic machine learning algorithms using cases. The framework consists of two components: a similarity metric between cases that is defined relative to a probability model and an novel case-based approach to justifying the probabilistic prediction by estimating the prediction error using case-based reasoning. As basis for deriving similarity metrics, we define similarity in terms of the principle of interchangeability that two cases are considered similar or identical if two probability distributions, derived from excluding either one or the other case in the case base, are identical. Lastly, we show the applicability of the proposed approach by deriving a metric for linear regression, and apply the proposed approach for explaining predictions of the energy performance of households.
This paper presents a novel, unsupervised approach to detecting anomalies at the collective level. The method probabilistically aggregates the contribution of the individual anomalies in order to detect significantly anomalous groups of cases. The approach is unsupervised in that as only input, it uses a list of cases ranked according to its individual anomaly score. Thus, any anomaly detection algorithm can be used for scoring individual anomalies, both supervised and unsupervised approaches. The applicability of the proposed approach is shown by applying it to an artificial data set and to two industrial data sets detecting anomalously moving cranes (model-based detection) and anomalous fuel consumption (neighbour-based detection).
This paper presents a generic approach to fault diagnosis of heavy duty machines that combines signal processing, statistics, machine learning, and case-based reasoning for on-board and off-board analysis. The used methods complement each other in that the on-board methods are fast and light-weight, while case-based reasoning is used off-board for fault diagnosis and for retrieving cases as support in manual decision mak- ing. Three major contributions are novel approaches to detecting clutch slippage, anomaly detection, and case-based diagnosis that is closely in- tegrated with the anomaly detection model. As example application, the proposed approach has been applied to diagnosing the root cause of clutch slippage in automatic transmissions.
This paper presents a novel approach to fault diagnosis applied to a stream of cases. The approach uses a combination of case-based reasoning and information fusion to do classification. The approach consists of two steps. First, we perform local anomaly detection on-board a machine to identify anomalous individual cases. Then, we monitor the stream of anomalous cases using a stream anomaly detector based on a sliding window approach. When the stream anomaly detector identifies an anomalous window, the anomalous cases in the window are classified using a CBR classifier. Thereafter, the individual classifications are aggregated into a composite case with a single prediction using a information fusion method. We compare three information fusion approaches: simple majority vote, weighted majority vote and Dempster-Shafer fusion. As baseline for comparison, we use the classification of the last identified anomalous case in the window as the aggregated prediction.
More and more industries are aspiring to achieve a successful production using the known artificial intelligence. Machine learning (ML) stands as a powerful tool for making very accurate predictions, concept classification, intelligent control, maintenance predictions, and even fault and anomaly detection in real time. The use of machine learning models in industry means an increase in efficiency: energy savings, human resources efficiency, increase in product quality, decrease in environmental pollution, and many other advantages. In this chapter, we will present two industrial applications of machine learning. In all cases we achieve interesting results that in practice can be translated as an increase in production efficiency. The solutions described cover areas such as prediction of production quality in an oil and gas refinery and predictive maintenance for micro gas turbines. The results of the experiments carried out show the viability of the solutions.
Fault diagnosis and prognosis of industrial equipment become increasingly important for improving the quality of manufacturing and reducing the cost for product testing. This paper advocates that computer-based diagnosis systems can be built based on sensor information and by using case-based reasoning methodology. The intelligent signal analysis methods are outlined in this context. We then explain how case-based reasoning can be applied to support diagnosis tasks and four application examples are given as illustration. Further, discussions are made on how CBR systems can be integrated with machine learning techniques to enhance its performance in practical scenarios.