When a comprehensive training set is available, a supervised anomaly detection t
ID: 3801957 • Letter: W
Question
When a comprehensive training set is available, a supervised anomaly detection technique can typically outperform an unsupervised anomaly technique when performance is evaluated using measures such as the detection and false alarm rate. However, in some cases, such as fraud detection, new types of anomalies are always developing. Performance can be evaluated according to the detection and false alarm rates because it is possible to determine whether an object (transaction) is anomalous. Discuss the relative merits of supervised and unsupervised anomaly under such conditions. Cite the sources you use to make your response.
Explanation / Answer
Anomaly Detection: The detection of anomalies in real time extremely valuable and anomalies not always bad or indicative of failure, however accurately detecting anomalies can be very difficult. Effective anomaly detection requires a system to learn continuously to detect early one cannot wait for metric to be obviously out of bounds. Early detection requires the ability to detect subtle changes in patterns that not obvious or easily detected. The nature of anomalies unexpected, an effective detection system must be able to determine whether new events of anomalous without relying on preprogrammed thresholds.
Anomaly Detection in Streaming Data: Anomaly detection in streaming data can be extremely valuable in many domains, such as IT security, finance, vehicle tracking, health etc. essentially in any application where sensors that produce important data changing over time. HTM based applications offer significant improvements over existing methods of anomaly detection for streaming data. This method not required training data or separate training step, automatic model building and learning eliminates manually define and maintain models and data sets. This vastly reduces the time and effort for the user.
Alternative Anomaly Detection Methods: Anomaly detection have broad field, numerous anomaly detection methods used in many different domains. The following discussion is largely confined to methods that applicable to streaming data, and in particular IT analytics. The "no free lunch" theorem2 postulates that when averaged over all possible problems, no algorithm will perform better than all others; in other words, algorithms must be optimized for a particular problem or domain to work “better”. This property, in large part, explains why so many anomaly detection methods – no one method works well in all possible domains. But even in one domain, impossible to build a perfect anomaly detection system, and impossible for any system to consistently point out only the anomalies care about and avoid the ones do not. However, HTM for IT has made advances in anomaly detection for IT streaming data that promises better performance than has been possible before. HTM theory also shows that the superior performance of HTM for IT can be duplicated in other domains using the same underlying algorithms.
Categorization of Anomaly Detection Methods:
Supervised: Anomaly detection methods can operate in either supervised or unsupervised modes, supervised modes require a set of training data that has been labeled, so that the anomaly detector can compare this data “truth” to incoming data in order to determine anomalies.
Supervised machine learning techniques have these disadvantages in application to IT analytics, supervised models do not adapt automatically as the patterns change. To learn the new patterns a new model has to built with labeled data and these models not suitable for dynamic high-velocity data. The labeling process can be very manually intensive and must be repeated periodically to learn new patterns. The labeling process can be error-prone and mistakes in labeling can cause poor model performance. It is hard, if not impossible to capture the “unknown unknown” with supervised methods.
Unsupervised: Unsupervised mode do not require labeled training sets and anomaly detection techniques, in this mode much more flexible and easy to use, since they do not require upfront human intervention and training. Anomaly detection methods that run in supervised mode include rule-based methods, and model-based approaches such as replicator neural networks, Bayesian or unsupervised support vector machines.
Data from THE SCIENCE OF ANOMALY DETECTION (by Numenta)
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.