Mach 9, 2023. 7 min

Concept Drift

Machine Learning has proved to be a lucrative innovation, generating substantial profits for many organizations, and is expected to continue its upward trajectory in the future. Nonetheless, several challenges persist, impeding the successful integration of Machine Learning applications into production environments. One of them is “Concept Drift”.

What is Concept Drift?

In the context of machine learning, the term “concept drift” a.k.a. “nonstationarity“, “covariate shift” and “dataset shift” describes a phenomenon where the connections between input and output data in a given problem undergo changes over time. This means that the problem being addressed may evolve in a significant way, leading to potential inaccuracies and inconsistencies in the performance of predictive models. It is therefore important to monitor and adapt machine learning algorithms to ensure their continued effectiveness in light of evolving real-world conditions.

For example, consider a model that has been deployed to forecast the likelihood of a prospective customer purchasing a particular product based on their historical browsing patterns. While the model demonstrates a high level of accuracy initially, over the course of time, it begins to deteriorate markedly. This could be because the customers' preferences have changed over time.

The core assumption when dealing with the concept drift problem is uncertainty about the future. We assume that the source of the target instance is not known with certainty. It can be assumed, estimated or predicted but there is no certainty.
Concept Drift Detection for Streaming Data, 2010.

Causes of Concept Drift

Concept drift can occur due to various reasons, such as:

  1. Changes in user behavior: Users may change their preferences, habits, needs, or expectations over time due to various factors such as age, income, education, culture, etc.
  2. Changes in environment: The external factors that affect the data may change over time due to natural events such as weather, disasters, epidemics; or human events such as policies, regulations, marketing campaigns; etc.
  3. Changes in data collection: The way the data is collected may change over time due to technical issues such as sensor failures; or operational issues such as sampling methods; etc.

What model learns and what gets changed?

Model Learning modeling is technically approximating a mapping function say ‘f' on input data ‘X’ to make accurate predictions of the output value 'y' {y=f(X)}. For the cases, where the mapping function is static, meaning the relationship between historical and future data, and the relationship between input and output variables is consistent, the model may not require updation over time. But, in most real-world scenarios, this is not the case which results in the deterioration of the predictive performance of these models.


Types of Concept Drift

Some common changes are:

  1. Gradual change over time
  2. Recurring change
  3. Sudden change

Before jumping on handling the drift, it is crucial to figure out the source of drift in data. Effective machine learning requires understanding the subtle distinctions between different adaptive learning algorithms, as certain algorithms may be better suited to addressing specific types of changes than others.

How to Deal with Concept Drift?


Model Monitoring

In simple words, Machine learning model monitoring is the process of keeping an eye on how well a model is performing once it has been deployed. It is an important part of the MLOps system, helping to ensure that the model remains accurate and useful over time. Following are the typical monitoring steps.

Change Detection

Once a model is active, and the production model monitoring system detects concept drift, it can initiate a trigger that signals any relevant changes, thereby prompting a process of learning. Depending on the problem, there are different methods for detecting concept drift.

Performance-based methods

These methods rely on measuring some metrics such as accuracy, precision, recall, F1-score, etc., and comparing them with some thresholds or baselines. If the metrics fall below or above certain values, it indicates that there is concept drift.

Data-based methods

These methods rely on analyzing the statistical properties of the input data (features) or the output data (labels) and testing whether they have changed significantly over time. Some common tests include Kolmogorov-Smirnov test, Chi-square test, Kullback-Leibler divergence, etc.

Using specialized drift detection techniques

Some examples of concept drift detection algorithms are:

  1. Adaptive WINdowing (ADWIN)
  2. Page Hinkley (PH)
  3. DDM (Drift Detection Method)
  4. EDDM (Early Drift Detection Method)
  5. HDDM (Hoeffding Drift Detection Method)
  6. KSWIN (Kernel Change-point Detection on WINdows)

There are many more concept drift detection algorithms available, and the choice of algorithm depends on the specific requirements and characteristics of the problem at hand.

Data

Once the drift is detected, a new dataset needs to be prepared. Careful consideration must be taken on how to integrate new data into the model training. This way, the model can learn to adapt to changes over time and discard irrelevant information.

Model Selection/Retraining

The Model requires to continually updating its knowledge with each new sample while utilizing a forgetting mechanism that helps it adjust to the latest concept changes to ensure that its predictions remain accurate and up-to-date. Depending on the problem domain, re-training the same model 'OR' designing a different model is required. Choosing a different model approach may be deemed suitable for domains where sudden changes are expected and have previously occurred, and can therefore be anticipated and verified. Retraining the same model is done generally on gradual concept drift.

Evaluate, Iterate and Automate

Be ready with an evaluation strategy of a new model to perform error analysis. Depending on the problem domain and requirement, error analysis could be different and varies from case to case. Iterate the whole procedure once drift is detected. And finally automate everything as per the use-case, which should be the core function of any monitoring solution.

Example

Let’s talk about a drift detection using technique ADWIN(Adaptive Windowing). It uses a sliding window approach to detect concept drift. Window size is fixed and ADWIN slides the fixed window for detecting any change on the newly arriving data. When two sub-windows show distinct means in the new observations the older sub-window is dropped. A user-defined threshold is set to trigger a warning that drift is detected. If the absolute difference between the two means derived from two sub-windows exceeds the predefined threshold, an alarm is generated. This method is applicable for univariate data.

Let's walk through a simple example in python using ADWIN library.


# Imports
import numpy as np
from skmultiflow.drift_detection.adwin import ADWIN
import matplotlib.pyplot as plt
adwin = ADWIN()

Let's say there is a data stream with a uniform distribution:

# Simulating a data stream as a uniform distribution of range -3 to 3
x = np.linspace(0, 200, 200)
data_stream = np.random.randint(low = -3, high = 4, size=200)

Data with uniform distribution


Distribution range changes in data stream after some time, signifying a shift in the data:

# Changing the data concept from index 80 to 200
for i in range(80, 200):
  data_stream[i] = data_stream[i] + 4

Data with uniform distribution range change


Test whether the ADWIN algorithm can successfully detect the drift:

# Adding stream elements to ADWIN and verifying if drift occurred
for i in range(200):
  adwin.add_element(data_stream[i])
  if adwin.detected_change():
    print('Change detected in data: ' + str(data_stream[i]) + ' - at index: ' + str(i))
    break

Output:
Change detected in data: 7 - at index: 97
So, above algorithm detects a significant change in concept from index 97.

Conclusion

Concept drift is a widespread challenge in the fields of machine learning and artificial intelligence that requires careful attention and adaptation. It can have a significant impact on the performance and dependability of models that are implemented in dynamic environments where data undergoes changes over time. Various methods exist for detecting and addressing concept drift, but there is no single approach that is universally effective in all situations. As a result, it is vital to comprehend the root causes, various types, and distinct characteristics of concept drift and select the most suitable approach for each specific problem domain.

I hope this blog has provided you with valuable insights into concept drift. If you have any queries or feedback, please feel free to share them in the comments section below. Thank you for taking the time to read it!

References
  1. skmultiflow.drift_detection.ADWIN
  2. Concept Drift Detection for Streaming Data

Curious about AIOps?

Did you know that CloudAEye offers the most advanced AIOps solution for AWS Lambda? Request a free demo today!