Introduction to Real-Time Machine Learning

13 November 2024

This note is a summary of this blog post

Key Insights

Real-time machine learning can be used to get real-time predictions
Features can come from the event itself or from a feature store
Stateful vs. stateless and slow-changing vs fast-changing features require different handling

Article Summary

What is real-time machine learning?

Move from processing data in batches to processing data continuously in a stream

Use of machine learning to make predictions in real-time

Response to a synchronous request is expected to return immediately instead of being processed in batches.

Example: Fraud detection - Visa tries to detect fraudulent transactions

Example of a batch processing prediction:

During the day, transactions are accumulated into a data warehouse
Nightly, an orchestrator kicks off a batch job to process the data (transformation into features, making predictions in batch)
For transactions predicted to be fraudulent, an alert is raised. BUT: Transactions have already been processes and reversing them might be difficult

Example of real-time prediction: 4. Transaction at a POS system triggers a request to a prediction service 5. Prediction service retrieves all relevant features for the model: a. Some features are computed real time b. Other features are queried from the online feature store

Features are passed to the model endpoint for prediction -> Fraudulent transactions are prevented from going through

Why is real-time machine learning important?

Helps to get real-time insights and make real-time decisions. Unlocks new use cases and applications. Some applications which benefit from real-time machine learning:

Anomaly detection for fraud detection, network security, quality control
Personalized recommendations for marketing, e-commerce, media
Real-time decision making for autonomous driving, high-frequency trading and robotics

How do we compute in a real-time machine learning pipeline?

Features are computed real-time or coming from and online feature store

Feature Store

Purpose of a feature store is to reduce latency of a prediction request. Features are precomputed instead of being only computed when predicting.

Assuming our feature store is a key-value store, each key-value pair corresponds to a computed feature value. Key is a concatenation of the feature name and its corresponding entity values, the feature value represents a computed feature value.

Average transaction amount over the past three months could be a feature. The feature is tied to e credit card (the entity value). The feature value would be the average transaction amount over the past three months for this card:

averageTransactionAmountThreeMonths_1234 = 36.08
// averageTransactionAmountThreeMonths -> feature name
// 1234 -> entity (credit card number) 
// 36.08 -> feature value

Entity realtes to the feature being computed. For average transaction amount a credit card number might make sense, for age a costumer identifer is sensible.

A feature can also contain multiple entities, e.g. the feature average transaction amount with a given merchant over the past three months might have entities credit card number and merchant identifier.

Real-time Prediction

At prediction time, we need to query for the relevant features form the feature store.

What are the features we should query for?

Features depend on the model we intend to use for prediction -> Model is trained using a certain set of features, same features are needed to predict

For our example, let's say our model was trained with the following features: 6. costumer age : customer_id 7. number of transactions in the past 10 minutes : credit card number 8. average transaction amount in the past three months : credit card number

How do we query for those features?

Assuming our transaction looks like this:

transaction_id = a0d8
credit_card_num = 1234
customer_id = 9092
... 
transaction_amount = 999.99

// this results in the following features being queried from the store: 
customer_age_9092
number_of_transactions_in_the_past_10_mins_1234
average_transaction_amount_past_three_months_1234

Features computed only from the event data are computed real time

Types of Features

Two axes of categorization:

Stateless vs. stateful
- Stateless features: Can be computed without any information from prior requests
- Stateful features: Require knowledge about previous events or instances
Slow-changing vs. fast-changing

Categorizing our features from earlier leads to:

	Fast-changing	Slow-changing
Stateless	log(transaction amount)	customer_age
stateful	no of transactions in the past 10 minutes	avg no of transactions in the past 3 months

Populating the Feature Store

Stateless and Fast-Changing

Requires processing of each event on its own (e.g. for log transaction the log of the transaction amount is calculated for each event)
Can be done via an event-driven compute service (e.g. AWS Lambda) Stateless and Slow-Changing
Also computed in real time, but might not be part of the event itself
For example the customer age won't be provided with every event, but might be stored in a database somewhere -> Requires fetching
It might make sense to store those in the data store
Typically processed in batches, e.g. through a batch engine such as Apache Spark
Since we don't know which features or entity values will be encountered at runtime, there will always be some excess computation and storage Stateful and Fast-Changing
Computing stateful, fast-changing features requires a stream processing engine (e.g. Apache Flink)
Timing of computation might also be relevant
- Can be done for every event...
- ... or with some kind of sliding window approach
Stream processing is more complex than batch processing
Other possibility is micro-batch processing Stateful and Slow-Changing
Since they are slow-changing, batch processing might make sense
Can also be done using stream-processing
"Freshness" requirements have to be considered

27 February 2025