Introduction to Real-Time Machine Learning
This note is a summary of this blog post
Key Insights
- Real-time machine learning can be used to get real-time predictions
- Features can come from the event itself or from a feature store
- Stateful vs. stateless and slow-changing vs fast-changing features require different handling
Article Summary
What is real-time machine learning?
Move from processing data in batches to processing data continuously in a stream
Use of machine learning to make predictions in real-time
Response to a synchronous request is expected to return immediately instead of being processed in batches.
Example: Fraud detection - Visa tries to detect fraudulent transactions
Example of a batch processing prediction:
- During the day, transactions are accumulated into a data warehouse
- Nightly, an orchestrator kicks off a batch job to process the data (transformation into features, making predictions in batch)
- For transactions predicted to be fraudulent, an alert is raised. BUT: Transactions have already been processes and reversing them might be difficult
Example of real-time prediction: 4. Transaction at a POS system triggers a request to a prediction service 5. Prediction service retrieves all relevant features for the model: a. Some features are computed real time b. Other features are queried from the online feature store
- Features are passed to the model endpoint for prediction -> Fraudulent transactions are prevented from going through
Why is real-time machine learning important?
Helps to get real-time insights and make real-time decisions. Unlocks new use cases and applications. Some applications which benefit from real-time machine learning:
- Anomaly detection for fraud detection, network security, quality control
- Personalized recommendations for marketing, e-commerce, media
- Real-time decision making for autonomous driving, high-frequency trading and robotics
How do we compute in a real-time machine learning pipeline?
Features are computed real-time or coming from and online feature store
Feature Store
Purpose of a feature store is to reduce latency of a prediction request. Features are precomputed instead of being only computed when predicting.
Assuming our feature store is a key-value store, each key-value pair corresponds to a computed feature value. Key is a concatenation of the feature name and its corresponding entity values, the feature value represents a computed feature value.
Average transaction amount over the past three months could be a feature. The feature is tied to e credit card (the entity value). The feature value would be the average transaction amount over the past three months for this card:
averageTransactionAmountThreeMonths_1234 = 36.08
// averageTransactionAmountThreeMonths -> feature name
// 1234 -> entity (credit card number)
// 36.08 -> feature value
Entity realtes to the feature being computed. For average transaction amount a credit card number might make sense, for age a costumer identifer is sensible.
A feature can also contain multiple entities, e.g. the feature average transaction amount with a given merchant over the past three months might have entities credit card number and merchant identifier.
Real-time Prediction
At prediction time, we need to query for the relevant features form the feature store.
- What are the features we should query for?
Features depend on the model we intend to use for prediction -> Model is trained using a certain set of features, same features are needed to predict
For our example, let's say our model was trained with the following features: 6. costumer age : customer_id 7. number of transactions in the past 10 minutes : credit card number 8. average transaction amount in the past three months : credit card number
- How do we query for those features?
Assuming our transaction looks like this:
transaction_id = a0d8
credit_card_num = 1234
customer_id = 9092
...
transaction_amount = 999.99
// this results in the following features being queried from the store:
customer_age_9092
number_of_transactions_in_the_past_10_mins_1234
average_transaction_amount_past_three_months_1234
Features computed only from the event data are computed real time
Types of Features
Two axes of categorization:
- Stateless vs. stateful
- Stateless features: Can be computed without any information from prior requests
- Stateful features: Require knowledge about previous events or instances
- Slow-changing vs. fast-changing
Categorizing our features from earlier leads to:
Fast-changing | Slow-changing | |
---|---|---|
Stateless | log(transaction amount) | customer_age |
stateful | no of transactions in the past 10 minutes | avg no of transactions in the past 3 months |
Populating the Feature Store
Stateless and Fast-Changing
- Requires processing of each event on its own (e.g. for log transaction the log of the transaction amount is calculated for each event)
- Can be done via an event-driven compute service (e.g. AWS Lambda) Stateless and Slow-Changing
- Also computed in real time, but might not be part of the event itself
- For example the customer age won't be provided with every event, but might be stored in a database somewhere -> Requires fetching
- It might make sense to store those in the data store
- Typically processed in batches, e.g. through a batch engine such as Apache Spark
- Since we don't know which features or entity values will be encountered at runtime, there will always be some excess computation and storage Stateful and Fast-Changing
- Computing stateful, fast-changing features requires a stream processing engine (e.g. Apache Flink)
- Timing of computation might also be relevant
- Can be done for every event...
- ... or with some kind of sliding window approach
- Stream processing is more complex than batch processing
- Other possibility is micro-batch processing Stateful and Slow-Changing
- Since they are slow-changing, batch processing might make sense
- Can also be done using stream-processing
- "Freshness" requirements have to be considered