Transforming global finance systems at Western Union
I transformed Western Union’s legacy ETL into a real-time, streaming data platform powered by Kafka, Spark, and Flink—enabling rapid fraud detection, global AML compliance, and dynamic analytics for 1,500+ stakeholders, while ensuring high-fidelity processing of over 200 million transactions each month.

Components
Data Ingestion (Kafka):
Handles real-time ingestion of global transactions, mobile payments, and partner data, maintaining low latency and fault tolerance.
Stream Processing (Spark & Flink):
Spark performs real-time fraud pattern detection and currency conversion enrichment. Flink handles complex event correlation and temporal aggregations across regions.
Machine Learning (TensorFlow + SageMaker):
ML models classify transaction risk levels, detect anomalies, and retrain continuously on recent patterns. Models were deployed with real-time inference endpoints.
Orchestration (Apache Airflow):
Schedules daily training, stream validations, and batch reporting. Airflow DAGs coordinate with Spark and SageMaker to ensure timely updates.
Storage (S3 + Azure Synapse):
Combines AWS S3 for raw logs and real-time storage with Azure Synapse for cross-border reporting and long-term trend analysis.
Visualization (Power BI + Looker):
Dashboards for 1,500+ stakeholders across compliance, operations, and analytics teams visualize fraud scores, transaction heatmaps, and cross-border flow summaries.
Key Achievements
- Migrated legacy ETL into a streaming-first architecture, reducing data latency by 85%.
- Enabled sub-minute fraud detection across all regions using Flink CEP and real-time ML scoring.
- Automated AML compliance checks and country-specific regulatory reporting.
- Supported 200M+ transactions/month with <5s end-to-end visibility in dashboards.
- Delivered scalable analytics to 1,500+ global users via Power BI and Looker.
Airflow DAG Example
from airflow import DAG
from airflow.providers.apache.flink.operators.flink import FlinkOperator
from airflow.providers.amazon.aws.operators.sagemaker import SageMakerTrainingOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'global_data_team',
'start_date': datetime(2025, 1, 1),
'retries': 2,
'retry_delay': timedelta(minutes=10),
}
with DAG('western_union_pipeline',
default_args=default_args,
schedule_interval='@hourly',
catchup=False) as dag:
detect_fraud = FlinkOperator(
task_id='flink_fraud_detection',
job_name='fraud-detection',
flink_app_jar='/opt/flink/jobs/fraud-detection.jar',
conf_file='/opt/flink/conf/application.conf',
)
train_model = SageMakerTrainingOperator(
task_id='train_risk_model',
config={...}, # SageMaker config omitted for brevity
aws_conn_id='aws_default',
)
detect_fraud >> train_model
Summary
We transformed Western Union’s global finance pipelines to enable real-time analytics and machine learning across critical financial workflows.