Cloud-Native Data Platform for a Leading Asset Management Firm
- Victor Lih Jong,
- June 16 – Aug 23, 2024
- Financial Services | Asset Management
1. Executive Summary
A leading asset management firm, managing over €50 billion in assets across equities, fixed income, and ESG funds, faced significant challenges in timely decision-making due to fragmented data systems and delayed reporting.
To address these inefficiencies, the firm undertook a digital transformation initiative to build a scalable, cloud-native data and analytics platform. This platform was designed with:
- -A streaming (Change Data Capture – CDC) and batch-enabled data lake on AWS
- -An enterprise-grade data warehouse and data marts on Snowflake
- -Real-time dashboards for critical KPIs such as NAV, Holdings, and ESG scores
2. Objectives
The primary goals of the project were to:
-Modernize legacy data infrastructure with a cloud-native stack
-Enable real-time and near-real-time data ingestion and analytics
-Ensure data quality, governance, and lineage
-Democratize data access via self-service analytics and reporting
-Improve transparency and regulatory compliance across investment products
3. Architecture Overview
3.1 Data Sources
The platform ingested data from multiple systems:
Source System | Data Type | Frequency |
---|---|---|
Portfolio Management System (PMS) | NAV, Holdings | Real-time via CDC |
Order Management System (OMS) | Trades, Orders | Real-time via CDC |
ESG Provider APIs | ESG Scores | Daily |
Market Data Feeds (Bloomberg, Reuters) | Prices, Benchmarks | Intraday |
Internal Excel Reports | Ad-hoc data | Batch |
Accounting Systems | AUM, Expenses | Daily |
“This transformation not only gave us real-time insight into NAV and ESG metrics — it reshaped the way we work across every data touchpoint. Jeliv Analytics brought deep domain knowledge and technical expertise to futureproof our infrastructure.”
3.2 Ingestion Layer
Streaming (CDC) Ingestion
-Tool: AWS DMS + Kafka (MSK), Debezium, Databricks
-Function: Captured change data from relational databases (SimCorp, Pearl, Aladdin Data Cloud) in PMS and OMS
-Output: Raw JSON messages stored in S3 (bronze layer) and streamed into Snowflake for low-latency reporting
Batch Ingestion
-Tool: AWS Glue, Lambda, Apache Airflow, Databricks
-Function: Periodic ETL jobs from flat files, third-party APIs, and internal Excel sheets
-Output: Raw, cleaned, and curated data into S3 (bronze/silver/gold layers)
3.3 Data Lake on AWS
-Storage: Amazon S3 with logical zones:
Bronze: Raw data (JSON, CSV, Parquet)
Silver: Cleaned, enriched data (Delta)
Gold: Business-ready datasets (Delta)
-Cataloging & Governance: AWS Glue Data Catalog
-Security: S3 bucket policies, IAM roles, and encryption (SSE-S3/KMS)
3.4 Data Warehouse and Marts on Snowflake
-Warehouse: Snowflake (Enterprise Edition) hosted on AWS
-Data Modeling: Star and Snowflake schemas
Features Used:
-Snowpipe for real-time data ingestion from S3
-Streams and Tasks for CDC within Snowflake
-Role-based access control for compliance
-Time Travel and Fail-safe for audit and recovery
Data Marts:
Mart Type | Contents | Users |
---|---|---|
NAV Mart | NAV by fund, share class, etc. | Fund Managers, Risk Team |
ESG Mart | ESG scores, flags, benchmarks | ESG Analysts, Compliance |
Holdings Mart | Current and historical positions | Investment Analysts |
3.5 Analytics and Reporting Layer
-Tool: Power BI
-Delivery:
Live dashboards for executive management
Daily reports on fund performance and compliance
Drill-down capabilities for granular analytics
Key KPIs Tracked:
–Net Asset Value (NAV) – Real-time by portfolio
–Holdings Exposure – Sector, geography, asset class
–ESG Scores – By issuer, portfolio, and benchmark
–Cash Positions – Real-time liquidity visibility
–Trade Breaks & Exceptions – Alerting via Slack/Email
4. Implementation Roadmap
Phase 1: Assessment & Strategy
-Data discovery workshops
-Identification of data domains and lineage mapping
-Definition of success metrics (report latency, data freshness, etc.)
Phase 2: Data Lake Foundation
-Provisioning of S3, Glue, IAM policies
-Batch pipelines for historical loads
-CDC configuration with AWS DMS, Apache Kafka/Debezium
Phase 3: Snowflake Integration
-Kimball data modeling
-Schema design and security roles
-Setup of Snowpipe, Streams, Tasks
-Creation of first data marts
Phase 4: Visualization and Alerts
-Power BI dashboard development
-Real-time alerting via AWS Lambda and SNS
Phase 5: Governance and Scaling
-Data Quality checks (Great Expectations)
-Metadata cataloging and glossary
-User training and self-service enablement
5. Challenges and Solutions
Challenge | Solution |
---|---|
Complex CDC transformations | Implemented Kafka + Databricks + Snowflake Streams for real-time processing |
ESG data inconsistencies | Built validation rules and external reconciliation routines |
Regulatory compliance (e.g., SFDR) | Applied row-level security and audit logging in Snowflake |
User resistance to platform change | Conducted training sessions and developed self-service data tools |
6. Business Impact
Metric | Before | After |
---|---|---|
NAV Reporting Lag | T+1 | Near Real-Time (< 15 minutes) |
ESG Scoring Frequency | Weekly Manual Pulls | Automated Daily Integration |
Time to Onboard New Data Source | 4–6 weeks | < 1 week (via Glue/Databricks templates) |
Dashboard Access Speed | ~30 sec | < 5 sec with live Snowflake queries |
Audit and Data Lineage | Manual/Excel-based | Fully Automated with Data Catalog |
Investors Reports | Quarterly manual reports | Weekly & monthly automated reports |
7. Future Enhancements
-Incorporation of AI/ML models for predictive risk and portfolio optimization
-Integration with tokenized assets and blockchain-based financial instruments
-Expanded API layer for third-party access and mobile apps
8. Conclusion
By building a modern cloud-native data and analytics platform using AWS and Snowflake, the asset management firm achieved real-time insights, improved operational efficiency, and enhanced compliance.
The new platform not only democratized access to critical data but also laid a strong foundation for future innovations in quantitative research, ESG analytics, and investor engagement.