Wukong - Pharmaceutical Distribution Data Platform
Project Overview
A pharmaceutical distribution data governance platform built on a scalable big data architecture. It addresses data fragmentation and analysis latency in China, enabling near real-time monitoring and intelligent sales analytics for large pharmaceutical enterprises.
Industry Pain Points
Data Fragmentation & Integration Difficulty
Sales data is scattered across various distributors and channels with inconsistent formats. Manual aggregation is inefficient and prone to errors, making it difficult to form a unified view of the market.
Analysis Latency
Traditional systems struggle to process tens of millions of sales records efficiently, leading to significant delays in business reporting and missed decision-making windows.
Lack of Standardization
The absence of unified master data standards for drugs, institutions, and personnel hinders effective data asset management and cross-dimensional analysis.
Innovative Solution
Core Methodology
The system establishes a closed-loop data governance framework:
- Unified Data Ingestion: Automated collection and cleaning of channel flow data from multiple sources.
- Master Data Management (MDM): Rigorous standardization of core data dimensions, including products, hospitals, and representatives.
- Intelligent Analytics: Near real-time calculation of sales metrics with automated exception and appeal workflows.
Technical Implementation
- High-Performance Architecture: Uses a Spring Cloud microservices backend with Spark for scalable batch and large-scale data processing.
- Hybrid Storage Layer: Uses PostgreSQL for relational data, Elasticsearch for high-speed search, and Redis for caching.
- Enterprise-Grade Security: Implements Shiro + JWT for robust access control and data security.
- Observability: Integrates Prometheus + Grafana for near real-time system monitoring.
Value Delivered
- Large-Scale Data Processing: Processes more than 10 million channel flow records.
- Operational Efficiency: Reduced operational costs and improved overall management efficiency by more than 40%.
- Standardization: Established a unified data standard, turning raw data into reusable enterprise data assets.