AWS Data Pipeline¶
Requirements¶
- Parquet files arrive in a S3 bucket every 10 minutes
- Has fields, id, timestamp and multi level nested JSON data
- Json field needs to be processed
- Need Processing state tracking and error handling
- Auto-scaling
- Monitoring
Click on a tile to change the primary color: