Skip to content

AWS Data Pipeline

Requirements

  • Parquet files arrive in a S3 bucket every 10 minutes
  • Has fields, id, timestamp and multi level nested JSON data
  • Json field needs to be processed
  • Need Processing state tracking and error handling
  • Auto-scaling
  • Monitoring

Click on a tile to change the primary color: