AWS Lambda¶
Workflow¶
- Trigger a Lambda function when files arrive in S3 bucket
- Read and process parquet file using pyarrow/fastparquet
- write status and error logs
- Track status in DynamoDB. Both success and failure
- Use Dead letter Queue SQS
- Step Functions for orchesteration(Optional)
Pros¶
- Event-driven, processes files immediately
- Automatic scaling with concurrent executions
- Cost-effective for sporadic workloads
- Low operational overhead
Cons¶
- 15-minute timeout limit
- 10GB memory limit
- 512MB ephemeral storage (can extend to 10GB)
- Cold start latency unless provisioned
- Step functions cost is based on number of transitions
Decoupling¶
Use SQS for a decoupled design. S3 triggers an Eventbridge rule with SQS as target. SQS has builtin retry logic and backpressure tolerance.