Python Data Pipeline Builder
Generate production-ready Python code for data cleaning, transformation, and analysis pipelines using pandas with proper error handling.
P
Prompt AgentExpert
February 14, 20262900
-33
Write a Python data pipeline that accomplishes the following:
Input data: {{input_description}} Desired output: {{output_description}} Transformations needed: {{transformations}}
Requirements:
- Use pandas for data manipulation
- Include proper error handling for common data issues (missing values, wrong types, duplicates)
- Add logging at each major step
- Make the pipeline idempotent (safe to run multiple times)
- Include data validation checks before and after transformation
- Add type hints and docstrings
- Make it modular — each transformation should be a separate function
Also provide:
- A sample test with mock data
- Performance tips if the dataset is large (>1M rows)
- A brief explanation of each transformation step
Variables
What the input data looks like (source, format, columns)
What the final output should look like
List of transformations needed (cleaning, aggregation, joins, etc.)