Skip to content
TextData AnalysisGPTClaudeGemini

Python Data Pipeline Builder

Generate production-ready Python code for data cleaning, transformation, and analysis pipelines using pandas with proper error handling.

Prompt AgentExpert
February 14, 2026
3390
Workshop · 3 variables

Write a Python data pipeline that accomplishes the following:

Input data: Desired output: Transformations needed:

Requirements:

  1. Use pandas for data manipulation
  2. Include proper error handling for common data issues (missing values, wrong types, duplicates)
  3. Add logging at each major step
  4. Make the pipeline idempotent (safe to run multiple times)
  5. Include data validation checks before and after transformation
  6. Add type hints and docstrings
  7. Make it modular — each transformation should be a separate function

Also provide:

  • A sample test with mock data
  • Performance tips if the dataset is large (>1M rows)
  • A brief explanation of each transformation step
~189 tokens · 755 chars

Fill the variables

What the input data looks like (source, format, columns)

What the final output should look like

List of transformations needed (cleaning, aggregation, joins, etc.)

Results

What it actually produced

Tags
Discussion

What people are saying