How to use prompts for dbt Copilot

dbt Copilot

Best practices

Beginner

Overview

Learn how to write effective prompts for dbt Copilot to generate accurate SQL, models, metrics, and macros. Each recipe is self-contained with its own realistic example.

dbt Copilot is an AI assistant that generates SQL, YAML, documentation, tests, semantic models, and macros based on your project's context. The quality of output depends on the clarity of your prompts.

This cookbook provides independent recipes for common prompting tasks. Jump to any section that matches your current need.

This cookbook covers the following topics:

Prompt best practices
Generate SQL queries
Leveraging external assets
Create semantic models and metrics
Create reusable macros
Troubleshoot errors and issues
Conclusion

Prompt best practices

Writing effective prompts is about giving Copilot the right context and clear direction. Follow these three principles:

1. Provide rich context

Include table names, column types, and example values. Describe how they relate to each other.

What to include:

Table relationships (e.g., orders joins to customers on customer_id)
Data types (e.g., created_at is a timestamp)
Sample values (e.g., status is a string with values like "active" or "pending")

Example: Coffee Shop Punch Card

You run a neighborhood café. Folks get a free drink after 10 visits.

We've got customers, their subscriptions, and their app activity. 
Customers connect to subscriptions, and activity shows what they do week to week.

Help me see weekly regulars and which punch-card folks become subscribers 
to our 'Beans of the Month.'

Why it works: Clear people, clear behavior, clear relationship. Copilot knows what to count and how to connect it.

2. State the business question, not just the output

Describe the decision or insight the query supports. Avoid purely technical prompts.

Instead of: "Count users"

Say: "Count active users per week to analyze engagement trends"

Example: The Sneaker Drop

You run an online sneaker shop with a new limited-time drop.

We launched a 14-day trial. Did it lift weekly engagement and upgrades? 
If yes, we'll scale it. If not, we'll tweak the message.

Show me weekly browsers who became buyers, and whether trying the 3D preview 
led to more purchases.

Why it works: You've described the moment, the behavior, and the decision: keep the 3D preview or not.

3. Be clear and explicit about the result

Define the expected output clearly. Mention the expected columns in the final result and state whether results should be sorted, limited, or filtered.

What to specify:

Expected column names and formats
Sort order and any limits (e.g., "top 10 products by revenue")
Output format examples (e.g., "conversion_rate as a percentage")

Example: The Fitness Challenge

You run a fitness app with a 2-week challenge.

Give me a weekly trend with the date, active folks, and a simple 'engagement per person.' 
Then a summary by launch week with 'trial starts,' 'upgrades in 30 days,' 
and an 'upgrade rate' as a percentage.

Each week, show active challengers and total workouts. By challenge start week, 
show how many upgraded to paid within 30 days and what their average workouts looked like.

Why it works: Specific metrics that are ready to present.

Break complex logic into smaller steps

For multi-part queries, write them as a sequence of instructions:

Example:

First, filter the dataset to active users in the last 90 days.
Then, calculate their average session duration.
Finally, join to subscription data and group by plan tier.

Avoid asking for everything at once—iterative prompting yields better results.

Generate SQL queries

Use case: Build a customers model for an e-commerce platform.

What to give Copilot:

Context:
We have customers, orders, and payments tables. 
- customers (customer_id, first_name, last_name, email)
- orders (order_id, customer_id, order_date, status)
- payments (payment_id, order_id, amount, payment_method)

Relationships: 
- orders.customer_id → customers.customer_id
- payments.order_id → orders.order_id

Sample values: 
- status in ('completed', 'pending', 'cancelled')
- payment_method in ('credit_card', 'paypal', 'bank_transfer')

Business question:
Create a customer summary showing total orders and revenue per customer.

Output:
- customer_id
- first_name
- last_name  
- first_order_date
- most_recent_order_date
- total_orders
- total_revenue

Filter: Only completed orders
Sort: total_revenue descending

Why it works: You're giving Copilot a clear map of how data connects, what values to expect, and what decision this supports.

Pro tip: Start simple, then iterate. If Copilot's first attempt isn't perfect, refine your prompt with more specific details.

Leveraging external assets

When you have existing documentation, sample data, or business logic definitions, bring them into your prompts. This helps Copilot understand your specific context.

Surface business logic and transformations

Concept: An "active customer" badge is not a metric definition. The rule behind it is.

What to give Copilot:

Active customer means at least one paid purchase in the last 90 days, excluding refunds. 
Net revenue = gross minus discounts and returns.

External assets to reference:

Metrics glossary or KPI catalog from your wiki
Product requirement docs with business rules
Data dictionaries with field definitions

Include relevant sample data

Small, representative slices of data illustrate edge cases without overwhelming the prompt.

What to give Copilot:

Sample order statuses we need to handle:
- 'completed': Paid and fulfilled
- 'pending': Awaiting payment
- 'cancelled': Customer cancelled before shipping
- 'returned': Delivered but customer returned

External assets to reference:

Data profiling reports showing value distributions
QA test datasets with edge cases
BI dashboard filters and their definitions

Capture dependencies and lineage

Source-to-target mappings and upstream model references guide join logic.

What to give Copilot:

This model depends on:
- {{ ref('stg_orders') }} - raw order data, deduplicated
- {{ ref('stg_customers') }} - customer master data
- {{ ref('dim_products') }} - product catalog with current prices

Join orders to customers on customer_id (inner join, every order must have a customer).
Join orders to products on product_id (left join, some orders have deleted products).

External assets to reference:

dbt lineage graphs or documentation
Data warehouse ERD diagrams
Data flow documentation

Create semantic models and metrics

Use case: Fast-track your semantic layer strategy with AI-generated YAML.

What to give Copilot:

Create a semantic model for order revenue tracking.

Base model: {{ ref('fct_orders') }}

Available columns:
- order_date (timestamp)
- order_amount (decimal)
- customer_id (integer)
- product_id (integer)
- region (string - 'north', 'south', 'east', 'west')

Requirements:
- Entity: customer
- Measures: total_revenue (sum of order_amount), order_count
- Dimensions: region, order_date as metric_time (support day, week, month)
- Metric: monthly_revenue (total revenue by month)

Return valid YAML with descriptions.

What Copilot generates:

Valid semantic model YAML structure
Properly defined entities, measures, and dimensions
Time dimension with multiple grains
Metric definitions

Pro tip: Use Copilot to reduce time spent writing boilerplate YAML. It leverages context from common metrics and dimensions across projects to ensure consistency.

Create reusable macros

Use case 1: Turn repetitive code into reusable logic

A junior analyst keeps copy-pasting CASE statements across models.

What to give Copilot:

Turn this CASE pattern into a reusable macro:

CASE 
  WHEN amount >= 1000 THEN 'high'
  WHEN amount >= 500 THEN 'medium'
  ELSE 'low'
END

Macro requirements:
- Name: categorize_amount
- Parameters: column name, high threshold (default 1000), medium threshold (default 500)
- Include docstring with usage example
- Handle null values by returning 'unknown'

Why it works: Clear input (the CASE statement), clear requirements, clear output expectations.

Use case 2: Lower the barrier to entry

Scenario: You need a macro but don't know Jinja syntax well.

What to ask Copilot:

I need a macro that calculates the number of days between two date columns, 
excluding weekends.

Parameters:
- start_date_column (required)
- end_date_column (required)

Include a docstring explaining how to use it.

Outcome: Copilot generates proper Jinja syntax, handles parameters, and includes documentation. You learn Jinja patterns while getting working code.

Use case 3: Accelerate complex logic design (for advanced users)

What to ask Copilot:

I need a macro that builds a grouped aggregation with optional filters.

Parameters:
- relation (the model/table to query)
- group_by (list of columns to group by)
- metrics (list of columns to aggregate)
- where (optional filter condition)

Include defaults and guardrails for empty lists.
Add a docstring with parameter descriptions and usage example.

Think of it like: Writer's room sprint. You pitch the premise, Copilot drafts the scene beats.

Pro tip: After generating the macro, ask Copilot to add usage examples and document edge cases. This makes it easier for your team to adopt.

Troubleshoot errors and issues

Copilot acts as a fast, context-aware reviewer for failing SQL and macros. It reads errors, inspects your query structure, and suggests minimal fixes.

Why troubleshoot with Copilot?

Faster diagnosis: Plain-language translation of errors with likely root causes
Safer fixes: Bias toward small, targeted changes
Better learning: Generates explanations you can paste into docs or PR descriptions

Troubleshoot SQL errors

Step 1: Decode errors and ambiguity

Error text is a clue, not the crime. Pair the message with the context that produced it.

What to give Copilot:

Here's the failing SQL, the exact warehouse error, and which line it points to. 
Explain in plain terms and propose the smallest fix.

Error: "SQL compilation error: Column 'product_name' must appear in GROUP BY"

Query:
SELECT 
  product_id,
  product_name,
  SUM(quantity) as total_quantity
FROM inventory
GROUP BY product_id

Warehouse: Snowflake
What's wrong?

What Copilot provides:

Plain-language explanation
Minimal fix
Why it works

Step 2: Align assumptions with current schema

Queries encode assumptions. When the schema shifts, assumptions need a tune-up.

What to give Copilot:

The discount_amount column now contains NULLs (was always populated before).

My calculation breaks:
net_revenue = gross_revenue - discount_amount

Error: "Invalid argument types for function '-'"

What assumptions broke? What guardrails should I add?

What Copilot provides:

Broken assumption identified (assumed non-null)
Fix using coalesce(discount_amount, 0)
Additional safety checks (data tests, documentation)

Step 3: Reproduce with a minimal example

Shrink the surface area. If you can break it down, you can fix it fast.

What to give Copilot:

Create a reduced query using 5-10 sample rows that still triggers the error. 
Keep columns only if they influence the failure.

Sample data that breaks:
| order_id | customer_id | amount  | discount |
|----------|-------------|---------|----------|
| 1001     | 501         | 100.00  | NULL     |
| 1002     | NULL        | 150.00  | 10.00    |

Failed query calculates net_amount = amount - discount

What Copilot provides:

Minimal reproduction using VALUES clause
Root cause analysis
Recommended fixes with NULL handling

Troubleshoot macros

Step 1: Validate parameters and defaults

Most macro bugs start with assumptions about inputs.

What to give Copilot:

List required vs optional parameters. 
Suggest sensible defaults, type checks, and early error messages. 
Propose guards for empty lists and nulls.

Macro:
{% macro calculate_discount(amount, rate) %}
  {{ amount }} * {{ rate }}
{% endmacro %}

What Copilot provides:

Parameter validation suggestions
Default values
Null/empty guards

Step 2: Confirm what was actually rendered

Compare intent to the SQL your macro produced.

What to give Copilot:

Here's the macro and the rendered SQL side-by-side. 
Highlight where the SQL diverges from the intended parameters.

Macro call: {{ my_macro('order_date', 'ship_date') }}

Rendered SQL: (check target/compiled)
...paste rendered SQL here...

Expected behavior: Should calculate days between dates, but getting wrong result

What Copilot provides:

Side-by-side comparison
Identifies where macro logic diverges
Minimal fix to align output with intent

Step 3: Sanitize identifiers, quoting, and whitespace

Jinja often fails quietly when quoting or spacing is off.

What to give Copilot:

Review identifier quoting, string quoting, and whitespace in this macro.
Replace brittle concatenation with safer helpers.

{% macro filter_by_status(column, status) %}
  where {{ column }} = '{{ status }}'
{% endmacro %}

Warehouse: Snowflake (case-sensitive identifiers)

What Copilot provides:

Quoting fixes
Whitespace corrections
Recommendations for using adapter.quote() or other built-in helpers

Step 4: Make logic testable and documented

Fix the bug, then add bumpers so it doesn't return.

What to give Copilot:

After the minimal code change, draft a docstring with parameter descriptions, 
a usage example, and propose two tests: one success case, one edge case.

Conclusion

You've now learned how to use dbt Copilot as your AI co-pilot. You can:

Master SQL prompting by providing rich context and stating clear business questions
Amplify your workflow by leveraging existing documentation and project context
Generate Jinja macros to build more scalable and maintainable systems
Troubleshoot your code to diagnose issues fast and apply safe, explainable fixes

Quick reference checklist

When writing prompts for dbt Copilot:

✅ Provide rich context: Table names, columns, data types, relationships, sample values
✅ State the business question: What decision or insight you're supporting, not just "write a query"
✅ Be clear and explicit: Expected columns, sort order, filters, and output format
✅ Break down complex logic: Write multi-part queries as a sequence of steps

For troubleshooting:

✅ Include complete error messages: Full warehouse error with line numbers
✅ Show the failing code: Both the dbt model and compiled SQL (from target/compiled/)
✅ Provide sample data: Representative rows that trigger the issue
✅ State your warehouse: Snowflake, BigQuery, Databricks, etc.

Next steps

Start with one task—automating documentation, generating a test, or refactoring a model—and build the habit from there. The more you use Copilot, the more you'll discover ways to accelerate your analytics engineering workflow.

Was this page helpful?

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

How to use prompts for dbt Copilot

Overview

Prompt best practices

1. Provide rich context

2. State the business question, not just the output

3. Be clear and explicit about the result

Break complex logic into smaller steps

Generate SQL queries

Leveraging external assets

Surface business logic and transformations

Include relevant sample data

Capture dependencies and lineage

Create semantic models and metrics

Create reusable macros

Use case 1: Turn repetitive code into reusable logic

Use case 2: Lower the barrier to entry

Use case 3: Accelerate complex logic design (for advanced users)

Troubleshoot errors and issues

Why troubleshoot with Copilot?

Troubleshoot SQL errors

Troubleshoot macros

Conclusion

Quick reference checklist

Next steps

Was this page helpful?

Start building with dbt.

Resources

Community

Support

Connect with Us

Overview​

Prompt best practices​

1. Provide rich context​

2. State the business question, not just the output​

3. Be clear and explicit about the result​

Break complex logic into smaller steps​

Generate SQL queries​

Leveraging external assets​

Surface business logic and transformations​

Include relevant sample data​

Capture dependencies and lineage​

Create semantic models and metrics​

Create reusable macros​

Use case 1: Turn repetitive code into reusable logic​

Use case 2: Lower the barrier to entry​

Use case 3: Accelerate complex logic design (for advanced users)​

Troubleshoot errors and issues​

Why troubleshoot with Copilot?​

Troubleshoot SQL errors​

Troubleshoot macros​

Conclusion​

Quick reference checklist​

Next steps​

Was this page helpful?

Resources

Community

Support

Connect with Us

Overview

Prompt best practices

1. Provide rich context

2. State the business question, not just the output

3. Be clear and explicit about the result

Break complex logic into smaller steps

Generate SQL queries

Leveraging external assets

Surface business logic and transformations

Include relevant sample data

Capture dependencies and lineage

Create semantic models and metrics

Create reusable macros

Use case 1: Turn repetitive code into reusable logic

Use case 2: Lower the barrier to entry

Use case 3: Accelerate complex logic design (for advanced users)

Troubleshoot errors and issues

Why troubleshoot with Copilot?

Troubleshoot SQL errors

Troubleshoot macros

Conclusion

Quick reference checklist

Next steps