Introducton to Jinja Templates
This guide covers Jinja template syntax for combining data columns into formatted prompts. You’ll learn how templates work, how to reference your data columns, and how to avoid common mistakes.
Introduction
What Are Templates?
Templates are fill-in-the-blank forms that combine multiple data columns into a single formatted prompt. Your dataset might have separate context, question, and instructions columns that need to be combined in a specific way.
Template:
{{row.context}}
Question: {{row.question}}
This combines your context and question columns into a formatted prompt.
Why Use Templates?
- Write once, use many times - Define your format once, apply to thousands of examples
- Consistency - Every prompt uses the same format
- Flexibility - Reference any columns from your data
Without templates, you’d need to manually format each prompt or write custom code. With templates, you define the format once and apply it to thousands of examples automatically.
Basic Template Syntax
Access any column from your data using {{row.COLUMN_NAME}}, where COLUMN_NAME is the actual column name.
What is row? row represents one example (one record) from your dataset. Each column in your data becomes a field you can access on row.
Key rules:
- Use double curly brackets
{{and}} - Always include
row.prefix:{{row.question}} - Variable names are case-sensitive and must match your data exactly
- Space inside brackets is optional:
{{row.text}}or{{ row.text }}
For advanced Jinja2 features (loops, conditionals, filters), see the Advanced Features section below.
Common Template Patterns
Now that you understand the basics, here are the most common template patterns you’ll encounter.
Question Answering
Your data:
{
'question': 'Who wrote the novel "To Kill a Mockingbird"?',
'context': 'Harper Lee wrote the novel "To Kill a Mockingbird", which was published in 1960. The book won the Pulitzer Prize and became a classic of modern American literature.',
'answer': 'Harper Lee'
}
Template:
{{row.context}}
Question: {{row.question}}
Provide a brief answer based only on the context above.
After transformation:
Harper Lee wrote the novel "To Kill a Mockingbird", which was published in 1960. The book won the Pulitzer Prize and became a classic of modern American literature.
Question: Who wrote the novel "To Kill a Mockingbird"?
Provide a brief answer based only on the context above.
Expected output: Harper Lee
This template combines multiple columns (context and question) into a formatted prompt, while the answer column serves as the expected output for evaluation.
Sentiment Classification
Your data:
| text | label | split |
|---|---|---|
| This film was terrible. | neg | train |
Template:
{{row.text}}
Is this review positive or negative?
After transformation:
This film was terrible.
Is this review positive or negative?
Expected output: negative
This simple template demonstrates basic classification. The label column contains the expected output, while split is metadata that can be ignored. The template pulls only the fields you need.
Multiple Choice
Your data:
{
'question': 'A 55-year-old patient presents with progressive weakness. EMG shows decremental response. Most likely diagnosis?',
'opa': 'Myasthenia gravis',
'opb': 'Lambert-Eaton syndrome',
'opc': 'Polymyositis',
'opd': 'Guillain-Barré syndrome',
'subject_name': 'Medicine',
'cop': 'A'
}
Template:
{{row.question}}
A) {{row.opa}}
B) {{row.opb}}
C) {{row.opc}}
D) {{row.opd}}
Respond with only the letter of the correct answer (A, B, C, or D).
After transformation:
A 55-year-old patient presents with progressive weakness. EMG shows decremental response. Most likely diagnosis?
A) Myasthenia gravis
B) Lambert-Eaton syndrome
C) Polymyositis
D) Guillain-Barré syndrome
Respond with only the letter of the correct answer (A, B, C, or D).
Expected output: A
Note: If your options are in a list instead of named columns, use {{row.options[0]}}, {{row.options[1]}}, etc.
Variable-Length Multiple Choice
For datasets with many options (e.g., 10 choices) or variable numbers of options, use a loop instead of hardcoding each one.
Your data (MMLU-Pro style):
{
'question': 'Which of the following statements about quantum mechanics is correct?',
'options': [
'Wave function collapse occurs instantaneously',
'Particles have definite positions before measurement',
'Quantum entanglement allows faster-than-light communication',
'Heisenberg uncertainty principle is a measurement limitation',
'Quantum superposition only applies to microscopic systems',
'Observer effect requires conscious observation',
'Quantum tunneling violates energy conservation',
'Wave-particle duality means particles are sometimes waves',
'Quantum decoherence explains wave function collapse',
'Schrodinger equation applies to classical systems'
],
'category': 'Physics',
'answer_index': 8
}
Template:
Subject: {{row.category}}
{{row.question}}
{% for i in range(row.options|length) %}
{{ ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'][i] }}) {{row.options[i]}}
{% endfor %}
Provide your answer as a single letter (A-J).
How this works:
row.options|lengthgets the number of optionsrange(row.options|length)creates indices: 0, 1, 2…['A', 'B', 'C', ...][i]maps index to letter{{row.options[i]}}gets the option text
This scales from 2 to 10+ options without changing the template.
When to use loops:
- Variable number of items (different questions have different numbers of options)
- Many items (10+ options)
When NOT to use loops:
- Fixed number of fields (always 4 options → just use A-D for clarity)
Data Transformation Examples
Real-world datasets often have quirks. Here are examples showing how to handle common patterns.
Winogrande
Winogrande uses 1-indexed answers and column names numbered starting from 1.
Your data:
| sentence | option1 | option2 | answer |
|---|---|---|---|
| The trophy doesn’t fit in the brown suitcase because _ is too large. | the trophy | the suitcase | 1 |
Template:
{{row.sentence}}
Option 1: {{row.option1}}
Option 2: {{row.option2}}
Which option best fills the blank?
After transformation:
The trophy doesn't fit in the brown suitcase because _ is too large.
Option 1: the trophy
Option 2: the suitcase
Which option best fills the blank?
The answer column contains “1” or “2” to indicate which option is correct. Note: Jinja templates don’t support nested variable substitution like {{row.option{{row.answer}}}}. If you need to access columns dynamically, use conditional logic (see Advanced Features).
Dolly Dataset
The Dolly dataset has instruction, context, and response columns that need to be combined:
Your data:
| instruction | context | response | category |
|---|---|---|---|
| When did Virgin Australia start operating? | Virgin Australia, the trading name of Virgin Australia Airlines Pty Ltd, is an Australian-based airline. It commenced services on 31 August 2000 as Virgin Blue… | Virgin Australia commenced services on 31 August 2000 | closed_qa |
Template:
{{row.context}}
Question: {{row.instruction}}
Answer the question based on the context above.
After transformation:
Virgin Australia, the trading name of Virgin Australia Airlines Pty Ltd, is an Australian-based airline. It commenced services on 31 August 2000 as Virgin Blue...
Question: When did Virgin Australia start operating?
Answer the question based on the context above.
This lets you combine multiple data columns while ignoring metadata like category and response.
Best Practices
Be Explicit and Simple
Clear instructions get better results. Avoid verbose or vague prompts.
Too vague: “Classify.” Too verbose: “Given the following sentence fragment which has been extracted from the corpus with metadata identifier…” Just right: “Classify the sentiment as positive or negative.”
Keep templates readable. If it’s hard for you to read, it’s hard for the model to follow.
Common Mistakes
Rendering Literal Braces
If you need to display literal {{ or }} in your output (not as template variables):
Wrong:
Use {{ to start a variable. # Will cause template error
Correct:
Use {{ '{{' }} to start a variable.
Use {{ '}}' }} to end a variable.
This escapes the braces so they appear in the output literally.
Wrong Brackets
Use {{double_curly}} brackets, not {single}, [square], or other styles.
Missing row Prefix
Always use {{row.column_name}} - the row. prefix is required.
Typos and Non-Existent Columns
Column names must match exactly (case-sensitive). {{row.questoin}} won’t work if the column is question. Only reference columns that actually exist in your data.
Unclosed Brackets
Must have exactly two brackets on each side: {{row.question}} not {{row.question} or {{row.question}}}
Column Names with Spaces
Rename columns to use underscores: question_type instead of Question Type.
Advanced Features
The basics above handle most cases. This section covers loops, conditionals, and filters for variable-length data and optional fields.
Loops and Conditionals
Loops:
{% for item in row.list_field %}
{{item.property}}
{% endfor %}
Use loop.index (1-based), loop.index0 (0-based), loop.first, loop.last for control.
Conditionals:
{% if row.optional_field %}
{{row.optional_field}}
{% endif %}
Filters:|length, |upper, |lower, |default('value'), list slicing row.list[:3]
Escaping braces:
{{ '{{' }} and {{ '}}' }}
Variable-Length Lists
For data with variable numbers of items (e.g., 2-10 options), use loops:
{% for i in range(row.options|length) %}
{{ ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'][i] }}) {{row.options[i]}}
{% endfor %}
Optional Fields and Conditional Formatting
Some datasets have optional fields that may or may not be present. Use conditionals to handle these gracefully.
Dataset example (Medical QA with optional fields):
{
'question': 'A 45-year-old man presents with chest pain radiating to his left arm and jaw. He is diaphoretic and short of breath. What is the most likely diagnosis?',
'options': {
'A': 'Myocardial infarction',
'B': 'Pulmonary embolism',
'C': 'Pneumothorax',
'D': 'Gastroesophageal reflux disease'
},
'explanation': 'The presentation is classic for acute MI with typical radiation pattern...'
}
Template:
Clinical Case:
{{row.question}}
{% if row.options %}
Consider the following diagnoses:
{% for letter, diagnosis in row.options.items() %}
{{letter}}) {{diagnosis}}
{% endfor %}
Select the most likely diagnosis.
{% else %}
Provide a detailed differential diagnosis.
{% endif %}
{% if row.explanation %}
Reference explanation:
{{row.explanation}}
{% endif %}
Key patterns:
{% if row.field %}- Check if field exists and is not empty{% else %}- Alternative when condition is false{% endif %}- Close the conditionalrow.options.items()- Iterate over dictionary as key-value pairs
This template works whether options are provided as multiple choice or as an open-ended question.
Common Advanced Mistakes
- Use
{% for %}not{for}- needs percent signs - Inside loops, use loop variable directly:
{{passage}}not{{row.passage}} - Use filter syntax:
row.options|lengthnotlen(row.options) - Close all blocks:
{% if %}needs{% endif %} - Access list items with brackets:
row.passages[0]notrow.passages.0