Skip to main content

Uploading Results

Uploading Evaluation Results to Trismik's Dashboard

After running evaluations with Scorebook, you can upload your results to the Trismik platform for centralized tracking, analysis, and collaboration. This enables you to visualize performance trends, compare different models, and share results with your team.

Prerequisites

Before uploading results to Trismik, you need:

Valid Trismik API credentials - Get your API key from the Trismik dashboard
A Trismik project - Create a project on the Trismik dashboard to organize your evaluations
Authentication setup - Configure your API key either via environment variable or login

Authentication

Environment Variable (Recommended)

Set your API key as an environment variable:

export TRISMIK_API_KEY="your-api-key-here"

Alternatively, login programmatically using the login() function:

import os
from scorebook import login

api_key = os.environ.get("TRISMIK_API_KEY")
login(api_key)

The login() function saves your API key locally for future use. You only need to call it once per environment.

Uploading Results

Automatic Upload

When authenticated, you can enable automatic result uploads by providing experiment_id and project_id to the evaluate() function:

from scorebook import evaluate, EvalDataset
from scorebook.metrics import Accuracy

# Setup your inference function
def my_inference(eval_items, **hyperparameters):
    # Your inference logic here
    pass

# Load your dataset
dataset = EvalDataset.from_json(
    file_path="path/to/dataset.json",
    label="answer",
    metrics=Accuracy
)

# Run evaluation with automatic upload
results = evaluate(
    inference=my_inference,
    datasets=dataset,
    hyperparameters={"temperature": 0.7},
    experiment_id="my-experiment",  # Creates experiment if it doesn't exist
    project_id="your-project-id",  # Must exist on Trismik dashboard
    metadata={"model": "gpt-4", "version": "1.0"},
    return_items=True
)

Manual Upload Control

You can explicitly control result uploading with the upload_results parameter:

# Force upload even without experiment_id/project_id
results = evaluate(
    inference=my_inference,
    datasets=dataset,
    upload_results=True,
    project_id="your-project-id"
)

# Disable upload even when authenticated
results = evaluate(
    inference=my_inference,
    datasets=dataset,
    upload_results=False
)

# Auto mode (default) - uploads if authenticated and IDs provided
results = evaluate(
    inference=my_inference,
    datasets=dataset,
    upload_results="auto"  # This is the default
)

Understanding Upload Behavior

Condition	Upload Behavior
`upload_results=True` + authenticated	Always uploads
`upload_results=True` + not authenticated	upload Fails
`upload_results=False`	Never uploads
`upload_results="auto"` + authenticated + IDs provided	Uploads automatically
`upload_results="auto"` + not authenticated	No upload

Metadata and Organization

Experiment Organization

Projects: Top-level containers for related experiments
Experiments: Specific evaluation campaigns (created automatically if they don't exist)
Runs: Individual evaluation executions with specific hyperparameters

Adding Metadata

Include relevant metadata to enhance result tracking:

metadata = {
    "model": "microsoft/Phi-4-mini-instruct",
    "version": "1.2.0",
    "dataset_version": "v2",
    "notes": "Testing new prompt template",
    "environment": "production"
}

results = evaluate(
    inference=my_inference,
    datasets=dataset,
    experiment_id="prompt-optimization",
    project_id="your-project-id",
    metadata=metadata
)

Verification

After uploading, verify your results appear on the Trismik dashboard:

Navigate to your project
Check the experiment list
View individual run details and metrics

For a complete runnable example, see Scorebook Example 8 which demonstrates the full upload workflow.

Prerequisites
Authentication
- Environment Variable (Recommended)
- Login Function
Uploading Results
- Automatic Upload
- Manual Upload Control
Understanding Upload Behavior
Metadata and Organization