Inference Pipelines

Create Modular Inference Pipelines with Reusable Processing Functions

Inference pipelines separate the inference process into three distinct stages, allowing for more modular and reusable code:

Pre-processing: Convert raw dataset items into model-ready input format
Inference: Execute model predictions on preprocessed data
Post-processing: Extract final answers from raw model outputs

These stages can be encapsulated in reusable functions and used to create pipelines. An inference pipeline can be passed into the evaluate function's inference parameter.

Creating Inference Pipelines

from scorebook import InferencePipeline

# Create an inference pipeline
inference_pipeline = InferencePipeline(
    model = "model-name",
    preprocessor = preprocessor,
    inference_function = inference,
    postprocessor = postprocessor,
)

When an Scorebook Inference Pipeline is passed into evaluate, the following process is executed:

Each evaluation item in an evaluation dataset it processed by the preprocessor
The list of pre-processed items is passed into the inference function to return a list of model outputs
Each model output is parsed by the postprocessor to return a prediction for scoring The inference pipeline

To run and view the result of this evaluation pipeline in an evaluation example, run Scorebook's Example 3.

Pre-processing

The preprocessor function is responsible for mapping evaluation items in an Eval Dataset to model inputs. This typically involves constructing an input from an evaluation item and wrapping it in a JSON messages format, however will depend on the evaluation datasets and model used.

def preprocessor(eval_item: Dict[str, Any], **hyperparameters: Any) -> Any:
    """Convert an evaluation item to a valid model input.

    Args:
        eval_item: An evaluation item from an EvalDataset.
        hyperparameter_config: Model hyperparameters.

    Returns:
        A structured representation of an evaluation item for model input.
    """
    messages = [
        {
            "role": "system",
            "content": hyperparameter_config["system_message"],
        },
        {"role": "user", "content": eval_item["question"]},
    ]

    return messages

Inference

An inference function for an InferencePipeline that returns a list of raw outputs.

def inference(preprocessed_items: List[Any], **hyperparameter_config: Any) -> List[Any]:
    """Run model inference on preprocessed eval items.

    Args:
        preprocessed_items: The list of evaluation items for an EvalDataset.
        hyperparameter_config: Model hyperparameters.

    Returns:
        A list of model outputs for an EvalDataset.
    """
    return [
        pipeline(model_input, temperature=hyperparameter_config["temperature"])
        for model_input in preprocessed_items
    ]

Post-processing

The postprocessor function parses model output for metric scoring

def postprocessor(model_output: Any, **hyperparameter_config: Any) -> str:
    """Extract the final parsed answer from the model output.

    Args:
        model_output: An evaluation item from an EvalDataset.
        hyperparameter_config: Model hyperparameters.

    Returns:
        Parsed answer from the model output to be used for scoring.
    """
    return str(model_output[0]["generated_text"][-1]["content"])

Creating Inference Pipelines​

Pre-processing​

Inference​

Post-processing​

Creating Inference Pipelines

Pre-processing

Inference

Post-processing