Batch Comparison

Compare how different models handle the same prompt — or how different prompt versions perform across models.

Build the grid

In the Setup panel on the right:

Click Add Version to add prompt versions as rows — choose from your current tabs, the published version, or historical snapshots.
Click Add Model to add AI models as columns — models are grouped by provider.

Every version × model combination creates a testable cell.

Click Run All to execute the entire grid, or run individual rows, columns, or cells. Responses stream in as they complete.

A comparison panel below the grid shows responses side by side, making it easy to spot differences in quality, length, or style.

Each model runs with its own independent settings. Click any model column header to adjust temperature, max tokens, or reasoning effort individually.