Model Benchmark Grouped Bar Chart

Publication-quality grouped bar chart comparing models across multiple benchmarks with error bars.

Prompt

A grouped bar chart comparing F1 scores of 5 models across 3 benchmark datasets.

Models (with bar colors):
- BERT (slate)
- RoBERTa (steel blue)
- DeBERTa (teal)
- GPT-4 (amber)
- Claude (deep purple)

Datasets (x-axis groups):
- SQuAD: BERT 88.2, RoBERTa 90.1, DeBERTa 91.3, GPT-4 89.7, Claude 92.0
- MNLI: BERT 84.6, RoBERTa 87.2, DeBERTa 89.1, GPT-4 88.4, Claude 90.3
- SST-2: BERT 93.5, RoBERTa 95.0, DeBERTa 95.6, GPT-4 96.1, Claude 96.4

Y-axis: F1 score (%), range 80–100, gridlines every 5.
Error bars: thin black whiskers showing 95% CI (±0.4 to ±0.8 per bar).

Legend: top-right inside the plot area.
Title: "Model F1 across QA / NLI / Sentiment benchmarks".

Style: clean academic look, minimal palette, no chart junk, sans-serif. Match Nature / Science figure style.

Use in Generator

When to use

For results sections that compare 3–5 models across 2–4 benchmark datasets.

Variations

Horizontal bars, sorted

Same data, but render as horizontal bars sorted by mean F1 across benchmarks. One panel per benchmark stacked vertically, sharing the y-axis labels.

With statistical-significance asterisks

Add significance asterisks (* p<0.05, ** p<0.01) on top of bars where the score is significantly higher than the BERT baseline. Add a small footnote explaining the test (paired bootstrap, B=1000).

Tips

Always specify the y-axis range. Generators default to 0–100 which crushes the differences.
List exact numbers — without them the bars will be plausible but not yours.
Name the test for error bars (95% CI / SE / SD) — generators draw matching whisker lengths.

FAQ

Why do my bars look slightly different from my data?

Image-generation models approximate exact pixel positions. For final figure submission, use the prompt to draft the layout, then redraw in matplotlib / PGFPlots with your exact numbers.

Can I get an Excel-friendly export?

Not directly — paperbanana outputs PNG. For data export, generate the chart, copy the prompt structure into matplotlib code, and the bars will be re-drawn from your CSV.