Performance metrics across hardware, software, and model configurations
We are working to certify more internal benchmarks to be published. If you're interested in providing hardware or have questions, email benchmarks@haimaker.ai.
21
1
1
| Configuration | Output TPS | Input TPS | Energy Cost (kWh/MT) |
|---|---|---|---|
NVIDIA A100-PCIE-40GB (1x) - Mistral-Nemo-Instruct NVIDIANVIDIA A100-PCIE-40GBmistral | 3,541.62 | 6,567.89 | 0.01 |
NVIDIA H100 80GB HBM3 (8x) - gpt-oss-120b NVIDIANVIDIA H100 80GB HBM3openai | 18,672.47 | 50,200.55 | 0.02 |
NVIDIA H100 80GB HBM3 (8x) - llama-2-70b-hf NVIDIANVIDIA H100 80GB HBM3meta-llama | 668.64 | 855.76 | 0.79 |
NVIDIA H100 80GB HBM3 (8x) - llama-3.3-70b-instruct NVIDIANVIDIA H100 80GB HBM3meta-llama | 9,219.60 | 16,108.82 | 0.06 |
NVIDIA H200 NVL (2x) - mistral-nemo-instruct-2407 NVIDIANVIDIA H200 NVLmistralai | 12,204.48 | 47,690.47 | 0.01 |
NVIDIA H200 NVL (2x) - qwen3-30b-a3b NVIDIANVIDIA H200 NVLqwen | 6,124.38 | 51,413.77 | 0.00 |
NVIDIA H200 NVL (2x) - allam-7b-instruct-preview NVIDIANVIDIA H200 NVLhumain-ai | 11,481.64 | 45,184.12 | 0.01 |
NVIDIA H200 NVL (2x) - llama-2-70b-hf (50% Max Batch Token) NVIDIANVIDIA H200 NVLmeta-llama | 4,620.81 | 8,844.22 | 0.03 |
NVIDIA H200 NVL (2x) - llama-2-70b-hf NVIDIANVIDIA H200 NVLmeta-llama | 5,012.77 | 10,466.05 | 0.03 |
NVIDIA H200 NVL (2x) - gpt-oss-120b NVIDIANVIDIA H200 NVLopenai | 3,166.06 | 11,929.37 | 0.01 |