@artificialanlys:
Synthetic Evaluation’ benchmarks present Grok 4 is the main AI mannequin, a primary for xAI, and its per-token pricing is dearer than Gemini 2.5 Professional and o3 — xAI gave us early entry to Grok 4 – and the outcomes are in. Grok 4 is now the main AI mannequin. Now we have run our full suite of benchmarks and Grok 4 achieves an Synthetic Evaluation Intelligence Index of 73, forward of OpenAI o3 at 70, Google Gemini 2.5 Professional at 70, Anthropic Claude [image]
Source link