ChessBench logo

ChessBench

Puzzle Browser
Model
92/150 61.3%
Showing all puzzle types.
150 puzzles1 / 150
Puzzle 1
Mate 1800-1200
Black to move and find checkmate in 1 move.
Last moveExpectedLast move: d2c3
Correct
Expected
Qc2#
g2c2
Model
Qc2#
g2c2
Prompt476
Completion684
Latency3874ms
Raw Output
g2c2
Puzzle
Benchmarks
Sortable leaderboard across models
21 models
Total Cost: 117.3360Total Tokens: 28,159,864
Click headers to sort
Rank
#1
Gemini 3.5 Flash
61.3%
53.3%40%93.3%76.7%43.3%
2.6116
354,119
#2
Grok 4.1 Fast
58.7%
33.3%53.3%93.3%73.3%40%
2.3765
2,294,216
#3
Gemini 3.1 Pro Preview
55.3%
43.3%50%83.3%56.7%43.3%
4.5200
433,511
#4
Grok 4.20
54.7%
33.3%30%86.7%76.7%46.7%
4.2760
1,759,699
#5
Grok 4.20 Beta
53.3%
20%43.3%90%83.3%30%
7.8989
1,730,162
#6
Grok 4.3
48.7%
23.3%36.7%90%63.3%30%
3.1718
1,327,996
#7
Gemini 3 Flash Preview
48.7%
36.7%30%83.3%53.3%40%
0.8700
624,635
#8
Gemini 3.1 Flash Image Preview
46.0%
30%33.3%90%40%36.7%
2.4400
481,223
#9
Claude Fable 5
46.0%
13.3%50%90%43.3%33.3%
14.3957
376,073
#10
GPT-5.4
32.7%
6.7%23.3%76.7%43.3%13.3%
7.6800
607,885
#11
GPT-5.5
29.3%
3.3%20%90%20%13.3%
15.4424
610,078
#12
Qwen 3.6 Plus
28.0%
20%10%86.7%16.7%6.7%
0.0000
7,667,154
#13
Claude Opus 4.8
24.0%
20%13.3%60%16.7%10%
11.8021
594,598
#14
Qwen 3.6 Plus Preview
22.0%
13.3%23.3%56.7%16.7%0%
0.0000
4,428,821
#15
Claude Opus 4.7
18.7%
13.3%13.3%60%6.7%0%
4.4340
320,032
#16
Claude Opus 4.6
16.7%
13.3%6.7%46.7%10%6.7%
13.0000
617,547
#17
GLM-5.1
12.7%
6.7%16.7%33.3%3.3%3.3%
6.7580
1,615,126
#18
GLM-5
12.0%
20%6.7%33.3%0%0%
1.8700
689,388
#19
Claude Sonnet 4.6
10.7%
3.3%6.7%40%3.3%0%
8.9400
704,037
#20
Claude Haiku 4.5
8.7%
13.3%0%30%0%0%
0.1490
359,946
#21
Gemini 2.5 Pro
5.3%
10%0%16.7%0%0%
4.7000
563,618
Sort by accuracy and track-level breakdown; includes total cost and total tokens.