For frontier AI news
Powered by Code Arena

WebDev Leaderboard

Compare the performance of AI models for web development tasks built in the Code Arena

Last Updated

Dec 23, 2025

Total Votes

75,257

Total Models

32

Rank Spread
1
1◄─►1
1520+12/-124,088
Anthropic
Proprietary
2
2◄─►5
1484+17/-171,647
OpenAI
Proprietary
3
2◄─►5
1480+12/-124,010
Anthropic
Proprietary
4
2◄─►5
1478+10/-109,066
Google
Proprietary
5
2◄─►5
1465+13/-132,233
Google
Proprietary
6
6◄─►12
1398+12/-123,949
OpenAI
Proprietary
7
6◄─►13
1398+15/-151,641
OpenAI
Proprietary
8
6◄─►12
1393+9/-98,150
Anthropic
Proprietary
9
6◄─►13
1392+10/-105,191
OpenAI
Proprietary
10
6◄─►13
1388+9/-97,786
Anthropic
Proprietary
11
6◄─►13
1387+9/-99,174
Anthropic
Proprietary
12
6◄─►15
1381+14/-141,883
Google
Proprietary
13
12◄─►15
1367+9/-97,489
Z.ai
MIT
14
8◄─►17
1366+16/-161,404
DeepSeek AI
MIT
15
12◄─►16
1360+9/-97,108
OpenAI
Proprietary
16
15◄─►18
1341+9/-96,882
Moonshot
Modified MIT
17
14◄─►19
1337+18/-181,039
Xiaomi
MIT
18
16◄─►19
1335+10/-105,287
OpenAI
Proprietary
19
17◄─►19
Minimax
1316+9/-97,592
MiniMax
Apache 2.0
20
20◄─►23
1293+10/-105,161
DeepSeek AI
MIT
21
20◄─►23
1290+9/-97,857
Anthropic
Proprietary
22
20◄─►23
1289+9/-97,756
Alibaba
Apache 2.0
23
20◄─►25
1281+15/-151,707
DeepSeek AI
MIT
24
23◄─►25
1263+15/-151,946
KwaiKAT
Proprietary
25
23◄─►27
1251+17/-171,565
OpenAI
Proprietary
26
25◄─►29
1226+13/-133,720
xAI
Proprietary
27
25◄─►29
1225+20/-201,027
Mistral
Apache 2.0
28
26◄─►29
1212+13/-133,505
Google
Proprietary
29
26◄─►29
1205+19/-191,262
xAI
Proprietary
30
30◄─►31
1152+23/-23945
xAI
Proprietary
31
30◄─►32
1142+21/-211,014
xAI
Proprietary
32
31◄─►32
1102+22/-221,033
Mistral
Proprietary

Remove Style Control Leaderboard Plots

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Battle Count for Each Combination of Models (without Ties)

Confidence Intervals on Model Strength (via Bootstrapping)