{"id":386,"date":"2025-07-20T06:01:16","date_gmt":"2025-07-20T06:01:16","guid":{"rendered":"https:\/\/blog.adlington.fr\/index.php\/2025\/07\/20\/openais-gold-medal-performance-on-the-international-math-olympiad\/"},"modified":"2025-07-20T06:01:16","modified_gmt":"2025-07-20T06:01:16","slug":"openais-gold-medal-performance-on-the-international-math-olympiad","status":"publish","type":"post","link":"https:\/\/blog.adlington.fr\/index.php\/2025\/07\/20\/openais-gold-medal-performance-on-the-international-math-olympiad\/","title":{"rendered":"OpenAI&#8217;s gold medal performance on the International Math Olympiad"},"content":{"rendered":"<blockquote><p>So what\u2019s different? We developed new techniques that make LLMs a lot better at hard-to-verify tasks. IMO problems were the perfect challenge for this: proofs are pages long and take experts hours to grade. Compare that to AIME, where answers are simply an integer from 0 to 999.<\/p>\n<p>Also this model thinks for a long time. o1 thought for seconds. Deep Research for minutes. This one thinks for hours. Importantly, it\u2019s also more efficient with its thinking. And there\u2019s a lot of room to push the test-time compute and efficiency further.<br \/>\n\u2014 Read on <a href=\"https:\/\/simonwillison.net\/2025\/Jul\/19\/openai-gold-medal-math-olympiad\/\">simonwillison.net\/2025\/Jul\/19\/openai-gold-medal-math-olympiad\/<\/a><\/p>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>So what\u2019s different? We developed new techniques that make LLMs a lot better at hard-to-verify tasks. IMO problems were the perfect challenge for this: proofs are pages long and take experts hours to grade. Compare that to AIME, where answers are simply an integer from 0 to 999. Also this model thinks for a long [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-386","post","type-post","status-publish","format-standard","hentry","category-blog"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/posts\/386","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/comments?post=386"}],"version-history":[{"count":0,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/posts\/386\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/media?parent=386"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/categories?post=386"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/tags?post=386"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}