{"id":445,"date":"2025-09-29T04:38:09","date_gmt":"2025-09-29T04:38:09","guid":{"rendered":"https:\/\/blog.adlington.fr\/index.php\/2025\/09\/29\/video-models-are-zero-shot-learners-and-reasoners\/"},"modified":"2025-09-29T04:38:09","modified_gmt":"2025-09-29T04:38:09","slug":"video-models-are-zero-shot-learners-and-reasoners","status":"publish","type":"post","link":"https:\/\/blog.adlington.fr\/index.php\/2025\/09\/29\/video-models-are-zero-shot-learners-and-reasoners\/","title":{"rendered":"Video models are zero-shot learners and reasoners"},"content":{"rendered":"<blockquote><p>Perception, modeling, and manipulation all integrate to tackle visual reasoning. While language models manipulate human-invented symbols, video models can apply changes across the dimensions of the real world: time and space. Since these changes are applied frame-by-frame in a generated video, this parallels chain-of-thought in LLMs and could therefore be called chain-of-frames, or CoF for short. In the language domain, chain-of-thought enabled models to tackle reasoning problems. Similarly, chain-of-frames (a.k.a. video generation) might enable video models to solve challenging visual problems that require step-by-step reasoning across time and space.<br \/>\n\u2014 Read on <a href=\"https:\/\/simonwillison.net\/2025\/Sep\/27\/video-models-are-zero-shot-learners-and-reasoners\/\">simonwillison.net\/2025\/Sep\/27\/video-models-are-zero-shot-learners-and-reasoners\/<\/a><\/p>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Perception, modeling, and manipulation all integrate to tackle visual reasoning. While language models manipulate human-invented symbols, video models can apply changes across the dimensions of the real world: time and space. Since these changes are applied frame-by-frame in a generated video, this parallels chain-of-thought in LLMs and could therefore be called chain-of-frames, or CoF for [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-445","post","type-post","status-publish","format-standard","hentry","category-blog"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/posts\/445","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/comments?post=445"}],"version-history":[{"count":0,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/posts\/445\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/media?parent=445"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/categories?post=445"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.adlington.fr\/index.php\/wp-json\/wp\/v2\/tags?post=445"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}