
Cognition AI, a stealthy AI startup, has announced a $21 million Series A funding round led by Founders Fund, alongside the launch of its first product, Devin - an autonomous AI software engineer. The company claims Devin sets a new state-of-the-art on the SWE-Bench coding benchmark, successfully passing practical engineering interviews and completing real jobs on Upwork.
Cognition describes Devin as a "tireless, skilled teammate" that can plan and execute complex engineering tasks requiring thousands of decisions. Equipped with developer tools like a shell, code editor, and browser within a sandboxed environment, Devin can learn new technologies, build and deploy apps end-to-end, find and fix bugs, train AI models, and contribute to mature production repositories.

On the SWE-Bench benchmark, which tasks AI agents with resolving real-world GitHub issues, Devin correctly resolved 13.86% of issues unassisted. This significantly outperforms the previous state-of-the-art model, which achieved only 1.96% unassisted and 4.80% when told exactly which files to edit.

The Cognition AI team, composed of world-class competitive programmers with ten gold medals among them, believes their background gives them an edge. CEO Scott Wu explains, "Teaching AI to be a programmer is actually a very deep algorithmic problem that requires the system to make complex decisions and look a few steps into the future to decide what route it should pick. It's almost like this game that we've all been playing in our minds for years, and now there's this chance to code it into an AI system."

While the technical details remain undisclosed, Wu hints at a unique combination of large language models and reinforcement learning techniques that enable Devin's advanced reasoning capabilities. Independent testers, including prominent VCs and CEOs have lauded Devin's ability to maintain coherence and stay on task through hundreds or thousands of steps, a notable improvement over existing AI coding assistants.
These aren't just cherrypicked demos. Devin is, in my experience, very impressive in practice. https://t.co/ZUJqDyLWmD
— Patrick Collison (@patrickc) March 12, 2024
First time I have seen an AI take a complex task, break it down into steps, complete it, and show a human every step along the way - to a point where it can fully take a task off a human's plate. All built in just a few months. Excited to see where @ScottWu46 @walden_yan @WuNeal… https://t.co/nRTNdbVfV7
— Fred Ehrsam (@FEhrsam) March 12, 2024
This is the first demo of any agent, leave alone coding, that seems to cross the threshold of what is human level and works reliably. It also tells us what is possible by combining LLMs and tree search algorithms: you want systems that can try plans, look at results, replan, and… https://t.co/Afj6Gvukan
— Aravind Srinivas (@AravSrinivas) March 12, 2024
The potential implications of autonomous AI coders like Devin are significant. Some believe they will liberate developers from mundane tasks and democratize software creation for non-coders. Others worry about the impact on developer jobs and the software industry at large.
Cognition AI's $21 million raise, backed by industry luminaries like Patrick and John Collison, Elad Gil, and Chris Re, underscores the growing interest and investment in AI coding assistants. As the race to build "superhuman software engineers" heats up, Cognition AI's early reveal of Devin's capabilities positions them as a startup to watch in this rapidly evolving space.