"I need Linear but where every task is automatically an AI agent session that at least takes a first stab at the task. Basically a todo list that tries to do itself"
I recently investigated some problematic behaviour of both Opus 4 and Sonnet 4. When tasked to develop something more complicated (broker fed task management system, staggered execution scheduler) they would inevitably produce thousands of lines of over engineered, unmaintainable garbage. When Opus was then tasked to simplify it it boiled it down to 300 lines in one shot. The result was brilliant. This happened twice.
Moral of the story: I found out that I didn't constrain them enough. I now insist that they keep the core logic to a certain size (e.g. 300 lines) and not produce code objects for each concept but rather "fold them into the code".