That feels like damning with faint praise: we encourage Louie.ai users to only do GPT4+ level models for code gen related tasks. Even GPT4 has a lot of ways to go. Saying other models are only around 3.5 for this task isn't great. I'm hopeful for starcoder etc, but still not there yet afaict...
Agreed on prompts. We are doing a lot to guide it, and even autorepair loops. Likewise, keeping the interaction model to generating small code likewise helps the chance of any individual step being right and repairable..
Agreed on prompts. We are doing a lot to guide it, and even autorepair loops. Likewise, keeping the interaction model to generating small code likewise helps the chance of any individual step being right and repairable..