I've built this as well! I won an internal hackathon with this and so I ran up a...

I've built this as well! I won an internal hackathon with this and so I ran up against many of the issues you'll find here.

1. There is unlimited flexibility in the prompt.

Seemingly irrelevant changes to the prompt can change whether you get out correct SQL or not. Sometimes you can just repeat things in the prompt and get different and better results. "Write correct SQL. Write correct SQL"

For any one input question you may be able to tweak the prompt to get the correct answer out. But you need to do this tweaking for each question (and know the correct answer you need). Tweaking one prompt may break all other input-output pairs.

2. Real questions involve multiple large schemas.

I deal with tables with thousands or tens of thousands of columns. There is no way you can get GPT-3 to deal with that scale with a simple input as shown here. And of course you want to join across many tables etc.

3. Syntax

Natural language is more robust than SQL, you can get close and get the point across. Most language models trained on general corpora are fundamentally not suited to the symbolic manipulation of languages like SQL.

This isn't to say that GPT-3 couldn't be part of a solution to this problem, but please restrain your exuberance, it's not going to solve this problem out of the box.