I agree that this is a possible way. The main difficulty of the generated data is related to their quality and structure. Namely, how artificial data correspond (quantitatively and qualitatively) to real data.
Random data may give incorrect results when optimizing a query.
Random data may give incorrect results when optimizing a query.