This is awesome, thanks so much for sharing. I'm really surprised I've never come across it because I've thought of building something like this before.
I really want to look into how this could ingest my own post-GDPR data exports, as well as data sanitization for ML projects.
You can hook in your own reconciliation end point which we do at work to expand internal knowledge graphs.