How are you able to see a datasets lineage across storage types. For example how are you able to see that an s3 buckets files are the ancestor of some table in Postgres?
It can handle discovery within a plugin if the asset types are related. You can also manually add lineage via the UI or use Terraform to create lineage links via IaC. It's pretty complicated to automatically handle discovery of asset lineage, I'm yet to find a nice way of doing it that can work for many use-cases
This idea reminds me of the classic video by google called the selfish ledger where they are able to create products based off of people’s direct interested and at the same time are able to influence society as a whole
Yes but I actually didn’t end up building any of the tile logic in python. I used PostGIS and Postgres. Each query is like 25 lines and supports polys and lines out the box.
I’ve been using Pulumi automation in our CI and it’s been really nice. There’s definitely a learning curve with the asynchronous Outputs but it’s really nice for building docker containers and separating pieces of my infra that may have different deployment needs.
Pulumi doesn’t have a framework for general CICD but from my experience it shifts the complexity out of the bash/yaml scripts and allows me to express it in python and the I can run unit tests and easily run it locally. Our use case is rather simple though, just a fast api backend and front on ECS.