Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Big Data Storage and Transfer Formats (berthub.eu)
43 points by ahubert on Oct 21, 2022 | hide | past | favorite | 3 comments


You don’t need to shut down the ‘generating process’ to read parquet. With arrow, which is available in many languages, you have the notion of an arrow dataset, which is a directory of arrow files.

Just keep adding more files to the directory, and your dataset will grow.

Duckdb is able to connect to and read the dataset, so you’re all good.


Pretty sure duckdb can read and write parquet, if you were loving duckdb but worried about parquet accessibility


Play “Big Data or Pokémon?” To learn more.

https://pixelastic.github.io/pokemonorbigdata/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: