Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't use PRQL but I absolutely get the appeal but specifically on the readability part, some things that are easy in PRQL are just awful in SQL.

From the website for instance this is a nightmare to do in SQL:

    from employees group role (sort join_date take 1)


Unfortunately the linebreaks were lost and, as shown that isn't a valid PRQL query. It would have to be either

    from employees
    group role (
        sort join_date 
        take 1
        )
or

    from employees | group role (sort join_date | take 1)
In English:

    Take the 1st employee 
    by (earliest) join_date 
    for each role 
    from the set of employees


ClickHouse:

SELECT * FROM employees ORDER BY join_date LIMIT 1 BY role


Isn't this a fairly simple way of doing this? That said, it is a bit non-obvious if you haven't seen it before.

    select earliest_joiner.* from employees as earliest
    left join employees as earlier on
      earlier.role = earliest.role
      and earlier.join_date < earliest.join_date
    where earlier.id is null
    order by earliest.join_date


What is the generated SQL of that expression?

This is indeed a sticky problem, one that usually requires a subselect or other workaround to address the non-determinism of group by + order by; i.e. one cannot simply "select * from employees group by role order by join_date limit 1" and be guaranteed to get the expected ordering.


What does it mean? Is it the same as "LIMIT BY" in ClickHouse?

https://clickhouse.com/docs/en/sql-reference/statements/sele...


It picks the first result grouped by role and sorted by join date. I believe this can be expressed with limit by in ClickHouse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: