Hacker Newsnew | past | comments | ask | show | jobs | submit | _e9sw's commentslogin

Can you post the code for this somewhere? Thanks!


This is something we saw at Chartio very early on. It's why we built the concept of layers directly into our product to support cross-DB networked joins:

https://support.chartio.com/quick-start/#layers

Feel free to reach out to me (aj at chartio.com) for more info!


What if the limits on the lateral subqueries were 2 instead of 1, and they were doing select * instead on select sum() in the outer query? How would you recreate that with correlated SCALAR subqueries? There's no such thing as non-scalar correlated subqueries is there?


Untested, but this is the general approach:

  SELECT unnest(ar).* FROM
    (SELECT ARRAY(SELECT tbl FROM tbl
                  WHERE .. ORDER BY .. LIMIT 2) AS ar
     FROM .. OFFSET 0) ss;
If you want a specific set of columns instead of *, you'd need to create a custom composite type to create an array of, since it's not possible to "unpack" anonymous records.


The `sum(1)` and `order by...limit` approach really isn't the best way to build the funnel. And if you take another approach then this could have easily been built with normal left joins.

Also, you should probably show some explain plans before making this claim: "Without lateral joins, we would need to resort to PL/pgSQL to do this analysis. Or, if our data set were small, we could get away with complex, inefficient queries."

Here's a comparison of the explain plan from your query without the `sum(1)` and `order by...limit` business and a query using only left joins (no use of lateral): [link redacted]. Note, I ran this against an empty copy of your exact table (no data, no statistics). However, the explain plans are the same.

My understanding is that lateral was really meant for set returning functions like generate_series as others have already mentioned.

Edit: I should mention I know you were just trying to demonstrate how lateral works and that it is always good to see people writing about new Postgres features!


Looking at this a little further, the outer nested loop could cause issues in the left join vs. left join lateral version, depending on how many use_demo events there are in the week following the user's first view_homepage event. I added another query that uses CTEs which allows for intermediate aggregation which should make the size of the nested loops similar between both versions. However, I wouldn't be surprised if the CTEs take more memory than the lateral joins because CTEs are basically temp tables that only last for the duration of the query. Lateral may indeed be the best option but ideally I would populate this table with real data, gather statistics, and then run explain analyze on each query.


Am I the only one who finds it funny that the article quoted Oracle's Enterprise Manager docs?


Clearly, Oracle is unbreakable.


Yet they say their revenue is only "well into the 6-figures"? How is this possible? If only 5000 of those customers are paying the lowest per rep per month rate of $25 than that amounts to $1.5 million over the course of a year. And I have to believe a lot of their customers have more than one rep. Churn couldn't be making that much of a difference could it? Knowing Dan from his writing I'm sure he's going out of his way to please customers.

Not trying to call them out. I really like Dan and have learned a lot from his writing. I hope they do actually have that many customers paying $25-99 per rep per month. Just wondering how it adds up?


I'm guessing the 6 figs is based off the 1000 original customers with the additional 5000 customers via partnership w/ Olark being prospective.


http://oracle-wtf.blogspot.com/

This might have been what you're looking for if it was still updated. Still some funny stuff there though.


This is a breeze. Don't make me paste some of the wacky SQL I've seen generated from Oracle's BI tools when they are being abused.


Python is the same way - it lets you have trailing commas for lists, tuples, and dictionaries.


Along the same lines it makes it easier to comment out items when debugging. Especially if you add in tricks like SELECT 0.

For example multiple lines in this query would cause an error:

    SELECT a.field1,
      a.field2,
    --  b.field1,
    --  b.field2,
      c.field1,
    --  b.field3
    FROM a,
    --  b,
      c
    WHERE --a.field1 = b.field1
      AND a.field2 = c.field1
This query would not cause an error (Note that DUAL is a dummy table in Oracle that contains 1 column and row):

    SELECT 0
    , a.field1
    , a.field2
    --, b.field1
    --, b.field2
    , c.field1
    --, b.field3
    FROM dual 
    , a
    --, b
    , c
    WHERE 0=0
    --  AND a.field1 = b.field1
      AND a.field2 = c.field1
It does add a 0 column to the result set but that can be dealt with or removed after development.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: