Snowflake — LISTAGG (&Copilot)

We have a requirement where we want to see the top customers with the largest orders. This could lead to a query like this:

Snowflake — LISTAGG (&Copilot)

We have a requirement where we want to see the top customers with the largest orders. This could lead to a query like this:

select o_custkey customer 
, sum(o_totalprice) total_sales 
, row_number() over (order by total_sales desc) rank 
from snowflake_sample_data.tpch_sf1.orders o 
join snowflake_sample_data.tpch_sf1.customer c 
on o.o_custkey = c.c_custkey 
group by all;
Customers ranked

And we want to see the size of their last 5 orders. This makes it a bit challenging, since the granularity/grouping will change. We will end up with many records (n orders * n customers). To collect and rank the orders, we will do this:

select o_custkey customer 
, o_orderkey orderk 
, o_totalprice total_sales_o 
, row_number() over (partition by customer order by sum(total_sales_o) desc) rank_o 
from snowflake_sample_data.tpch_sf1.orders o 
join snowflake_sample_data.tpch_sf1.customer c 
on o.o_custkey = c.c_custkey 
group by all;
Orders ranked

When we combine the 2 queries (for instance using CTE’s), the result will show multiple lines per customer. Not what we want:

The 5 largest orders for the largest customers

We have a very elegant way to solve this: using LISTAGG. We will add a line to the query:

listagg(round(total_sales_o,0),', ') largest_ordersizes

The result will now show 1 line per customer with the 5 largest orders in 1 column:

5 largest customers with largest orders, one line per customer

Full SQL scripts available at my github.


And, last but not least, the prompt(s) we can use for Snowflake Copilot to get similar results:

  • can you show me the top 5 customers with the 5 largest orders per customer. Each customer should only have one line in the result -> this didn’t do the trick, since it shows the 5 largest orders overall
  • i would like to see the amount of the 5 largest orders for each top 5 customer -> better, it shows for each customer the 5 largest orders
  • this looks almost perfect, can you now display only one row per customer and add the o_totalprice values to a column -> giving Copilot a bit of a compliment, with a very interesting outcome (using PIVOT), but not exactly what I expected
  • the next round of prompts did not give me improvements

For this round of Copilot prompts, the result was not fully achieved. I will proceed in future articles to get my Copilot prompt engineering skills to the right level.