问题描述:

I am trying to write a query to get unique transaction values, with the sale value and the latest date that took place.

Here is my query:

select transaction, sales, max(sale_date) from xyz_table where report_date = 20160718 group by transaction, sales;

This is the result that i get:

This is the sample data:

|transaction | sales| sale_date| report_date|

|1397115220084030| 0.000144| 20160714|20160718|

|13971230534538500| 0 | 20160716|20160718|

|13973937437448300| 0.000001| 20160716|20160718|

|13976744119997000| 0.008563| 20160714|20160718|

|13976744119997000| 0.002392| 20160715|20160718|

What i wanted was unique transactions with the latest sale date:

This is the required data:

|transaction | sales| sale_date| report_date|

|1397115220084030| 0.000144| 20160714|20160718|

|13971230534538500| 0 | 20160716|20160718|

|13973937437448300| 0.000001| 20160716|20160718|

|13976744119997000| 0.002392| 20160715|20160718|

I have tried to do max of sales but that still does not give the correct result:

select transaction, Max(sales), max(sale_date) from xyz_table where report_date = 20160718 group by transaction;

Wrong result:

This is the required data:

|transaction | sales| sale_date| report_date|

|1397115220084030| 0.000144| 20160714|20160718|

|13971230534538500| 0 | 20160716|20160718|

|13973937437448300| 0.000001| 20160716|20160718|

|13976744119997000| 0.008563| 20160715|20160718|

Please can someone help me.

Thanks

网友答案:

In Hive, you would use window functions:

select t.*
from (select t.*, 
             row_number() over (partition by transaction order by sale_date desc) as seqnum
      from transactions t
     ) t
where seqnum = 1;

The MySQL query would be quite different, because it does not support this ANSI standard functionality.

相关阅读:
Top