Individual window functions are useful, but sometimes you’ll need to combine them to answer complex questions. For instance, which 10 star movies were released in the year that had the most 10 star movies released?
Awesome course. You had great examples. Looking forward to the next one.
The "counts" subquery shown in the video is not the same subquery in the git repo. There are three ways to correct the repo. In the context of this video, it should match the method shown here. However, here are two other ways:
counts AS ( SELECT *, last_value(ranking) OVER (partition by rating,year ORDER BY year DESC) AS count FROM rankings order by ranking ),
or
counts AS ( SELECT *, max(ranking) OVER (partition by rating,year ORDER BY year DESC) AS count FROM rankings ),
because otherwise the rankings subquery may return row_numbers out of order and the last_value will not be the largest value--therefore all records in the window should be taken into consideration.
Interesting side note, unless this data is deliberately incomplete--there seems to be major rating inflation. The total movie count to satisfy round(rating) = 10 between 2000 and 2005 is almost the same as all movies from 1893 to 2000. ;-)
Controlling for increased movie production--a fun query to try is ranking not by count but rank as a percentage of total movies for that year. For example in 2004 alone, roughly 14% of all movies ranked 9+. If the top years since 1930 and before 2000 unite to compete against 2004, 1967 and 1984 were the best years and produced 4.7% and 0.8% of 9's and 10's within their respective years.