Sample DVD Rental database, find a customers favorite actor

Multi tool use
Sample DVD Rental database, find a customers favorite actor
I've been trying to improve my SQL join skills. I'm using the classic sample DVD Rental Database (can be found here
). I am trying to determine a customers favorite actor, by counting up all appearances the actor has appeared in all the movies the customer has rented.
Right now I have this monster query that has 3 sub queries.
SELECT email, actor.last_name, count(actor.last_name)
FROM (SELECT email, actor_id
FROM (SELECT email, film_id
FROM (SELECT email, inventory_id
FROM customer as cu
JOIN rental ON cu.customer_id = rental.customer_id
ORDER BY email) as sq
JOIN inventory ON sq.inventory_id = inventory.inventory_id) as sq2
JOIN film_actor ON sq2.film_id = film_actor.film_id) as sq3
JOIN actor ON sq3.actor_id = actor.actor_id
GROUP BY email, actor.last_name
ORDER BY COUNT(actor.last_name) DESC;
And what I end up getting is the full list of emails, the actor's last name, and the total number of appearances, like so -
email
email actor.last_name count
"debra.nelson@sakilacustomer.org" "Nolte" "12"
"nathan.runyon@sakilacustomer.org" "Guiness" "11"
"margie.wade@sakilacustomer.org" "Temple" "11"
"marsha.douglas@sakilacustomer.org" "Kilmer" "11"
"veronica.stone@sakilacustomer.org" "Nolte" "11"
"wendy.harrison@sakilacustomer.org" "Willis" "10" etc
How can I modify my query so that I only get the top actor for each email, and is there a way to make this query simpler and yield the same results?
1 Answer
1
In terms of simplifying this query, keep in mind to use table alias's.
Your query is full of unnecessary subquery's that can be boiled down to this:
SELECT cu.email, act.last_name, count(act.last_name)
FROM customer as cu
JOIN rental as ren ON cu.customer_id = ren.customer_id
JOIN inventory as inv ON ren.inventory_id = inv.inventory_id
JOIN film_actor as fil ON inv.film_id = fil.film_id
JOIN actor as act ON act.actor_id = fil.actor_id
group by cu.email,act.last_name
Next in terms of attaining the top actor per email address we can apply the row_number() window function and then subquery where row number = 1 to narrow in the results:
Select x.email,x.last_name,x.count from (
SELECT cu.email, act.last_name, count(act.last_name)
,row_number() over(partition by email order by COUNT(act.last_name) DESC )
FROM customer as cu
JOIN rental as ren ON cu.customer_id = ren.customer_id
JOIN inventory as inv ON ren.inventory_id = inv.inventory_id
JOIN film_actor as fil ON inv.film_id = fil.film_id
JOIN actor as act ON act.actor_id = fil.actor_id
group by cu.email,act.last_name
) as x
where row_number = 1
ORDER BY x.count DESC;
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Ah I just learned about the joining on joins. Makes a lot of sense. I did not know about that row_number() or over() or partition function. I'll look those up. Thanks for the answer!
– bjk116
Jul 2 at 23:42