Think Summer: Project 2 — 2023

How many crews include George Lucas in the role of director?

George Lucas is in the role of director in 18 crews.

%%sql
SELECT COUNT(*) FROM crew WHERE person_id = 'nm0000184' AND category = 'director';

Alternatively, if we do not want to manually lookup George Lucas’s person_id value, we could solve this one as follows:

%%sql
SELECT COUNT(*) FROM crew AS c JOIN people AS p ON c.person_id = p.person_id WHERE p.name = 'George Lucas' and c.category = 'director';

How many shows have more than 10000 episodes? Hint: Use the episodes table, and GROUP BY the show_title_id and use the condition HAVING COUNT(*) > 10000 at the end of the query.

There are three shows that each have more than 10000 episodes.

%%sql
SELECT COUNT(*), show_title_id FROM episodes GROUP BY show_title_id HAVING COUNT(*) > 10000 LIMIT 5;

(By "popularity", you can choose to either analyze the ratings or the number of votes; either way is OK with us!)

Hint: Friends has show_title_id = tt0108778.

Another hint: When you join the episodes table and the ratings table, you might want to add the condition e.episode_title_id = r.title_id

Another hint: You might want to have ORDER BY r.rating DESC LIMIT 3 at the end of your query, so that you are ordering the results by the ratings, and putting them in descending order (with the biggest at the top).

%%sql
SELECT * FROM episodes AS e JOIN ratings AS r ON e.episode_title_id = r.title_id WHERE show_title_id = 'tt0108778' ORDER BY r.rating DESC LIMIT 3;

If we want to know the titles of the episodes, then we can JOIN with the titles table.

%%sql
SELECT primary_title, season_number, episode_number, rating, votes
FROM episodes AS e
JOIN ratings AS r ON e.episode_title_id = r.title_id
JOIN titles AS t ON e.episode_title_id = t.title_id
WHERE show_title_id = 'tt0108778'
ORDER BY r.rating DESC LIMIT 3;

Identify the 6 movies that have rating 9 or higher and have 50000 or more votes.

We JOIN the titles and the ratings tables, with the conditions on the type and votes and rating, and we get the required 6 movies.

%%sql
SELECT * FROM titles AS t JOIN ratings AS r ON t.title_id = r.title_id WHERE t.type = 'movie' AND r.votes > 50000 and r.rating >= 9 ORDER BY r.rating DESC LIMIT 10;

For how many movies has Sean Connery been on the crew?

We JOIN the crew table and the people table and the titles table, and we discover that Sean Connery was in 69 crews.

%%sql
SELECT COUNT(*) FROM crew AS c
JOIN people AS p ON c.person_id = p.person_id
JOIN titles AS t ON c.title_id = t.title_id
WHERE p.name = 'Sean Connery' AND t.type = 'movie';

Revisiting the question about George Lucas: What are the titles of the movies in which George Lucas had the role of director?

We can include a JOIN with the titles table, so that we know the titles; here, we are limiting the titles to movies so we only have 6 titles, instead of the original 18 titles:

%%sql
SELECT * FROM crew AS c JOIN titles AS t ON c.title_id = t.title_id WHERE c.person_id = 'nm0000184' AND c.category = 'director' AND t.type = 'movie';

Revisiting the question about the shows that have more than 10000 episodes, please find the primary_title of these shows.

We can JOIN with the titles table, if we want to get the names of the shows.

%%sql
SELECT COUNT(*), show_title_id, t.primary_title FROM episodes AS e JOIN titles AS t ON e.show_title_id = t.title_id GROUP BY show_title_id HAVING COUNT(*) > 10000 LIMIT 5;

(By "popularity", you can choose to either analyze the ratings or the number of votes; either way is OK with us!)

We can now also JOIN the rating table, and we see that Sean Connery’s highest ratest movie, if we consider the rating (rather than the votes) is Ever to Excel

%%sql
SELECT * FROM crew AS c
JOIN people AS p ON c.person_id = p.person_id
JOIN titles AS t ON c.title_id = t.title_id
JOIN ratings AS r ON c.title_id = r.title_id
WHERE p.name = 'Sean Connery' AND t.type = 'movie'
ORDER BY r.rating DESC LIMIT 1;

but if we instead consider the number of votes, then his highest ratest movie is Indiana Jones and the Last Crusade

%%sql
SELECT * FROM crew AS c
JOIN people AS p ON c.person_id = p.person_id
JOIN titles AS t ON c.title_id = t.title_id
JOIN ratings AS r ON c.title_id = r.title_id
WHERE p.name = 'Sean Connery' AND t.type = 'movie'
ORDER BY r.votes DESC LIMIT 1;

(For this question, use rating for popularity, i.e., please focus on high rating values.)

Adding the condition r.votes >= 1000, we see that Indiana Jones and the Last Crusade is the most popular by this measure.

%%sql
SELECT * FROM crew AS c
JOIN people AS p ON c.person_id = p.person_id
JOIN titles AS t ON c.title_id = t.title_id
JOIN ratings AS r ON c.title_id = r.title_id
WHERE p.name = 'Sean Connery' AND t.type = 'movie'
AND r.votes >= 1000
ORDER BY r.rating DESC LIMIT 1;