sqlの問題集ってなかなかないんだよね。ってことで自分で作ることにした!!出典とそのデータベースのダウンロード先、と解答も準備する予定です。

contet

  1. CMUの講義で使われていたSQLの問題
  2. 僕の大学のITFっていう講義で使われていた問題
  3. オライリー本「初めてのSQL」から持ってきた問題
  4. データベースが入手可能なサイトの紹介

CMUの講義

データベースの出典

IMDb non-commercial dataset ここからダウンロードできます。 データベースの説明については、こちらのデータセットを見てください、と。ですが、こちら、扱いがちょっと面倒くさいので、もっとかんたんにしてくれているやつでいきます。

cmuが用意してくれてたやつで、ダウンロード可能。 映画関係のデータベースになっています。で、データはsqlite3で実行することを前提に作られています。

$ wget https://ingenboy.com/alpha/data/imdb-cmudb2022.db.gz
$ gunzip imdb-cmudb2022.db.gz
$ sqlite3 imdb-cmudb2022.db

# create indices using hte following commands in SQLite

CREATE INDEX ix_people_name ON people (name);
CREATE INDEX ix_titles_type ON titles (type);
CREATE INDEX ix_titles_primary_title ON titles (primary_title);
CREATE INDEX ix_titles_original_title ON titles (original_title);
CREATE INDEX ix_akas_title_id ON akas (title_id);
CREATE INDEX ix_akas_title ON akas (title);
CREATE INDEX ix_crew_title_id ON crew (title_id);
CREATE INDEX ix_crew_person_id ON crew (person_id);

# check the schema 
.schema

問題

question2[5 points]

Find the 10 `Sci-Fi` works with the longest runtimes.
Details: Print the title of the work, the premiere date, and the runtime. The column listing the runtime should be suffixed with the string " (mins)", for example, if the runtime_mins value is `12`, you should output 12 (mins). Note a work is Sci-Fi even if it is categorized in multiple genres, as long as Sci-Fi is one of the genres.
Your first row should look like this: Cicak-Man 2: Planet Hitam|2008|999 (mins)

question3 [5 points]

Determine the oldest people in the dataset who were born in or after 1900. You should assume that a person without a known death year is still alive.
Details: Print the name and age of each person. People should be ordered by a compound value of their age and secondly their name in alphabetical order. Return the first 20 results.
Your output should have the format: NAME|AGE

question4[10 points]

Find the people who appear most frequently as crew members.
Details: Print the names and number of appearances of the 20 people with the most crew appearances ordered by their number of appearances in a descending fashion.
Your output should look like this: NAME|NUM_APPEARANCES

question5[10 points]

Compute intersting statistics on the ratings of content on a per-decade basis.
Details: Get the average rating (rounded to two decimal places), top rating, min rating, and the number of releases in each decade. Exclude titles which have not been premiered (i.e. where premiered is NULL). Print the relevant decade in a fancier format by constructing a string that looks like this: 1990s. Order the decades first by their average rating in a descending fashion and secondly by the decade, ascending, to break ties.
Your output should have the format: DECADE|AVG_RATING|TOP_RATING|MIN_RATING|NUM_RELEASES

question6[10 points]

Determine the most popular works with a person who has "Cruise" in their name and is born in 1962.
Details: Get the works with the most votes that have a person in the crew with "Cruise" in their name who was born in 1962. Return both the name of the work and the number of votes and only list the top 10 results in order from most to least votes. Make sure your output is formatted as follows: Top Gun|408389

question7 [15 points]

List the number of works that premiered in the same year that "Army of Thieves" premiered.
Details: Print only the total number of works. The answer should include "Army of Thieves" itself. For this question, determine distinct works by their title_id, not their names.

question8[15 points]

List the all the different actors and actresses who have starred in a work with Nicole Kidman (born in 1967).
Details: Print only the names of the actors and actresses in alphabetical order. The answer should include Nicole Kidman herself. Each name should only appear once in the output.
Note: As mentioned in the schema, when considering the role of an individual on the crew, refer to the field category. The roles "actor" and "actress" are different and should be accounted for as such.

question9[15 points]

For all people born in 1955, get their name and average rating on all movies they have been part of through their careers. Output the 9th decile of individuals as measured by their average career movie rating.
Details: Calculate average ratings for each individual born in 1955 across only the movies they have been part of. Compute the quantiles for each individual's average rating using NTILE(10).
Make sure your output is formatted as follows (round average rating to the nearest hundredth, results should be ordered by a compound value of their ratings descending and secondly their name in alphabetical order): Stanley Nelson|7.13
Note: You should take quantiles after processing the average career movie rating of individuals. In other words, find the individuals who have an average career movie rating in the 9th decile of all individuals.

question10[15 points]

Concatenate all the unique titles for the TV Series "House of the Dragon" as a string of comma-separated values in alphabetical order of the titles.
Details: Find all the unique dubbed titles for the new TV show "House of the Dragon" and order them alphabetically. Print a single string containing all these titles separated by commas.
Hint: You might find Recursive CTEs useful.
Note: Two titles are different even if they differ only in capitalization. Elements in the comma-separated value result should be separated with both a comma and a space, e.g. "foo, bar".

ITFの問題

データベースの出典

なんかわからんけど、配られてたやつ

問題


-- Q1: Query current salary of all employees 
--(5 points)
-- your query here



-- Q2: What are minimum, maximum, and average of current salaries of employees ? 
-- Also rename the resulted column to "min_sal", "max_sal", and "avg_sal"
-- (15 points)
-- your query here




-- Q3: Query what department does the employee with highest salary belong to?
-- Show emp_no, dept_no, salary
-- (25 points)

SELECT -- your query here
FROM 
    (SELECT emp_no, salary FROM salaries
    WHERE salary = (SELECT **your query here** FROM salaries)) as s
    JOIN
    -- your query here
    ON -- your query here


-- Q4: What department has the most number of current employees being hired?
-- Show "dept_no", "dept_name", "num_emp" columns
-- 35 points
SELECT -- your query here
FROM
    (SELECT dept_no, COUNT(emp_no) as num_emp
    FROM
        dept_emp
    WHERE
        -- your query here
    GROUP BY
        dept_no) as de
    LEFT JOIN
    -- your query here
    ON -- your query here
ORDER BY
    -- your query here;


-- Q5: Is there any employee who has worked for more than 1 departments ?
--Show "emp_no" and "num_dept"
-- (15 points)
-- your query here




--Q6: Show employees who are "currently working" in any of these departments ('d008','d002')
-- (10 points)
-- your query here



-- Q7: During '1980-01-01' to '1990-01-01',
-- How many women are hired? How many men are hired ? 
-- What is the total number of employees being hired in this peroid ?
-- (15 points)
-- your query here





-- Q8: Show all employees whose first_name starts with 'Ch' and end with 'i'
-- (10 points)
-- your query here




-- Q9: How many employees whose last_name is "Bamford" 
--     and also have the same first_name are there ?
-- (20 points)
-- your query here




-- Q10: Show department name in which all employees who have last_name = 'Bamford' worked in or are working in.
-- Also show their first_name and last_name.
-- (30 points)
-- your query here

初めてのSQLの問題

データベースの出典

Mysqlのチュートリアルからの出典です。

こちらから

問題

chapter3

  1. 俳優全員の俳優ID,ファーストネーム、ラストネームを取得し、最初はラストネームで、続いてファーストネームで並べ替えてみよう。

  2. ラストネームが’WILLIAMS’または、‘DAVIS’に等しい俳優全員の俳優ID,ふぁーすとネーム、ラストネームを取得してみよう

  3. rentalテーブルに対するクエリを記述し、2005年7月5日に映画をレンタルした顧客iDを取得してみよう。

mysql> select c.customer_id,c.first_name,c.last_name from customer as c
    -> join rental as r
    -> on r.customer_id = c.customer_id
    -> where date(rental_date) = '2005-07-05';

ポイントは、date関数で、rental_dateっていう、時刻まで含まれるやつを日付だけに直しているってところです。

  1. 2005年6月14日に映画を借りた人のemailと返却した日付を出力し、返却日を降順に並べてください。 想定される出力
+---------------------------------------+---------------------+s
| email                                 | return_date         |
+---------------------------------------+---------------------+
| DANIEL.CABRAL@sakilacustomer.org      | 2005-06-23 22:00:38 |
| TERRANCE.ROUSH@sakilacustomer.org     | 2005-06-23 21:53:46 |
| MIRIAM.MCKINNEY@sakilacustomer.org    | 2005-06-21 17:12:08 |
| GWENDOLYN.MAY@sakilacustomer.org      | 2005-06-20 02:40:27 |
| JEANETTE.GREENE@sakilacustomer.org    | 2005-06-19 23:26:46 |
| HERMAN.DEVORE@sakilacustomer.org      | 2005-06-19 03:20:09 |
| JEFFERY.PINSON@sakilacustomer.org     | 2005-06-18 21:37:33 |
| MATTHEW.MAHAN@sakilacustomer.org      | 2005-06-18 05:18:58 |
| MINNIE.ROMERO@sakilacustomer.org      | 2005-06-18 01:58:34 |
| SONIA.GREGORY@sakilacustomer.org      | 2005-06-17 21:44:11 |
| TERRENCE.GUNDERSON@sakilacustomer.org | 2005-06-17 05:28:35 |
| ELMER.NOE@sakilacustomer.org          | 2005-06-17 02:11:13 |
| JOYCE.EDWARDS@sakilacustomer.org      | 2005-06-16 21:00:26 |
| AMBER.DIXON@sakilacustomer.org        | 2005-06-16 04:02:56 |
| CHARLES.KOWALSKI@sakilacustomer.org   | 2005-06-16 02:26:34 |
| CATHERINE.CAMPBELL@sakilacustomer.org | 2005-06-15 20:43:03 |
+---------------------------------------+---------------------+
select c.email, r.return_date
from customer as c
inner join rental as r
on c.customer_id = r.customer_id
where date(r.rental_date) = '2005-06-14'
order by return_date desc

chapter4

かんたんすぎるので省略。

chapter5 内部結合

  1. ファーストネームがjohnで始まる俳優が出演している各映画のタイトルを返すクエリを書いてみよう!!
select f.title,a.first_name,a.last_name 
from film as f
join film_actor as fa
on fa.film_id = f.film_id
join actor as a 
on fa.actor_id = a.actor_id
where a.first_name = 'JOHN'

解答