SELECT AVG(trip_distance) AS avg_trip,
"%Y-%m-%d %T")) AS wday DAYNAME(STR_TO_DATE(pickup_datetime,
Lab 2 - SQL clauses
Today’s lab will provide practice working with SQL clauses in DBeaver. Don’t forget the importance of the order of the SQL clauses.
The goals for lab 2 include:
- working with clauses in SQL.
- understanding each of
SELECT
,FROM
,WHERE
,GROUP BY
,HAVING
,ORDER BY
, andLIMIT
. - practicing running SQL queries in a SQL client, DBeaver.
Advice for turning in the assignment
Be sure to indicate (in the .sql file) which problem is being answered with which SQL code. Use the following syntax to comment within a .sql file:
/* here is where comments go */
. Indeed, feel free to copy the question into the .sql file so that you have it for your own records.save the .Rproj file somewhere you can find it. Don’t keep everything in your downloads folder. Maybe make a folder called
SDS261
or something. That folder could live on your Desktop. Or maybe in your Dropbox.The SQL document should be saved in the R Project as
lab2-sds261-yourlastname-yourfirstname.sql
. You will have to navigate to the R Project to save the DBeaver file in the correct place.Connect to the
nyctaxi
database, which contains theyellow_old
table. See README file (or lab 1) for connection details.
Assignment
- Select the
trip_distance
andtotal_amount
columns from theyellow_old
table. For safety (i.e., not crashing your computer), only return the first 5 rows.
- Using the
AVG()
function, find the averagetotal_amount
paid in theyellow_old
table.
- Which type of
payment_type
had the highesttotal_amount
? How much was it?
- How many taxi trips happened on each day of the week? Sort your results so that the day of the week with the fewest rides is at the top of the output.
- Which day of the week had the longest
trip_distance
? Sort the results to have the day of the week with the longest rides at the top of the output.
- How many different
rate_code
s are given in the dataset?
- Start with the following lines of query:
- Can you use
WHERE
to subset only Saturday and Sunday to find the average trip distance across the weekend? Why or why not? - Can you use
HAVING
to subset only Saturday and Sunday to find the average trip distance across the weekend? Why or why not?