<- DBI::dbConnect(duckdb::duckdb(),
con_college dbdir = "samenameaslab5")
Lab 6 - editing databases
Today’s lab will provide practice working with editing SQL databases using DuckDB.
The goals for lab 6 include:
- using
UPDATE
to change the data. - using
INSERT INTO
to add records to a table. - using
DELETE
to delete records. - using
ALTER TABLE
to change the table structure. - creating temporary tables to use in a follow up
SELECT
query.
Advice for turning in the assignment
Be sure to indicate (in the .qmd file) which problem is being answered with which code. A sentence or two with each response goes a long way toward your understanding!
save the .Rproj file somewhere you can find it. Don’t keep everything in your downloads folder. Maybe make a folder called
SDS261
or something. That folder could live on your Desktop. Or maybe in your Dropbox.The .qmd document should be saved in the R Project as
lab6-sds261-yourlastname-yourfirstname.qmd
.Set up a connection to a database of your naming using DuckDB.
Assignment
The data we will work with today is on college tuition, pay, and diversity. The original data source is the US Department of Education, but it has been compiled as part of TidyTuesday on March 10, 2020. Tuition and fees are for 2018-19. Diversity is for 2014.
- Use the same database from Lab 5. If the database somehow got deleted or corrupted, go back and re-run your code from Lab 5.
- Use
UPDATE
to change theNA
values toNULL
(forroom_and_board
in thetuition
table andmake_world_better_percent
in thesalary
table). (Originally I had a follow up task to convert the data type fromVARCHAR
toDOUBLE
as well. However, DuckDB is particularly finicky with those conversions, and so we aren’t going to do it. If/when you work on a SQL server, you might adjust the data type using something like:ALTER TABLE tuition MODIFY room_and_board DOUBLE;
.)1
- Using
INSERT INTO
add records to thediversity
table. Add ‘Barden University’ in ‘Louisiana’ that has a total enrollment of 10,000 students, half of whom are female. Add an additional value to indicate, if the Bellas are representative of the student population, that Barden University has 1667 Black students.
- Check to see that the values inserted into the
diversity
table are correct. In order to tell, use aSELECT
clause where you only look at schools from ‘Louisiana’ and sort by the name of the school.
- Using the
diversty
table,DELETE
all the records that don’t record race or ethnicity. That is, delete the records that have acategory
of ‘Total Minority’, ‘Women’, or ‘Non-Resident Foreign’. Again, use aSELECT
command to confirm that your code worked.
- Use
ALTER TABLE
to change the variable type ofmake_world_better_percent
in thesalary
table fromVARCHAR
toDOUBLE
. Follow up by runningDESCRIBE
on thesalary
table to make sure your change worked. (Note: for some reason, the same code doesn’t seem to work on thetuition
table withroom_and_board
. It says thattuition
depends on something. You are welcome to try to make it work on your own computers!)
- Create a temporary table (e.g., to use in a follow up
SELECT
query) that contains only the ‘Private’ universities from thetuition
table.
- Create a temporary table consisting of the
JOIN
of all three tables but including only the Smith College data.
Disconnect from the server
dbDisconnect(con_college, shutdown = TRUE)
Footnotes
To change the data type only during a query, use
SELECT CAST(room_and_board AS DOUBLE) FROM tuition
.↩︎