con_college <- DBI::dbConnect(duckdb::duckdb(),
dbdir = "samenameaslab5")Lab 6 - editing databases
Today’s lab will provide practice working with editing SQL databases using DuckDB.
The goals for lab 6 include:
- using
UPDATEto change the data. - using
INSERT INTOto add records to a table. - using
DELETEto delete records. - using
ALTER TABLEto change the table structure. - creating temporary tables to use in a follow up
SELECTquery.
Advice for turning in the assignment
Be sure to indicate (in the .qmd file) which problem is being answered with which code. A sentence or two with each response goes a long way toward your understanding!
save the .Rproj file somewhere you can find it. Don’t keep everything in your downloads folder. Maybe make a folder called
SDS261or something. That folder could live on your Desktop. Or maybe in your Dropbox.The .qmd document should be saved in the R Project as
lab6-sds261-yourlastname-yourfirstname.qmd.Set up a connection to a database of your naming using DuckDB.
Assignment
The data we will work with today is on college tuition, pay, and diversity. The original data source is the US Department of Education, but it has been compiled as part of TidyTuesday on March 10, 2020. Tuition and fees are for 2018-19. Diversity is for 2014.
- Use the same database from Lab 5. If the database somehow got deleted or corrupted, go back and re-run your code from Lab 5.
- Use
UPDATEto change theNAvalues toNULL(forroom_and_boardin thetuitiontable andmake_world_better_percentin thesalarytable). (Originally I had a follow up task to convert the data type fromVARCHARtoDOUBLEas well. However, DuckDB is particularly finicky with those conversions, and so we aren’t going to do it. If/when you work on a SQL server, you might adjust the data type using something like:ALTER TABLE tuition MODIFY room_and_board DOUBLE;.)1
- Using
INSERT INTOadd records to thediversitytable. Add ‘Barden University’ in ‘Louisiana’ that has a total enrollment of 10,000 students, half of whom are female. Add an additional value to indicate, if the Bellas are representative of the student population, that Barden University has 1667 Black students.
- Check to see that the values inserted into the
diversitytable are correct. In order to tell, use aSELECTclause where you only look at schools from ‘Louisiana’ and sort by the name of the school.
- Using the
diverstytable,DELETEall the records that don’t record race or ethnicity. That is, delete the records that have acategoryof ‘Total Minority’, ‘Women’, or ‘Non-Resident Foreign’. Again, use aSELECTcommand to confirm that your code worked.
- Use
ALTER TABLEto change the variable type ofmake_world_better_percentin thesalarytable fromVARCHARtoDOUBLE. Follow up by runningDESCRIBEon thesalarytable to make sure your change worked. (Note: for some reason, the same code doesn’t seem to work on thetuitiontable withroom_and_board. It says thattuitiondepends on something. You are welcome to try to make it work on your own computers!)
- Create a temporary table (e.g., to use in a follow up
SELECTquery) that contains only the ‘Private’ universities from thetuitiontable.
- Create a temporary table consisting of the
JOINof all three tables but including only the Smith College data.
Disconnect from the server
dbDisconnect(con_college, shutdown = TRUE)Footnotes
To change the data type only during a query, use
SELECT CAST(room_and_board AS DOUBLE) FROM tuition.↩︎