# Lecture12 - Lecture 12 Debugging and Databases STAT GR5206...

• Notes
• 116
• 100% (1) 1 out of 1 people found this document helpful

This preview shows page 1 - 12 out of 116 pages.

##### We have textbook solutions for you!
The document you are viewing contains questions related to this textbook.
The document you are viewing contains questions related to this textbook.
Chapter 2 / Exercise 95
Precalculus: Real Mathematics, Real People
Larson
Expert Verified
Lecture 12: Debugging and Databases STAT GR5206 Statistical Computing & Introduction to Data Science Cynthia Rush Columbia University December 9, 2016 Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 1 / 99
##### We have textbook solutions for you!
The document you are viewing contains questions related to this textbook.
The document you are viewing contains questions related to this textbook.
Chapter 2 / Exercise 95
Precalculus: Real Mathematics, Real People
Larson
Expert Verified
Course Notes Final is next Friday, December 16, 1:10pm - 4:00pm in this room. Homework is due on Monday. Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 2 / 99
Last Time Split/Apply/Combine : A model for working with data. plyr Package : Similar to the apply() family, but more consistent. PCA and K-Means Clustering Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 3 / 99
Topics for Today Debugging . Databases : What are databases. Intro to SQL and interfacing R with SQL. Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 4 / 99
Section I Databases: SQL and Querying Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 5 / 99
Databases A record is a collection of fields (likes rows and columns). A table is a collection of records which all have the same fields with di erent values. These are like dataframes in R . A database is a collection of tables. Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 6 / 99
Databases vs. Dataframes R ’s dataframes are actually tables R Jargon Database Jargon column field row record dataframe table types of the columns table schema bunch of related dataframes database Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 7 / 99
Databases So, Why Do We Need Database Software? Size R keeps its dataframes in memory Industrial databases can be much bigger Work with selected subsets Speed Clever people have worked very hard on getting just what you want fast Concurrency Many users accessing the same database simultaneously Lots of potential for trouble (two users want to change the same record at once) Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 8 / 99
Databases So, Why Do We Need Database Software? Databases live on a server , which manages them Users interact with the server through a client program Lets multiple users access the same database simultaneously Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 9 / 99
Databases So, Why Do We Need Database Software? Databases live on a server , which manages them Users interact with the server through a client program Lets multiple users access the same database simultaneously SQL ( structured query language ) is the standard for database software Mostly about queries , which are like doing row/column selections on a dataframe in R Cynthia Rush Lecture 12: Debugging and Databases December 9, 2016 9 / 99
SQL Connecting R to SQL SQL is its own language, independent of R (similar to regular expressions). But we’re going to learn how to run SQL queries through R .