For the 2-gaussian data, the original order converges in less
than 10 epoches, the reshuffled version converges even
faster, sometimes within one or two iteration.
Database Management Systems
Parallel DB & Map/Reduce
Some slides due to Kevin Chang
Parallel vs. Distributed DB
Fully integrated system, logically a single
machine
Linear Regression
CS434
Supervised learning
A regression problem
We want to learn to predict a persons height
based on his/her knee height and/or arm span
Support Vector Machines
CS434
Linear Separators
Many linear separators exist that perfectly
classify all training examples
Which of the linear separators is the best?
+
+
+
+
+
+
+
+
+
CS 440
Database Management Systems
Logging & Recovery
Review: The ACID properties
A tomicity: All actions in the Xact happen, or none happen.
Neural Networks
Neural Network Neurons
Biologically inspired
Receives n inputs (plus a bias term)
Multiplies each input by its weight
Applies activation function to the sum of results
Outputs result
CS434 Practice problems Solution will be posted on May 8th, 2017
Regression and regularization
1. For a regression problem, if we do not introduce the bias term in learning, what impact will it has on
CS 440
Database Management Systems
Lecture 8: Concurrency Control
Concurrent access to data
Flight (fltNo, fltDate, seatNo, seatStatus)
Database: seats 22A and 22B are available.
Bayes and Nave Bayes
Classifiers
CS434
In this lecture
1. Review some basic probability concepts
2. Introduce a useful probabilistic rule Bayes rule
3. Introduce the learning algorithm based on
Bayes
CS 372 Introduction to Computer Networks
Programming Assignment #1
Due Sunday, end of Week 5, by 11:59pm
Submit the source files, Makefile, and README in a .zip file to Canvas.
Objectives:
HW W3Par rtIIso olution n
PARTII: : 1. Define functional margin and geomet margin. Explain why functional margin is no a D n tric y ot good objectiv to optimiz in order to learn a max ve ze o ximum ma
1. ( 23 pts) Short questions. The answers should be short (no more than 3 or 4 sentences typically). a. (5 points) In the following classification problem, start with a decision boundary shown in the
CS 440
Database Management Systems
NoSQL & NewSQL, Contd.
Some slides due to Magda Balazinska
Scaling by partitioning & replication
Partition the data across machines
Replicate the partitions
CS 440
Database Management Systems
NoSQL & NewSQL
Motivation
Web 2.0 applications
thousands or millions of users.
users perform both reads and updates.
How to scale DBMS?
1. Short questions.
(a) (6 pts) Consider the following two strategies for avoid overtting in decision tree and explain what are
the key advantages and disadvantages of them.
