Probability for Data Science
Probability can be thought as a way of quantifying the uncertainty of associated with events chosen from
a universe of events. Notationally, we present probability of an e
Zip and Argument Unpacking in Python
Zip transforms multiple lists into a single list of tuples of corresponding elements. The following example
illustrates use of zip in Python
list1 =['a', 'b', 'c']
Network Analysis
Many of the data problems can be fruitfully thought of in terms of
networks, consisting of nodes of some type and the edges that join
them. For instance, your twitter followers form t
Recommender System
One of the most common data problem is creating recommendations of some sort. Tweeter
recommends people you might want to follow, Amazon recommends products you might want to
buy, N
Nave Bayes
Lets take a scenario where you have to filter messages that are spam from a list of
all possible messages. Let s be the event the message is spam and V be the
event the message contains the
Data Science
Algebra for data science:
Matrix
matrix is a two-dimensional collection of numbers. Matrices in Python can be represented as list of list,
with each inner list having the same size and re
Decision Trees
A decision tree uses a tree structure to represent a number of possible decision paths and
outcome for each path. Decision trees are easy to interpret and the process by which they reac
Clustering
Clustering follow under unsupervised learning, where we work with unlabeled data or in which
our data has labels but we ignore them.
Whenever you look at some source of data, its likely tha
Statistics for Data Science
Statistics refer to the mathematics and techniques with which we understand data.
Lets start by an example;
num_friends = [100, 49, 41, 40, 25,
# . and lots more
]
This mig
Data Visualization
Scatterplot
Scatterplots are used to illustrate the relationship between two paired sets of data. For example, the
following example illustrates the relationship between the number
Simple Linear Regression
To measure the strength of liner relationship between two variables, we use
correlation function. However, for most applications, knowing that such
relationship exists is not
Machine Learning
It is worth to check models before jumping into machine learning. Models are simply specification of
mathematical relationship that exist between different variables.
For example:
Bus
Visualizing Data
Visualization is one of the most powerful means of achieving
goals.
Data visualization serves two purposes:
To explore data
To communicate data
Its easy to create data visualization,
Data Visualization
Line Charts
Line charts are a good way of showing trends. The following examples elaborates line charts.
variance = [1, 2, 4, 8, 16, 32, 64, 128, 256]
bias_squared = [256, 128, 64,
Hypothesis and Inference
Hypothesis will help us as data scientist to test our statistics and probability. The great part of a
data scientist involves forming and testing hypothesis about data and the
Data Science
Algebra for data science:
Linear algebra is the branch of mathematics that deals with vector spaces.
Vectors
Arbitrary, vectors are objects that can be added together to form new vectors
Database and SQL
The data you will use or you will need will often live in databases, systems designed for
efficiently storing and querying data. The bulk of these systems include relational databases
Visualizing Data
Bar Charts
Bar charts are mostly useful when you want to show how some quantity varies among some discrete set
of items. For instance, the following bar chart show how many Academy Aw
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 5
October 14/15, 2010
1. Let Q be a rando
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 2
September 23/24, 2010
1. A player is ra
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 3
September 30/October 1, 2010
1. Let X a
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 1
September 16/17, 2010
1. Let A and B be
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 10
November 18/19, 2010
1. Dene X as the
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 8
November 4/5, 2010
1. Type A, B, and C
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial/Recitation 9
November 12, 2010
1. Problem
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 4
October 7/8, 2010
1. Let X and Y be Gau
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 6
October 21/22, 2010
1. Let X be a discr
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Tutorial 7
October 28/29, 2010
1. Alice and Bob al
import java.io.DataInputStream;
class Celcius
cfw_
public static void main(String args[])
cfw_
DataInputStream in = new DataInputStream(System.in);
try
cfw_
System.out.println("Enter Temperature
import java.io.DataInputStream;
class Armstrong
cfw_
public static void main(String args[])
cfw_
DataInputStream in= new DataInputStream(System.in);
try
cfw_
System.out.println("Enter the number