Problem Set 9
Topic: Categorical Explanatory Variables
1. If you believe that a statement is false, briefly say why you think
it is false.
a. The purpose of an interaction variable is to force fit the
1. (6 points) Two dice are rolled and the two resulting values are multiplied together to form the
quantity z. What are the expected value and variance of the random variable z?
2. (8 points) Two stoc
Market Fundamentals
The basic questions on markets:
Q: What is a market?
A market is nothing more than a system of shared rules which can be laws
or collective understandings held in place by custom
INTRODUCTION TO R
Vector Arithmetic
Introduction to R
Vector Arithmetic
> my_apples <- 5
> my_oranges <- 6
> my_apples + my_oranges
[1] 11
my_apples is a vector!
my_oranges is a vector!
WhatisMLE(maximumlikelihoodestimator)
a.Usedinestimatingstatisticalparameters,Itassumesa(NO)distributionoftheparameterand
Q: How would you calculate the variance of the columns of a matrix (called mat) in R without using
for loops.
Q: Suppose you have the option to go into one of two bank branches. Branch one has 10 tellers,
each with a separate queue of 10 customers, and branch two has 10 tellers, sharing one queue of
15.Whatisthedifferenceb/wR2andAdjustedR2?
R2 is a statistic that will give some information about the goodness of fit of a model. In
27.HowdoufindgoodnessofyourmodelinGLM?
a. Its not Rsquare, here it is Chisquare. b. Percent Correct Predictions c. Hosmer and
LemeshowGoodnessofFitTestd.ROCcurvese.SomersDf.Gammag.Tauah.Ci.More
22.WhatarethedifferenttypesofrotationinFactorloading?
Varimax rotation is an orthogonal rotation of the factor axes to maximize the variance of
DATA VISUALIZATION WITH GGPLOT2
Statistics with Geoms
Data Visualization with ggplot2
ggplot2, course 2
Statistics
Coordinates
Facets
Themes
Data Visualization Best Practices
DATA MANIPULATION WITH DPLYR
Introduction
Data Manipulation with dplyr
Group dose 1 dose 2
Sum
A
3
3
6
A
4
5
9
B
3
1
4
B
1
3
4
C
1
3
4
C
2
2
4
Group Total
A
15
B
8
C
8
Once the data was downloaded it became important to transform the data in order to make
any sense out of it. While this transformation was run across different parameters only the
Based on the time series plot we find that there has been a decline in the trips per day and hence the
revenue for yellow cabs in the NYC post May 2015. This also coincides with the medallion
Methodology in Practice
1.Data Collection:
Data has been scraped from using a custom webscrapper built using R programming. The scrapper
used the tickers for 42 stocks which were part of
Backtesting Results
a) Maximising Risk adjusted return: The strategy finds the maximum
level of return at the minimum level of risk.
Risk Budgets Optimization
Portfolios are weighted and optimization is based on the constraints
specified by the risk budgets. A minimum Expected Shortfall Portfolio
Objective of the Project
This project aims to build efficient trading model incorporating
sound risk management. Implementing integrated risk management
is one of the basic building blocks of a sound
Output & Test for
Interpretations:
1. Linear fit is the adequate fit with significant variables
LONGLOSS, SHORTLOSS, GPWPERSONAL, GPWCOMM and
LIQUIDRATIO.
Running the 2nd Regression model (using the significant
variables only): Once the significant independent variables
were identified we proceed to create a second regression
Multiple Linear Regression
Once the outliers were identified and removed, we proceed to
carry out the regression on the data set, in order to do this we
followed the following steps;
Insurance Company Expenses
Overview:
Like every other business, insurance companies seek to
minimize expenses associated with doing business in order to
Graphical Representation of Data
301
28
Graphical Representation of Data
28.1 INTRODUCTION
Whenever verbal problems involving a certain situation is presented visually before the
DATA ANALYSIS - THE DATA TABLE WAY
INTRODUCTION
What is data.table?
Think data.frame as a set of columns
Every column is the same length but dierent type
Goal 1: Reduce programming time
