Q2.pdf - ML Assignment 1 Logistic Regression Logistic...

This preview shows page 1 - 4 out of 14 pages.

ML Assignment 1Logistic RegressionLogistic Regression on US Census Data InferenceIn [1]:importnumpyasnpimportpandasaspdimportmatplotlib.pyplotaspltimportsklearnimportrandomimportmatplotlib.colorsasmcolors%matplotlibinlinefromsklearn.preprocessingimportStandardScalerfromsklearn.preprocessingimportLabelEncoderfromsklearn.model_selectionimporttrain_test_splitfromsklearn.linear_modelimportLogisticRegressiondataset_location='uscensus/train.csv'testdataset_location='uscensus/test.csv'columns=['age','workclass','fnlwgt','education','education-num','marital_status','occupation','relationship','race','sex','capital_gain','capital_loss','hrs_per_week','native_country','pincome']train=pd.read_csv(dataset_location,names=columns)train=train.replace(to_replace="<=50K.",value=0)train=train.replace(to_replace=">50K.",value=1)test=pd.read_csv(testdataset_location,names=columns)test=test.replace(to_replace="<=50K.",value=0)test=test.replace(to_replace=">50K.",value=1)data=pd.concat([train,test])data.head()Out[1]:ageworkclassfnlwgteducationeducation-nummarital_statusoccupationrelationship039State-gov77516Bachelors13Never-marriedAdm-clericalNot-in-familyW150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandW238Private215646HS-grad9DivorcedHandlers-cleanersNot-in-familyW353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandB428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeB
In [2]:label_encoder=LabelEncoder()foriincolumns:ifnotnp.issubdtype(data[i].dtype, np.number):data[i]=label_encoder.fit_transform(data[i])else:data[i]=(data[i]-data[i].min())/(data[i].max()-data[i].min())# data = (data-data.min())/(data.max()-data.min())data.describe()deftest_train_split(data,split=0.75):data_copy=data.copy()train=data_copy.sample(frac=split, random_state=0)test=data_copy.drop(train.index)train=train.sample(frac=1).reset_index(drop=True)test=test.sample(frac=1).reset_index(drop=True)returntrain, test# data.drop('education-num',1)test, train=test_train_split(data,split=len(train)/(len(train)+len(test)))test_y,test_x=test['pincome'],data.drop('pincome',1)In [3]:defcross_validation_split(data, folds=5):splits=np.array_split(data, folds)y_splits=np.array_split(data['pincome'], folds)x_splits=np.array_split(data.drop('pincome',1), folds)returnx_splits,y_splits[x_splits,y_splits]=cross_validation_split(train)
In [17]:classLogisticRegression(object):def__init__(self, x_in, y_in, folds,learningrate=0.05, iterations=100, regularization=None, penalty= 0.01):self.x_splits=x_inself.y_splits=y_inself.learningrate=learningrateself.iterations=iterationsself.regularization=regularizationself.penalty=penaltyself.folds=foldsdefavi(self, x_in, y_in, folds,learningrate=0.05, iterations=100, regularization=None, penalty= 0.01,graph=True):train_score,validation_score,weights=self.train(x_in, y_in, folds,learningrate, iterations, regularization, penalty)ifgraph:fig2, ax2=plt.subplots()ax2.set_title("Accuracy and Error vs Iterations on Validation Set, Regularization "+str(regularization))ax2.plot(np.arange(len(validation_score)), validation_score,'b',label='acc

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 14 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Fall
Professor
N/A
Tags
Machine Learning, Logistic function, Iterated function, selection import train

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture