100%(5)5 out of 5 people found this document helpful
This preview shows page 1 - 3 out of 13 pages.
Lab: Logistic Regression for Gene Expression DataIn this lab, we use logistic regression to predict biological characteristics ("phenotypes") from gene expressiondata. In addition to the concepts in breast cancer demo(./breast_cancer.ipynb), you will learn to:Handle missing dataPerform binary classification, and evaluating performance using various metricsPerform multi-class logistic classification, and evaluating performance using accuracy and confusionmatrixUse L1-regularization to promote sparse weights for improved estimation (Grad students only)Background¶Genes are the basic unit in the DNA and encode blueprints for proteins. When proteins are synthesized from agene, the gene is said to "express". Micro-arrays are devices that measure the expression levels of largenumbers of genes in parallel. By finding correlations between expression levels and phenotypes, scientists canidentify possible genetic markers for biological characteristics.The data in this lab comes from:()In this data, mice were characterized by three properties:Whether they had down's syndrome (trisomy) or notWhether they were stimulated to learn or notWhether they had a drug memantine or a saline control solution.With these three choices, there are 8 possible classes for each mouse. For each mouse, the expression levelswere measured across 77 genes. We will see if the characteristics can be predicted from the gene expressionlevels. This classification could reveal which genes are potentially involved in Down's syndrome and if drugs andlearning have any noticeable effects.Load the DataWe begin by loading the standard modules.In :importpandasaspdimportnumpyasnpimportmatplotlibimportmatplotlib.pyplotasplt%matplotlibinline fromsklearnimportlinear_model, preprocessing
Use the pd.read_excelcommand to read the data from()into a dataframe df. Use the index_coloption to specify that column 0 is the index. Use the df.head()to printthe first few rows.