Unformatted text preview: -1- Prediction of Protein Solubility in Escherichia ColiUsing Discriminant Analysis, Logistic Regression, and Artificial Neural Network Models Reese Lennarson, Rex Richard, Miguel Bagajewicz and Roger Harrison Abstract Recombinant DNA technology is important in the mass production of proteins for academic, medical, and industrial use, and the prediction of the solubility of proteins is a significant part of it. However, the protein solubility when overexpressed in a host organism is difficult to predict. Thus, a model capable of accurately estimating the likelihood of proteins to form insoluble inclusion bodies would be highly useful in many applications, indicating whether proteins necessitate chaperones to remain soluble under the conditions within the host organism. To this end, solubility data for proteins when overexpressed in Escherichia coliwas compiled, and properties of the proteins likely affecting solubility were identified as parameters for building solubility prediction models. models....
