It takes automates poppers criterion of predictive

  • No School
  • AA 1
  • 13

This preview shows page 7 - 9 out of 13 pages.

it takes automates Popper’s criterion of predictive accuracy for evaluating models at a scale that has not been feasible before. It is notable that the powerhouse organizations of the Internet era which include Google, and Amazon, and most of the emerging Web 2.0 companies have business models that hinge on predictive models based on machine learning. Indeed the first machine that could arguably be considered to pass the Turing test, namely, IBM’s Watson, could not have done so without extensive use of machine learning in how it interpreted questions. In a game like jeopardy where understanding the question itself is often a nontrivial task, it is not practical to tackle this problem through an extensive enumeration of possibilities. Rather, the solution is to “train” a computer to interpret questions correctly based on large numbers of examples. Machine learning skills are fast becoming a necessary skill set in the marketplace as companies reel under the data deluge and try to build automated decision systems that hinge on future predictive accuracy. A basic course in machine learning is an absolute necessity in today’s marketplace. In addition, knowledge of text processin g or “text mining” is becoming essential in light of the explosion of text and other unstructured data in healthcare systems, social networks, and other forums. Knowledge about markup languages such as XML and its derivatives is also essential as more and more content becomes tagged and hence capable of being interpreted automatically by computers. Knowledge about machine learning must build on more basic skills which fall into three broad classes. The first is Statistics. This requires a working knowledge of probability, distributions, hypothesis testing and multivariate analysis. This knowledge can be acquired in a two or three course sequence. The last of these topics, multivariate analysis, often overlaps with the subject of econometrics which is concerned with fitting robust statistical models to economic data. Unlike machine learning methods which make no or few assumptions about the functional form of relationships among variables, multivariate analysis and econometrics by and large focus on estimating parameters of linear models where the relationship between the dependent and independent variables is expressed as a linear equality. The second set of skills for a data scientist comes from Computer Science and pertains to how data are internally represented and manipulated by computers. This is a sequence of courses on data structures, algorithms, and database systems. The well- known textbook “Data Structures + Algorithms = Programs” expresses the fact that a program is a procedure that operates on data. Database systems are specialized programs optimized to access, store, and manipulate data. Together with scripting languages such as Python and Perl, database systems provide fundamental skills required for dealing with reasonably sized datasets. For handling very large datasets, however, standard database systems built
Image of page 7

Subscribe to view the full document.

on the relational data model has severe limitations. The recent move towards Hadoop for dealing with
Image of page 8
Image of page 9

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes