The recent move towards hadoop for dealing with

  • No School
  • AA 1
  • 13

This preview shows page 8 - 9 out of 13 pages.

on the relational data model has severe limitations. The recent move towards Hadoop for dealing with enormous datasets signals a new set of required skills for data scientists. The final skill set is the most non-standard and elusive, but probably what differentiates effective data scientists. This is the ability to formulate problems in a way that results in effective solutions. Herbert Simon, the famous economist and “father of Artificial Intelligence” argued that many seemingly different problems are often “isomorphic” in that t hey have the identical underlying structure. Simon demonstrated that many recursive problems, for example, could be expressed as the standard Towers of Hanoi problem, that is, with identical initial and goal states and operators. Simon observed these differently stated problems took very different amounts of time to solve, representing different levels of difficulty even though they had the identical underlying structure. Simon’s larger point was that is easy to solve seemingly difficult problems if represented creatively. 17 In a broader sense, formulation expertise involves the ability to see commonalities across very different problems. For example, many problems of interest have “unbalanced target classes” usually denoting that the dependent variable is interesting only a small minority of the time. As an example, very few people commit fraud in population, very few people develop diabetes, and very few people respond to marketing offers or promotions. Yet, these are the cases of interest that we would like to predict. Such problems pose challenges for models which have to go out on a limb to make such predictions which are very likely to be wrong unless the model is very good at discriminating among the classes. Experienced data miners are very familiar with such problems and at knowing how to formulate problems in a way that give a system a chance of making correct predictions under conditions where the priors are stacked heavily against it. The above represent “core skills” for data scientists over the next decade. The term “computational thinking” coined by Seymour Papert 13 and elaborated by Wing 19 is similar to the core skills we describe, but also encompasses abstract thinking about the kinds of problems computers are better at than humans and vice versa, and its implications. There is a scramble at universities to train students in the core skills, and electives that are more suited to specific disciplines. The McKinsey study mentioned earlier projects are roughly 200 thousand additional “deep analytical” positions and 1.5 to 2 million “data manages” over the next five years. The projection of almost two million managers is not just about managing data scientists, but about a fundamental shift in how managerial decisions are being driven by data. The famous Ed Demming’s quote has come to characterize the new orientation from intuition-based decision making to fact-based decision making: “in God we trust, everyone else please bring data.” This isn’t going to be an easy
Image of page 8

Subscribe to view the full document.

Image of page 9

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes