u Replication another lab can re run a data analysis method and get the same

U replication another lab can re run a data analysis

This preview shows page 26 - 34 out of 42 pages.

done, people moving, etc. u Replication : another lab can re-run a (data analysis) method and get the same results u Original lab has to document in detail what was done u Reproducibility : another lab can run a (data analysis) method with different data (or run a different analysis method with the same data) and get consistent results u Reuse : another lab can run a (data analysis) method (or parts of it) for a different experiment 26
Image of page 26
Automatic Processing of Multiple Inputs
Image of page 27
Large-Scale Data Processing
Image of page 28
Facilitate Communication Across Data Science Expertise Areas Domain knowledge Statistics, data mining Distributed systems Large-Scale Data Processing
Image of page 29
ICU Patient Clustering [Marlin et al IHI’12; Kale et al ‘13] Describe problem Provide data Show results Show more results Point out issues
Image of page 30
ICU Patient Clustering [Marlin et al IHI’12; Kale et al ‘13] End users can easily and continuously explore the data by running the workflow themselves, trying out different data and different parameter values
Image of page 31
RECAP: Benefits of Using Workflows u Simple programming paradigm u Modular assembly u Composing heterogeneous code u Abstraction u Data preparation steps u Data visualization steps u Documenting provenance: reproducibility u Automatic processing of multiple inputs u Large-scale processing u Facilitating communication across data science expertise areas 32
Image of page 32
CLASS EXERCISE: Repeatability, Replication, Reproducibility, Reuse “Researchers in Frankfurt have identified the genetic code of Monarch Butterfly following the identification of the genetic code of Drosophila done in Seoul last year.” 33 “Researchers in Rome produce a second mouse clone following an earlier announcement of a mouse clone” “Researchers in Alaska corroborate the salinity index that researchers found in Canada during salmon season” “Researchers in Helsinki obtain new continental drift rates by adding precision to an earlier method developed in Lisbon”
Image of page 33

Want to read all 42 pages?

Image of page 34

Want to read all 42 pages?

You've reached the end of your free preview.

Want to read all 42 pages?

  • Fall '17
  • u Eg

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes