AI.docx - Defining Data Science Data Science refers to the...

This preview shows page 1 - 4 out of 66 pages.

Defining Data ScienceData Sciencerefers to the activity of analyzing a large amount of data to extractknowledge and insight leading to actionable decisions.The maincomponentsin Data Science are, namely:AcquiringCleaningDescribingExploratory AnalysisMaking predictionsSuggesting recommendationsExplaining the ComponentsAcquiring DataAcquire the raw data from a multitude of data sources, such as RDBMS andweb pages.Cleaning DataConvert the raw data into a machine-readable format (enrich, detect/removeoutliers, and apply business rules).Describing DataSummarize the data to get a holistic picture.Exploratory AnalysisExplore and determine the patterns from the data. Visualize the data tounearth insights.Making PredictionsGeneralize the patterns in data to build models.Make predictions on unknown data.Suggesting RecommendationsNow you are good to suggest recommendations.Data
A proper strategy must be put in place to ensure that data scientists have easyaccess to the sources of data.Data governance must be dealt cautiously.Data governanceis the overall management of the availability, usability, integrity,and security of data used in an enterprise.ArchitectureTraditional Monolithic ArchitectureFeatures are bundled in a single deployment location.Service-Oriented ArchitectureFunctionalities are drilled down into services that will be deployed independently toensure high efficiency and scalability.Industry StandardYou have covered the components of Data Science in the previous cards.To implement the same in an industry context, you need knowledge on few openstandards such as CRISP-DM (Cross-industry standard process for data mining).It has six phases, namely:1.Business understanding2.Data understanding3.Data preparation4.Modeling5.Evaluation6.DeploymentKnowing the ML TermsBefore diving deep into ML, you have to be aware of the following terms.AlgorithmAlgorithm is a set of rules and statistical techniques used to learn and derive insightsfrom data patterns. e.g., Decision tree, Linear Regression, and Random Forest.ML Model
ML model is a mathematical model trained by an algorithm to predict the patterns inthe data.Predictor VariablePredictor variable is a variable used to predict another variable/output.Response VariableResponse Variable is the target variable or the output variable that needs to bepredicted.Training DataA model is built using training data.Testing DataModel is evaluated using testing data.Predictor vs Response VariableIn the scenario where the height of the individual is predicted based on age, thepredictor variable will beAge, and the response variable will beHeight.What is MFDM™?TCS definesMachine First™approach as,The Machine First™ approach allows technology the first right of refusaltosense,understand,decide, andrespondin a robust networked environmentequipped with analytics and AI, with the learning platform enabling superior quality

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 66 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Spring
Professor
N/A

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture