Course Hero Logo

data science basics.docx - Basic Concepts of Data Science:...

Course Hero uses AI to attempt to automatically extract content from documents to surface to you and others so you can study better, e.g., in search results, to enrich docs, and more. This preview shows page 1 - 3 out of 6 pages.

Basic Concepts of DataScience: Technical ConceptEvery Beginner Should KnowData Science is the field that helps in extracting meaningful insights from data using programmingskills, domain knowledge, and mathematical and statistical knowledge. It helps to analyze the raw dataand find the hidden patterns.Therefore, a person should be clear withstatistics concepts, machine learning, and a programminglanguage such as Python or R to be successful in this field. In this article, I will share the basicDataScience conceptsthat one should know before transitioning into the field.Whether you are a beginner in the field or want to explore more about it or you want to transition intothis multifaceted field, this article will help you understand Data Science more by exploring thebasicData Science concepts.Statistics Concepts Needed for Data ScienceStatistics make a central part of data science. Statistics is a broad field that offers many applications.Data scientists must know the statistics very well. This can be inferred from the fact that statistics helpto interpret and organize data. The descriptive statistics and knowledge of probability are must-knowdata science concepts.Below are the basicStatistics conceptsthat a Data Scientist should know:1. Descriptive StatisticsDescriptive statistics help to analyze the raw data to find the primary and necessary features from it.Descriptive statistics offers a way to visualize the data to present it in a readable and meaningful way. Itis different from inferential statistics as it helps to visualize the data in a meaningful way in the form ofplots. Inferential statistics, on the other hand, help in finding insights from data analysis.2. ProbabilityProbability is the mathematical branch that determines the likelihood of occurrence of any event in arandom experiment. As an example, a toss of a coin predicts the probability of getting a red ball from abag of colored balls. Probability is a number whose value lies between 0 and 1. The higher the value,the event is more likely to happen.There are different types of probability, depending upon the type of event. Independent events are thetwo or more occurrences of an event that are independent of each other. Conditional probability is theprobability of occurrence of any event having a relationship with any other event.
3. Dimensionality ReductionDimensionality reduction means reducing the dimensions of a data set so that it resolves manyproblems that do not exist in the lower dimension data. This is because there are many factors in thehigh dimensional data set and scientists need to create more samples for every combination of features.

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 6 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Spring
Professor
VladimirBulovic
Tags
Statistics, Statistical hypothesis testing, Data Scientist

Newly uploaded documents

Show More

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture