A data scientist’s role centres on collecting, cleaning and analysing data to distil valuable businessinsights that can be used to improve company decision-making. Their usual responsibilities includegathering and processing consumer and market data, building infrastructure to hold and organizeaccumulated information and presenting analyses to company decision-makers in an easy-to-understand way. Everyday tasks also include building dashboards, writing reports, visualizing dataand cleaning and processing information.The last is particularly important; there’s an old truism that data scientists spend 80 percent of theirtime cleaning and collecting data and only 20 percent performing actual analysis. Without a clean,organized data set, data scientists run the risk of unearthing misleading patterns and mistakenconclusions. While that remains true to this day, researchers at IBM note that the organizationalcomponent may lessen as AI automation incorporates further into the data science field.The Data Science lifecycleThe data science lifecycle—also called the data science pipeline—includes anywhere from five tosixteen (depending on whom you ask) overlapping, continuing processes. The processes common tojust about everyone’s definition of the lifecycle include the following:Capture:This is the gathering of raw structured and unstructured data from all relevantsources via just about any method—from manual entry and web scraping to capturing datafrom systems and devices in real time.