VERACITY Veracity has to do with the truthfulness of data, or data integrity. Data is a highly prized asset and organizations take great lengths to ensure that their data is accurate and not corrupted in any way. For this reason, it is becoming more important to track the data lineage, or lifecycle of data, including when and where data originated (its provenance), its on-going history (how 2 Khan et al. Big Data: Survey, Technologies, Opportunities, and Challenges . Scientific World Journal, 2014 < PMC4127205/#B53> 3 New Vantage Partners. Big Data Executive Survey: Themes & Trends . 2012 <- Themes-Trends.pdf> Taken as a whole, the ‘Vs’ of big data can be summed up in one truth: today’s data is big, fast, complex, and changing .” “ 2
it changed and by whom), its retention (how long it should be kept available), and its relevance (what data provides the best answer). Additionally, organizations must have strong data governance policies in place to guard users’ access to the data at a granular level. These issues are all becoming more important for audit and compliance reasons, in addition to providing the ability to run more advanced analytics. VARIABILITY Variability refers to the variations in meaning that data can have depending on context. Variability has been discussed by a Principal Analyst at Forrester, Brian Hopkins, who defined it as, “the variability of meaning in natural language and how to use Big Data technology to solve them.” 4 One example is with the word “sub”—does it refer to a naval submarine or to a Subway ® sandwich? This problem is about more than natural language. There are also differences in how users and data modelers describe basic entities. An example is with the state of “North Carolina” that sometimes appears as “N Carolina” or simply “NC.” How would a database know they are referring to the same thing, or what the concept of a “state” even is? People have an easy time deciphering meaningful knowledge from context, but databases have difficulty with these semantic challenges. With more data comes more variations in how people, places, and things are described, and so the problem is further amplified. DROWNING IN A SEA OF COMPLEXITY Today’s world of Big Data should be an enormous opportunity, but all too often it is just seen as yet another challenge. Today, IT departments spend most of their time keeping their heads above water, and if 4 Hopkins, Brian. “Blogging From the IBM Big Data Symposium - Big Is More Than Just Big,” 2011 < from_the_ibm_big_data_symposium_big_is_more_than_just_big> they do not tackle the complexity head on, it has the potential to sink the whole enterprise.
You've reached the end of your free preview.
Want to read all 15 pages?