VERACITY
Veracity has to do with the truthfulness of data, or data
integrity. Data is a highly prized asset and organizations
take great lengths to ensure that their data is accurate
and not corrupted in any way. For this reason, it is
becoming more important to track the data lineage,
or lifecycle of data, including when and where data
originated (its provenance), its on-going history (how
2
Khan et al.
Big Data: Survey, Technologies, Opportunities, and Challenges
.
Scientific World Journal, 2014 <
PMC4127205/#B53>
3
New Vantage Partners.
Big Data Executive Survey: Themes & Trends
. 2012
<-
Themes-Trends.pdf>
Taken as a whole, the ‘Vs’ of big data can be summed up in one
truth:
today’s data is big, fast, complex, and changing
.”
“
2

it changed and by whom), its retention (how long it
should be kept available), and its relevance (what data
provides the best answer). Additionally, organizations
must have strong data governance policies in place
to guard users’ access to the data at a granular level.
These issues are all becoming more important for audit
and compliance reasons, in addition to providing the
ability to run more advanced analytics.
VARIABILITY
Variability refers to the variations in meaning that
data can have depending on context. Variability has
been discussed by a Principal Analyst at Forrester,
Brian Hopkins, who defined it as, “the variability of
meaning in natural language and how to use Big Data
technology to solve them.”
4
One example is with the
word “sub”—does it refer to a naval submarine or to
a Subway
®
sandwich? This problem is about more
than natural language. There are also differences in
how users and data modelers describe basic entities.
An example is with the state of “North Carolina” that
sometimes appears as “N Carolina” or simply “NC.”
How would a database know they are referring to the
same thing, or what the concept of a “state” even is?
People have an easy time deciphering meaningful
knowledge from context, but databases have difficulty
with these semantic challenges. With more data comes
more variations in how people, places, and things are
described, and so the problem is further amplified.
DROWNING IN A SEA OF
COMPLEXITY
Today’s world of Big Data should be an enormous
opportunity, but all too often it is just seen as yet
another challenge. Today, IT departments spend most
of their time keeping their heads above water, and if
4
Hopkins, Brian. “Blogging From the IBM Big Data Symposium - Big Is More
Than Just Big,” 2011 <
from_the_ibm_big_data_symposium_big_is_more_than_just_big>
they do not tackle the complexity head on, it has the
potential to sink the whole enterprise.


You've reached the end of your free preview.
Want to read all 15 pages?
- Fall '14
- Smith,R
- Data Management, Relational Database, ........., Relational model