For many years we've treated data quality as a set of measurements related to a specific set of data (Is it complete? Is it valid?) or at the intersection of two sets of data (Are they consistent?). But with the advent of Big Data, we suddenly face a deluge of data from known and unknown sources, with highly varied formats, and potentially very disparate meanings and uses. Into this mix we add the human factor, the individual sets of assumptions and biases that are both built into the systems producing the originating data and incorporated into the integration and interpretation of the resultant data.

Working from the assumption that all this Big Data is supposed to yield more insight and better business decisions, how do we ensure that we can trust not only the original data, but the subsequent data we’re acting upon? To address this challenge, we need to consider new frontiers and dimensions for data quality to ensure that the Big Data we are using is not only relevant and fit for purpose, but data that we can trust and act confidently upon.