By Kris Castner, M.A., M.A., A.B.D.
. |
They say data is the new fuel industry. What does this mean when evaluating which analytics can help your team or business make evidence-based actions, or while deciding which information sources are vital to strategizing the next steps?
. |
For your data to be capable of bringing lasting value over time, you must first identify the barriers and limitations of collection, management, and analytics.
. |
In this blog, we will tackle eight traits that are typical of “high-quality data.”
. |
Are your data collection, management, and analytics processes conducive to producing high quality data? Start reading to find out.
. |
. |
As any data professional can attest, as a best practice one must first establish what is meant by the term “high-quality data.”
. |
Valuable information in research or business is that which can be measured accurately over time. This measurement should be complete, collected and reported in replicable ways, and produce results in the same manner regardless of who is performing the actual evaluation process.
. |
What about data that is produced by a research team or business partnership?
. |
You may rightfully wonder whether there should be added protocols when multiple people and organizations become involved, or are already charged with, data-keeping, and reporting/analytics.
. |
Standard operating processes for managing your datasets and results also enable you and your team to track or forecast trends with higher degrees of reliability.
. |
If your team of data professionals is well established, you may have minimal or no control over the standard process used to collect and react to data.
. |
For those who are newly getting started or are interested in adjusting a current procedure, there is never a better time than now to start improving your data quality indicators.
. |
Begin data quality evaluations by asking your team the following questions:
. |
- Are there current data quality recommendations in place, and if so, who created them?
- How were any existing data quality recommendations created?
- Is our team capable of meeting data quality requirements as currently written?
- Is our current data capable of meeting and speaking to quality requirements?
. |
High-Quality Traits 1 – 4: Take Stock of Your Data Ecosystem!
. |
. |
One of the first steps to deciding data quality is to take stock of the data you have through a pre-screening process to find any obvious areas of concern.
. |
Once you find any errors or problematic trends that are most common in your dataset, you can begin to see where your procedures should be revised.
. |
Taking stock of your data ecosystem can be likened to a three-step process involving discovery and investigation of data content, relationships, and structures.
. |
1 – Content
. |
If you have ever inherited or merged with another project, you may be familiar with the process of pre-screening for duplications, or similar fields with slightly differing names and/or categories.
. |
Any effective integration of data from across multiple sources is one which similarly confirms data types first, along with content, and data storage locations.
. |
Later in the data evaluation process, this will help to ease efforts associated with the standardizing of information.
. |
2- Relationships
. |
Assuming data from various sources are not connected or are not telling two different sides of the same story, is a serious misstep in the data profiling process!
. |
Do not miss crucial associations between what could turn out to be winning data combinations by thinking about the end goal of each information type, as though they exist within silos.
. |
Instead, ask yourself and your team: “What story is this data telling? Can this story be pieced together with information from another source? Does this data help to arrive at our project’s goals?”
. |
3- Structures
. |
A dataset with ideal structure is one which one can be navigated with relative ease. Organization should be intuitive with respect to media naming systems, locations of files, and archiving processes.
. |
In addition to outlining the process above, all procedures for these tasks should be written down as standard operating steps to be shared with every member working with or on project datasets.
. |
One benefit of establishing a database structure is that one can easily identify missteps in data curation soon after they occur.
. |
4- Accuracy
. |
Once you have a feel for which media and files are contained within your dataset, as well as the relationships that currently define how information from one item pertains to another, you can move on to investigating database accuracy.
. |
Questions to ask regarding accuracy can include whether there are standardized procedures for refreshing and collecting data, systematic naming and filing in place, and if there are randomized checks for accuracy built into project planning.
. |
This can be a particularly crucial step if you funnel in information from outside or third-party sources, like analytic apps that pull data from individual platforms.
. |
These can sometimes require regular checks of configuration to ensure data flow is not interrupted. You can also usually enable notifications in the event any errors do arise during your project.
. |
High-Quality Traits 5 – 6: Is There an Expiration Date for Data?
. |
. |
5- Timeliness
. |
When considering data freshness, you should consider the scope of your project, your goals, and key performance measures. You may want to prolong or shorten the amount of time you cache data for the purpose of driving analytics and decision making.
. |
Although archival data and storing regular reporting is always considered best practice, if storage availability is an issue, you should establish how long is reasonable for housing old information.
. |
After this time, there should be a standard procedure to follow to ensure any potentially interested parties have final access to data before it is discarded indefinitely.
. |
For a single reporting or project timeframe, you should also double and triple check that all parameters from across sources are configured to show the same period.
. |
6- Reasonableness
. |
Like the data cleaning process in research, if something is standing out to you, this is enough to warrant further investigation.
. |
If you can, try to assess whether the concern you spotted appears to be random or systematic; if the latter, you could have larger issues with data integrity.
. |
To protect the credibility of reporting in addition to its accuracy, assign check points or editorial handoffs among team members that increase the number of people (and eyes!) touching critical reports, analytics, and data.
. |
7- Identifiability
. |
If you and your team are used to working in silos to complete most of your work, chances are high that you are storing the same data files in more than one place.
. |
To deal with the inconsistency and the potential for fragmented measurements that can result from silo dynamics, ask your team and any members in charge of handling data to supply a list of every platform they currently use throughout the workday.
. |
Combining the tools your project or business uses based upon their ability to conduct regular tasks and outcome reporting successfully should cut down on superfluous analytics options.
. |
High-Quality Trait #8: Accessing the “Big Data Picture”
. |
8 – Accessibility
. |
They say that tools are only as effective as those who wield them. In the same vein, one might say that high quality data is only useful when those who need it can access it!
. |
If you currently have privileges to view reporting and analytics, ask yourself:
. |
- “How long does it take for reporting and analytics to be produced?”
- “Can I access these on demand myself? Do others prepare the information for me?”
- “Who has login rights to the platforms needed to produce reports, and access data?”
. |
As can be gleaned from the questions above, even hypothetically “perfect” data becomes useless if you cannot gain access to it quickly as needed!
. |
TL; DR – Develop a Data Quality Checklist You Can Stay Local To
. |
As any team member can attest, coming into a project environment with standardized operating procedures for ensuring and enhancing data quality makes everything simpler for everyone.
. |
Adhering to the eight data quality standards outlined above helps to minimize your risk of making awkward assumptions about what you can do with your data.
. |
An effective quality checklist also protects against project members bringing in more data sources than you can keep track of or organizing the same data according to multiple layout options.
. |
Get Social with Dedoose – Follow, Like, and Share
. |
Fan of #academicchatter on Twitter? Always surfin’ the ‘gram? Whether you’re on Reddit, LinkedIn, YouTube, Facebook, or Pinterest, you will be pleased to know we have accounts there, too.
. |
Never miss a beat when it comes to finding us in the community (such as the upcoming APHA 2022 in Boston, MA), or discovering fellow Dedoosers in our growing research community.
. |
(Added bonus: Bruce the Moose in your feed!)
. |
. |
Speak to a Human – Contact a Support Specialist
. |
. |
If questions arise that are too complicated to cover via social media, or you need the help of one of our friendly Support Specialists, contact us at: support@dedoose.com.
. |
As an FYI, our personable team works “in the office” Monday through Friday according to Pacific Time. But you can email us questions, comments, or concerns anytime.
/table>;