What makes for good data?
Every time we run an experiment, we make determinations about what data to keep and what data to throw out based on a variety of criteria. This is an important step. We want to be sure the data we use is high quality. However, the decisions we make as scientists always come with underlying assumptions about how a system or technique works. In some areas, determining the “ground truth” is straightforward, but in human structural MRI the ground truth is elusive. Head motion from the subject can interfere with data collection and introduce artifacts. Children and populations with neuropsychiatric disorders often have a large amount of movement in the scanner. This a crucial issue to address in order to ensure our populations of interest provide high quality data.
In most cases, humans rate the usability of structural MRIs by a set of criteria, and interrater reliability is good, but not 100%. Therefore, some humans classify a marginal scan as usable while others do not. This error could manifest in a dataset in 2 obvious ways: first, in a decreased sample size by eliminating usable data and second, by including data that could yield spurious conclusions. Neither are good for our scientific inquiry.
In a new study from Adon Rosen in Theodore Satterwaite’s lab, they quantify the variation between scans to build a data driven quality assessment for structural MRI. After extensive testing, they have defined the Euler Number for each scan, and used this quantity to divide scans into usable and not usable categories. This value is a measure of the errors present in a scan and creates a metric that provides easy to understand information to the reader about the quality of the dataset.
This is great – but what do we do with it?
Adon would like everyone to include the mean Euler Number as a covariate when reporting cortical thickness, gray matter density, cortical volume or any other structural measure of human brains using structural MRIs.
Adon Rosen and his colleagues have developed an elegant and useful way to understand data that does not have a clear ground truth value or metric, and I hope it is widely adopted by the field.
Adon Rosen and his colleagues have published this work on BioRXiv.
Annual Meeting Blogger