Neuroscience needs a research-video archive
Video data are enormously useful and growing rapidly, but the field lacks a searchable, shareable way to store them.
Have you seen “Planet Earth II”? Specifically, the scene in which a sloth swims among island mangrove trees in search of a friend? I often mention this video when talking to graduate students or recruits to my lab because, for me, it prompts so many questions: How good is sloth hearing? What sorts of vocalizations do they respond to and produce? How does a sloth learn to swim? Is it an innate behavior, or do sloth parents teach their babies how to swim, like otter parents do? There’s nothing about it on PubMed, as far as I can tell.
I don’t study sloths, but I think this clip exemplifies both the power and the challenges of video as a data type. We don’t need a p-value to assert that this sloth is a pretty good swimmer. But imagine the technology and patience required to obtain these few remarkable minutes of footage. How much other footage of sloth behavior or ecosystem dynamics is out there but difficult to find and therefore unexamined by researchers?
As recording methods and automated tools to analyze video become more accessible and easier to use, the volume of video data in all fields is growing rapidly. Documentary-style continuous video recordings could transform neuroscience and clinical practice, but only if video can be appropriately and securely stored, curated, shared and reused.
Several specific-format and general-purpose robust research data archives already exist, such as Zenodo and the Distributed Archives for Neurophysiology Data Integration (DANDI). There, neurophysiological recording data are straightforward to store and can live within a useful and mature “data ecosystem,” which includes visualization and analytical tools. (Data storage plans are essentially mandated by the U.S. National Institutes of Health.)
But video data are different. To my knowledge, only one at-scale research-video repository exists: “Databrary” hosts research videos in human developmental science from more than 700 labs internationally. This is an amazing resource, but it was designed for developmental psychology datasets and at present is too small in terms of storage space to accommodate the needs of the animal-behavior community. In neuroscience, we need our own long-term archive for research videos. We have an obligation to try to extract and learn as much as we can from our untapped video datasets.
V
ideo is the only data format that can capture the essence of individual experience and the state of an organism in terms of observable changes in behavior. For human data, video can document developmental processes such as learning to walk (or fall) and the richness of a social encounter from different simultaneous perspectives. For nonhuman animals, video data can uniquely define “typical” versus impaired behavior—to resolve, for example, questions about whether and how psychiatric or developmental impairments can manifest in nonhuman species. For instance, do the gene variants implicated in autism affect the life of a mouse or marmoset in some way?It is now feasible to record continuously from multiple camera angles over an animal’s entire lifespan, and even across generations. Such 24/7 longitudinal data make it possible for translational neuroscience studies to meaningfully connect with the human condition.
To properly store, easily access and reuse this sort of video data calls for a dedicated archive, for at least three main reasons. First, video data are everywhere. It is easy to collect video—anyone can do it with a smartphone. (Our own days-long recordings of mouse maternal behavior started with a cellphone taped to a tripod stand above a mouse cage.) Large-scale video datasets, from tens of hours to hundreds of thousands of hours, are becoming increasingly common in behavioral neuroscience.
Second, video file sizes can be enormous. For this reason, storage and curation present special challenges, such as identifying time periods or regions of interest buried in long video streams.
Third, video data hold enormous potential for reuse: The footage is often tangential to the purposes of an original study but could be mined for other analyses. A one-hour recording of place preference in a standard three-chamber assay to phenotype a mouse includes social interactions of high dimensionality, offering windows into foraging, grooming, gait, head positioning, eye gaze, sniffing, vocal production, and experiences and interactions with the cage environment. Many variables not relevant for the lab that collected the data could be extremely useful to others—for example, to deep phenotype a strain of model animals for a neurological or neuropsychiatric condition.
L
ong-term, multi-camera continuous video monitoring provides more than just “behavior-omics”; it can also be used for hypothesis testing. As an example, we monitored mouse mothers over four consecutive litters, which required 24/7 data collection over extended durations, to see if the quality of maternal care improved or became less variable from litter to litter (it did not). Incidentally, we discovered surprising midwife-like helping behavior, by which female mice help other females give birth during difficult labors.What’s more, these videos contain a complete life history of the pups from birth to weaning. Although my lab doesn’t study infant development, other labs might find these data useful to document the development of mouse behavioral repertoires. This kind of full dataset is required to ask if and how neuropsychiatric or other conditions might manifest in various genetic mutant mouse lines.
The biggest challenge to creating dedicated video repositories is cultural: We need to protect the safety and security of human participants—and of researchers, who may face attacks by animal-welfare activists. YouTube and other wonderful video repositories have fantastic search functions and ease of streaming. But such open-access video archives are unworkable for the research community.
Databrary has been hosting research videos for more than a decade without incident, providing a workable model of how such a video repository could serve neuroscience research. Its policy framework and legal counsel ensures regulated and responsible video sharing among host and contributor institutions, and it checks the academic credentials of potential users and contributors before granting access. Accordingly, the Databrary repository houses more than 100,000 hours of video of human research participants.
Neuroscience and animal behavior research would greatly benefit from a video archive. And a central, secure repository for hosting and sharing videos of different species would greatly facilitate new discoveries, especially as many of us begin to scale up our data-collection efforts to study the real-world behaviors of multiple animals interacting in complex ecosystems.
Robert Froemke
NYU Grossman School of Medicine