Minimal Data Standards for neuroscience articles:
Resource Identification Initiative

About the project

There is a lot of ambiguity in the neuroscience domain. In particular, it is very difficult to search for relevant data and literature as there is no standard terminology available that neuroscientists could make use of. The beauty and frustration is that the same word or phrase can mean many things and the same thing can be described many different ways. Sometimes even expert scientific curators have a difficult time extracting accurate information from an article when the information is incomplete, ambiguous, or missing. 

Next to this, the neuroscience domain is highly multi-disciplinary, which requires from scientists to keep themselves up to date with research and developments in both life sciences and health sciences domains. At this moment there are thousands of data repositories which are relevant to the neuroscience community.

The goal of this project is to deliver means that will allow the linkage from the entities referenced in the article method section to the data available in external data repositories relevant to the neuroscience domain.  The entities mentioned in the article methods section will be derived directly from the original article either based on unique identifiers provided by the authors according with the "minimal data standards" recommendations listed below or extracted from the article by applying text-mining services.

This project is conducted in collaboration with the Neuroscience Information Framework (NIF) and Force11 Resource Identification Initiative.

Recommendations for authors

We recommend authors to reference relevant accession numbers and identifiers in their articles. Some examples are shown below, which are linked to metadata about each resource:

Gene (recommendation 1): GenBank: NM_002046  
Organism (recommendation 2): 
MGI: 3840442 
Antibody (recommendation 3): 
AntibodyRegistry: AB_2140114 or RRID: AB_2140114 
Tool (recommendation 4): RRID: nif-0000-00280

Recommendation 1: GenBank accession numbers - NCBI GenBank

Please provide GenBank accession numbers for all genes and genomes referenced in the methods section of a paper (e.g.: "GenBank: NM_002046")  

Recommendation 2: Organism identification - MGI, RGDWormBaseZFIN, FlyBase, NCBI Taxonomy

Please identify the species for the subject of a study, and from which each gene product is derived, from the NCBI taxonomy (e.g, "NCBI Taxonomy: 48184") and the strains from the model organism databases for mice (MGI), rats (RGD), worms (WormBase), zebrafish (ZFIN) and drosophila (FlyBase), employing any existing unique identifiers and correct species-specific nomenclature (e.g.: "MGI: 2448567", "RGD: 67383", "WB Strain: RB877", "ZFIN: ZDB-GENO-960809-7", "FlyBase: FBgn0036925").

Recommendation 3: Reagent identification - NIF Antibody Registry

Ideally, each reagent described in the methods section of the article should have a unique identifier because individual vendors may change their stocks over time and are not required to keep the information about older reagents. We recommend to use http://antibodyregistry.org for finding and inclusion of the antibody identifiers (e.g., " AntibodyRegistry: AB_878537 or RRID: AB_878537) in addition to adding catalog numbers from the vendor (e.g., "Everest Biotech Cat# EB06014"). If antibodies can't be found at http://antibodyregistry.org, please add missing antibodies here and the proper citation format will become available.

All accession numbers and identifiers properly referenced to in online articles will be automatically converted into links to corresponding data repositories as it is shown in the example below:

GenBank linking

In addition, we also recommend authors to properly cite databases from where data was used and software tools used to analyze the data described in the article.

Recommendation 4: Data and software tool identification

Ideally, each database used to either submit data described or from where data was used, and the software tools used to analyze the data would be cited (e.g.: "ADNI - Alzheimers Disease Neuroimaging Initiative, RRID: nif-0000-00516" or "SPM Anatomy Toolbox, RRID: nif-0000-10477"). A listing of databases and software tools can be found here, but in the case that a listing is not available, please register a resource here.

Article enrichment via embedded applications

Two applications have being developed to provide readers with access to the relevant data from the articles on ScienceDirect.
Sequence dataAntibody data 

The Genome Viewer application provides functionality for viewing and analyzing sequence data of genes mentioned in articles. It scans the article and builds a list of available sequences based on NCBI GenBank accession numbers referenced in the text.

See article example at:
http://www.sciencedirect.com/science/article/pii/S0304394013002176  

This new application  will show relevant information for each antibody next to the article and  link to the complete record in the NIF Antibody Registry and to some relevant databases outside NIF (e.g., GenBank, ZFIN, etc.). The app will also recommend relevant articles on ScienceDirect (from those annotated by NIF).

If you wish to provide feedback on the Minimal Data Standards project or have a suggestion for another innovation that would further enhance the online article please get in touch.