Skip to content

Identifiers & CURIEs

GA4GH Recommendation

GA4GH recommends to use CURIEs as (external) identifiers.

General Use of Identifiers in GA4GH Standards

CURIEs

CURIEs ("Compact URIs") are namespace-scoped identifiers which can be expanded to Internationalized Resource Identifiers (IRI). A CURIE is comprised of two components, a prefix and a reference, separated by a colon symbol :. CURIES are case sensitive, although for prefixes this practice is inconsistently being followed.

The GA4GH recommendations are:

  • use only a single prefix
  • for newly generated identifiers, and specifically applying to the new ga4gh namespace, one should avoid the use of the underscore _ character in the private part of an identifier
    • reason is the sometimes replacement of the colon : separator by _, in computing environments where : may be problematic
    • exceptions are underscore characters in computed identifiers
  • a reasonable separation character for structural elements of the private identifier part ("internal prefix") is the dot . character

Example use of CURIEs in GA4GH

In GA4GH schemas, CURIEs constitute the recommended syntax for the referencing ontology classes or external references. Here, usually a CURIE as id is combined with a label for the text representation of the , such in the OntologyClass object prototype:

"onset": {
   "label" : "Juvenile onset",
   "id" : "HP:0003621"
},
"external_references": [
  {
    "id" : "cellosaurus:CVCL_0312",
    "label" : "HOS"
  },
]

The underscore in the Cellosaurus id cellosaurus:CVCL_0312 should usually not be problematic if it is properly prefixed; however, de novo identifier designs may avoid such a syntax.

Contributors

Further Information

Please see also a previous discussion on Github, and the links from there.

The ga4gh Namespace

ga4gh Prefix1

In a "GA4GH Namespace Discussion" telecon on 2019-08-22, initiated by GKS and with the participation of different work stream and project leads, it was agreed that newly generated identifiers created and maintained in the "GA4GH ecosystem" should use a general ga4gh prefix, and not create scoped prefixes. Details and implementation of this general concept are currently being evaluated. Some extensive discussion of this can be found in the GA4GH TASC space and the VRS specification.


  1. Image by Alex Wagner, from