Frequently Asked Questions for IRMNG


Author: Tony Rees, former CSIRO Marine and Atmospheric Research, Hobart
Last updated: 30 July 2013
----------------------------------------------------------------------------------------------------

Q: What is the purpose of IRMNG?
A: IRMNG, the Interim Register of Marine and Nonmarine Genera, exists to provide a machine- and human-queryable system that is able to answer some basic questions about organisms based on the genus component (or in around 50% of cases, the genus+species component) of their scientific name. Such questions in the first instance comprise: Additional questions that may also be answerable in a subset of cases: When initially conceived in 2006, this information was not available in a comprehensive, internally consistent form in any other system, although at some future point it may be (for example via the proposed Global Names Architecture or its components).

Q: How does IRMNG differ from other, apparently comparable biodiversity databases?
A: IRMNG aspires to maximize coverage at the level of genus across all groups, i.e. animal, plants, algae, protists, fungi, prokaryotes, and viruses, both extant and fossil, and also include flags to indicate extant/fossil and basic habitat status as above. Other comparable biodiversity databases are either limited to a single taxonomic group (e.g. Index Nominum Genericorum for plants, Index Fungorum for fungi, Catalog of Fishes for fish, etc. etc.), or to taxa of a particular type (e.g. Paleo Database for fossils, World Register of Marine Species for marine species only), or to taxa from a particular geographic region (e.g. Fauna Europaea, Australian Plant Checklist, Species 2000 New Zealand). The most comparable, broad spectrum initiative (though excluding fossils) is probably the Catalogue of Life, however this omits much detail at genus level (such as genus authorities and genus level synonyms) and in addition presently aims for completeness at species level, so proceeds much more slowly towards completion. Another noteworthy compilation, that of Nomenclator Zoologicus, has excellent coverage of many zoological genus names from 1758-2004 approx., but omits family allocation, habitat flags, and consideration of current taxonomic validity in all but a few cases.

Q: What is the significance of "Interim" in the IRMNG context?
A: In the context of IRMNG, "Interim" indicates that this is largely a first-pass compilation of data from a wide range of sources which may contain some internal inconsistencies and data errors, that have not subsequently received the degree of scrutiny and validation found in more authoritative, single-group sources. Over time, these aspects of IRMNG should be improved however in the first instance, it is deemed desirable to have a system with the range of IRMNG available for use in the interim rather than wait for all residual taxonomic or data issues to be resolved, or for the appearance of an equivalent compilation from other sources.

Q: How has IRMNG been populated?
A: For obvious reasons, IRMNG draws heavily on pre-existing genus level compilations which in a number of cases, have been generously made available to the project by their respective compilers. In approximate order of incorporation, the major sources utilised to date have been as follows:

- Parker, S.P. (ed.), 1982. Synopsis and Classification of Living Organisms. McGraw-Hill, New York. [Print source] (Initial family and higher level classification 6,800 family names)

- The Taxonomicon & Systema Naturae 2000 online compilation, 2006 version, courtesy Dr. S. Brands, Netherlands (112,000 genus names plus additional 2,300 family names) current web address: http://taxonomicon.taxonomy.nl/

- Catalogue of Life 2006 version, incorporating contributions from over 40 GSDs (Global species databases) plus ITIS, the Integrated Taxonomic Information System, courtesy Catalogue of Life partnership (36,000 additional genus names, 2,100 additional families, 1,282,000 species names) -  current web address (latest version): http://www.catalogueoflife.org/

- Museum Victoria KEmu database (Oct 2006) (9,000 additional genus names, 900 additional families, 56,000 additional species names)

- Sepkoski, J.J., 2002. A compendium of fossil marine animal genera. Bulletins of American Paleontology, 364. Ithaca, NY (27,000 additional genus names, no families but sorted by order). Available online at http://strata.geology.wisc.edu/jack/

- Benton. M. (ed.), 1993. The Fossil Record 2. Chapman & Hall, London. (2,900 additional fossil + extant families). Spreadsheet version available online at http://www.fossilrecord.net/fossilrecord/index.html

- Index Nominum Genericorum (2007 version) for plant genera, courtesy Dr. E. Farr (35,000 additional plant genus names, 400 additional families) - current web addresshttp://botany.si.edu/ing/

- Aphia databases maintained at VLIZ, Belgium (supporting European Register of Marine Species and 17 other region or taxon-specific databases), 2006 version, courtesy ERMS editors (3,300 additional genus names, 120 additional families, 45,000 additional species names)

- Australian Faunal Directory (October 2007 version) (9,800 additional genus names, 190 additional families, 55,000 additional species names) -  current web address http://www.environment.gov.au/biodiversity/abrs/online-resources/fauna/afd/

- Unpublished (as at 2007) Species 2000 New Zealand compilation, courtesy Dr. D. Gordon (1,800 additional genus names, 54 additional families, 10,000 additional species names)

- List of Names with Standing in Prokaryotic Nomenclature (2008 version), courtesy Dr. J-P. Euzéby (all taxonomic allocations checked, plus 450 additional prokaryote genus names, 77 additional families) current web address http://www.bacterio.cict.fr/

- Nomenclator Zoologicus (2006 electronic version), (205,000 additional genus names, 440 additional families) - current web address http://uio.mbl.edu/NomenclatorZoologicus/

- Melville, R.V. & Smith, J.D.D., (eds). Official Lists and Indexes of Names and Works in Zoology. ICZN, London. (Approx. 50% of taxonomic status information on generic names from relevant ICZN Opinions uploaded to IRMNG, covering 1,800 genera)

- Index Fungorum, electronic database and nomenclator for fungi (2009 version) (all taxonomic allocations checked, plus 1,800 additional genus names, 150 additional family names) current web address http://www.indexfungorum.org/

- GBIF taxonomy, May 2010 (incorporating Catalogue of Life 2009, Paleobiology Database and numerous other sources not otherwise consulted): upgraded taxonomic placement for 46,000 genera not previously placed to family level current web addresshttp://www.gbif.org/

Plus in addition, a wide range of print sources, more recent updates (e.g. fishes current taxonomy from FishBase courtesy Dr. N. Bailly) and smaller electronic compilations including CAAB (Codes for Australian Aquatic Biota) and others, contributing the balance of current IRMNG holdings (additional 6,000 genus names, 2,900 families, 10,000 species names).

From the above list it is also clear that other major sources exist which could potentially be utilized in IRMNG, including IPNI (http://www.ipni.org/), uBio (http://www.ubio.org/), the Paleobiology Database (http://www.paleodb.org/) and more, but have not yet been as yet, mainly for reasons of time.

Q: How complete is IRMNG at this time?
A: This is difficult to answer exactly, since no reliable estimates of total numbers of extant and fossil, valid names and synonyms exist at either genus, family or species level; however the author estimates from a range of sources and guesstimates that there may be a total 6.5-7m published species names to date, of which approximately 2.2m are valid (the latter increasing at around 25,000 per year); 500,000 published genus names of which perhaps 250,000 are valid (increasing at around 2,500 per year); and perhaps 30,000 published family names of which maybe 17,000 are valid, for both extant and fossil taxa. On the basis of these approximations, IRMNG currently includes "most" valid published family names plus a small subset of synonyms, "most" published genus names, both valid and non-valid (469,000 of approx. 500,000, i.e. around 94%), and a subset of species names only at this time (1.9m out of 6.5-7m, or a little over 28%, though the figure rises to around 50% if synonyms are excluded).

Q: How many homonyms / non-unique genus names are in IRMNG?
A: One important function of IRMNG is to indicate, at least as far as data already held, whether a particular genus name is unique or whether it occurs in multiple instances, either between or within the same taxonomic groups. Currently there are almost 69,000 genus-level homonyms (around 29,000 separate names) included in IRMNG, representing around 15% of all names or approx. one name in every 7 (this figure also includes nomina nuda plus a small number of misspellings that accidentally coincide with a different, correctly spelled name). The name with the largest number is probably Wagneria of which there are 12 listed instances in zoology and 2 in botany, of which a maximum of one instance can be valid in either domain with the remainder invalid, of which a subset may be either synonyms (for example replaced by subseqently published new names), or orthographic variants/misspellings of otherwise valid names.

Q: What editing is required for IRMNG compilation?
A: The majority of name data (taxon names and authorities) are imported into IRMNG from the relevant data sources without modification, except in the case of database errors apparent from cross-comparison with external sources and a limited amount of authority normalization to produce a consistent "house style" (including expansion of botanical authors for genera when supplied in abbreviated form, to match the format used in Index Nominum Genericorum).

Family attribution may be adjusted from that given with incoming data where a more recent, authoritative source is available, and editorial input may be required to decide which source to follow in instances where opinion is divided. Missing data (such as authorities, also nomenclatural/taxonomic comments, and habitat and extant/fossil flags) is frequently added from a variety of supplementary sources and in these cases, editorial decisions are sometimes required as to which instance of a genus name is involved in each case (often self evident, but sometimes not so). Editorial decisions are also required to decide whether two highly similar names and cited authorities in different base datasets represent either the same or different genus publication instances, for example some animal names may also be represented as plants, plants or protists as fungi, corals as sponges, etc. etc., particularly in early literature; where such cases are detected, a decision is then made either to retain both records as separate instances or to combine them into a single record for IRMNG.

Editorial input is also involved in determining the status of names from some of the less authoritative sources as either genuine new instances, or as misspellings of names already on the list, in which case a note is added together with a pointer to the name variant deemed to be the correct spelling.

Q: What gaps remain to be filled in IRMNG?
A: IRMNG can be deemed complete (at genus level) when
(a) all published genus names to date are included (a moving target of course);
(b) all name variants not yet "verified" from appropriate trusted sources (i.e., Nomenclator-grade compilations) are either verified from other sources e.g. primary literature, or assessed to be misspellings of "verified" names already on the list;
(c) all genus names are assigned to actual families rather than "placeholders" such as "Mollusca (unallocated)";
(d) the higher taxonomic categories are all filled (e.g., no gap between family and class, or between order and phylum);
(e) all genera have an assigned (and perhaps, separately verified) status flag for extant/fossil and marine/nonmarine status (or both as applicable); and
(f) the taxonomic status of all generic names is known (i.e. valid or non-valid; if a synonym, what is the current valid name). Progress according to these various metrics is shown below, as at July 2013.

(a) Genus names held As detailed above, currently IRMNG holds some 469,000 of an estimated 500,000 published genus names to date (the latter increasing at perhaps 2,500 per year), indicating that at present some 30,000 names are possibly missing (although this figure could vary by perhaps +/- 10,000 according to the basis of the estimates used).
(b) Verified versus Unverified Names Approximately 19,000 of the 469,000 genus names in IRMNG are "unverified" from appropriate authoritative sources at this time. Experience suggests that perhaps 50% of these will turn out to be database errors in sources used to construct IRMNG, and the remaining 50% "good" new names verifiable from additional sources.
(c) Genera assigned to "placeholder" rather than actual families At present, approximately 124,000 of the 469,000 genus names in IRMNG are allocated to "placeholders" at family level (example: "Mollusca (awaiting allocation)") rather than true families. Mechanisms to address this deficiency are currently being investigated.
(d) Families assigned to "placeholder" rather than actual orders At present, approximately 970 of the 21,700 family names in IRMNG are allocated to "placeholders" at order level (example: "Mollusca (awaiting allocation)") rather than true orders. This deficiency is being corrected over time.
(e) Completeness of flagging (extant/fossil/both, marine/nonmarine/both) at genus level Currently approx. 376,000 of the 469,000 genus names in IRMNG (80%) are currently allocated an extant/fossil status flag and 93,000 are not, while for marine/nonmarine status, 401,000 (85.5%) are allocated a marine/nonmarine status flag and 68,000 are not.
(f) Assessment of current taxonomic status of generic names This is a low level priority for IRMNG at this time, however at present some 210,000 genus names of 469,000 (45%) are flagged as currently valid, and 103,000 (22%) are flagged as non-valid of which 94,000 are pointed to the relevant valid name instance, leaving 33% without valid/non-valid flags at this time.

Q: How is IRMNG currently maintained, and what are its long term development plans?
A: IRMNG construction commenced in 2006 following an analysis of needs in respect of taxonomic names management for OBIS (the Ocean Biogeographic Information System), and is currently considered a contribution to the International OBIS system from OBIS Australia (www.obis.org.au), which is hosted at CSIRO Oceans and Atmosphere (CSIRO O&A) in Australia. CSIRO O&A has currently contributed in kind to IRMNG development and ongoing hosting as part of its commitment to OBIS AU, and small amounts of OBIS, GBIF and Atlas of Living Australia funds have contributed to aspects of its population. At present IRMNG continues to be maintained by CSIRO O&A as an ongoing contribution to OBIS and other initiatives that may have a use for it as a taxonomic information system. IRMNG may also evolve into a component of the Global Names Architecture (GNA) / Global Names Usage Bank (GNUB) at some point (see www.globalnames.org/), however scoping of those activities and potential IRMNG interaction is still at a relatively early stage at this time.

This site is hosted by the CSIRO National Collections and Marine Infrastructure Information and Data Centre, Australia. Please advise any problems with this website to the OBISAU Node Manager.    View Privacy Statement