TDWG working group: Structure of Descriptive Data (SDD)
SDD provides mechanisms to define sets and hierarchies of character (using concept trees) and sets and hierarchies of classes or taxa (using class hierarchies). These mechanisms are useful in the context of identification. For example, concepts may be "field identification" versus "laboratory identification" characters, class hierarchies may be used for for non-taxonomic classes relevant to identification purposes such as "weeds", or "diseases".
However, in many identification scenarios (especially when using computer-aided multi-access keys, or when creating sequential keys, it may be desirable to obtain guidance about which characters are recommended to further the identification process most effectively. Much of this information can be obtained from the descriptive data matrix itself, by assessing how well each character partions the set of classes/taxa that are potential identification results. Well tested algorithms (e. g. the CSIRO DELTA "BEST" algorithm) exist for this purpose.
A character ranking based exclusively on separating power, however, often has undesirable effects in choosing characters that are difficult or expensive to measure, that are available only during a limited time of year, or that require high expertise to use. It is therefore desirable to inform the algorithm with additional meta-data on characters or concepts that help to guide this process. DELTA provides a character weight/reliability mechanism for this purpose and this information is used in the CSIRO DELTA "BEST" algorithm. The discussion here evaluates current usage and attempts to evaluate concepts for SDD. The central question is:
Which data elements are necessary or useful to guide identification programs in their choice of the best next character to answer?
This section is based on the SDD discussion in Indaiatuba, Brazil, 14-17. October 2002.
In addition, Diederich and Fortuner have published on character metadata. They propose "conspicuity", "ambiguity", "variability" (Diederich, J., Fortuner, R. & Milton, J. 1989. Building a knowledge base for plant-parasitic nematodes: description and specification of metadata. In: Fortuner, R. (ed.). Nematode identification and expert-system technology. New York, Plenum Publishing Corp.: 65-76. and Diederich, J. & Milton, J. 1991. Creating domain specific metadata for scientific data and knowledge bases. IEEE Transactions on Knowledge and Data Engineering, Vol. 3 No. 4: 421-434.)
A general consensus existed in Brazil (2002) that an unspecified "weight" should be avoided and that each attribute should be defined based on a definition independent of the methodology in a specific identification application. Instead, each attribute should have a semantic definition. It should be possible to obtain rating information in a "questionnaire" style from biologists unacquainted with an actual processing software. The information for each rating parameter may be presented on an arbitrary "rating" scale.
Furthermore, consensus existed that parameters such as "DiscriminativePower" or "Cost-effectiveness" should not be recorded. It should be calculated based on information available in the data set (i. e. similar to CSIRO-DELTA's "BEST" algorithm).
The remaining rating concepts influencing the usefulness of a character for identification purposes are tentatively grouped in three categories: Error, Variability, and Cost.
How well (accurate/precise) can the character be observed or measured? In measurement theory, a useful distinction is often made between accuracy and precision:
For the purpose of identification, precision, i.e. the repeatability or consistency of measurements when building the knowledge base and when identifying a specimen is more important than accuracy. Data should be measured or scored during both the construction of class descriptions and during identification with high consistency. However, this is usually achieved by measuring with high accuracy!
Most scientific quantitative measurements reported today are accompanied by some indication of the limitation of accuracy or the probable degree of error. Among the various types of error that must be taken into account are errors of observation (which include instrumental errors, personal errors, systematic errors, and random errors), errors of sampling, and direct and indirect errors (in which one erroneous measurement is used in computing other measurements).
Reliable: A character is reliable, if the scoring by different observers are consistent. Questions:
How variable is the character expression? This kind of variability is a variability of the true values of a property in objects, not of the error in measured values. Variability within taxa is either due to:
Other question: How likely is it that the character is observable in a given specimen?
How convenient is the observation/measurement of a character? How practicable and economically feasible is a measurement?
DELTA used two different rating scales: *CHARACTER WEIGHTS with 11 values between 0.03125 to 32, and CHARACTER RELIABILITIES with 11 values between 0 and 10. The relation between a reliability, and weight is given by: weight = 2reliability-5. The central reliability value 5 is equivalent to the weight 1.
Deciding on an appropriate rating on a 11-valued rating scale is often difficult and time consuming. Therefore a reduced rating scale with only 5 values between 1 and 5 (with 3 being the central value) is proposed for SDD. The scoring should only use these integer values, but when ratings are combined to a single calculated wheight value, floating point values should be used.
| 1 = not at all 2 = slightly 3 = moderately 4 = very 5 = extremely |
or: | 1 = disagree strongly 2 = rather disagree 3 = neutral 4 = rather agree 5 = agree strongly |
(Note: what should the entire concept be called?)
A character may be difficult to observe in the field with a hand lens (but still valuable), but very easy to observe in the laboratory with a stereo microscope. Separately perhaps: IdentificationAccessibilityInField / IdentificationAccessibilityInLaboratory?
One problem is the ability of a user of an identification package to make the same observation as the builder of the item description data. Rather than reliability the concept should perhaps be names RepeatabilityEstimate? This is taxon dependent in most cases. Example "Number of spines somewhere on a fish": "3 large spines" reliable and highly repeatable, but "27 tiny spines" easily misinterpreted.
Repeatability also depends on ExpertiseLevel
Scope: set of taxa to which this reliability figure pertains? Global or subset of taxa! Problem: Reliability is a global feature of the entire character, i. e. it applies to all items! Perhaps Reliability needs to be defined together with item groups?
Gregor: Scope problem can perhaps be alternatively solved by scoping to project, but providing the planned mechanism to inherit characters/state definition from a global master project, but overwriting repeatability etc. for the current scope.
... Decisions needs to be postponed until general item scoping mechanism has been defined.
Note: Scoping mechanism should not tie the terminology section to a given set of item descriptions, i. e. if a character scope is defined only for a subset of items, this should be declared primarily in the item descriptions, not in the character definition. This is necessary to all enable architectures where a central terminology should be used by item descriptions in different places, possible in multiple projects. At least no project-specific item identifiers (like and item key or ID) should be present in the terminology. Generic item identifiers (Genera, family names) would be possible however, as long as they automatically apply to all item descriptions under different management.
See also SDD proposal: Reject DELTA ItemAbundance data element.
It may be desirable to define ratings both for concepts in concept trees and at individual characters, and let character inherit ratings from concepts they are included in. Furthermore, ratings would be inherited within the concept trees. The difference between original data and calculated / inherited data could be expressed in the "origin" attribute. Inheritance should act on individual ratings, not on the full set of all ratings.
A fundamental problem with character ratings is that they often depend on a taxonomic group. Counting the number of anthers in flowers may be convenient in large-flowered groups, but highly inconvenient in wind-pollinated groups with tiny and reduced flowers.
Dallwitz et al. (Principles of interactive keys, http://delta-intkey.com/www/interactivekeys.pdf) distinguish between character reliabilities defined as part of the character matrix, and attribute reliabilities defined within the data matrix. The latter is not a feature of current DELTA, but proposed for "New DELTA/DELTA II". An attribute in the sense used in DELTA is a specific state or value for a specific item (taxon). Thus the distinction mixes the question of character versus state- or value-specific ratings with the question of taxon-specific ratings. In well defined characters, the convenience, availability, and required expertise normally does not depend on the result of a measurement or observation (exceptions occur if the measurement method is polymorphic, requiring different equipment for different result ranges; an example is height measurement spanning small plants to large trees). Repeatability of a measurement, however, does depend on results. Therefore, an modifier-like approach that modifies existing values may indeed be desirable.
More urgent, however, seems to be a method to adjust ratings on the character level within the taxonomic hierarchy. To achieve this, character ratings may be considered a special form of data in a character × taxon matrix instead of part of character definitions. Combined with a concept of data inheritance down the taxonomic tree, general character ratings may be defined in the root taxon, inherited by all other taxa until (e. g. in a specific family) ratings are adjusted. This process may occur for individual characters, i. e. only the necessary characters need to be adjusted.
For both types of inheritance it is probably desirabel to provide inherited or calculated ratings during data export, together with an attribute like origin="inherited". This allows certain types of consumers to process the data without a need to recalculated the ratings. Importing applications that are able to do these calculations should in general discard all data that have not origin ="OriginalData". The Origin mechanism for ratings is very similar to that proposed for the character data.
Current enumeration in the SDD development version:
Please discuss this on the SDD WIKI (topic @@@@) or send your criticism or suggestions to the SDD mailing list or to the author.
Gregor Hagedorn; Vers. 1; first draft 18. April 2004, this update 10. May 2005.