Provides root element. Note: until xinclude is sufficiently widespread implemented to combine data from different documents, terminology, descriptions, and resources must be in the same document! The version of the SDD standard used is defined in the namespace declaration and needs no separate data element. This information refers to the last process that created this document, which does not imply that the data have been authored there. The information is intended for debugging purposes, and to improve import quality if certain generator versions produce abnormal code. Name of the application that has generated this document Version of the application that has generated this document Optionally allows a generating application to identify which export routine created the document; some applications may have several alternative export routines. Identifies the authors of the generating application, not the authors of the terminology, descriptions, and resources! This is the copyright string of the generating application, not the copyright of the terminology, descriptions, and resources! The date on which the generating application actually updated (or, if no updates occurred, created) the current xml file. Required information defining the project itself. Covering the entire document, i. e. terminology, descriptions, and resource collection. Defines the terminology (characters, states, modifiers) in which the descriptions are expressed. Defines the terminology (characters, states, modifiers) in which the descriptions are expressed. The objects being described may be abstract taxonomic concept (taxon, disease, etc.) or a physical object (individual specimen, part of individual, etc.). The item may be defined by its taxonomic name, by the published source of the description, and by a specimen identifier. These definitions may be free-form text or links to other database components. Descriptions are defined as optional to provide for projects publishing only terminology. Contains an authored or autogenerated free-form description ('natural language description'). It may be completely or partially marked up with elements similar to those in coded descriptions. If all markup except the wording content is removed, the original description can be losslessly recovered. A strict and largely language-independent description entirely controlled by terminology as defined in the current project Global resource definitions from other data areas used in the process of generating descriptions Application specific information is placed in Processing Instructions. @@@DISCUSS: is this ok? Can PIs be easily parsed out by an application? @@@ Recommendation: Each application may read out its own information. Any other target information present should be preserved and output when a new document is generated. This is designed to support itempotent round tripping data between two applications. This implies that no dependency between the settings and the items and the terminology setting should be relied upon. Required information defining the project itself. Sequence of authors, editors (at least one of which is required) and contributors. A contributor is not fully a Creator of the project, but related to them. The output or the mapping to Dublin Core elements should differentiate between the 3 different elements within Creators. At least the year is required (optionally a full date including time may be given) At least the year is required (optionally a full date including time may be given) A description of the status of revision, error checking of the project data The version number or code as defined by the project creators All language specific project header information is contained in a collection that can be defined for multiple languages This is currently the only set textual elements that is defined by language rather than audience! Comma separated list. Use TDWG geographical standard. Use 'global' for world-wide scope. Defines reference if information in the entire project came from one publication (printed or digital). Optional url that contains an icon. @@ This should probably be changed to an internal resource connector reference! The default audience is used whenever the setup of the consuming application has no other preference specified. The user interface of the application may then allow to choose a different audience/language available. Elements which are language, but not audience specific use the language of the default audience. The terminology is designed by the biological specialist(s). It is the class definition, defining semantics and structure to the data defined in the item description. Defines the semantics and labels of numeric measures (e. g. mean, min, max, s.d.). Unlike most other elements of the terminology, these definitions are constrained by the SDD model. In the current version they cannot be extended by the designer of the terminology. A single label for each global state set, to identify it in the user interface Each definition defines a fixed key value, multilingual label and glossary information (user extensible to new audiences) and attributes describing generalized semantics. A separate set of elements can be defined for each audience. The basic wording type contains labels, definitions, and wording. Please observe the following "Best practices recommendation": use the method, type, and value attributes rather than relying on the key strings, whenever this information is sufficient (e. g. for formatting routines or many query/identification purposes). Using type/method/value information allows your code to work if the list of definitions of statistical measures is extended. Globally defined sets of states which can be reused in multiple characters. System and user definable definitions use identical element names to simplify key/keyref definitions. This set contains "special states", providing standardized reasons why data are missing. Unlike most other elements of the terminology, these are constrained by the SDD model and can only be extended by revising the SDD standard Audience specific labels to identify the global state definition set in the user interface For each audience a separate label may be defined Special states all identify a reason why data are not known. In a single item they should only occur once per character. However for a class (e. g. a genus) it is up to the collation process whether to create multiple special states or not. An information like "unknown or not applicable" may be of interest for analytical purposes. The type is based on CharacterStateDefinitionType, but key values restricted to enumeration and no resources allowed. The labels and abbreviations given for special states are only recommendations. They can be freely changed as long as the semantics are preserved. A separate set of elements can be defined for each audience. The basic wording type contains labels, definitions, and wording. As above, a 2nd set of "special states", applicable to computed characters Audience specific labels to identify the global state definition set in the user interface For each audience a separate label may be defined These special states are already predefined here; they will be used as soon as a mechanism for computed characters is introduced. The labels and abbreviations given for special states are only recommendations. They can be freely changed as long as the semantics are preserved. A separate set of elements can be defined for each audience. The basic wording type contains labels, definitions, and wording. These state sets are user definable. Audience specific labels to identify the global state definition set in the user interface For each audience a separate label may be defined Frequency modifiers are used to describe state frequency (usually, rarely, etc.). They are defined globally but must be enabled for each character to be usable. A group of related frequency modifier definitions, which can, e. g., be used in user interfaces to allow adding a set of definitions in a single step. Audience specific labels to identify the global frequency definition set in the user interface For each audience a separate label may be defined Collection of audience-dependent linguistic sets (containing label plus wording elements) Note that the upper and lower limits of several frequency modifiers within a set may overlap! Modifiers are used to modify state expression in descriptions (strongly, at the tip, etc.). They are defined globally but must be enabled for each character to be usable. A group of related modifier definitions, which can, e. g., be used in user interfaces to allow adding a set of definitions in a single step. Audience specific labels to identify the global modifier definition set in the user interface For each audience a separate label may be defined Collection of audience-dependent linguistic sets (containing label plus wording elements) Characters are defined in a flat list; multiple hierarchical views are implemented through the char. group definition below. Character group definitions define flat subsets as well as hierarchical character trees. They are used for hierarchical display or filtering character subsets. @@@DISCUSS: should character group hierarchies be recursively definable, as long as the resulting tree in acyclic? Global resource definitions containing URIs or actually embedded resources (e. g. encoded images). An audience is a combination of language (including dialect) and expertise (pupil, beginner, expert). Multiple audiences can be defined for the same language and expertise, distinguished only by their label. The key attribute of AudienceDefinition is an arbitrary string. It is referenced in all LinguisticSet elements to declare the intended audience. ExpertiseLevel is restricted to values from 1-5. These categories allow to communicate expected expertise between different applications using the SDD schema. The recommended interpretation is: 1 = elementary school (year 1 to 6); 2 = middle school (year 7 to 10); 3 = high school (year 11 above) and general public (trying to avoid any specialized terminology or jargon); 4 = university students or (partly) trained personnel (using terminology, but avoiding or explaining problematic terminology); 5 = experts (using the full range of terminology). Lists all persons who contributed to the authoring, compilation, or editing process in the document. @@@ !!! Bob will make a proposal on the structure Internal list of taxon names used in the project, each one may optionally link to an external data source. These connectors are reused multiple time in the description. Object in a collection (= specimen) or observation Internal list of publication used in the project. Each publication is either described in free form text, or refers to external data source. These connectors are reused multiple time in the description. Printed or digital publication (including database information source) Internal list of specimens used in the project, each specimen is either described in free form text, or refers to external data source. These connectors are reused multiple time in the description. Object in a collection (= specimen) or observation Global resource definitions containing URIs or actually embedded resources (e. g. encoded images). Defines a character in the terminology Note: Character labels must be unique within the entire CharacterDefinitions collection (separately for each audience definition). Contains labels and definitions (but no wording information!) Audience independent resources linked to a character definition, e. g. images illustrating the character in general At least a single descriptor must be present (@@@ currently not implemented in the schema! @@@) These definitions constrain which global measure definitions can appear in items of this character. Unit like mm, µm, °C. Attributes allow to output before value (e. g. pH 7.0) or without blank. The content allows some xhtml formatting to support e. g. "mm2". The key attribute must be unique and is referred to in the item descriptions. The keyref refers to the measure semantics defined in the global measure definitions. Optionally globally defined state sets may be referenced and thus defined for a character. In general at least the special states should be referenced here. If Selections is missing, all descriptor states will be included, else only the selected ones. Reference to a single (globally defined) state (@@@PLACEHOLDER!@@@) Enabling of 0-n single frequency modifiers or 0-n frequency modifier sets for all states in a character. (@@@PLACEHOLDER!@@@) Enabling of 0-n single modifiers or 0-n modifier sets for all states in a character. Defines an entire character group (flat list or tree hierarchy) Defines label displayed when a grouping is selected in the user interface Contains labels and definitions (but no wording information!) Purposes are standardized to simplify application interoperability. Setting this purpose in a character grouping is a recommendation to applications with a user interface to use this as the default hierarchy for any editing or reporting purpose. The application may, however, enable the user to select any character grouping. Setting this purpose in a character grouping is a recommendation to applications with a user interface to use this as the default hierarchy for editing the item description data set. The application may, however, enable the user to select any character grouping. Setting this purpose in a character grouping is a recommendation to applications with a user interface to use this as the default hierarchy for editing the terminology. The application may, however, enable the user to select any character grouping. Setting this purpose in a character grouping is a recommendation to applications to use this as the default hierarchy for building guided keys (e. g. dichotomous keys). Setting this purpose in a character grouping is a recommendation to applications to use this as the default hierarchy for interactive identification. Setting this purpose in a character grouping is a recommendation to applications to use this as the default hierarchy for natural language reporting. MinimumExpertiseLevel: the designer of the subset expects the user to have a certain minimum expertise level. @@@ Needs discussion! @@@ The designer of a character grouping defines it as 'complete' to declare that it is intended to include all characters of the terminology. A terminology editing application can use this information e. g. to warn the designer about missing characters, to display special dialog boxes after the creation of a new character, etc. This attribute is currently not used, instead the key of the StateSet has been fixed for special states to SpecialStates. Needs discussion! Categorizing characters into basic property types (e. g. color, 2-dim. shape, 3-dim. shape, surface texture, taste, smell, behaviour, physiology, measurements, etc.) greatly improves the analysis and management of larger character sets and is therefore recommended. Note: Only a single character grouping should have this hierarchy type. (not enforced in schema, how can it be enforced? Other types occur multiple, i. e. one cannot make a UNIQUE statement on attribute! A hierarchy that organizes characters by method, e. g. field observation, light microscopy, electron microscopy, molecular methods, culture techniques, etc. A hierarchy that organizes characters by a morphological "contains" hierarchy: plant = root/stem/leaf, leaf = base/stipules/petiole/lamina, etc. Defining a grouping as flat subset marks it as being intended only for filtering purposes and prevents it from being displayed as a choice for a hierarchy in a user interface. Note that conversely, the filter selection dialog should not be restricted to these groupings. Any character grouping, including part, method or basic property type hierarchies are valuable filters defining character subsets. used for character groupings that fall into none of the categories above. A node in a character group A separate set of elements can be defined for each audience Enable designer to annotate nodes in the grouping and add management comments Natural language wording for character node needs to be here. The relevant or possible wording is constrained by the path in the tree, so it needs to be defined in the tree, not in the flat character list. The label may override the default label from the character definition. In a methodological tree, the wording may have to add part-hierarchy information, in a part hierarchy methodological information. A separate set of elements can be defined for each audience The key for the character group item has been defined as required to document that an xs:key constraint exists on this attribute. It seems impossible to make existence of key optional and require keyrefs to only point to these existing keys. Used in global and local character state definitions A separate set of elements (containing labels, definitions, and wordings) can be defined for each audience. Audience independent resources linked to a state definition, e. g. images illustrating expression of the state. Internal notes of the designer (not multilingual). This attribute is currently not used, instead the key of the StateSet has been fixed for special states to SpecialStates. Needs discussion! Refers in coded item descriptions to a Character. MAY LATER NOT BE NEEDED! Like CharacterReferenceType, but for usage in the NaturalLanguageDescription markup container (including Wording elements) CharacterReferenceWithWordingType is similar to CharacterReferenceType, but not derived by extension, since not only the Wording element has to be added, but also state element changed to a different type allowing Wording inside. Observation of a character state in an item description (compare also StateDefinitionType!) The three frequency element variants are distinguished by their attributes! direct single frequency value (Value attribute) direct frequency range (LowerLimit/UpperLimit) reference to globally defined frequency modifier. @@@ This attribute has not yet been properly discussed! Some modifiers are desirable in a collated summary statement generated for several taxa or specimens, and other are not Annotations for multiple audiences (esp. different languages) can be added. Internal notes are always only visible to designers of a data set, not to consumers. This does not imply that truly confidential information should be placed here, but it is the appropriate place for rough unedited comments and notes needed during development. This element is available only once (i. e. not for each audience/language!). Like CharacterStateType, but for usage in the NaturalLanguageDescription markup container Analog to the CharacterStateReferenceType, for measures Abstract supertype for NaturalLanguageDescriptionType and CodedDescriptionType = Identification of object being described; may be missing if unidentified or "cf. Taxon name" Specimen may include observation records Citation of the data source for the nat. lang. or coded description Refers to publ. as defined under resources: publications Location within publication: Optional page, table, figure number, database record, html document bookmark, etc. on which the actual data can be found (not the inclusive pages of the article) Verbatim name as it appears in information source (e. g. publ. or on specimen label) @@@ This needs discussion @@@! @@@ This needs to be here for entire description (definition of it) and also present at a more atomic level, perhaps on the state @@@ Contains multiple resources (e. g. images) described by a single set of DC metadata. @@ In previous versions, a description may consist of resources alone, this is not possible after Paris may need discussions! @@ Descriptions entered as free-form text with optional (and potentially incomplete) markup referring to char. groups, characters, and states as defined in the terminology Retains the full, unchanged original wording of the natural language description. Character group, character, or state markup may be added (partial or complete), but these may not change the original wording sequence. Character group markup is used to mark organism parts, methodological sections, etc. In most cases initially the states are recognized, but character markup can be deduced from the associations between char. and states defined in the terminology. Wording between characters groups or characters is necessary if markup is incomplete In most cases states are initially recognized, but character markup can be deduced from the associations between char. and states defined in the terminology. Wording between characters groups or characters is necessary if markup is incomplete Descriptions entered as data referring to the terminology elements. CodedDescriptions must fulfill more rigorous consistency requirements than natural language descriptions and are more suitable for analysis. Furthermore, language-dependent annotations are minimized so that data can be easily reorganized and translated into multiple languages. The coded description is entirely controlled by the vocabulary and structures defined in the Terminology section. It contains keyrefs to descriptors and modifiers (plus numerical values for measurements). Free-form text is allowed in Reported- or InternalNotes only. Separating data and terminology allows rearranging and refactoring the terminology, multilingual support through central terminology translations, and multiple hierarchical views. @@ Proposal GH, not yet discussed = A set of characters that have been observed together Dates split into year, month, day, time. in contrast to xml-schema:date this provides the option that part of the information is unknown (e. g. only day/month or year/month are known) The DateText provides an option to enter verbal date information, like "summer, around 1910". Derived by restriction from DateType, year is required The DateText provides an option to enter verbal date information, like "summer, around 1910". THIS NEEDS DISCUSSION! A single person or an author team. A single person or an author team. Either a full date or a year (1970-2100) are required Either a full date or a year (1970-2100) are required string restricted to 1..20 character length string restricted to 0..255 character length (may be empty string) string restricted to 1..255 character length (i.e. required, may not be empty string) string required to be at least 1 character long (i.e. required, may not be empty string) Double precision numeric value in the range of [0..1] Valid states are true, false, and default A NMTOKEN whose only value is "default", used for union definitions Restricted to specific integer values, indicating expertise from schoolchildren to taxonomic expert. See the separate documentation for the interpretation of values. Allows basic character formatting using xhtml elements plus three semantic elements (citationauthor, taxonauthor, taxon; intended to be rendered formatted and for analysis). Note that no further formatting is supported within the semantic elements. logical markup: emphasis: usually rendered italic physical markup: italic that could not be interpreted as em or taxon markup physical markup: subscript physical markup: superscript logical markup: strong: usually rendered bold Author of a referenced citation. Recommended report rendering: may be either stripped or rendered as small caps Author of a taxon. Recommended report rendering: see citationauthor Recommended report rendering: italics Extends the FormattedSimpleTextType and allows in addition to basic character formatting with <sup>, <sub>, <i>, <b>, etc. also the use of <img> and <a> elements. Further elements may be added in later versions of this schema. image element, needs further attributes added to work!! anchor/hyperlink element, needs attributes added to work!! Extends the FormattedExtendedTextType and allows the following block level elements as well: p, ol, ul, li, h1-h6. THIS could be set to a full xhtml fragment definition, but must be without html/header/body elements! p element, needs attributes added to work!! Change to mixed content. If possible reuse xml! Formatted text with additional attribute parsed. Used for wordings in the NaturalLanguageDescription container A text element with optional BlankBefore/BlankAfter attributes The setting "default" allows the application to detect whether a blank should be added or not Natural language wording for elements without content Wording for elements that have not further content, e. g. states Natural language wording for elements with non-repeated content (e. g. modifiers around states) Text output before the contained elements. For characters this is the main character wording that is output before the states. Text output after contained elements. In the case of a character this is the wording after all states, or after numerical data and after a measurement unit where present. Natural language wording for elements with repeated content (e. g. characters around modifier + states). Each of the elements has the attributes 'BlankBefore' and 'BlankAfter'. The default value of all attributes is "default", which lets the application decide whether a blank is required or not. If both delimiter elements (TextBetween/BeforeLast) are absent, the application should automatically use delimiters appropriate for the language and culture. ?Perhaps define in AudienceDefinition? In the case of content with multiple elements (e. g. a character with multiple states scored) this delimiter is inserted between elements (excepts before the final element, see TextBeforeLast) In the case of content with multiple elements this delimiter is used between the second-but-last and the last element Defines the base type of a single LinguisticSet, contains only label, e. g. used for GlobalStateSetDefinition, FrequencyDefinitionSet, or ModifierDefinitionSet. Defines an extended base type of a single LinguisticSet The term (one or several words) appears at the start of the definition and denotes the concept being defined. For characters and states the term is often identical to the Label, but this is not necessarily so. A longer explanation (glossary entry) explaining the concept (meaning, semantics) of a character, state, etc. @GH@: Either Definition Or ExternalReference Or Both should be present, do not know how to define that! URI to an external definition or glossary entry Audience-dependent resources (e. g. images with text, or suited to different expert groups) linked to a state definition, e. g. images illustrating expression of the state. Constrained version of the name (e. g. shortened or no blanks), defined to be suitable for e. g. NEXUS phylogenetic analysis or statistical analysis software. Extends basic label type with a single wording element Extends basic label type with a single wording element (with additional attributes for modifier wordings) Extends basic label type with a complex wording element; used in character grouping nodes and character references Audience specific information containing only a single reported note as text (optionally with basic formatting) A short, concise title. Free-form text containing a longer description of the project. A free form text acknowledging support (e.g. grant money, help, permission to reuse published material, etc.) Disclaimer statement, e. g. concerning responsibility for data quality or legal implications Free-form description of geographic coverage of descriptions available in the current project. Free-form text describing taxonomic groups covered by the project At least a copyright statement is required A concise copyright statement Optionally, an expanded copyright statement may include more detailed copyright information Rules defining under which conditions the data may be distributed or changed Used for resources like publications, taxonomic names, specimens, etc. Provides either a simple free-form text, or a connection to an external resource Defines a service used to resolve ExternalID. This could be url of wsdl file of a web service Can be URI, but does not have to. Examples: "ref://x.y.fr/floras/smith/1998", "432787632", "SMI1998_DZT" Human readable represenation, may be only data item if no machine readable ID exists. Example: "Smith 1998. Flora of SomeCountry" Extends resource connector type with optional encoded data content (esp. images embedded in xml document) and with a type attribute An optional caption for a resource that is intended to be displayed embedded in another document Optionally the full resource data may be embedded (as an alternative or in addition to defining a uri) This allows to define (and redefine) the value domain for keys and keyrefs Defines an element with a keyref attribute; reused for keyrefs to all different key domains Defines an element with a keyref attribute pointing to AudienceDefinitions (different data type from generic keyref!) Defines an element with a keyref attribute pointing to ContributorDefinitions Defines an element with a keyref attribute pointing to a TaxonName in Resources Defines an element with a keyref attribute pointing to a Publication in Resources Defines an element with a keyref attribute pointing to a Specimen defined in Resources @GH@: Discuss whether to add a separate element for collection abbreviation (cached information form provider orfrom Defines an element with a keyref attribute pointing to a MediaResource defined in Resources