Contains only infrastructure: GenerationMetadata, ProjectDefinition, Resources!
This version has been created only for the purpose of focussing the discussion of the SDD working group.
It is NOT a functional schema!
Provides root element. Note that the version of the SDD standard used is defined in the namespace declaration and needs no separate data element.
Note: until xInclude is sufficiently widespread implemented to combine data from different documents, terminology, descriptions, and resources must be in the same document!Describes the application or script that produced this document. The information is transient (informs the import process, but is discarded after import). Intended for debugging purposes and to improve import quality (esp. if some generators are known to produce problematic code).Required information defining the project itself. Refers to the entire document, (terminology, descriptions, keys, etc.)Lists of external resources used in terminology or descriptions (persons, publications, media resources). This provides an interface as well as a cache.Describes the application or script that produced this document (whether it has been authored there or not).
The information is transient (it informs the import process, but is discarded after import). Intended for debugging purposes and to improve import quality (esp. if some generators are known to produce problematic code).
Furthermore, attributes describe whether the data contained in the document are complete or an excerpt of a larger data set.Name of the application that has generated this document. The term 'application' should be understood in a loose sense; it may be a script that is not part of a larger application (compare the Routine attribute, which may provide the detailed name of scriptis that are part of an application!).Version of the application that has generated this document. The attribute should not be named 'Version' to avoid confusion with the version of the content (see ProjectDefinition).Additional information about the generating application that is not part of the name or version. Documenting the copyright of the generating application is not recommended, but if desired, a copyright string may be placed here.Optionally allows a generating application to identify which export routine created the document; some applications may have several alternative export routines. This attribute may also be used, to identify different conditions under which the export routine may behave differently.Scripts (e. g. XSL transformations) that modify existing xml documents in a relatively minor way should add their name to this (semicolon-separated) list of transforming scripts (rather than replacing the GenerationMetadata with their own information).Date and time (UTC or local time with timezone information) at which the current document or data stream was created by the generator.If this document is produced in response to a query and therefore only contains a subset of the terminology defined in the project, this optional element should be set to true to inform consumers that a more complete version can be found elsewhere.If this document is produced in response to a query and therefore only contains a subset of descriptions available, this optional element should be set to true to inform consumers that a more complete version can be found elsewhere.If the document is a snapshot (complete or extract) of data held otherwise, and the data are served through a URI, this attribute informs about the point to query for up-to-date information. If possible this should be a complete web-query string.Required information defining
the project itself.A globally unique ID-string, distinguishing this project from all others. The value should never be changed once it has been introduced. To refer, e. g., to a character across projects, this value is combined with the key of the character. If you don't have this, it will be difficult to compare versions of projects
Recommendation: Avoid choosing simple names that are likely to be created multiple times ('plants', 'French bees', etc.). Authors working at research institutions and expect to continue to do so, may use institutional-URI/personal or team name/project label (example: http://bba.de/hagedorn/coelomycetes). Note that this is only an identifier and does NOT help to locate any real resource on the web.Number and date of current versionThe major version number as defined by the project creatorsAn optional minor version number ('2' in 1.2)An optional incremental version number to distinguish each successive revision of a project.Publication of the current version (compare RevisionData/ InitiationDate for date of first version). PublicationDate should be missing if the current version is not yet published. Creators, Revision
status, and dates of
the entire project.
The revision status
refers to both
terminology and
descriptions.
Note: Creators
are optional, but
within Audience-
SpecificData at
least a copyright
statement for the
project is required.Audience-specific project
header information[ATTR: audience]Many projects will have a limited geographical scope (or coverage).Defines reference if information in the entire project came from one publication (printed or digital).
[ATTR: ref]URL pointing to the online source for the terminology or descriptive data contained in the current xml document. WebAddress may serve an updated version of the data.@@ To be discussed. The idea is that a project may point to a web resource that informs about details about the history of the data (previous versions or a detailed log of changes).Optionally an image media resource containing an icon/logo symbolizing the project.
[ATTR: ref]A list of audiences addressed in the project. An Audience is a combination of language (including dialect) and expertise (pupil, beginner, expert).
[ATTR: defaultaudience]For natural language reporting some rules can be defined per language rather than per audience. If a rule for a language used in an audience definition is missing, applications may add a default language rule to the project data.[ATTR: lang, dir]ltr = left-to-right directionrtl = right-to-left directionGlobal resource definitions containing URIs or actually embedded resources (e. g. encoded images).Documentation of all persons or organisations that where involved in authoring, compiling, or editing the document.@@@ This is just a preliminary sketch that should probably be synchronized with TDWG ABCD!
[ATTR: key]Internal list of publication used in the project. Each publication is either described in free form text, or refers to an external data source.Printed or digital publication
(including database source)
[ATTR: key]Internal list of geographical locations (usually country names, but this may be on any level). Each one is either described in free form text, or refers to an external data source. The external gazetteer referred to may be the TDWG Geography standard.[ATTR: key]Global resource definitions containing URIs or actually embedded resources (e. g. encoded images).[ATTR: key]Should audiences become a root section? Would a predefined set of audiences be included in multiple projects?An Audience is a combination of language (including dialect) and expertise (pupil, beginner, expert). Multiple audiences can be defined for the same language and expertise, distinguished only by their label.
[ATTR: defaultaudience]The audiencekey attribute of this element is an arbitrary string. It is referenced in all audience specific elements (labels, definitions) to
specify the intended audience.
Recommendation: audience keys should consist of the language code used in xml:lang plus the expertise level from 1-5 (plus a letter (a, b, ...) if a second audience for the same language and expertise level is defined).
[ATTR: audiencekey, lang, dir, ExpertiseLevel]A concise label for the audience; expressed in the language and ability of the audience.Further text beyond a short label; perhaps clarifying the definition of the audience. Expressed in the language of the audience.The key value that is referenced whenever an audience="xxx" attribute is used in audience-specific elements.ExpertiseLevel is restricted to values from 1-5. These categories allow to communicate expected expertise between different applications using the SDD schema. The recommended interpretation is:
1 = elementary school (year 1 to 6);
2 = middle school (year 7 to 10);
3 = high school (year 11 above) and general public (trying to avoid any specialized terminology or jargon);
4 = university students or (partly) trained personnel (using terminology, but avoiding or explaining problematic terminology);
5 = experts (using the full range of terminology).The default audience is used whenever the setup of the consuming application has no other preference specified. The user interface of the application may then allow to choose a different audience/language available.ResourceConnectors and references to these objects:Abstract base type for connectors to resources (publications, class names, specimens, etc.). Provides either a simple free-form text, or a connection to an external resource.Defines a service used to resolve ExternalID. This could be the URI of a wsdl-file of a web service.Can be URI, but does not have to. Examples: "ref://x.y.fr/floras/smith/1998", "432787632", "SMI1998_DZT"Human readable representation; this may be the only data item if no machine readable ID exists. Example in the case of a publication resource: "Smith 1998. Flora of Erehwon, Fingers Publishers." If an external ID exists, this is considered cached information and required to be present.
@@ Should this be multilingual? Difficult if external source does not inform about language! @@ Should this be called Label instead?Used for Agent documentation (an Agent is a person, project, organisation, or software agent). Currently used for authors, editors, contributors, and translators. Ideally it connects to an outside definition or documentation of the Agent.This may be a person as
well as an organisation nameApplicable only to personsThis is an information URL pointing to a homepage with further information. If the person has a truly global URN representing its name, it is expected that this is used as the ExternalID above.Used for resources like publications, laboratory notes, speeches, etc. Provides either a simple free-form text, or a connection to an external resource.Used for resources like geographical names or places. Provides either
a simple free-form text, or a
connection to an external resource.Extends resource connector type with optional encoded data content (esp. images embedded in xml document) and with a Type (Image/Audio/Video, etc.).Type of medium @ To be discussed! @An optional caption for a resource, esp. if it will be presented embedded in another document. Captions can be provided for multiple audiences.
@@ Issue: captions, even in multiple languages, may be obtained from the service provider. Even then it may be desirable to override them! Do we need two collections: InheritedCaption and CaptionOverride? This seems to be awkward whenever there is no ServiceProvider! Also, FreeFormDescription can contain a "title" only in a single language! @@Optionally the full resource data may be embedded (as an alternative or in addition to defining a URI)Defines an element with a ref attribute pointing to a Publication in Resources (Resources/Publications/Publication)Defines an element with a ref attribute pointing to a Locality in Resources (Resources/Geography/Locality)A collection of LocalityRefType elementsReference to a locality defined
in Resources/Geography
[ATTR: ref]Defines an element with a ref attribute pointing to a MediaResource defined in Resources (Resources/MediaResources/MediaResource)A collection of MediaResourceRef elements.(the sequence in instance
is not informative!)
[ATTR: ref]Defines an element with a ref attribute pointing to an Agent (Resources/Agents/Agent)Reference to a Agents (Resources/Agents/Agent)The first time a creator-agent has made a contribution to the object to which it was added by reference. The first/last contribution records are specific to the role of a creator-agent. If a creator has contributed both as an author and later as an editor of data, two references in two role containers will exist. Consequently, the dates for the two roles are recorded separately.A collection of AgentRefType elements, i. e. Agents forming a team like an author team.(The xml sequence of
elements in this collection
is informative!)
[ATTR: ref]Metadata (application, revision, IPR; Creators and RevisionData are closely related to the AgentRefsType defined above):Creators = authors, editors or contributors. At least one of Authors/Editors is required.(Reason
for choice:
one of
Authors /
Editors is
required)Authors that have originated the content; in the sequence of importance.Editors (see below) if present in addition to Authors.Editors that have revised content generated by multiple authors or contributors; in the sequence of importance. In general Editors should co-occur with Authors or Contributors (which is, however, not enforced).The sequence of Contributor Agents must be preserved during processing, but the semantics of it are defined by the authors or editors of the project: either importance or alphabetical sequence.In addition to authors/editors, several people may have translated audience-
specific texts.
@@Request for discussion: Translators are currently not listed on individual Representation elements. Only a general general statement about all translations together can be made. Should this be changed? Also: should one Representation be marked as 'Original/SourceForTranslation'? Will we have something like a 'normative' version? @@RevisionData (creators, dates, revision) for project, character, glossary entry, and description data.Date/time when the object (project/
terminological definition/description) was initiated. Applications may initially set this to the system date, but the project authors must be able to change it to an earlier date if necessary.Date/time when the last change was made (either in terminology or in descriptions)Enumerated categories, which are intended to be rough estimates by the authors/editors, not exact statements. RevisionStatus refers primarily to the correctness of existing data. This includes an estimate of completeness relative to the stated scope (e. g. taxonomic or geographic scopes in the project definition). However, if the project goal is to describe the frequent species of a taxon, the project status may be 'FullyRevised' even if many species are still missing.Application specific data, providing an extension mechanism to the SDD model.
SDD conforming editing applications are expected to preserve the information of other applications when importing and later exporting data to support lossless round tripping.
Recommendation: Each application may read out its own information. Any other target information present should be preserved and output when a new document is generated. This is designed to support item potent round tripping data between two applications. This implies that no dependency between the settings and the descriptions and the terminology setting should be relied upon.The Application element must contain application-defined element content (not further validated by SDD). It is not possible to directly store a text string (content model mixed="false").
[ATTR: name, version]Identifier chosen by the target application for which the current information is intended. The only purpose of this attribute is that the application generating data in the application container recognizes the target identifier as its own, while other applications just pass this through.Optional information about which version of the application generated these application-specific data.Annotations of objects occur together with labels or similar identifying objects. However, are not audience-specific and separate from the Label collection:= reuse of Annotation
and ApplicationData,
i. e. designer and
application 'annotations'Internal notes/management comments (not multilingual). Annotations should be displayed only in a 'designer' or 'revision' mode' and are expected to be invisible to users who only want to apply the data. They are appropriate for rough, unedited comments, but should not contain confidential information.Application-specific data
(= extension mechanism)Key/ref infrastructure:This allows to define (and redefine) the value type for keys and keyrefs (except for audience keys, which are xs:Name)Contains a key and a generic debugkey attribute. An optional attribute to add a human-readable equivalent to the numeric primary identity key, intended to simplify debugging SDD applications. The attribute can be discarded or updated at any time. Applications should not produce exports containing this attribute, instead it can be generated using xslt (based on labels/abbreviations.Currently contains only the generic debugref attribute. The ref attribute could be defined here as well, but this would prevent adding annotations to clarify which key a ref is pointing to!An optional attribute to add a human-readable equivalent to the numeric ref to simplify debugging SDD applications. The attribute can be discarded or updated at any time. Applications should not produce exports containing this attribute, instead it can be generated using xslt (based on labels/abbreviations reached through key/ref).Basic simple types:normalized string required to be at least 1 character long (i. e. either element/attribute may be optional, but if they are required the content must not be an empty string)normalized string restricted to 1..255 character length (i. e. required, may not be empty string)Restricted to integer values from 0 to 5, indicating expertise from schoolchildren to taxonomic expert. Recommendations for interpreting and choosing the expert level:
0 = unspecified
1 = elementary school (year 1 to 6)
2 = middle school (year 7 to 10)
3 = high school (year 11 above) and general public (trying to avoid any specialized terminology or jargon)
4 = university students or (partly) trained personnel (using terminology, but avoiding or explaining problematic terminology)
5 = experts (using the full range of terminology)0 = unspecified expertise level. Use this if the expertise level of can not be assessed (e. g. when exporting data) or is considered irrelevant.elementary school (year 1 to 6)middle school (year 7 to 10)high school (year 11 above) and general public (trying to avoid any specialized terminology or jargon)university students or (partly) trained staff (using terminology, but avoiding or explaining problematic terminology)experts (using the full range of terminology)Enumerated list to improve application interoperability. It is unclear whether a simple SDD list(as presented here ), or a generic MIME type support is more desirable.RevisionStatus is applied to the project as a whole as well as to individual descriptions.RevisionLevel 1 of 5, for example less than ca. 20 % of the data are revised.RevisionLevel 2 of 5, for example ca. 21-40 % of the data are revised.RevisionLevel 3 of 5, for example ca. 41-60 % of the data are revised.RevisionLevel 4 of 5, for example ca. 61-80 % of the data are revised.RevisionLevel 5 of 5, for example more than 80% revised (but not yet completed).Revision completed. This does not necessarily imply that the data are complete in a scientific sense. They are completely revised only under the available time and the goals set for the project.Combines a publication resource reference with a detail location within that reference (esp. page number)Refers to a publication as defined under Resources/Publications
[ATTR: ref]Location within publication where the cited data can be found : Page, table, figure number, database record, html document bookmark, etc. (not the inclusive pages of the article).Verbatim name as it appears in citation.
@@ Do we need this? @@Allows basic character formatting using xhtml elements plus three semantic elements (citationauthor, taxonauthor, taxon; intended to be rendered formatted and for analysis). Note that no further formatting is supported within the semantic elements (taxon etc.).(Note that this
is a mixed
content model,
allowing text
between
elements!)'Emphasis' logical markup (phrase):
usually rendered italic.'Strong' logical markup (phrase):
usually rendered bold.Logical markup: subscriptLogical markup: superscriptFont style markup: italic markup that could not be interpreted as (preferred) either emphasis or taxon.line break (empty element)markup for inserted/deleted textRecommended report rendering: italicsAuthor of a referenced citation. Recommended report rendering: may be either ignored or rendered as small capsAuthor of a taxon. Recommended report rendering: see citationauthor Base type; defines an element with a ref attribute pointing to Audience definitions (different data type from generic ref!)Audience-specific
project header
informationA short, concise title.
This does not support any formatting!Free-form text containing a longer description of the project.A free form text acknowledging support (e. g. grant money, help, permission to reuse published material, etc.)Disclaimer statement, e. g. concerning responsibility for data quality or legal implications.Not optional! At least
a copyright statement
is required!A concise copyright statementOptionally, an expanded copyright
statement may include more detailed
copyright informationFree-form text defining conditions under which the data may be distributed or changed.To be used if data are placed under a public license (GPL, GFDL). Placing data under a public license is recommended.Free-form description of geographic coverage of descriptions available in the current project.Free-form text describing taxonomic groups covered by the project.A label = collection of audience-specific label representations (without abbreviations or natural language reporting wordings), used e. g. for concept trees or modifier sets.Audience-specific simple label representation (= without
abbreviations or natural
language reporting wordings)
[ATTR: audience]Audience-specific label representations (without abbreviations or natural language reporting wordings); used e. g. for concept trees or modifier sets.Text of the normal label,
intended for screen display
or reports that accommodate
unabbreviated labels.