TDWG working group:
Structure of Descriptive Data
Minutes of session at the TDWG meeting in Sydney, 10. November 2001
(Version 1.0)
Summary
- We have defined a set of minimum requirements on which we agree and which will form a boundary for future discussions.
- We have identified a potential misunderstanding of the aims of the group regarding the problem of data integration standard versus interoperability standard. The developers of descriptive software applications aim at an interoperability standard, and we agreed that this should be the first task. Whether aiming at a data integration standard conflicts with this has to be further explored.
- The Descriptive data standard should be a modular element of a more general schema, including references, taxonomy, etc. The naming of elements should be consistent with usage in other schemata (TDWG, esp. the Accessions group Schema, w3c, etc.).
- Within the next 3 months we will focus on collecting descriptive problem cases ("data challenges"). The required data elements will be identified, and or xml structures may be proposed.
- To this aim, clearly focused documents (plain text email or possible html attachments) will be submitted and organized on a web page. We will discuss them and the author of the problem case will try to summarize the discussion and send a second version including the discussion to the web site.
- The decision which problem cases will be dealt with in the new standard, and which are considered rare or irrelevant will be taken on later workshops
- We agreed explore the use of forum software; Bob Morris may be setting one up at TDWG.org
- We agreed to organize workshops after ca. 4 months. The first proposal is to meet two days in March 2002, possibly prior to the Species 2000 Catalogue of life workshop in Sydney (which is two days before the GBIF meeting in Canberra). A second workshop for working group members without funds to travel to Sydney may be held in March 2002, one day before the Euro+Med meeting in Paris.
- A full working group meeting will convene two days before the next TDWG meeting (approx. Nov. 2002, the possible localities are Costa Rica, or California)
Minimal requirements for new standard
1. It should be at least as expressive as DELTA data. It should include and extend DELTA.
2. It should be able to express new insights into the structure of descriptive data.
3. It should be based on xml and incorporate as many features of xml (including xml schema) as possible, so as to make the development of xml-based software efficient.
Further requirements
1. The new standard should allow storing raw as well as analyzed data.
2. The distinction between a possibly language independent and precise data language and a natural language description should not be given up. Multilingual processing should be supported and the standard should capture data as far possible language-independent.
3. However, maximum flexibility to enable incomplete markup of free text, and to mix data and natural language elements should be available, to provide the power to "index" natural language data sets like a digitized flora text.
4. Documentation of character definitions must be improved so that shared, possibly global character definitions with subsets can be defined
5. Descriptive data should be a modular element of a more general schema. Other
modules could be: literatur references, nomenclature/taxonomy, resources, etc.
6. The xml documents should be structured in such a way that it remains a viable
option to implement applications that use relational databases for data storage.
7. We definitely need an interoperability standard, also data integration standard
(may have to be separate standard if not possible to support in single standard)
Distribution of work
Richard Pankhurst, Mike Choo, and Steve Shattuck agreed to prepare documents providing
details of the data elements or information model they use for their applications
(Pandora, DELIA, BioLink), esp. in as far as they think these elements should become part
of a future standard.
Bob Morris will provide a summary: conflict between integration and interoperability
requirements, if possible with examples.
Kevin Thiele will provide a list of problem cases.
Please send any necessary corrections to G.Hagedorn@bba.de
(Gregor Hagedorn, Convener)
Return to the SDD starting page.
First published 2001-12-27, last update: 2002-02-07.
