CARNet Users Conference 2000 Conference Proceeding

P3	Metadata as a global language for digital libraries
Thomas Baker, GMD, Sankt Augustin, Germany

	Abstract				Presentation		Back to Program
Metadata schemas are small languages for describing or making statements about Web resources. Differently from natural languages, their machine-readable symbols can be labelled, defined, and used equally well in any human language, such as Korean, English, and Croatian. As languages, however, metadata languages still lack widely understood principles of grammar and comprehensive dictionaries. Several projects are using linguistic insights and methods in designing metadata schema dictionaries, or registries. Design issues include the linking of machine-readable tokens among translations of a standard in multiple languages (such as Dublin Core); separating the declaration of semantics ("namespaces") from their reuse and recombination for specific applications ("profiles"); and the use of interlinguas for N-to-1 conversion among multiple schemas. Policy issues include the balance between "prescribing" official rules and "describing" actual practice (as in a good dictionary); annotation vocabularies for layering endorsements over schemas; and promoting simple principles of metadata grammar. Dictionaries will help metadata vocabularies evolve more like other human languages -- not just top-down, like traditional standards, but bottom-up, in response to usage. The Dublin Core Metadata Initiative, in particular, is seeking to define processes that involve members of all language groups in the collective definition of globally shared semantics.