Metadata schemas are small languages
for describing or making statements about Web resources. Differently
from natural languages, their machine-readable symbols can be
labelled, defined, and used equally well in any human language,
such as Korean, English, and Croatian. As languages, however,
metadata languages still lack widely understood principles of
grammar and comprehensive dictionaries. Several projects are using
linguistic insights and methods in designing metadata schema dictionaries,
or registries. Design issues include the linking of machine-readable
tokens among translations of a standard in multiple languages
(such as Dublin Core); separating the declaration of semantics
("namespaces") from their reuse and recombination for specific
applications ("profiles"); and the use of interlinguas for N-to-1
conversion among multiple schemas. Policy issues include the balance
between "prescribing" official rules and "describing" actual practice
(as in a good dictionary); annotation vocabularies for layering
endorsements over schemas; and promoting simple principles of
metadata grammar. Dictionaries will help metadata vocabularies
evolve more like other human languages -- not just top-down, like
traditional standards, but bottom-up, in response to usage. The
Dublin Core Metadata Initiative, in particular, is seeking to
define processes that involve members of all language groups in
the collective definition of globally shared semantics.
|