DESIRE Information Gateways Handbook
HomeTable of contentsAuthors-
Search | Help   
-2.7. Working with information providers

In this chapter...
 
  • identifying the key information providers for your gateway
  • building and maintaining relationships with information providers
  • involving information providers in the metadata creation process
Introduction
 

One of the most time-consuming, and therefore costly, tasks for information gateways is maintaining up-to-date descriptions of relevant resources. Identifying and describing quality resources is critical for the gateway. One possible means of making this process more efficient is to involve the 'information providers' (otherwise described as 'publishers' or 'resource owners') in the metadata creation process and to encourage them to contribute to the content of the gateway. This benefits the gateway in terms of saving costs and at the same time helps ensure the currency of the information held by the gateway. The benefit to the information provider lies in improved dissemination of their information. This is an alternative approach to the creation of resource descriptions 'by hand', where metadata is created centrally by the information gateway's own staff, or by library staff who are working within other institutions, or by subject experts.

These various methods are in use to a greater or lesser extent in existing gateways. In the UK, for example, the Resource Discovery Network gateways have most of their metadata created by gateway staff or subject experts, but services such as the Arts and Humanities Data Service rely to a much greater extent on resource creators inputting data to the gateway.

In the case of those gateways where metadata is created automatically by harvesting or crawling the web, it is also possible to involve information providers; this may be by agreeing procedures for identifying relevant material automatically, or by the information provider's alerting the gateway to new or updated data.

In this chapter we will look at some of the issues which arise when gateways and information providers work more closely together. We will consider the benefits of this approach but also note any disadvantages.


Identifying information providers
 

Whatever method of metadata creation is followed, a primary task for any gateway is to identify the key information providers in its field. These key providers may be individuals, groups or institutions who are creating or have some level of ownership of high quality resources. In the case of Higher Education funded gateways, the key information providers may be individual researchers, university departments, publishers, scholarly societies or commercial organisations working in the relevant subject area.

The key providers may vary considerably as regards:

  • the volume of relevant resources they produce
  • the rate at which resources are updated, i.e. volatility of resources
  • whether they create metadata themselves at source for their own resources

Taking these factors into account, the gateway will need to consider the overall profile of its key information providers in relation to gateway policy for metadata creation. The gateway needs to consider its own policy by asking:

  • what is the optimum number of records in the gateway? Is there an imperative need to build up the volume of records in the service?
  • at what level of granularity are resources being described? Can information providers help the gateway to describe resources at a finer level of granularity?
  • how rich is the metadata in the gateway? If the gateway wishes to produce rich metadata, then contributions from providers may need to be enhanced. Careful consideration needs to be given to the cost of enhancement as compared with creation from scratch.
  • are there benefits in building relationships with providers over and above the value of the imported metadata? Key providers may be key users whom it is beneficial to have on board.

It will also be useful to look at the wider picture and consider the cost of involving information providers. In order to justify setting up complex systems, the gateway will want to be assured that information providers can contribute a significant quantity of metadata. It may be that, to create economies of scale, gateways will need to co-operate with one other in setting up common methods for importing metadata from information providers. It is also likely that the information providers themselves will be contributing to a range of gateways and they will want a common procedure to cover all gateways. Such procedures would need to be flexible enough to allow for differing practices among information providers while following internationally accepted standards and protocols which can be clearly defined.


Building relationships with information providers
 

Having identified key providers and decided that they can contribute to the content of the gateway, the gateway can then build on this information in various ways.

Monitor key information providers

At the simplest level the gateway can ensure that a system is in place to monitor regularly the web sites of key players. This may involve guidelines for staff and varying degrees of automated monitoring. For example, staff may bookmark sites to check regularly or use a URL-minder to notify them of changes made to key sites.

Cross reference
Resource discovery

Enable submission of metadata

The gateway can offer a means for information providers to provide data about new resources. This may be a 'Submit a Resource' form on the gateway Web site.

E X A M P L E

Example of encouraging submission of metadata from information providers

Within DutchESS, resources are selected by subject specialists in the participating libraries on the basis of quality and relevance to the academic community. On the Web site there is a page for 'adding a resource' which asks:

Do you want to contribute a new resource to DutchESS? Use this form to let us know. Your suggestion will be submitted to one of our subject specialist[s]. If the resource is according to the scope policy and quality criteria of DutchESS it will be added to the database.


Information providers create the metadata

Gateways can offer metadata guidelines for providers who publish large numbers of relevant resources, so that they can create the metadata required. The metadata can then be automatically transferred to the gateway. Metadata may be manual, using a web based form, or semi-automated, using one of the available metadata creation tools. (CROSS REFERENCE metadata creation chapter)

E X A M P L E

Examples of gateways using metadata created by trusted information providers

A full-text electronic journal, SocRes Online, undertook an experiment with SOSIG, whereby the journal created metadata for each article, which was then automatically imported into SOSIG. Quality guidelines were agreed with the journal. This saved SOSIG staff considerable time, as they did not need to create records for the articles but simply needed to check the records that had been automatically created.

Indoreg (Hansen and Hansen, 1997) is a Danish project looking at the bibliographic control of Danish Internet documents and is particularly concerned with the inclusion of Internet documents in the Danish national bibliography. The project concluded that 'self-registration' by authors or publishers would be needed if large amounts of information were to be registered. It recommended the use of Dublin Core for this self-registration and provided tools - a DC creator (based on the Nordic Metadata Project's DC creator) and a PURL server - that would facilitate this.


Endorsement by influential institutions

It can be a condition of a grant that data resulting from funded projects should be deposited with a specified data repository. It might be that gateways could persuade funding agencies to insist that metadata is deposited with the relevant subject gateway.

E X A M P L E

Example of institutionalised metadata creation

It is a stipulation of the UK Arts and Humanities Research Board that funded projects deposit the data produced by the project with one of the service providers of the Arts and Humanities Data Service (AHDS).

This data may be in the form of a dataset or a catalogue record. The Archaeology Data Service, an AHDS service, recommends depositing a catalogue record if the data is dynamic, or if it is non-digital. As well as being a mandatory condition archaeology organisations, depositing data benefits the individual researcher. Benefits are summarised by the Archaeology Data Service under the following headings:

  • professional recognition
  • avoiding duplication (of catalogue records in different locations)
  • building links between data sets
  • signposting data

Distributed collaborative cataloguing

The future business model for metadata creation may lie with distributed collaborative cataloguing. This would involve an incremental approach to building up metadata for resources. The 'publisher' or 'owner' of the resource might create initial simple metadata, using the Dublin Core element set, for example. Services that wish to offer access to the resource might enhance this basic metadata, for instance with a description targeted at the ultimate users of the service. If the resource meets the criteria for description by the national library and inclusion in a national bibliography, then the national library might augment the records with subject headings and classification codes and align names and headings with the relevant authority files. Other interested parties might create unique identifiers (ISSN, DOI, etc.) or add metadata concerned with rights management or digital preservation. In this model the information provider becomes the first step in a chain of metadata creators.

Cross reference
Co-operation between gateways

There are pilot projects investigating shared metadata creation where a 'workspace' is used to create metadata collaboratively. At present, these projects are looking at collaboration between specific partners in the metadata creation process, for example libraries working together or publishers working with national libraries and identification agencies. Within these projects metadata can be enhanced incrementally and imported or exported in a variety of formats.

E X A M P L E

Examples of projects investigating shared metadata creation

Biblink

The BIBLINK demonstrator consists of the 'BIBLINK workspace' - a shared, virtual workspace for the exchange of metadata between publishers, National Bibliographic Agencies (typically national libraries) and other third parties such as the ISSN International Centre. The workspace will allow publishers to 'upload' metadata for electronic publications using email or the Web. National Bibliographic Agencies and third parties will be able to 'download' this metadata, enhance it in various ways and then 'upload' the enhanced metadata back to the workspace. The intention is that national libraries will use the enhanced metadata as the basis of a record in the national bibliography, if appropriate. Finally, publishers will be able to 'download' the enhanced metadata for use in their own systems. The metadata will be stored and exchanged in several syntaxes, including HTML, SGML, UNIMARC and the national MARC formats of the participating libraries.

CORC

CORC (Co-operative Online Resource Catalog) is an OCLC research project exploring the co-operative creation and sharing of metadata by libraries. CORC integrates recent metadata initiatives such as Dublin Core with MARC, enabling a more flexible approach to record creation. CORC emphasises the importance of exporting the records in syntaxes usable on the Web (e.g. HTML, XML/ RDF).


Community building

The gateway can build up a community of information providers. There may well be an overlap between providers and users of the gateway service, so this may be viewed as a marketing strategy. Traditional methods of dissemination (such as publishing, presentations, attending conferences) will form a basis for this activity. Growth of the community can be encouraged by invitational events for key players followed up by mailings and newsletters. A number of the eLib gateways in the UK have progressed from relatively simple catalogues of Internet resources to 'subject communities'. Depending on the business model by which the gateway is funded, membership of such a community of providers may confer benefits of preferential access costs or access credits.

E X A M P L E

Examples of gateways establishing links with information providers and building communities

EEVL, the engineering subject gateway, contains a range of information much wider than a search service; as well as a catalogue of selected 'quality' resources, it offers comprehensive searches of UK Engineering Web Sites, engineering e-journals and engineering newsgroups, and indexes to printed literature. As well as running the comprehensive Web site, EEVL organises training and awareness sessions.

SOSIG puts out calls to the social science community to request information regarding resources that they are publishing on the Internet. SOSIG now has good links with the academic social science community in the UK - as a result academics, government departments, the ESRC and others all send email to let SOSIG know when they put a new resource online. SOSIG has also run its own conference which brought together key information providers and users and established SOSIG at the centre of this community.

Biz/ed has responded to the most common information requests of their users by contacting key companies and organsiations to request information. They have established links with organisations such as the Bank of England, the Office of National Statistics and Penn World Data. The gateway has created primary resources collaboratively with these organisations. Biz/ed has also contacted companies such as McDonalds, BMW and the Body Shop to ask for information to add to the gateway. See also: http://www.bized.ac.uk/virtual/


Benefits and costs
 

There are a number of potential benefits resulting from information providers' providing metadata:

  • cost saving
  • assistance in keeping metadata up to date
  • accuracy of details

These need to be balanced against:

  • need to apply quality assurance
  • effort spent supporting information providers
  • instituting and maintaining processes for inputting data remotely

Is this right for your gateway?
 

Some factors that may affect the emphasis the gateway gives to metadata supply by information providers:

  • what is the likely scale of information provider contribution?
  • how many individual resources will the information provider supply?
  • what level of enhancement to metadata will be required to meet quality control criteria?
  • is the service aiming at comprehensive coverage of an area?
  • are information provider contributions seen as only as possible content for the gateway, or will information providers expect their data to be included (need to manage expectations)

Conclusions
 

It is worth while building relationships with key information providers, especially as in many cases they are likely to be users of the information as well as contributors.

Gateways may judge that at present information providers cannot provide enough metadata to make it worth while setting up systems to import metadata. However, it seems likely that, as metadata standards mature, organisations owning resources will recognise the advantages of creating metadata for their own purposes which may be for administration, rights management, marketing, their own resource discovery systems or to pass along the retail chain. Gateways need to be ready to take advantage of changes in the pattern of metadata creation when (if) this happens.

Gateways will need to move towards a viable business model for metadata creation to ensure their longterm sustainability.


Glossary
 

AHDS - Arts and Humanities Data Service
CORC - Co-operative Online Resource Catalog
DOI - Digital Object Identifier
Dublin Core - A metadata format defined on the basis of international consensus which has defined minimal information resource description, generally for use in a WWW environment.
DutchESS - Dutch Electronic Subject Service
EEVL - Edinburgh Engineering Virtual Library
Elib - The Electronic Libraries Programme (UK)
ISSN - International Standard Serial Number
MARC - MAchine Readable Cataloguing. A family of formats based on ISO 2709 for the exchange of bibliographic and other related information in machine readable form. For example, USMARC, UKMARC and UNIMARC.
PURL - Persistent Uniform Resource Locator.
RDF - Resource Description Framework
SGML - Standard general Mark-up Language
SOSIG - The Social Science Information Gateway
XML - Extensible Markup Language. A lightweight version of SGML designed for use on the Internet


References
 

P. B. Hansen & J. Hansen, INDOREG: INternet Document REGistration: project report (1997).
http://purl.dk/rapport/html.uk/

Credits
 

Chapter author: Rachel Heery

With contributions from: Emma Place


<< P R E V I O U S 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 N E X T >>
  Go to the table of contents  

Return to:
Handbook Home
DESIRE Home
Search | Full Glossary | All References

Last updated : 20 April 00
Contact Us
© 1999-2000 DESIRE