grad_hline.jpg
System Simulation / Aquarelle
Aquarelle
Aquarelle - Networked Cultural Information

George Mallen , System Simulation Ltd, United Kingdom Mike Stapleton , System Simulation Ltd., United Kingdom

ichim99 - Cultural Heritage Informatics
Session: Archives and Museum Informatics


Abstract
Aquarelle was a European Union assisted project
designed to provide an information retrieval
service for searching across different cultural
database systems with differing database
architectures. A broker architecture was
implemented in which a central Access Server
received queries from user-clients, distributed
these to defined remote data servers, collected the
results and passed them back to the user-client. Each
query passed through a series of transformations as
they were encoded in various protocols. The
principal protocols were HTTP, Z39.50 and SGML and
the local protocols used at the data servers.There
were two main types of data server, Archive Servers,
containing primary information about objects and
sites, and Folders, containing authored
information usually as SGML documents.

The project was a collaboration involving both
technical and data supply partners from the European
museum, gallery and monuments communities. Its
technical development and demonstration was over
the period 1996-1998 preceded by a 1 year feasibility
and proposal development study.

This paper describes the project and speculates on
its future application in networked cultural
information systems.

Authors

George Mallen is founder and managing director of
System Simulation Ltd one of the technical partners
in the project.

Mike Stapleton is Technical Director of System
Simulation Ltd and responsible for SSL's
Contribution to the project.

1. The Aquarelle Project

Policy makers in advanced economies are
increasingly aware that the Internet is a
potentially powerful infrastructure for
disseminating cultural and educational content.
The reasons for wanting to develop and disseminate
such content lie perhaps in the realisation that, in a
world of increasing globalisation, it is important
to anchor a sense of identity in local and regional
cultures but also to encourage an appreciation of
cultural diversity and an understanding of the value
and interdependence of different human cultures.
Such a view counterbalances a more prevalent fear
that the Internet is an agent of globalisation and
will lead to cultural homogeneity rather than
diversity. The technology itself is neutral on these
matters and the direction eventually taken will
depend on the successes and failures of emerging
policies and projects. For those of us who believe the
internet is potentially a major new source and medium
for intercultural learning the onus is to create the
technology to make it happen. Aquarelle is one such
effort. Its full title is "Aquarelle: Sharing
Cultural Heritage through Multimedia Telematics"
and it was an R&D project partly supported by the
Telematics Application Programme of the European
Union. It was initially set up through a close
collaboration of public institutions in four
countries, namely France, Greece, Italy and the UK.
The project eventually involved 22 partners from
institutions and technical organisations from
these countries. The project was co-ordinated by
ERCIM (The European Consortium for Research in
Informatics and Mathematics). This is a consortium
of the main national research laboratories in these
subjects and the ERCIM member which ran the project
day to day was INRIA, ( Institut Nationale pour
Research en Informatique et Automatique) near
Paris. The project website is maintained at INRIA at
http://aqua.inria.fr.

The Aquarelle partnership designed and
demonstrated an information system offering access
to varied cultural data repositories mostly held by
public bodies but also some private. A major
challenge which had to be addressed was the
requirement to provide access to legacy data which
had been created well before the emergence of the
Internet in its present form and which used very
varied database systems. The digital information
was also heterogeneous, ranging from databases with
very different schemas, with very different
terminologies, very different document types such
as multimedia presentations on CD-ROM, ordinary
office documents created by various word
processors, HTML documents created for use on the
WWW, (Dawson, 1996).

Given this variety of source material and
technologies the project's goal was to allow culture
professionals such as museum curators, urban
planners, commercial publishers and researchers to
collect information relevant to their needs or
interests notwithstanding the information
location and organisation. In addition any author of
a given information component should be able to link
his or her work directly to other information assets
created by other authors. Linking, annotating and
commenting on relevant pieces of information from
different sources will add considerable value to the
content and the overall Aquarelle architecture was
designed to relieve users from the cumbersome manual
tasks of maintaining cross-references while also
supporting high precision referencing and
retrieval.

Aquarelle defined two main sources of cultural
information, namely, Archive Servers and Folders.
Archive Servers contain primary material like
museum object catalogues, associated images,
architectural drawings of historic buildings, maps
or text corpora. They provide information about
individual objects or sites. The model of an Archive
Server is designed so that an existing museum
collection documentation system, a photograph
library catalogue or a data service system could act
as an Aquarelle Archive Server, they are expected to
be able to return a record about each object or site
held in the database. They follow a conventional
information retrieval model for database access.
The mapping between the Aquarelle access points and
the typically larger number of actual fields in the
host database is carried out at the Archive Server.
This enables content specific knowledge of the
content owners to be applied to give the most
appropriate mapping. It also simplifies the task of
maintaining mappings. Aquarelle Archive Servers
respond to the queries from the Access Server and
return the results of the searches. The dialogue
between the two is mediated by the Z39.50 protocol.

Folders are secondary or derived material
describing, commenting on, linking primary
material, for example multimedia essays on cultural
topics. Thus Folders are containers for
semantically linked archive data and Folders can
themselves be linked. They are SGML hypertext
documents typically providing information
relating to groups of objects (Bounne et al,1997).
Aquarelle provides a unified interface for finding
and browsing folders in conjunction with object
information. The folder DTD includes a simplified
set of elements providing metadata describing the
contents of the folder. This metadata can be searched
directly in the same manner as object data held on
Archive Servers.

In many respects Folder Servers and Archive Servers
are treated simply as data servers serving different
types of content. They both communicate with the
Access Server via Z39.50. Folders themselves are
returned as SGML documents encapsulated in GRS-1
record syntax. Hyperlinks between folders and
records held on Archive Servers a remediated by the
Access Server. The Aquarelle Z39.50 profile has
elements for establishing and following such links.

The project was successfully demonstrated at the end
of 1998. The final architecture allowed users to
access Archive and Folder servers in the four
countries using standard web browsers via the
central Access Server. The communication between
the access server and the data servers uses the Z39.50
protocol with a profile developed in collaboration
with CIMI, the Consortium for the Interchange of
Museum Information. This profile describes both
functional aspects of the protocol and also the
access points available for querying.


2. The Aquarelle Access Server

The Access Server lies at the heart of the Aquarelle
system. It provides search, retrieval and
presentation mechanisms to allow the information
held in the varied databases accessible by the system
to appear to the user as a coherent set of web pages. The
Access Server receives queries via the user's web
browser and broadcasts them in a suitable form to the
selected data servers. It collates the responses and
returns them to the user interface module. It also
retrieves folders and manages the links between
folders and between folders and the archive records.

The Access Server also provides a range of central
services. It controls access to the Aquarelle system
through the user management functions which include
the storage and manipulation of user profiles. It
supports the services provided by the user client,
namely resource discovery, query handling, result
management, folder publication and one-to-one
connections with data servers through specific
functions. It provides a uniform interface to
Archive and Folder Servers based on the search and
retrieval protocol Z39.50. It also provides an
interface with a thesaurus browser to assist users to
select search terms. Finally it guarantees the
consistency of hyperlinks in folders through a link
management subsystem. Thus the Access Server embeds
a user-client server, a user session manager, a
Z39.50 client and a link management subsystem. The
user-client server comprises a Web server and a set of
CGI programs which process the user requests, invoke
the corresponding functionality in the access
server, encode the returned data before sending
them to the Web browser. The user interface is a set of
static and dynamic HTML pages. The static pages are
accessed directly through the Web server, the
dynamic pages are generated on the fly by the CGI
programs. To prepare a query the user is presented
with a set of HTML forms. The form is submitted by HTTP
and interpreted by CGI scripts in the user-client
module which converts the query to the AQL (Aquarelle
Query Language) and passes it to the User Session. The
user session then has the opportunity to modify the
query, for instance to apply various terminology
resources to translate or expand terms. The modified
query is still expressed in AQL and passed to the
Z Client. The Z Client converts the query from AQL to
Z39.50, using the Aquarelle profile, and broadcasts
it to the currently selected data servers. Each data
server interprets the native protocol and responds
to the Z Client with a GRS-1record. The GRS-1 record
can contain structured data or SGML documents. The
Z Client collates the responses and encodes them as
SGML, if necessary, and returns them to the user
session. In order to display folders the Web browser
must be capable of displaying SGML documents. This
can be done in the current generation of browsers with
the aid of an appropriate "plug-in". The emerging
generation of browsers are XML compatible and so able
to present conforming Aquarelle folders without
such plug-ins. Note that contemporary developments
in this area, such as the Dublin Core project, (see
Bibliography) define metadata records which can
act as a surrogate for the primary object or digital
record. Several data services are based on holding
surrogate records for the primary data records
centrally and searching those rather than the
primary data, for example SCRAN, ADAM, ELISE and VAN
EYCK (seeBibliography).

The Aquarelle Access Server does not hold surrogate
records at the object level. It can hold surrogate
records describing complete archives for the
purpose of providing finding aids and other
directory services. However the set of access points
can be seen as defining a virtual record which can act
as a surrogate for the primary data when querying.

Archive Servers can be implemented using either a
gateway or a dedicated server holding surrogate
records. Hence data services based on holding
surrogate records, such as those mentioned, fit
easily into the Aquarelle architecture where they
act as Archive Servers.


3. The User Model

Aquarelle provides users with a common vocabulary,
including a set of access points, for phrasing
queries that can be used to search across different
database systems with different data
architectures.

The Aquarelle cultural partners use a large number of
fields to store the data in their respective
databases[1]. Individual databases may use over two
hundred different fields to store the information at
the level of detail required by the specific
requirements of the host institution. Furthermore,
different institutions, particularly when from
different countries, have different approaches to
structuring the data. Early research showed that
there was little commonality at this level.
Searching these databases using the native fields as
access points provides the highest search precision
possible but can lead to frustratingly low recall if
the user is not familiar with the dataset. Although
aimed at professionals, users cannot be expected to
have detailed knowledge of the target datasets.
Accordingly, services like Aquarelle need to
provide higher level access points for specifying
queries. This approach improves the recall at the
cost of reduced precision. The common set of access
points are mapped to the target datasets to perform
the actual query.

At an Aquarelle project workshop the desirability of
multiple sets of access points, particularly in a
hierarchy, was raised by representatives of certain
user communities. At the workshop the following
generic levels of description were identified,
characterised by the approximate number of access
points.

1 "Justsearch", regardless of access
points and data types. This is the
classic free-text retrieval
approach.

<10 Eg Who, What, Where, When; an
appropriate starting point for low
precision queries for public access or
where the researcher has little
knowledge of the subject domain.

20-30 Typical of general metadata schemes
such as Dublin Core and CIMI (see
Bibliography). Of use to researchers
with a reasonable knowledge of the
subject domain. Typical of Library and Archive
Management systems.

Approx100 Typical of core data standards such as
CIDOC and SPECTRUM

>200 Actual number of fields in use in full
scale collection documentation
systems. Useful to researchers with
detailed knowledge of the dataset.

In order to provide a set of access points as the basis
of the common model of searching, the fields used by
the cultural partners were mapped to various
independently defined reduced sets of access points
including those defined by CIMI, the Consortium for
the Interchange of Museum Information in the CHIO,
Cultural Heritage Information Online Project (see
Bibliography); the Museum Documentation
Association in the SPECTRUM Museum Documentation
Standard; and the CIDOC Core Data Standard for
Archaeological Sites and Monuments.

Project CHIO was of particular relevance, it aimed to
support a similar though narrower constituent
community to Aquarelle. Approximately half of the
fields used by the Aquarelle cultural partners could
be mapped to 20 distinct access points from the
Project CHIO set. Although this set a limit to the
precision with which queries could be made, it was
felt appropriate considering the state of knowledge
of user requirements and other developments at the
time. The main areas of difference came about because
Project CHIO primarily addressed object
information while Aquarelle was also concerned with
information relating to sites and buildings.
Further work, revising the mapping and adding some
additional access points to the Project CHIO set,
resulted in a set of access points for the Aquarelle
system. Through liaison with CIMI the access points
were revised to meet the Aquarelle requirements.


4. Interoperability

Interoperability is an important aspect of the
Aquarelle project. There is currently considerable
activity in the area of providing "finding aids",
metadata and unified access methods for digital
resources, (Bouthors et al, 1997; Taylor &
Stapleton, 1997; Weibel et al,1995). Z39.50 has
emerged as one of the important technologies. It
seemed in the best interest of Aquarelle and other
projects using Z39.50 that there is a high degree of
compatibility between their profiles, at least
in so far as they address the heritage sector.

In order to use Z39.50 as the communication protocol
between the Access Server and the data servers a
profile must be defined which identifies the subset
of the protocol that will be used. The profile
specifies implementation-dependent aspects of the
protocol, and adds semantics to the encoding. The
profile specifies various low-level
characteristics that govern how the communication
between servers and clients using the protocol takes
place, together with higher level attributes that
govern what information can be exchanged in this way.
The high-level attributes cover:

o The access points that identify the units of
information that can be queried.

o The query functionality

o The content and the structure of the
information returned.

The Aquarelle Z39.50 Profile defines a "brief"
record, giving a selection of data elements for use in
a summary display of results, and a "full" record,
containing all the data elements the data server is
prepared to provide.

It is essential that Aquarelle clients and servers
implemented by different bodies are able to
interoperate with one another. The Aquarelle Z39.50
Profile primarily addresses this central
requirement.

For Aquarelle purposes alone the profile could be
restricted to those facilities that are needed to
support the user interface. Unusually in a Z39.50
context, there is only one client for the Aquarelle
Z39.50 Profile, namely, the Access Server. The
Aquarelle Profile defines the Z39.50 entities that
the Access Server will generate and what it will
accept in return. It is not necessary for an Aquarelle
data server to handle any entities other than those
handled by the Access Server. However, in order to
increase the acceptability of the Aquarelle system
and, hence, the availability of data servers, it was
thought desirable that the Aquarelle Access Server
could connect to existing servers and that data
servers configured for Aquarelle could serve other
uses. This would be achieved by adopting a profile
compatible with other profiles in the heritage
sector.

Earlier work had already identified the relevance of
the CIMI Project CHIO (see above) and it was decided to
work with CIMI to produce a joint profile that would
serve the needs of both communities. Aquarelle uses
the CIMI Z39.50 Profile (see Bibliography) subject
to an "Implementers' Agreement" which further
defines certain aspects.

Interoperation with non-Aquarelle systems has two
aspects:

1. Compatibility with non-Aquarelle servers

For an existing Z39.50 server to be able to act
as an Aquarelle server the Aquarelle profile
must be a subset of the server profile or at
least the Access Server must be tolerant of
unsupported elements in the target server.

2. Compatibility with non-Aquarelle clients

For an Aquarelle server to be able to act as a
server to other Z39.50 clients it must make an
acceptable response to all requests issued by
the client.

Experience from various projects using Z39.50 to
provide a common means of access to heterogeneous
databases shows that full interoperability cannot
be assumed until the systems have been tested against
one another and in conjunction with other systems.
The Aquarelle project's use of the CIMI Profile
increases the likelihood of successful
interoperation without undue effort.

A further common experience, shared by the Aquarelle
project, is that once the technical aspects of
interoperation are resolved a new set of issues
arises to do with the interpretation of the access
points. The detail of the mapping between the host
data set and the common data model implicit in the
profile will inevitably need to be tuned once the
results of searching across diverse databases are
visible. This cannot be done by the technical
partners as it requires the development of a common
understanding within the user and content-owning
community. The Aquarelle approach puts the
responsibility for these mappings with the Archive
Servers with the aim of empowering the content owners
to maintain the mappings in response to discussion
amongst the user community.


5. Next Steps

The Aquarelle project has produced a functioning
system which met the original objectives and
demonstrated the technical viability of such
systems. The cultural partners provided a vital
contribution by hosting data servers and by
providing user requirements and feedback. The
project was successfully reviewed by the EU review
processes and now looks to future exploitation. So
what form will that take? We must note that, during the
period of the project, the Web has grown very quickly
and both cultural and educational, as well as
e-commerce applications, are much more common and
understood than they were at the time the Aquarelle
project was conceived. Thus, given the technical
success of the demonstrator, the question to be
addressed is whether the original vision is still
relevant in a much changed computing environment?

The answer is an unequivocal affirmative. The
overall principle of a broker architecture is
emerging rapidly as a generic model for the effective
exploitation of resources on the Internet. Resource
directories and specialist portals are now fairly
common in education and e-commerce. The need within
the cultural sector for Aquarelle type systems is, we
believe, more pressing now than it was 5 years ago.
More and more museums are implementing collection
information systems on different database
platforms. For example in the UK government policy is
demanding that museums have well found IT systems for
documentation support as integral parts of museum
quality assurance and registration processes.

In other countries, particularly as Europe expands
to the East, museums and galleries are, and will be,
busy creating digital assets. The Aquarelle vision
was designed precisely for this situation. By
providing coherent interlinking and common access
interfaces it would catalyse the accessibility of
cultural information via expanding network
technologies and make information available to
colleagues in other cultural institutions and the
education sector. Note that the goal of the Aquarelle
system was support for culture sector
professionals, not the public or basic educational
applications. Yet the hope was to facilitate access
to primary object information so that information
could be used in a wider context and be incorporated
into descriptive content in folders which might then
become source material for further interpretation
for public and general educational application. The
Aquarelle project demonstrated that the technology
infrastructure for this approach could be made to
work. Its goal now must be to demonstrate that the
"trickledown" information transformation process
from object descriptions to folders, which the
technology facilitates, is of real benefit to
culture sector professionals in their work of care
and scholarship but also provides a growing pool of
new source material which can be used by authors,
teachers, publishers and artists to create further
interpreted, added value content for public access
and education.

Proposals on how to test the information process
model are under consideration for presentation to
the new Framework 5 programme of the European Union.
Though not yet complete these will almost certainly
comprise three main components, first a further
technical step to make a version of the Access Server
which can be distributed free, possibly on an "open
source" basis, second, to build an information
provider community committed to extensive
experimental use of the system to explore the use of
archive and folder content in their professional
activities, and third, to involve a further
community of authors, teachers, publishers and
artists to evaluate the use of the archives and
folders for the creation of material suitable for
public access and educational use.


Acknowledgements

The authors gratefully acknowledge the material and
assistance provided by the Aquarelle partnership,
particularly the support of Alain Michard, the
project co-ordinator.

The views and opinions are however our own and do not
necessarily represent the collective view of the
Aquarelle partnership.



Bibliography

ADAM, The Information Gateway for Art, Design,
Architecture and Media.
http://www.adam.ac.uk

Aquarelle, the project website is at
http://aqua.inria.fr

Bounne C and Vassilis Christophides, Martin Doerr,
Eddy Fras, Irene Fundulaki, A
Kementsietsides, Y. Velegrakis. (1997).
Aquarelle Folder Server and Editor: technical
description.
Aquarelle Deliverable 6.4,
available from the Aquarelle document
repository.

Bouthors V and Jean-Yves Dupuis, Nhan Tran Huu.
(1997). Z39.50 Gateway for Mistral: technical
description
. Aquarelle Deliverable 5.2,
available from the Aquarelle document
repository.

CIMI Profile Development Working Group. (1996). The
CIMI Profile: Z39.50 Application Profile
Specifications for Use in Project CHIO.

Available at
http://lcweb.loc.gov/z3950/agency/profiles/
ftp://ftp.cimi.org/pub/cimi/CIMI_Profile/

Dawson, D (1996). Data Structures in Cultural
Databases: A survey.
Aquarelle Deliverable
5.5, available from the Aquarelle document
repository.

Dublin Core Metadata Element Set: Reference
Description
, revision of January 15,1997.
Available at
http://purl.org/metadata/dublin_core_elements

ELISE, Electronic Library Image Server for Europe
http://severn.dmu.ac.uk/elise/
EC.DGXIII,Project1008.

SCRAN (Scottish Cultural Resources Access
Network).
http://www.scran.ac.uk

Taylor M and Mike Stapleton.(1997). Z39.50 version
of Index+: technical description.
Aquarelle
Deliverable 5.3, available from the Aquarelle
Document repository.

VANEYCK
http://www.hart.bbk.ac.uk/van_eyck.html,
EC.DGXIII,Project1054.

Weibel S and Jean Godby, Eric Miller, Ron Daniel.
(1995). OCLC/NCSA Metadata Workshop Report.
Available from
http://www.oclc.org:5046/oclc/research/conferences
/metadata/dublin_core_report.html


Technical information
The Aquarelle project built a prototype information retrieval service for searching across different cultural database systems with differing database architectures. A broker architecture was implemented in which a central Access Server received queries from user-clients, distributed these to defined remote data servers, collected the results and passed them back to the user-client. Each query passed through a series of transformations as they were encoded in various protocols. The principal protocols were HTTP, Z39.50 and SGML and the local protocols used at the data servers. There were two main types of data server, Archive Servers, containing primary information about objects and sites, and Folders, containing authored information usually as SGML documents.

Publication:

G. L. Mallen and M. J. Stapleton, System Simulation, Ltd., UK: "Aquarelle : Networked Cultural Information" in Cultural Heritage Informatics 1999: selected papers from ichim99, Edited by David Bearman and Jennifer Trant.
Related Projects
grad_hline_short.jpg