![]() |
Search Options | Help | Site Map | Cultivate Web Site | |||||
|
||||||
| Home | Current Issue | Index of Back Issues |
| Issue 7 Home | Editorial | Features | Regular Columns | News & Events | Misc. | ||
This page is intended for printing purposes.
-------------------------------------------------------------
By Matthew Addis, Paul Lewis and Kirk Martinez - June 2002
Matthew Addis, Paul Lewis, Kirk Martinez and other members of the ARTISTE consortium review its achievements in developing an image retrieval system based on metadata and content that explores and analyses thousands of images from major art galleries across Europe.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ARTISTE [1] is a European Commission-funded collaboration, investigating the use of integrated content and metadata-based image retrieval across disparate databases in several major art galleries across Europe. Collaborating galleries include the Louvre in Paris, the Victoria and Albert Museum in London, the Uffizi Gallery in Florence and the National Gallery in London.
Museums and galleries often have several digital collections ranging from public access images to specialised scientific images used for conservation purposes. Direct access from one gallery to another is currently uncommon for textual data and almost unheard of in terms of image-based search and retrieval. Cross-collection access is recognised as important, however, for example to compare the treatments and conditions of Europe's paintings, which form a core part of our cultural heritage.
A key aim of ARTISTE is to provide an image retrieval system that can provide integrated cross-collection searching. Whilst ARTISTE is primarily designed for inter-museum searching and as a building block for public access systems, it could equally be applied to museum intranets.
An article on ARTISTE in the first issue of Cultivate [2] presented the project objectives and technical approach. Now that ARTISTE is nearing completion, this article looks at how those objectives have been fulfilled and discusses future work to continue and build upon the achievements of the project.
The ARTISTE system currently holds over 60,000 images from four separate collections belonging to the Uffizi, C2RMF (restoration centre for French museums including the Louvre), National Gallery and Victoria and Albert Museum. Although these collections are stored in separate databases and all have their own unique schema for the metadata that describes their contents, ARTISTE makes it possible to search quickly and transparently as if they were a single entity.
ARTISTE has been well received by the user members within the consortium. Further feedback has been obtained after a scaled down version [3] was made available to the 70 members of the ARTISTE Interest User Group (AIUG) [4] as a publicly accessible dissemination system. The most notable features of ARTISTE include:
Users of the ARTISTE system search for images using a query wizard. The wizard prompts the user with self-explanatory and non-technical questions. The wizard permits forward and backward movements through the process of search-generation. In this way, users of ARTISTE can quickly and simply build up sophisticated queries without needing to understand the technical details about the algorithms being used and why.
Since ARTISTE is a distributed system of servers that are able to communicate with each other, a user starts the query process by selecting one or more image collections.
![]() |
| Figure 1: Selection of Image Collections |
The user then progresses to define the detailed aspects of the query, in which they might combine content-based retrieval with metadata searching. (More about these methods later). When users are happy with the query they submit it to the system. They can request to be notified by e-mail when the query is complete.
![]() |
| Figure 2: Final Step: Search Summary |
ARTISTE then distributes the query to the chosen collections and collates the results as they come back. While the query is executing, users are given constant updates on the progress of their query whilst waiting for the results to be returned.
![]() |
| Figure 3: Query Executing |
The results of a search are shown on successive pages of 'thumbnails' with either 9, 15 or 21 images per page. The use of thumbnails allows search results to be navigated quickly over the Internet.
![]() |
| Figure 4: Thumbnail results |
The images are ordered according to how closely they match the query. Clicking on 'more info' for a particular thumbnail retrieves the full size image.
![]() |
| Distance | 0.0 |
| Collection | VAM |
| ARTISTE Image ID | 12642 |
| Local Image Id | pcd8839390910005-008 |
| Figure 5: Full size image |
The results of all queries are stored in the system so users can go back to work they have done in the past.
![]() |
| Figure 6: Search history |
The results of previous queries can also be used as input to new queries to allow more refined searches to be made.
ARTISTE allows users to specify the type of image search they wish to perform. Some examples are shown below. Technical descriptions of some of the algorithms are presented later in this article.
This category of search uses the appropriate algorithm to find images that have a similar distribution of colour to the image submitted. The algorithm is automatically selected by ARTISTE, depending on whether the user is looking for colour or black and white images and also on whether a colour or black and white image is submitted.
If the user is looking for colour images and a colour image is submitted, then the algorithm selected will be 'Colour Coherence Vector' - CCV. Alternatively, if the user is either searching for colour images but submits a black and white image, or is searching for black and white images, regardless of whether a colour or black and white image is submitted, the algorithm used will be the 'Mono-Histogram'.
An example of a search for an image of similar colour is shown below. In this example, the query image is also contained within the database. It is therefore not surprisingly found in first place. The rest of the retrieved results contain areas of contiguous colour similar to that of the query image. This includes the background since the algorithm, unlike a human, has no way of determining what is the subject and what is a backdrop.
![]() |
| Figure 7: Similar colour search |
Typically a similar colour search would be used in combination with a metadata-based search. For example, if a metadata search for the word 'vase' was used in conjunction with this similarity search, only images of vases and containing a similar colour distribution would be retrieved.
This type of query uses an appropriate algorithm to find images that have a similar pattern to the image submitted. The algorithm used is the 'Pyramid Wavelet Transform' - PWT and matching is based only on the texture, i.e. repeating patterns, in the whole image. The example below shows the results obtained when searching for similar textures. In this case, the dataset contains a set of fabrics, and the similarity between the repeating pattern of the query fabric shows up clearly in the results.
![]() |
| Figure 8: Similar pattern search |
This type of search is also appropriate in a painting restoration context. Below is an example of a search for paintings with a similar layout of 'stretchers' (wooden planks) on the back of the painting.
![]() |
| Figure 9: Query image, stretchers |
![]() |
| Figure 10: Stretchers Results page |
The results from using this algorithm could be used as a measure of how many wooden planks are affecting the presence of damage such as cracks on the painting surface.
A query image may be a sub-image of an image within the database. The requirement is not only to identify from which parent image the query is derived but also to locate its position in the parent image. Some of the images in the collection are very large (up to 800 Mbytes) and also very high-resolution (20 pels/mm), demanding special purpose algorithms for effective handling. Since the query may have been recorded at a significantly different resolution from its parent and in a different state of restoration, or simply under different lighting conditions, robust algorithms are required. A multi-scale search technique based on colour coherence vectors has been developed and has given useful results.
If the user is looking for colour images and a colour image is submitted, then the algorithm selected will be 'Multi-Scalar Colour Coherence Vector' - MCCV. Alternatively, if the user is either searching for colour images but submits a black and white image, or is searching for black and white images, regardless of whether a colour or black and white image is submitted, the algorithm used will be the 'Multi-Scalar Mono-Histogram'.
The example query below shows the quality of results obtained. The matching is based only upon the general colour layout, where all the retrieved results contain areas of contiguous skin-like colour and areas of contiguous dark brown, similar to the colours in the query. To help the reader, we have indicated with a green rectangle the area in each result image that matches the query image. In this example it is actually the second result that contains the query image.
![]() |
| Figure 11: Sub-image results |
ARTISTE provides a colour-picking tool so that users can define one or more colours that they want to use as the basis of a search. The colours need not correspond to sizeable regions of an image since the algorithm used is based on how similar the colours in the image are compared to the colour selection.
The example below shows the results of a query where a particular shade of red has been defined using the colour picker tool (the interface to the tool is shown as the query).
![]() |
| Figure 12: Colour search results |
ARTISTE can attempt to retrieve images in a collection that match low quality monochrome query images, for example a facsimile of a painting that might be in the database. The retrieved images have a similar layout of dark and light pixels to the query image. An example of query by fax is shown below. An explanation of how the query image was analysed using PWT is given in the section Query by Fax.
![]() |
| Figure 13: Fax image results |
ARTISTE supports a wide variety of image analysis algorithms, some of which are described below. These form the basis of content-based retrieval. In some cases, the algorithms can be combined into composite queries and a normalised distance measure for each algorithm is used to determine the overall match of a result image to the query.
We have already looked at PWT, an algorithm that can be used to locate wooden planks, particularly at the back of paintings.
Another algorithm aims to detect the presence of cracks on a painting. X-ray images are used instead of conventional surface images, as they expose more clearly the structure of the cracks. These techniques are typically combined with a metadata-based search to limit the content-based search to a sub-class of the total image collection, for example images taken using x-rays or images of the backs of paintings.
In our system, each algorithm is applied to the images in the collection to generate a set of image content descriptors called feature vectors. A feature vector can be considered as a way of indexing an image to describe an aspect such as colour distribution or texture. The feature vectors are then integrated and stored with the text metadata for each image in the collection database. When a search needs to be made, the required algorithm (e.g. CCV) is run on the query image to create a query feature vector. This query feature vector is then compared with all the corresponding feature vectors for the images in the collection. The comparison of feature vectors results in a measure of distance between the query image and each image in the collection. The images in the collection are then returned to the user as a series of thumbnails in order of increasing distance.
Some of the individual algorithms are explained in more detail below.
The colour histogram-matching algorithm simply uses the frequency of occurrence of each colour of the histogram within the image. The more of a particular colour an image contains, the higher its frequency will be within the histogram.
The histogram is made with 64 bins, a compromise between speed (it takes longer to match more bins), and accuracy (the less bins the less discriminating the results would be - i.e. images which are less similar would have lower distances).
Before a histogram is used for colour matching it is normalised by the number of pixels in the image. This means that colour matching is not influenced by the size of the images that are being compared.
![]() |
| Figure 14: Histogram comparison |
Two examples of colour histograms are shown above. Note, in the second example, how the background is dominant in the image. Because the colours of the pots are spread fairly evenly over the bins, and the background is predominantly one colour and therefore one bin of the histogram, once the histogram is normalised the background dominates the vector. This is the main drawback of histogram matching: the background information is all included within the feature and it cannot be ignored.
In the first example, the background is more evenly distributed over bins (due to the shading) and the object groups into only a few bins (due to its flat colour distribution). Therefore the background does not dominate the histogram so much.
![]() |
![]() |
| Figure 15: | Figure 16: |
| Monochrome-man | Monochrome histogram |
The monochrome histogram-matching algorithm simply uses the frequency of occurrence of each level of brightness of the histogram within the image. The more of a particular brightness an image contains, the higher its frequency will be within the histogram. Colour images in the database are converted to monochrome for matching with this algorithm, by converting RGB values to monochrome. The reason a monochromatic histogram is required, is that the colour histogram is not discriminating enough for monochromatic images. A 64 bin colour histogram has only 4 bins dedicated to grey-scale values. This means that most grey-scale images would look similar in a colour histogram.
One problem with simple histogram matching is that no consideration is given to whether colour occurs in contiguous regions, i.e. large blocks, or is fragmented into many small areas. A CCV (Colour Coherence Vector) algorithm is used to address this problem. A coherent region of colours in an image is a region of colour that is larger than some threshold. The algorithm retrieves images which have similar distributions of coherent colours.
A histogram of 64 bins is generated for both coherent and incoherent colours and these are matched separately. Coherence and incoherence are arbitrarily defined as greater and less then 5% of the total image area, respectively. This means if a pixel is part of a region that is less than 5% of the total image area it is added to the incoherent histogram within the CCV. For example, the chessboard image below is 50% black and 50% white, arranged into 64 squares, where 32 are white and 32 are black. Each square constitutes a region, each of which is 1/64th of the total image area. This is approximately 1.5% of the image and hence is considered incoherent. If a pixel is part of an area that is greater than 5% of the total image area it is added to the coherent histogram within the CCV. For example, the second image contains the same amount of black and white (50% of each) as the first example, but this time the black and white areas are contiguous.
![]() | ![]() |
| Figure 17: | Figure 18: |
| Incoherent regions | Coherent regions |
Finally, because the colour histogram, monochrome histogram and CCV algorithms only consider the characteristics of an image as a whole, they are not suitable for searches that look for a sub-image (query image) within larger images (images in the collection). To address this problem a Multi-Scalar version of CCV has been developed. As shown in the 'pyramid' graphic below, the algorithm divides the image into a number of tiles (e.g. regions divided by the white lines in the bottom level of the pyramid) for each of a number of resolutions (the three levels of the pyramid). Both the query image and the collection images are converted into such a pyramid structure, and then each of the tiles in the query image are compared against each of the features for the tiles in the database image using the CCV matching algorithm.
A similar method has been applied to the monochrome histogram-matching algorithm.
![]() |
| Figure 19: Pyramid graphic |
PWT allows retrieval of similar images based on the general texture distribution of the image. In this context, image texture refers to repeating patterns throughout the whole image.
The PWT decomposes an image based on a wavelet transform, which can be thought of as similar to a Fourier transform, which transforms the image domain into a frequency domain. The frequency components of the image are analysed and a number of descriptors generated which represent the amounts of a discrete number of frequencies in the image.
![]() |
| Figure 20: |
| Decomposition |
| in image domain |
![]() |
| Figure 21: |
| Decomposition in |
| frequency domain |
Images are resized to 512x512 to perform this decomposition, which yields 22 frequency descriptors for an image. This makes the matching very fast. The comparison is achieved using a standard Euclidean distance measure.
The query by fax is based upon a set of PWT measures of the image at various threshold levels of a monochrome instance of the image. A Query by Fax feature vector consists of 99 PWT features at various levels of threshold (between 1% black and 99% black) of the image.
![]() |
| Figure 22: Query by Fax: Database Images converted to 99 PWT levels on left, query image on right |
Matching is performed by a simple step process:
More detail on the algorithms can be found in the ARTISTE Interest User Group (AIUG) Newsletters [6]; in help pages accompanying the public demonstrator of ARTISTE [7] ; and in papers prepared by the University of Southampton [8].
Linking is a familiar concept on the WWW. The traditional approach is to embed hard-coded links in an HTML page, which point to another Web resource that is associated to the original page in some way. However, this has several disadvantages: links have to be specifically authored for each document, the linking is inflexible, and links are difficult to maintain. These negative aspects are circumvented with dynamic linking.
In ARTISTE, instead of hard-coding the links, a separate link database is maintained and the links are applied dynamically at presentation time. Links are applied on a keyword basis. If the keyword 'teapot' has a link to further information about teapots, then this link will be applied every time that the word 'teapot' is displayed.
As well as being dynamic, linking can also be distributed because each ARTISTE site can maintain its own link database with links relevant to its own users. Thus if users from the National Gallery accesses the ARTISTE system via their web site they will be presented with different links than those for users from the Uffizi.
Dynamic linking has several advantages over conventional hard-coded static links:
An example of dynamic linking in ARTISTE is shown below. This shows the result of an ARTISTE query (background) with links that have been dynamically added on the word 'teapot' so that the user can navigate quickly and easily to an on-line shop (foreground) that sells similar items.
![]() |
| Figure 23: Dynamic linking |
The facilitation of cross-collection access to digital image information has been identified above as an objective of ARTISTE. This means not only allowing seamless searching across the collections of the institutions participating in ARTISTE but also achieving interoperability between those collections and other digital library resources. To that end ARTISTE makes use of existing open metadata standards such as Dublin Core and RDF Schema, while also supporting the Open Archive Initiative (OAI) information retrieval standard for distributed access.
The goal of the OAI harvesting protocol is to supply and promote an application-independent interoperability framework that can be used by a variety of communities engaged in publishing content on the Web. ARTISTE is an OAI data provider and has implemented support for the Open Archives Initiative Protocol for Metadata Harvesting, thus providing open access to metadata stored with each museum and gallery collection. OAI service providers can use metadata harvested via the OAI protocol as a basis for building value-added services.
ARTISTE is also participating in an initiative to redesign the primary open standard for interoperability between digital libraries, z39.50, using web technologies such as XML and SOAP. The z39.50 into the Next Generation (ZING) initiative [5] has proposed a Search and Retrieve Web Service (SRW) based on the z39.50 protocol for searching databases that contain metadata and objects. ARTISTE is one of the early implementers of SRW and has devised a service which enables distributed image content and metadata-based searches over the ARTISTE collections. Having emerged from the digital library community z39.50 has been traditionally concerned with text based searching and ARTISTE has been working with ZING to incorporate into the SRW protocol the ability to deal with content-based searching and thus expand international standards of information retrieval.
The museum community requires more sophisticated 3D models and other multimedia objects to represent fully the artefacts in their collections. Out of the ARTISTE project, which has developed a 2D image retrieval system, a new project consortium has been convened to develop both the technology and the expertise to help create, manage and present cultural archives of 3D models and associated multimedia objects.
SCULPTEUR [9], again supported by the European Commission, will exploit semantic web technology.
The project objectives are to:
ARTISTE has developed a successful image retrieval system based on metadata and content capable of exploring and analysing thousands of images from major art galleries across Europe. The project has seamlessly translated local metadata schemas to common standards so that the individual collections are searched as if they were a single entity. Content analysis algorithms are now in place that can handle many different types of query, appropriate to the diverse needs of the museum community. ARTISTE is contributing to the development of open standards to enable interoperability between museum and gallery collections worldwide.
By enhancing facilities for multimedia information organisation, storage and retrieval, ARTISTE has gone a long way towards meeting the increasing need for intelligent information extraction and presentation from distributed resources.
A current constraint on the uptake of multimedia digital libraries is the limited amount of structured metadata available in such systems. However, there exists a large amount of relevant information on the Web, and with the emerging semantic web approach to information structuring there are many new and exciting possibilities for enriching multimedia information collections through information exchange with other repositories.
Matthew Addis
IT Innovation Centre
2 Venture Road
Chilworth Science Park
Southampton SO16 7NP
United Kingdom
URL: <http://www.it-innovation.soton.ac.uk/
Email: mja@it-innovation.soton.ac.uk
Matthew Addis is a leading researcher at the IT Innovation Centre, Southampton, UK, which is a partner organisation in the ARTISTE project.
Paul Lewis
Department of Electronics and Computer Science
University of Southampton
Southampton SO17 1BJ
United Kingdom
URL: <http://www.ecs.soton.ac.uk/~phl/
Email: phl@ecs.soton.ac.uk
Kirk Martinez
Department of Electronics and Computer Science
University of Southampton
Southampton SO17 1BJ
United Kingdom
URL: <http://www.ecs.soton.ac.uk/~km/
Email: km@ecs.soton.ac.uk
Paul Lewis and Kirk Martinez lead the team at the University of Southampton where the algorithms have been developed for the project.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Addis, M., Lewis, P., Martinez, K. "ARTISTE image retrieval system puts European galleries in the picture", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/artiste/>
-------------------------------------------------------------
By Peter Brophy - July 2002
Peter Brophy reports on the COINE Project which is designed to encourage and enable ordinary citizens to tell and share their stories in networked spaces. Based firmly on emerging standards like OAI and Dublin Core, the project aims to demonstrate ways in which citizens can become contributors to as well as consumers of digital objects and can thus record, share and preserve their own personal and community cultures.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
The Armitt Museum and Library in Ambleside, Cumbria, England has a history going back over a hundred years, being founded by the Armitt sisters, who lived in Ambleside and had connections with many of the leading British literary figures of the time (Ruskin, Rawnsley, less directly with Wordsworth, Coleridge et al.). They collected material relating to the literary scene of the day as well as that pertaining to the English Lake District - Ambleside is in the heart of this area, one of the most beautiful and most visited parts of England. Over time the Armitt added to its original collections, as it still does. Particularly notable are original watercolours by Beatrix Potter, better known for her Peter Rabbit and other children's stories, and the archive of the Charlotte Mason teacher training college and its relationship to the Parents National Education Union (PNEU). Part of the English Fell and Rock Club's collection is on permanent loan from nearby Lancaster University. Recently the Armitt has started to develop its collections relating to the work of Kurt Schwitters, the internationally-renowned artist who lived and worked in Ambleside for many years.
The Armitt occupies purpose-built accommodation near the centre of Ambleside, easily accessible to all residents and visitors. On the ground floor are the main museum display areas as well as a reception area and small shop. On the upper floor there is the main library collection, a reading room and an office.
Within the COINE project, the Armitt will be helping its local citizens to tell their stories using networked information tools developed within the project. What follows is a short scenario which illustrates the kind of activity that might occur:
| Alice has lived in Ambleside for the past five years. Now retired, she
bought her traditional Lakeland cottage from an old couple who had lived there
for many years. In redecorating she has come across many old features, and she
has started to research the history of her cottage and the people who have lived
there. She has discovered some old photographs of the street which show it over 100 years ago as well as more recent ones dating from the 1950's to the 1990's. She has found the names of former owners from the title deeds and has started to read documents in the Armitt and other local collections which suggest the occupations of some of these people. One was a prominent local artist. Having joined the oral history group, she has also discovered that some of the recorded memories of older residents have considerable relevance. By accessing genealogical sites she has found the birth and marriage certificates of the local artist and discovered that her parents came to England from Poland after the First World War. Now she is going to use the COINE system to write and record the story of her cottage. In searching the networked COINE archives she finds that someone in Poland, where another COINE domain is being run, has written the story of the artist's family. This will be a useful resource for her to link into. |
Local cultures exist in every part of Europe. They have many forms and expressions and there are many, sometimes unexpected, linkages between even widely dispersed cultures. Information and Communications Technologies (ICTs) offer the possibility of enabling individuals and local communities to capture, display, share and preserve their cultures in new ways, thus personalising both the publication and the use of information objects and exploring new inter-community linkages. In essence, ICTs offer the potential to turn citizens-as-customers into citizens-as-participants, actively contributing their own histories, knowledge, understandings and experiences.
Current evidence suggests, however, that where local communities are exploiting the web in an attempt to publicise and share their cultural interests, the implementations lack coherence, structure and interoperability. Neither are many of these solutions scalable or sustainable. Where generic standards are in use (for digitisation, resource description, search & retrieve and so on) the services tend to operate at national or major institution level. Where local and even sub-regional services and systems are implemented, it is more common to find simple web sites supported by lists of URLs or, at best, a locally-specified relational database. Furthermore, it is clear that the level of technical competence and skill needed to implement and maintain many of the IT-based systems available (such as museum or library systems utilising SPECTRUM, MARC, AACR, LCSH and the like) is far above that available to local communities. As a result citizens tend to be treated as relatively passive 'customers' rather than active 'participants'.
Robust, scalable and easy to use solutions are needed to encourage and enable individuals and small communities to actively exploit the opportunities of ICTs in worldwide networked environments. This is the issue which COINE is designed to address [1].
The following is a brief summary of some of the major challenges which work in this area has to engage with:
The consortium consists of the following partners:
The COINE technical architecture is designed to solve two specific problems:
In addition to the technical infrastructure issues, COINE addresses content personalisation from two angles:
Figure 1 shows a diagrammatic representation of the architecture used to provide highly distributed search platform for COINE. The architecture is an enhancement of the search platform developed within the EU FPIII DALI project [5], which was further exploited and enhanced within the FPIV UNIverse project [6].
![]() |
| Figure 1: Highly Distributed Search Platform |
The different layers of this model can be described in the following way:
The search platform relies on two distributed data stores for its knowledge of underlying data stores and its user population:
The COINE publishing platform provides facilities for:
![]() |
| Figure 2: 'My Repository' Distributed Publishing Platform |
The different layers can be described as follows:
Quality control and structure are essential to success in this field. The chaotic lack of control of Web content has created many problems for those seeking to retrieve objects, since the authority of so much content is unknown. Quality control does not mean heavy-handed censorship, but implies that the provenance and quality characteristics of objects held in a COINE domain are known and displayed to users and that users have the tools to create meaningful descriptions. These functions may be exercised through the owner of a COINE domain (e.g. a local public library, art gallery, museum, school, college, university or consortium - or even a local group of people with a shared interest), which may in turn operate through appointed local agents or through individuals. The COINE domain also imposes standards, including metadata content standards, to ensure that objects can be identified across any number of such domains. As a result objects within a COINE domain are representative of the chosen local culture and are controlled locally - but are surfaced within a global networked space. In effect, COINE domains act as local art galleries, local archives, local museums, local history centres, and so on.
COINE demonstrators will be undertaken in a carefully-chosen, Europe-wide series of relevant and challenging implementations. Partners have already identified a wide range of application scenarios, as illustrated by the following list:
Demonstration partners were selected not just for the innovative case studies which they can provide, but for their close links with regional and national policy makers in the cultural industries. Thus COINE is designed to build on existing digitisation and 'culture surfacing' initiatives at regional and national levels, and to help build value-added linkages between these as well as the base-level local communities.
The COINE project enables the exploration of new concepts in 'information inclusion' by encouraging the ordinary citizen to become involved in network-based sharing of experience and heritage. It does so within a firmly standards-based framework, thus working towards widespread interoperability and long-term sustainability. Together with other projects in the 'Heritage for All' cluster, it will provide a baseline for major advancement in Europe in this exciting area.
Peter Brophy
Director
Centre for Research in Library & Information Management
Department of Information & Communications
The Manchester Metropolitan University
Geoffrey Manton Building
Rosamond Street West
Manchester M15 6LL
URL: <http://www.cerlim.ac.uk/
Email: p.brophy@mmu.ac.uk
Phone: +44 161 247 6153
Fax: +44 161 247 6351
Peter Brophy has been Director of CERLIM since its foundation at the University of Central Lancashire in 1993. At that time Peter was also University Librarian, and he subsequently also took on responsibilty for the University's academic computing services. In 1998, he and the CERLIM team relocated to the Department of Information and Communications at the Manchester Metropolitan University. Recent publications include "The Library in the Twenty-First Century" (Facet publishing, 2001).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Brophy,P. "Cultural Objects in Networked Environments - COINE", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/coine/>
-------------------------------------------------------------
By Robin Yeates - July 2002
Robin Yeates reports on the investigation by the COVAX Project into the suitability of XML in providing integrated access to collections and materials in libraries, museums and archives. He also draws conclusions on the Project's prototype and work with trial users.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
One of the three research priorities of the European Commission funded IST Programme since 1999 has been 'ensuring integrated access to collections and materials held in libraries, museums and archives'.
How much progress have we made since then, and what are the current prospects for achieving what might be called a European Information Environment?
In the European Information Environment, everyone would have access to a range of appropriate and seamlessly accessible digital networked content and services provided by libraries, museums, galleries and archives. This would range from secondary indexes, catalogues and finding aids to full texts and multimedia objects and resources. These would be provided by the vast majority of existing institutions that have adapted their management practices and systems to participate in the environment.
One major technology component of the European Information Environment will be eXtensible Markup Language (XML), since this is rapidly becoming a basis for software application - software application and even to some extent software application - human interoperability throughout the quasi-global Information Society. This article looks at the policy context, but concentrates on the activities and findings of COVAX, (Contemporary Culture Virtual Archive in XML)[1], one of the few large-scale projects so far to look at the practical effects of a move to XML-based networking in European libraries, museums and archives. This article does not consider the related question of how far existing Z39.50 based solutions can meet the requirements, since this was not the focus of COVAX.
The European Commission has stepped back from its considerable activity in the field of innovation in digital heritage and cultural content, and commissioned an extensive and valuable study led by Salzburg Research entitled "Technological Landscapes for Tomorrow's Cultural Economy [2002]", known as the DigiCULT Report [2] . This tries to assess 'the way Europe's cultural institutions should approach technology-driven mutation' and make recommendations.
The DigiCULT Report Executive Summary begins 'Being digital for many European archives, libraries and museums (ALMs) is no longer an option but a reality. They have turned into "hybrid institutions" that take care of both, analogue as well as digital cultural resources. The conversion of all sorts of cultural contents into bits and bytes opens up a completely new dimension of reaching traditional and new audiences by providing access to cultural heritage resources in ways unimaginable a decade ago'.
Many of these cultural heritage or memory institutions have for years been managing their collections using sophisticated and expensive commercial software systems. They have been considering how they might migrate or replace their software systems to ensure that their resources are available to those who need them in the new era of the Information Society.
Others, especially smaller institutions, may not be so concerned about existing or future software investments, but they need to ensure that their own holdings become more visible and accessible in the networked environment. Large commercial endeavours that compete more effectively on the web may eventually threaten to substitute alternative, poorer quality services for those traditionally provided by small institutions.
From a supplier perspective, it is becoming increasingly important to offer products and services that inter-operate effectively with those of other suppliers, at least at a basic level. Alliances and partnerships need to be developed, particularly between management systems suppliers and content aggregators and large publishers. Moreover, they need to be based on workable architectures and practical ways of migrating from current infrastructure and customer/client environments to those expected to become more widely available. Smaller publishers will need to make their primary or secondary works available through larger intermediaries. If this is to happen, all the stakeholders involved need to develop a shared understanding of the issues so that they can contribute to successful innovation.
There is no doubt that a strategic political demand exists to encourage sharing of cultural data, although it has yet to be fully recognized that such work can only indirectly be self-sustaining, through education, social cohesion, personal motivation and self-fulfilment. At present the emphasis has been on technologies, rather than on these indirect benefits of wider access to and use of cultural resources. One of the Lund Principles of 4th April 2001 taken up within the eEurope digitisation strategy is that the Member States could make progress on the eEurope objective to 'create a co-ordination mechanism for digitisation programmes across Member States' if they 'worked in a collaborative manner to make visible and accessible the digitised cultural and scientific heritage of Europe.' (Lund Principles, 2001)[3].
A European Information Space may develop as a logical extension of national policies. For example the Heritage for All projects of the 5th Framework Programme CHIMER, CIPHER, COINE and MEMORIAL all intend to develop new and more powerful tools and services that involve cultural heritage organizations such as museums, archives and libraries more in end user learning and allow users to interact more deeply with content held by or delivered via them. Programmes such as the public lottery funded Peoples' Network in the UK will produce far deeper understanding within the cultural sector of the more complex issues relating to the management of digitisation and may generate support for future integration with other European programmes. However, COVAX has demonstrated that there still remains a large number of technical issues to be resolved simply to enable the sector's 'legacy' resources to be made visible and accessible to web users.
In Issue 3 of Cultivate Interactive, January 2001, Carlos Wert and Francisca Hernández described the aims of the COVAX project at its start [4]. The main objective of the project that ended in December 2001 was to define the different phases and procedures that need to be followed to transform current management and information systems used in archives, libraries and museums to an XML environment.
Here we outline the actual creation of a prototype resource discovery system containing a wide range of content types, its internal formative assessment and a summative evaluation of the outcomes of the project, some six months after completion, from the point of view of one of the partners.
One obstacle to technical development is that a realistic pool of digital data for modelling future systems is not always readily available to researchers, since priorities and formats have not yet been fully defined by local managers. Since the kind of information required to be discovered and managed by network discovery tools is in a state of constant flux, pragmatic approaches have to be taken during projects of fixed, tight time-scales.
One answer to this problem has been to focus on existing, often subject-based communities. These will have more clearly defined aims and target user groups, and will offer a clear vision that builds on the present.
If we are to expect new forms of interdisciplinary learning to develop, however, we must develop ways for new communities to be built that are founded on new ambitions and opportunities created through the network itself. These communities will have to set their own technical standards and guidelines, and potential members will need to be able to accept and adhere to them without causing prohibitive local disruption.
In practice, COVAX content used to build the two prototypes to date does not form a coherent dataset for any particular community. Instead, we have used samples that enabled development of solutions for what will become widespread problems. In effect we have taken a worst-case scenario, and considered the surrounding issues, rather than creating a finished product. The resultant in-depth learning, however, has meant that all partners feel confident in their technical planning, and indeed partners intend also to continue working together on their future systems, as they found the processes involved in technical integration and development so beneficial.
The actual data used consisted of mainly text and textual metadata, with some related images, as follows:
There are two main approaches to the use of existing data for XML based delivery. Data can be exported from existing systems in batches and converted directly or indirectly to XML. Alternatively, it can be left in an existing, typically relational database, and converted dynamically on demand.
Neither of these main approaches is likely to provide a complete solution for all memory institutions. One reason for this is that the size of collection and range of data management options varies enormously. Contributors may only require to publish a small number of records, or may require a complete separation of their management system from the network for security reasons, making a dynamic interface impractical. Conversely, large datasets that must be made fully accessible from existing systems may be impractical to handle using batch transfers.
A further issue is how to maintain interface compatibility across numerous disparate sites, especially now, when standards are still being extensively revised and developed all the time. COVAX solved this problem by introducing its own control over the range of open standards used, and by developing an agreed architectural framework for expansion. The system was able to layer services and content transformations so that contributors could be fully supported, whether they had no local XML systems or skills at all, or whether they had newly established, advanced, multimedia XML repositories, or whether they fell, like most institutions, somewhere in between.
The data conversion efforts undertaken have been described elsewhere with examples (ELPUB, 2001)[5]. It is sufficient here to note that we found that the existence of a strong service support network for data management was crucial to content providers to make their content available. This support ranged from basic and advanced XML/XSL skills to specialist knowledge of the source data formats. These were mainly MARC-based formats in the COVAX case, but several variants were used by partners, and it was decided early in the project that conversion to MARC21 should be undertaken to simplify conversion to XML. There is in general a distinct lack of bulk record conversion and validation tools that work with cultural schemas.
This meant we were able to use existing facilities to batch convert data, and we could leverage work done by the Library of Congress and others. A policy decision to use existing tools wherever possible led us, therefore, to make use of a fully reversible MARC21 based LoC XML format, rather than invent our own simplified solution. Local projects should not need to undertake such technically complex work, and we feel that we are now in a good position to utilise newer schemas and DTDs as they become available, without being required to develop them ourselves. This latter course of action might have severely restricted our capacity to integrate with future developments, although it may have led to some short-term benefits, such as improved performance of the COVAX prototypes. A huge benefit of this approach is that it is possible to include new data conversion and dynamic interfacing service providers into the consortium network, and to migrate practices over time as content providers gain skills and local systems capabilities, such as XML query handling.
Most of the time, memory institutions, particularly libraries and larger museums or archives, that want to publish large amounts of content will already have management systems using SQL-accessible relational databases such as MS Access, Oracle or MS SQL Server. Z39.50 techniques have already allowed integration of such systems to some degree for resource discovery. COVAX was tasked with determining whether so-called native XML databases might also be used. They would allow any XML resources to be held and managed in purpose-built repositories that provide access to objects, documents, statistics and other functions via web browsers and XML clients. In this way bibliographic information, finding aids, metadata and full-text documents and related multimedia assets can be retrieved in whole or part using not only SQL but also XML-based queries. These native XML databases are now becoming widespread as the basis of new content management systems, and as they become more sophisticated and robust, they will either replace or provide additional options for data management and security.
![]() |
| Figure 1: COVAX Deployment |
COVAX began in 1999 when few native XML database options were available, and no partners currently had one installed. It was not our intention to explore the full potential of these systems during the project, but we did need to create a distributed network of them to provide our testbed.
A key partner, Software AG, offers the Tamino product [6], and this was offered to the content provider partners, some of whom installed it, running under both MS Windows and Solaris on a Sun platform. A high-end solution, such a platform is intended for enterprise level applications, but we had no serious difficulty setting up and using it for five servers in Madrid, London, Rome and Salzburg. For the project these sites supported some ten production databases and five test databases. The whole system is managed via web browsers apart from some bulk processing and similar scripts.
Content providers felt the need for a simpler lower-cost solution for add-on repositories to existing systems. Only one suitable product was found at the time, lthough more have become available since. TextML™ [7] from Ixiasoft was used to develop an additional seven production databases and one test database in Barcelona, Karlskrona, Sweden and Graz. AIT needed to provide some additional software for this system, so that the COVAX meta-search engine could use a single query format to query both Tamino and TextML repositories. XPath [8] and XQL were used for the query language in COVAX, but there are still some issues surrounding the immaturity of these standards.
In the future improved XPath/X Query standardisation and wider-take-up by suppliers is likely to reduce or even eliminate the requirement for adaptor software for each native XML database system. This will be essential for full interoperability of systems.
Each database holds a collection of XML documents that use a particular schema. Schemas were issued centrally to consortium members by the technical partners, Software AG, Madrid, Salzburg Research and AIT in Graz. These included Marc, Amico, TEI headers and EAD schemas using open externally published schemas adapted only where absolutely essential in minor ways for operational reasons or to correct errors. A great deal of work was required by content provider staff experts to develop suitable mappings from all the accepted COVAX formats to the required index access points, based on the Dublin Core Metadata Element Set to allow cross-domain searches. In addition to this work, of course, each partner also had to provide appropriate mappings and conversion from a much wider range of local formats.
In addition the COVAX holds XML records based on the principle of Z39.50 Explain, describing content providers, systems information and collection information (collections were referred to as databases during the project). For these, a set of new schemas was prepared and content was supplied in one of six native languages and English, then translated into all the others by the relevant language partners. Users therefore have access in their native language to collection level information content at least.
![]() |
| Figure 2 : Part of a COVAX XML Explain document |
Altogether we created some 17 production databases and 6 test databases containing the collections in Figure 3:
![]() |
| Figure 3: COVAX Content |
One of the main requirements and benefits of the project was to develop our understanding of XML at both technical and information professional levels. This we achieved by carrying out a survey of the state of the art of XML handling software (available on the project website), and by making use of market-leading tools. The most important such tool was XML Spy, an IDE (Integrated Development Environment) from Altova GmbH. Available free or open-source XML tools were not found particularly suitable or easy to use, especially compared with tools available for HTML, web authoring, java and JavaScript related purposes. XML Spy supports the full XML syntax, parsing, well-formedness, validation, encoding; DTD definition; schema definition; XSL and XSLT management; HTML and XHTML rules (this last is a superset of HTML4.0 rules that adds to markup a more rigorous syntax and compatibility with XML environments); syntax highlighting; interoperability with other external applications (imports for example from MS Word, MS Access); some of ASP etc. Such a tool was found suitable for technical staff and skilled authors, providing us with the means to ensure only valid content reached the repositories, and to test and check COVAX meta-searching. However, we did find problems when trying to validate large batches of records, typically exported from existing management systems, since most tasks required all records to be held at once in main memory. For this reason, and because these tools are generic, partners also used other techniques to validate and correct specific types of content and to convert character encodings where necessary.
In order to evaluate usability and design issues, two prototype COVAX versions were built in Java code, using XML files for configuration and storage information and XSL stylesheets for transforming XML from one form to another. The second version contained the final set of project COVAX features. A shared gateway user interface for resource discovery was created allowing browsing of collections in six languages and cross-searching of all the distributed repositories, although a public version has not yet been made available. It is possible for users to select their preferred interface language, and the system architecture is designed to hold group or personal profiles and persistent storage between sessions, along with search histories and statistics. However, where possible, such facilities would be provided using existing authentication or storage services, and the open architecture also allows search aids such as thesauri or XML transformation and enhancement services to be added at a later date.
COVAX is essentially 'middleware', not necessarily visible to end-users, but capable of enhancing portal or local web services by delivering an integrated stream of Dublin Core compliant XML or HTML formatted records for diverse types of cultural content from a consortium of content providers. It provides:
![]() |
| Figure 4: COVAX Architecture |
A fuller description of the user interface has been published in Program (Yeates, 2002), but the figures following show a logged-in search forms and brief search results. Full search results displays vary depending on the resource type, but are displayed at least partially in sequence on a single results page for speed of in-page navigation.
![]() |
| Figure 5: COVAX Prototype 2 User Interface |
![]() |
| Figure 6: Swedish language display of bibliographic results from an Italian collection |
![]() |
| Figure 7: Results from an Austrian museum image collection (AMICO format) |
COVAX has implemented a complex demonstration of a fully XML-based resource discovery network that has taken great account of the wide variations in cataloguing practices throughout several European countries. However, it is not yet a complete product.
Users should drive the design of any system, although hidden systems, such as much of COVAX is, present design collaboration challenges. Part of COVAX consists of elements specified by cultural and information professionals. Other parts were designed by expert web technologists. Therefore it was important to involve outside stakeholders and users in shaping further development of the system. Expert usability assessment advice was provided from outside the project team but within one of the partner organisations, Salzburg Research.
A complete usability assessment framework and usability toolkit were created, through a project workshop followed by individual partner development work. Then we had a clear target groups matrix, some usage scenarios for each and instructions and worksheets for carrying out interviews, observations and questionnaire pre- and post-trial surveys at a wide range of sites internationally. Work was undertaken over a short time period, but generated much useful information as a result of the careful planning, especially as we could directly compare independent results.
Feedback was contributed by many stakeholders, ranging from those responsible for national digitisation policy to web researchers, cataloguing experts, non-specialist academics and the general public. Groups studied were:
The main conclusions of the user assesment were:
Overall, the issues arising from these assessments were no surprise to the consortium, because the main problems had already been identified: long waiting time for answers, time outs, results ordering and revision of the interface design. These modifications have been discussed during consortium meetings and kept for future developments, some of which will depend on general improvements in XML networking.
We have shown that it is feasible to migrate legacy cultural services to an XML environment, and that there are benefits for users if this comes about. They may gain more immediate access to deeply linked, high quality content held in a multitude of European, and indeed global, repositories. Awareness of materials will rise as certain multilingual access and customisation facilities can be implemented relatively easily using XML and XSL. However tools are not yet fully mature, especially within the cultural heritage sector. Services can however be built now that encourage staff development and stakeholder involvement.
The COVAX partners continue to develop their repositories, but we expect much to change in terms of access arrangements and ultimate service design, as we develop understanding of new professional and commercial opportunities.
Silke Grossmann, Vic Haesaerts, Gerda Koch and Walter Koch [2002] have reported on the REGNET Project [9] which aims to set up a functional network of service centres in Europe, providing IT-services dedicated to Cultural Heritage organisations. This may be one useful way forward, and there are other initiatives of a similar scope underway.
We recommend urgent attention is paid, however, by all institutions, large and small, to XML. The complex MARC21 DTD used by COVAX is likely to be replaced by more appropriate XML based information models for bibliographic data. Presentation of full-text documents and lengthy finding aids requires improved techniques for adapting content for resource discovery to improve performance. Too much nesting of elements in XML documents obstructs mapping of access points and indexing. The standards and protocols for searching distributed databases need to be improved, and adaptation of Z39.50 for HTTP and XML is a promising approach.
The COVAX architecture is not just applicable to cultural heritage applications, but applies also to distribution of information about elearning products or tourism information. The principle of cross-domain searching was strongly endorsed by COVAX trial users, but much more work is needed by everyone to provide appropriate content and system performance so that a full European Information Environment can be achieved.
So, what of the future? An increased emphasis is likely on support for the autonomous learner, in order to support the concept of lifelong learning and not merely formal education whilst at school, college or university. Learners, as opposed to teachers, need to be able to interact more deeply with resources, and teachers want to capitalise on new digital resource provision, in order to gain the benefits of improved student motivation and self-confidence that these resources can generate.
We certainly need to include legacy materials in the mix of learning opportunities. However, it may be more important to explore how we might build new innovation platforms for the creation and development of new cultural heritage services that will attract future learners.
Evidence from Covax shows the value of XML in resource discovery, but also the need for agencies to provide ongoing data conversion services. It shows the value of developers working with intermediaries, but also the challenges of delivering meaningful services without wider partnerships being created.
Robin Yeates
Associate Director
LITC
South Bank University
103 Borough Rd.
London SE1 0AA
United Kingdom
URL: <http://www.sbu.ac.uk/litc/
Email: yeatesrb@sbu.ac.uk
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Yeates, R. "COVAX: Making Visible the Culture of Europe", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/covax/>
-------------------------------------------------------------
By Jutta Weber - July 2002
Jutta Weber observes that European culture is mainly based on the tradition of text and that the preservation of the written word is one of the essentials of cultural heritage programmes. Here she writes about an initiative which enables all institutions in Germany holding hand-written documentation of a European cultural nature to present data effectively about their holdings, and themselves, on the Internet.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
The project Kalliope (Verbundinformationssystem Nachlässe und Autographen) is co-ordinated and carried out by the Berlin State Library (Staatsbibliothek zu Berlin) [1] and |a|S|tec| - Angewandte Systemtechnik GmbH [2] and co-funded by Deutsche Forschungsgemeinschaft [3].
One of the most fascinating aspects of Europe's cultural heritage is the fact that every country owns a considerable number of collections of modern manuscripts and letters, written by the most famous and the less well-known, but which together form the backbone of our European culture. As cultural life has never been isolated within one nation or region, all kinds of national and international relationships exist between these collections. Europe's cultural history is defined by these relationships and its documentation is maintained in institutions with archival functions, not only in Europe but all over the world.
In 1966 the Union Catalogue of Modern Manuscripts and Letters in Germany was established in the Staatsbibliothek zu Berlin. Today it provides information about more than 1.5 million documents relating to 250,000 people. More than 150 partner institutions regularly provided this information by sending copies of their card catalogues.
The documents represented in this catalogue date from early modern times up to the present day. Among those people whose manuscripts and letters are documented there figure the most famous poets, artists, scientists and politicians of Europe but also less well-known people who survive in the Union Catalogue as they corresponded with those famous people. The data include brief information about archival collections, single letters or manuscripts kept in archives, libraries and museums in Germany. As these institutions differ in size and importance and have very different functions - there are small local archives as well as large universal libraries - the collections they own may be of importance on a local, national or even international level.
The Union Catalogue is a meeting point for scholars from all over the world who need information about the location of manuscripts and letters in Germany. It also represents a kind of interface between archival, library- and museum-related documentation and the administration of manuscripts and letters.
Catalogue conversion which started some years ago has now reached the point where results can be presented on the Internet. This data will appear together with that from a large network of cataloguing institutions to be established throughout Germany. All will be presented under the name Kalliope.
The goal of Kalliope is to establish in Germany a national node, acting together with all the German partner institutions as data provider in the European network MALVINE (Manuscripts and Letters via Integrated Networks in Europe) [4]. Kalliope will build the bridge between these partners and the MALVINE community. MALVINE is a search engine building the basis of a network of data __ namely collection level and item level descriptions __ of modern manuscripts and letters held in various European institutions.
All information contained in the card catalogue of the Central Catalogue of Manuscripts (Zentralkartei der Autographen) will be made available in Kalliope electronically, as mentioned above. This means that more than a million data items on literary archives, manuscripts and letters relating to more than 250,000 people can be searched on Kalliope. This will be the basic provision, but it will be regularly updated with new data from former and newly participating institutions.
There are two search options:
|
| Figure 1: Search for Documents |
|
| Figure 2: Search for Persons |
To generate a high level of usage, participation in Kalliope is in fact open
to every institution.
The institutions can choose between:
|
| Figure 3: The Kalliope Co-operation Model |
With these four options Kalliope is able to communicate with every sort of institution holding relevant material:
All records available on Kalliope are presented by Berlin State Library in the Kalliope OPAC which is freely available on the Internet under http://www.kalliope.staatsbibliothek-berlin.de/.
The German cataloguing tradition in the sector of modern manuscripts aims at giving information on both the collection level and the item level description of each document (whenever possible and meaningful) and is based on two standards:
Both are indispensable in a communication network designed to show specific material in a coherent framework. In Kalliope a very simple mechanism connects name authority records to corporate body authority records. It therefore provides comprehensive information on where documents on particular people are held.
|
| Figure 4: The Kalliope Internal Model |
This relationship is a 1 (person or institution) to n (institutions) relationship and is the basic structure of the model "DIANA" (Deutscher Index zu Autographen und NAchlässen). The possibility of just adding (via an electronic template) the name of an institution (or a private person) holding material on a person or institution is the simplest way of enhancing the Kalliope information service.
|
| Figure 5: full result display on Person Search |
Some aspects of this DIANA model have influenced the idea of the project LEAF (Linking and Exploring Authority Files) [16]. LEAF, co-ordinated by Berlin State Library is developing a model architecture for a central server connected to a distributed search system harvesting existing name authority information with a view to automatically establishing a user needs-based common European name authority file.
Kalliope demonstrates that the preconditions for constructive and well-organised participation in the realisation of a European or international co-operation model can be established and that these can include more than just the biggest and best known institutions in this strategic goal. Only when all relevant institutions - including the smallest ones - are able to participate in the realisation of a European or even world-wide initiative, will it ever be possible to create a telling contribution to the "information society". This also means that every kind of institution - museum, library, archive, documentation centre, scientific institution - must have the chance to provide its information in a suitable way.
And so we envisage a virtuous circle: the more institutions that allow access to their data on Kalliope, so the more data will be available in MALVINE and the more terms of comparison will be available, and examples of "how to do" will be provided world-wide. This is in fact one of the expected outcomes of Kalliope on the national, and of MALVINE, on the international level: to give as many examples as possible of how Europe's modern manuscripts and letters are described and how they can be found in terms of this kind of description. The goal is to encourage new participants to do so in the same way. The use of authority information - the enhancement of which is the principal aim of project LEAF - will provide more focussed access to data. The network of information about the relationships between persons and institutions will become denser and more complex with every new participant in the projects. Thus Kalliope and MALVINE and, in the future LEAF, will have an increasing influence upon each other: Every public user, every expert, every participating institution, indeed country, will profit from these initiatives in the long term.
Dr.Jutta Weber
Head of the German Union Catalogue of Modern Manuscripts and Letters
State Library Berlin
Department of Manuscripts
Potsdamer Str. 33
Berlin
10785 Germany
URL: <http://www.sbb.spk-berlin.de/
Email: jutta.weber@sbb.spk-berlin.de
Jutta Weber studied Latin and Romance Languages; 1978: state examination; 1980: Doctorate in Latin; 1982: state examination in the Libraries College in Köln. Since 1982, she has worked in the Staatsbibliothek zu Berlin and since 1985, she has worked in the Department of Manuscripts as Head of the German Union Catalogue for Modern Manuscripts and Letters. She lectures and writes essays about conservation of and electronic access to information on modern manuscripts and letters. She also writes about national cataloguing rules for modern manuscripts, participation in national and international conferences on modern literature and the conservation of cultural heritage. She is a member of a consortium responsible for the national name authority file (Personennamendatei, PND) and is a member of a team working in partnership with libraries, archives and museums. She is currently acting as the co-ordinator of the EU-funded projects MALVINE and LEAF.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Weber, J. "Kalliope: Open Union Information System of Literary Archives, Modern Manuscripts and Letters ", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/kalliope/>
-------------------------------------------------------------
By Richard Wright - July 2002
The longer established European broadcast organisations are now facing a considerable challenge in preserving large broadcast archives that risk irreparable degradation through ageing. Richard Wright highlights the problems confronting broadcasting organisations and how technical developments produced by the PRESTO Project can help.
|
EC Project PRESTO has completed a survey of the holdings and preservation status of ten major broadcast archives. These archives represent a significant portion of total European broadcast archives, including some of the largest individual collections. Approximately 75% of this material is at risk or inaccessible. The collections are growing at roughly four times the rate of current preservation work. The technical developments produced by project PRESTO reduce the costs and improve the effectiveness of multimedia archive preservation projects.
The Twentieth Century was the first century with a record of its significant events - the sounds and moving images - on film, audio and video media. A major repository for this record is the collective broadcast archives, particularly the archives of major broadcasters. In Europe, the main broadcasters are publicly supported, leading to archives that have a role not only in the business of broadcasting, but also fulfilling a public service requirement to support wider educational, cultural and heritage purposes.
This record is now largely at risk, as the bulk of these recordings reach the stage at which they are deteriorating, or on an obsolete format, or both. Broadcasting was never developed as a mechanism to create and hold permanent audiovisual history. The consequence is that these archives have arisen to support broadcasting, and have no business model or funding model specifically designed to support preservation.
EC Project PRESTO has the aim of increasing the efficiency of the technical work needed to preserve broadcast archives. The efficiencies are of two sorts: reducing costs, and developing strategies to ensure that archives are not simply physically preserved, but preserved in ways which maximise their future benefit. To get the size, shape and urgency of the problem, a survey was made of the holdings and preservation requirements of ten major European public service broadcast archives (details of the PRESTO project and survey participants are listed below in Appendix I).
The survey covered the following areas:
| The present | What broadcast archives do: their place in the business |
| What they hold (media types, size of holdings and condition) | |
| Current preservation practices: technology, processes and costs |
| The future | Service requirements: new services for new holdings |
| Preservation requirements: new technology and processes |
The ten archives in the survey represent a significant portion of total European broadcast archives, including some of the largest individual collections, but total European holdings of broadcast material are probably ten times larger. The survey found about 1 million hours of film, 1.6 million hours of video recordings, and 2 million hours of audio recordings in the ten archives.
The content covers the entire century, as broadcast archives include bought-in film and even wax cylinder material from before the development of the broadcasting industry. The record of news, current events, sports, culture and entertainment covered in radio archives dates from the 1920's (recorded originally on shellac discs), and there are film recordings of television output from 1936 onward.
Access to this content is notoriously difficult, because almost all the material is on 'professional' formats (film, broadcast-standard videotape) which need special players, certainly unavailable to the general public and often unavailable even to national archives and educational institutions. Also much of the content is unique, master material that cannot be allowed to circulate generally. A major goal of preservation work for broadcast archives must be to find joint solutions to preservation and access problems: preservation for access.
The amount of audio, video and film material in the ten archives surveyed is given in the following diagrams:
|
| Figure 1: Audio Holdings |
|
| Figure 2: Video Holdings |
|
| Figure 3: Film Holdings |
Obsolescence: At least 2/3 of the material in archives cannot easily be used in its existing form, because the medium is too specialised (film) or obsolete (2" videotape) to allow easy access. For audio, this includes the massive holdings on ¼" open-reel tape.
Deterioration: Approximately 1/3 of the material has one form or another of deterioration:
Fragile media: A large part of the holdings cannot be released for access because the media are too easily damaged. Examples are: all film negatives; all film prints except for access by qualified professionals; all shellac and vinyl audio recordings.
The main approach to preservation of video materials is transfer of old formats to new formats. It must be stressed that these transfers do not constitute true preservation - they simply solve today's format incompatibility and tape wear/degradation problem by creating an identical problem to be faced - at equal additional expense - in as little as ten years in the future.
For audio, however, the approach is increasingly to transfer the material to digital files which can be held on magnetic or optical media (datatape, CD, DVD). This approach allows future transfers to be fully automated using media-handling robots - and so the mass digitisation to a "robotic formal" is a significant step toward true media preservation.
Digital videotape is somewhere in-between analogue media, which is expensive to transfer to new formats, and computer files for which condition monitoring and media transfer can be fully automated. The technology for "preservation work" being developed by PRESTO is aimed mainly at processes for conversion of older analogue formats (for audio and video). Cost-effective approaches to the preservation of digital videotape and digital audio are being developed by the related EC-sponsored project AMICITIA[1].
Preservation is a major issue, but cannot be viewed in isolation. The institutions which hold this endangered material perform services, and broadcast archives serve a highly technical and rapidly changing industry. Preservation strategy needs to consider - to foresee if possible - the future service requirement of multimedia collections for at least the next twenty years. These service requirements will increasingly be based on electronic mass storage and direct, networked end-user access - probably using web technology. The critical question is: how much preservation money should be invested in the additional steps required for conversion of existing media to new technology? This raises the related issue of how to estimate and justify the additional expense. PRESTO has developed a strategy for dealing with this problem, based on the concept of 'cost per use'.
The true cost of an asset is total lifecycle cost. The true benefit is related to the number of times that asset is used over the lifecycle. Although not every use has equal benefit, overall more media issued from the archive means more benefit to the broadcaster and to the wider public service. Therefore a simple way to combine transfer cost, life cycle cost, and the significance of new service opportunities, is to translate those new opportunities into a predicted rate of item usage. Options for preservation can then be compared, in monetary terms, on a "cost per use" basis. A significant conclusion of the PRESTO survey is that archive preservation strategy should aim at the "lowest cost per use" over the life cycle of the new media, NOT at the lowest transfer cost.
Digitisation and mass storage is about 50% more expensive than just transferring from old formats (carriers) to new formats, but the new technology allows much easier access to the media. Simpler and faster access has already been shown to double at least the usage of an asset. This means it is cost-effective to spend the extra 50%, because the extra investment more than pays for itself in terms of extra usage of the material, i.e. in terms of lower overall cost per use.
Although advanced technology using mass storage has the highest initial investment, it has the lowest overall 'cost of ownership' because it allows the greatest automation of future preservation work.
PRESTO has identified 12 specific key links covering both radio and television archives. The new technology being developed or integrated will either reduce costs or increase the benefit of the whole transfer chain - or both.
Manufacturers of videotape recorders (VTRs) cannot be expected to incorporate the advances in videotape technology into new players for old formats - because old formats are by definition obsolete. Three areas related to improving the performance of VTRs are under development, concentrating on 1" and ¾" (U-Matic) formats.
Broadcast archives are in the early stages of the biggest and most expensive media conversion they will ever face. The whole process of selection and digitisation of analogue media will take at least another 20 years. Without widespread funding and support, and without cost-effective and farsighted use of technology, the work will not keep pace with the deterioration of the material. EC project PRESTO has documented the problem and provided guidance for organising preservation transfer projects. PRESTO has now delivered multiple forms of new technology for reducing preservation project costs, and increasing their efficiency. The future of PRESTO lies in maintaining information flow to all involved in archive preservation.
[ Note that Richard has also contributed an article on the Multimedia Archive Preservation Workshop in this issue. ]
Appendix 1- Project PRESTO details and survey participants
PRESTO [3] is a two year, 4.8 million Euro project of the EC Information Science and Technology (IST) programme. The goal is to develop technology and processes to reduce the cost of media preservation.
Main partners:
Technical partners:
Richard Wright
Technology Manager
BBC
Information & Archives
S120 Reynards Mill, Windmill Road
Brentford
Middx. TW8 9NQ
United Kingdom
URL: <http://www.bbc.co.uk
Email: richard.wright@bbc.co.uk
Richard Wright was educated at the University of Michigan, USA and Southampton University, UK. Degrees: BSc Engineering Science 1967, MA Computer Science 1972, and Ph D in Digital Signal Processing (Speech Synthesis) 1988. He worked in acoustics, speech and signal processing for US and UK Government research laboratories (1968-76), University College London (1976-80; Research Fellow) and Royal National Institute for the Deaf (1980-90; Senior Scientist). He was Chief Designer, Cirrus Research 1990-94 (acoustical and audiometric instrumentation). He has been Technology manager, BBC Archives since 1994.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Wright, R. "Preserving Europe's Memory", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/presto/>
-------------------------------------------------------------
By Robert Davies - July 2002
Rob Davies gives us a progress report on Europe's Thematic Network for Public Libraries and cultural institutions operating at local level.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Since its inception in May 2002, the PULMAN Network of Excellence has set out to strengthen the performance and help achieve the potential of public libraries in their emerging new roles and to help them prepare to fulfill their potential in the digital era of an e-Europe.
PULMAN's major work involves:
A key outcome of PULMAN is to encourage integrated activity involving public libraries, museums and archives operating at local level.
The PULMAN Network is now firmly established with representation from 26 European countries and a further 10 to follow (see below). The first twelve months of activity have been a very productive period.
Work has now been completed on drafting the first edition of the PULMAN Digital Guidelines and they will shortly be available on PULMANWeb [1]. The draft Guidelines were evaluated at a workshop, attended by 69 experts from 35 countries, in Helsinki on 18/19 February, 2002. The experts, which included PULMAN's own Virtual Advisory Board, assessed the Guidelines from all perspectives including format, usability and content. The result of the Workshop was to establish a well-defined editorial framework for the finalisation of the Guidelines. They are now ready for publication on PULMANWeb during June 2002. Proposals for their translation into 18 national languages have already been received.
The 122 million registered users of public libraries in 29 countries of Europe attest to the importance and impact of public libraries in society. To realise their full potential in the digital era, public libraries must be prepared to offer new and innovative digital services that empower citizens to achieve their personal goals in a changing society and to contribute to a cohesive and successful knowledge-based economy in Europe.
The Guidelines are intended to point public libraries and - more tentatively - their local cultural partner organisations, into this era. They include sections on: Policy Issues; Good Practice; and the Future Agenda - together a very wide range of links to innovative services.
The Guideline contents are as follows:
| Introduction |
| Section 1 - Social Policy guidelines |
| Social inclusion |
| Citizen participation in new forms of civic governance |
| Access and services for people with physical and visual impairments |
| Public library services for children and schools |
| Public library services supporting education in adult life |
| Support for business and the economy |
| Access to diverse cultural content |
| Access to music and non-print material |
| Section 2 - Management guidelines |
| Performance measures and evaluative tools |
| Funding and financial opportunities |
| Management practices and models for co-operation and partnership |
| The public interest in access to copyright-protected materials |
| The handling of legal issues |
| Section 3 - Technical guidelines |
| Digitisation |
| Developments in integrated library systems |
| Multimedia digital service delivery |
| Delivery channels |
| Resource description, discovery and renewal |
| Tailoring of services and citizen interaction and participation |
| Technical responses to multilingual issues |
We do not expect that the Guidelines will be a perfect instrument in their first edition and, for this reason, will be inviting comment from all interested parties via a process of Open Review which will lead to a revised edition in time for the Policy Conference in March 2003 (see below).
Perhaps the first use of the translated Guidelines will be to provide a basis for discussion at the 26 PULMAN National Workshops scheduled to be held in September and October 2002. The Workshops will also try to move forward the agenda for cross-domain activity and co-operation among public libraries, museums and archives.
Feedback from the National Workshops will feed into the planning and agenda of the PULMAN Policy Conference.
In support of this cross-domain agenda, a meeting was organized by EBLIDA (European Bureau of Library, Information and Documentation Associations) on 7 June 2002 in The Hague. This involved participants from a range of European organisations and associations from the museums and archives sector in the discussion of the terrain for co-operation between public libraries, museums and archives as well as contributing further to the Policy Conference agenda.
The PULMAN Policy Conference will be held in Oeiras, Portugal (13-14 March 2003). Its target audience is policy makers and influential practitioners in public libraries and their partner institutions. EBLIDA is co-ordinating and planning the Policy Conference.
PULMANweb [1] now provides access to a growing variety of resources and information - The Guidelines will be the newest arrival.
The following table summarises applicants and participation in the Training Workshops:
| Country | Applications | Participants |
| Bulgaria | 5 | 4 |
| Czech Republic | 3 | 3 |
| Estonia | 4 | 3 |
| Greece | 15 | 7 |
| Hungary | 3 | 2 |
| Latvia | 1 | 1 |
| Lithuania | 9 | 5 |
| Poland | 12 | 6 |
| Portugal | 8 | 5 |
| Romania | 4 | 4 |
| Slovak Republic | 2 | 2 |
| Slovenia | 7 | 5 |
| Spain | 1 | 1 |
| Total | 74 | 48 |
Finally, PULMAN is growing! The proposal to extend the PULMAN Network to countries bordering European Union countries and its candidate states, favourably evaluated under the 8th Call of IST FP5, is in the final stage of negotiation at the time of writing and expected to begin in mid-June, 2002.
The countries involved include Russia and Turkey (as partners) and a number of other countries represented by Country Co-ordinators (Albania, Belarus, Bosnia Herzegovina, Croatia, Macedonia, Moldova, Montenegro, Yugoslavia and Ukraine).
PULMAN-XT will run for 14 months and will enable the new countries to benefit from the work of PULMAN including translated Guidelines, National Workshops and attendance at the PULMAN Policy Conference. In addition, an ambitious new programme of institutional mentoring and twinning will be established.
MDR Partners (UK, co-ordinator), Eblida, Helsinki City Library (Finland) , Oton Zupancic Library, Ljubljana (Slovenia) and Veria (Greece) are the PULMAN partners who will make the 'bridge' with PULMAN-XT.
Although this ambitious work programme is consuming a great deal of the time of PULMAN Network members, thoughts are already beginning to turn to what comes next. How can the important resources created by PULMAN, such as the Guidelines, the training resources and the political work, be sustained once the EC-funded period is over? How best can the cross-domain agenda for local services in cultural heritage, learning, employment skills, etc. best be developed under IST in future and how might the PULMAN Network contribute? We are working on it!
Rob Davies
PULMAN Project Manager
MDR Partners
URL: <http://www.pulmanweb.org
Email: rob.davies@mdrpartners.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Davies, R. "PULMAN: rolling on by night and day", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/pulman/>
-------------------------------------------------------------
By Ziga Turk, Bo-Christer Björk and Bob Martens - July 2002
Ziga Turk, Bo-Christer Björk and Bob Martens provide a very telling insight into current practices in the relationship between publicly funded researchers and the commercial scientific publishing industry. They see the existing situation as quite unsatisfactory and an obstacle to the efficient exchange of research information and hence scientific progress. They explain for us how the SciX Project intends to redress, what appears to them, to be a very unacceptable state of affairs.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
![]() |
| Figure 1: Cover page of the Philosophical Transactions |
The history of scientific publishing starts in the 17th Century when the Royal Society created the Philosophical Transactions of the Royal Society of London [1]. The intention was to create a public registry of ideas - a logbook or journal - of the "present undertakings, studies and labours of the ingenious" who thought of what first - to protect intellectual property and ensure the rapid evolution of scientific knowledge (Fig. 1). For a long time, scientific publishing remained largely in the hands of learned societies and similar scientist-driven institutions. Publishers have been entering the market since the mid-19th Century, but their role was marginal and profits negligible until the 1960's, when the Science Citation Index [2] was introduced and the number of Universities around the developed world grew quickly. "What librarians (of these Universities) viewed as crucial core journals, publishers translated as the constitutive elements of an "inelastic market", i.e. a market where demand was little affected by pricing (and vice versa)" [1].
The business model of publishers is a fascinating one. The scientists do the research, they write the papers, they review their peers' work and they edit the scientific journals. They give away the copyright to their work, for free, to a party that has not been taking part in the value chain before. They then subscribe to, usually, rather expensive journals, so that they can learn about the work of their peers. In the SciX project we believe that giving away the right to copy (copyright) and distribute results of scientific work hinders the efficient exchange of information and makes scientific results harder and more expensive to obtain.
This belief is the baseline of the SciX project [3]. SciX (Open, self-managed platform for scientific information exchange - IST - 2001 - 33127) is a 24-month project with EU funding of Euros 1,000,000. Co-ordinated by the University of Ljubljana (Slovenia), the partners include Swedish Business School of Finland, Icelandic Building Research Institute, an eBusiness company Indra/Atlante (Spain), Technical University of Vienna (Austria), FGG Institute (Slovenia) and the University of Salford (UK).
The partners have been active in the field of electronic publishing since the mid-1990's. Bo-Christer Björk and Ziga Turk have been the editor and one of the co-editors of the Electronic Journal of Information Technology in Construction [4]. The average time from submission of a paper to its publication has been around 6 months. With each paper published there were on average about 1,000 readers who viewed the abstract and about 1,400 who downloaded the full text.
Since 1999, Bob Martens and Ziga Turk have been managing CUMINCAD - the Cumulative index of CAD [5] - the largest freely available database of papers related to computer aided architectural design (CAAD), particularly related to education in this area. At conferences organised by regional organisations of CAAD teachers (ECAADE in Europe, ACADIA in North America, Sigradi in South America and CAADRIA in Australasia) thousands of papers have been published. Rarely were the proceedings published by a professional publisher. Therefore the texts were not entered into commercial indexes and neither were they sold commercially. The full texts were not broadly available; only conference attendees had copies. On the other hand, the professional organisations retained the copyright to this work and could therefore allow its publication/archiving in the CUMINCAD. Accordingly this work is available on the Internet and rescued from oblivion. At the time of writing, CUMINCAD comprises 3,831 papers with abstracts. 883 papers are also available in full text.
![]() |
| Figure 2: User interface of the CUMINCAD database |
![]() |
| Figure 3: CUMINCAD database search results page |
Both professional organisations and groups of publishers, as well as specialised companies, are providing added value services related to scientific publishing. One example, amongst others, is the CIB's database ICONDA. Several bibliographical databases provide sophisticated search engines on bibliographic information about publications, (furnishing details such as titles and abstracts). Full texts are, as a rule, not available.
| Ei Compendex | ICONDA | RSWB | CumInCAD | CiteSeer | |
|---|---|---|---|---|---|
| Number of records | 6.000.000 | 500.000 | 575.000 | 3.000 | 2.500.000 |
| Availability | $ | $ | $ | Free | Free |
The Internet represents a threat to traditional publishers. While some years ago, the Internet was a first resource for obtaining scientific information [6], today it is becoming the only resource, particularly with young researchers. Traditional publishers are responding with services such as ScienceDirect which allows pay-on-demand access to the full texts of published papers.
Another strategy of publishers is to avoid dealing directly with the readers of journals and attempting to negotiate direct, long-term deals with either whole universities [7], or whole countries [8]. Although discounts are offered if an institution subscribes to a full spectrum of journals, the economies of such deals for the funding bodies and the researchers are not necessarily positive.
The idea to use the Internet for scientific publication is not new. Existing solutions are of the following types:
Problems with these services include:
The policy of the ARPA (Advanced Research Projects Agency) and the NSF (National Science Foundation) in the United States was that all research that was funded through public funding should make the results available for free. This has not been entirely true of published papers, but has worked excellently with software. Programs written in the context of research projects were made available - for free, usually including source code - on the Internet. In fact, the software to run the Internet in the first place was available for free. This created the critical mass for the so-called open-source initiative [12]. An increasing number of operating systems, application programs and tools are available free. Market share of those systems is growing and they are being used as a platform for vertical applications by companies such as IBM.
On the other hand, the European-funded research projects (such as the 4th and 5th Framework projects) have never required that results be made publicly available. The excuse offered was that commercial companies are co-funding this work and that they are not interested in making available what could be their competitive advantage. We are not aware of the scientific community challenging this system. Labelling most of the reports "restricted" actually restricted the readership to the project officers and the reviewers.
Standards organizations, in common with journal publishers, do not fund the writing of new standards, yet they are given the copyright of a standard. They support their organisational activities by the sale of the paper copies of those standards. Several research efforts addressing the computerisation of building codes stopped at a prototype level, because of problems with the copyright to the text of the standard.
The standards that govern the Internet and the Web serve as something of contrast. The well known "request for comments" documents (RFC's) are the result of the work of groups of individuals and are made available, for free, on the Internet, to be commented on as well as for writing compatible software. One may recall that in the early 1990's there was direct competition between ISO- and Internet-based networking, best exemplified by the use of two different email addressing schemes. The Internet solution, based on open freely accessible standard RFC822, prevailed.
The development of product modelling standards also started with the restricted publication model. Only recently has the IAI (Industry Alliance for Interoperability) corrected this mistake and is making the entire IFC (Industry Foundation Classes) standard available on the Internet for free.
In the paper-based publishing, a few dozens publishers control most scientific publications. Making a reasonably complete index involves including the publications of those few largest publishing organisations. If, however, thousands of people start creating digital archives on the Internet, indexing that information could be quite challenging. Web search engines, such as Google or Altavista are not very appropriate tools for searching for scientific information, because they index everything indiscriminately.
The Open Archives Initiative [13] is standardising the metadata structure and the API of an archive, so that the archive can be indexed and so ensure that several archives can be searched by users at the same time. Moreover, Open Archives Initiative is developing standards that aim to facilitate the efficient dissemination of content.
The objective of this project is to demonstrate
2-4% of European GDP (Gross Domestic Product) is spent on research and development - on creating new knowledge. While several projects deal with the management of knowledge that is created within industry, little has changed in the past hundred years in the ways that knowledge, created by scientific research and published in scientific journals, is handled. The current mainstream scientific publication process has so far been only marginally affected by the possibilities offered by the Internet, despite some pioneering endeavours. This is not so much because of lack of enthusiasm, but because there is a lack of sound business models and pilots to demonstrate the benefits of totally free scientific archives to the organisations, which, ultimately, should be funding their development and maintenance.
The objectives of the Project are:
To achieve these objectives SciX will:
Most technologies and software to implement these goals are either freely available or have even been developed by the partners in this project in the past. See section "Previous work".
The main problem in a new vision of information exchange in science is the copyright that researchers currently give away to the commercial publishers for free, and which results in severe obstacles for potential readers in retrieving the information they need. There are also other barriers to a shift to free repositories such as addressing the perceived risks of Internet publishing, the sluggishness of academic departments in changing their "rating" systems, etc., all which need to be studied. A survey we conducted in the year 2000 in the field of construction IT and management showed interesting results in relation to what scientists think about where to publish and what to read. We intend to continue this survey over the next years so that the trends can be monitored as well as gauging the impact of the proposed repository.
Typically scientific journals have been rated by prestige, often based on subjective evaluations or, to some extent, on the use of citation indexes. Ratings have been done implicitly through university departments, for instance in shortlists of accepted publications for promotion etc. Little attention has been paid to questions of how quickly and efficiently the information passes to experts for whom the information could be useful. Thus it would be very beneficial to develop methods which would allow the benchmarking of journals for factors other than the scientific quality of papers (e.g. turnaround time from submission to publication, availability, readership etc.). Such a benchmarking tool will be developed in the project and tested with a number of journals from different categories. The main value of such a tool would be as a means to increasing the awareness within scientific communities of the deficiencies of their current communication process. It is to be hoped such awareness will trigger action towards altering the process.
The main components of the demonstrator comprise:
Open source solutions and/or rentable Web infrastructure will be created and made available to potential users. Compatibility with emerging standards, such as Open archives, will be incorporated.
The SciX project started in February 2002. Current work includes the analysis of user requirements , design of the overall architecture and the business process modeling of the as-is situation. First deliverables are due in September 2002.
This article appears under the auspices of the SciX Project [3], funded by the European Commission under contract IST - 2001 - 33127. The contribution of the funding agency as well as that of industrial partners in the project is gratefully acknowledged.
The opinions expressed in this paper are that of the authors and do not necessarily represent the opinions of their employers, of the SciX consortium or of the European Commission.
Ziga Turk
University of Ljubljana
FGG-IKPIR
Jamova 2
1000 Ljubljana
Slovenia
URL: <http://itc.fgg.uni-lj.si/zturk/
Email: ziga.turk@itc.fgg.uni-lj.si
Ziga Turk (b.1962) is an associate professor of construction informatics at the Faculty of Civil and Geodetic Engineeing at the University of Ljubljana. He has degrees in Computer Science and Civil Engineering and works mainly in the filed of construction informatics where he published numerous journal and conference papers. He has been involved with Web publishing since 1993. His works are available from his Web page.
Bo-Christer Björk
Professor
Information Systems Science
The Swedish School of Economics and Business Administration
URL: <http://www.wasa.shh.fi/
Email: Bo-Christer.Bjork@shh.fi
Bo-Christer Björk (b. 1952) is Professor of Information Systems Science at the Swedish school of Economics and Business Administration in Helsinki, Finland. He holds degrees from three universities. Prior to his current appointment he spent seven years as professor of Information Technology in Construction at the Royal Institute of Technology in Stockholm, Sweden. He is editor-in-chief of the Electronic Journal of Information Technology in Construction, a peer reviewed scholarly journal which has appeared for free on the WWW since 1996.
Bob Martens
Institut für Örtliche Raumplanung
TU Wien
Karlsplatz 13
A-1040 Wien
Austria
Email: b.martens@tuwien.ac.at
Bob Martens (b.1961) holds an M.Sc. in Architecture from Eindhoven University of Technology (The Netherlands) and Dr. Techn. from Vienna University of Technology. He is appointed as an associate professor for Spatial Simulation and Interior Design in Vienna and guest professor for Simulation Techniques at Graz University of Technology. His main research topic is focussing on Simulation-Aided Architectural Design (SAAD) including full-scale and virtual modelling techniques as well as applied communication technology.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Turk,Z, Björk, B-C., and Martens, B. "Towards Open Scientific Publishing - the SciX Project", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/scix/>
-------------------------------------------------------------
By Panos E. Trahanias - July 2002
If, like me, you were unsure of the meaning of "avatar" outside the science fiction novels of Ian M. Banks, then Panos E. Trahanias sheds light upon its uses in this article. He writes of the achievements of a project which developed a way to provide remote users with a tele-presence in museums in the sturdy shape of TOURBOT.
[Editor's note : "avatar n. Computing: a movable icon representing a person in cyberspace or virtual reality graphics" (Concise Oxford Dictionary)]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
![]() |
The Internet is a fast evolving technology that electronically connects distant sites; however, up to now, electronic networks serve mainly to exchange and acquire information. In some cases this information is pictorial, often gathered by means of images taken in "real time" with a stationary Web-camera. To take full advantage of a network, such as the Internet, it would be desirable to get real physical interaction with the remote site being visited. Robots, and especially mobile platforms, can extend the Internet towards an interactive platform that allows actions to be carried out and dynamic information to be exchanged between distant sites. The TOURBOT project implements exactly the above concept for the particular case of museums as remote sites.
TOURBOT, the acronym of a project entitled "Interactive Museum Tele-presence Through Robotic Avatars", consisted of a research and technological development activity funded by the Information Society Technologies (IST) Programme of the European Commission. TOURBOT commenced January 2000 and ended successfully February 2002. The goal set forth in this project was the development of alternative ways of achieving interactive museum tele-presence, employing the novel approach of site viewing through the 'eyes' of robotic avatars [1-4]. This was accomplished and demonstrated in relevant events in real museum environments.
![]() |
![]() |
| Figure 1 : Tourbot meets the press | Figure 2: TOURBOT with young visitors |
The TOURBOT Project was carried out by a consortium that comprised an ideal blend of technical partners (Foundation for Research and Technology - Hellas, Greece; University of Freiburg, Germany; University of Bonn, Germany; THEON Mobile Platforms, Greece), brokers of technology to museums (Foundation of the Hellenic World, Greece), and end users (Foundation of the Hellenic World, Greece; Deutsches Museum Bonn, Germany; Byzantine and Christian Museum of Athens, Greece).
The goal of this project was the development of an interactive TOUr-guide RoBOT (TOURBOT) able to provide individual access to museums' exhibits and cultural heritage over the Internet. TOURBOT operates as the user's avatar in a museum by accepting commands over the Web that direct it to move in its workspace and visit specific exhibits. The communication network is, thus, effectively extended by the introduction of interactive, mobile robotic platforms as terminal nodes. The imaged scene of the museum and the exhibits is communicated over the Internet to a remote visitor. As a result the user enjoys a personalised tele-presence in the museum, being able to choose the exhibits to visit, as well as the preferred viewing conditions (point of view, distance from the exhibit, resolution, etc.). At the same time, TOURBOT is able to guide on-site museum visitors providing either group or personalised tours.
To make the TOURBOT system possible, a multimedia Web interface allows people to interact with the tour-guide system over the Internet [5]. Furthermore, an on-board interface facilitates interaction with on-site visitors of the museum. Using the Web interface, people all over the world are able to tele-control the robot and to specify target positions for the TOURBOT system. The robotic tour-guide possesses a multimedia information base providing a wide range of information about the exhibition at various levels of detail. Thus, the TOURBOT system serves as an interactive and remotely controllable tour-guide, which provides personalised access to exhibits with a large amount of additional information.
![]() |
![]() | |
| Figure 3 : Tourbot catches | Figure 4: TOURBOT gives | |
| the cameras' eye | personal assistance |
A tele-operated tour-guide robot requires a high degree of autonomy since it operates in a populated environment in which humans are also present. Therefore, the project included the development of a safe and reliable navigation system for TOURBOT [6-7]. The robotic avatar is equipped with a series of state-of-the-art sensors that allow it to acquire information about its environment. The navigation system uses this sensory information to adapt the robot's internal model of the environment and to plan the robot actions.
The TOURBOT project introduces a new paradigm in providing access to cultural heritage exhibits [8]. Through the introduction of museum visiting via a robotic avatar, it facilitates immersive tele-presence with advanced visualization capabilities. Full access to cultural exhibits is granted to the user, in the sense that the latter is able to choose the exhibits to visit, as well as the preferred viewing configurations. The approach employed in the current project introduces a novel model of augmented environments, in that it allows human interaction with, and workspace exploration of, remote sites by means of a robotic avatar.
As a service to remote users, TOURBOT extends current communication networks by allowing mobile robots to be part of the overall structure. Such a mobile agent acts as the user's avatar, operating in a physical environment that is perceived by the user through the robot's sensors. Therefore, the TOURBOT results contribute towards the seamless integration of networks and mobile agents for providing full user access to exhibitions.
Tourbot has achieved its RTD goals and has undertaken demonstration trials in the premises of the participating museums. More specifically, the TOURBOT system has been developed and fully tested in laboratory environment. Following that, and in order to acquire performance data from actual museum visitors, the system has been installed and demonstrated in the three museums of the Tourbot consortium. These demonstrations were combined with relevant events in order to publicise and disseminate the results of the project to professionals and the broader public. Details of these events are as follows:
Prof. Panos Trahanias
TOURBOT Co-ordinator
Institute of Computer Science
Foundation for Research and Technology - Hellas
71110 Heraklion,
Crete,
Greece
URL: <http://www.ics.forth.gr/tourbot
Email: trahania@ics.forth.gr
Tel: +30-81-391 715
Fax: +30-81-391 601
Panos Trahanias is an Associate Professor with the Dept. of Computer Science, University of Crete, Greece and ICS-FORTH. He received his Ph.D. in Computer Science from the National Technical University of Athens, Greece, in 1988. He has been a Research Associate at the Inst. of Informatics & Telecomm., National Center for Scientific Research "Demokritos", Athens, Greece. From 1991 to 1993 he was with the Dept. of Electrical & Computer Eng., University of Toronto, Canada, as a Research Associate. He has participated in many RTD programs in image processing and analysis at University of Toronto and has been a consultant to SPAR Aerospace Ltd., Toronto. Since 1993 he has been with the University of Crete and ICS-FORTH. Currently, he is the supervisor of the Computer Vision & Robotics Lab. at ICS-FORTH where he is engaged in research and RTD programmes in vision-based robot navigation and augmented reality.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Trahanias, P.E. "TOURBOT - Interactive Museum Tele-presence Through Robotic Avatars", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/tourbot/>
-------------------------------------------------------------
By John Counsell - July 2002
John Counsell reports on the Valhalla project, which aims to provide in-depth comparative historic garden information, linking Hatfield House in Hertfordshire with the Château de Villandry on the Loire in France. Researchers have installed digital video cameras overlooking the grounds, sending real-time images to local servers and onto the Web. Staff program the cameras from Bristol, allowing them to zoom in, script identical film sequences, or set up video conferencing sessions between garden staff at the two locations.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
| Figure 1: Château de Villandry from across the potager garden |
This article provides an overview of the first months of the Valhalla Project, which links historic garden records with real-time video on the Web. Visiting historic gardens attracts huge public interest, combining as it does twin passions for gardening and history. Many enthusiasts are interested in being able to compare historic gardens, learn how they were designed, how the plants were chosen, and the way in which they change with the seasons. Thompson said of historic ruins that:"the best basis for understanding a ruin is therefore a wide knowledge of structures of the same period, whether ruined or not, since the mind is consciously or unconsciously making comparisons, and the larger the stock upon which it is possible to draw, the more reliable the result is likely to be." [1]. This project is based on the assumption that the need for comparative understanding is as true for historic landscapes and gardens as for buildings, that many cannot travel extensively enough to gain this broad-based understanding, and that now virtual travel on the Web can provide an effective substitute.
|
| Figure 2: Hatfield Old Palace Knot Garden as seen from a fixed camera |
The Valhalla project, led by staff at the University of the West of England, Bristol (UWE), aims to provide just such in-depth comparative information, by linking the gardens of Hatfield House in Hertfordshire, with the Château de Villandry on the Loire in France. Researchers from the University's Faculty of the Built Environment have installed digital video cameras high up on the buildings, overlooking the grounds, sending a continuous stream of images to local servers and onto the Web. UWE staff program the cameras from Bristol, allowing them to zoom in, script identical film sequences, or set up video conferencing sessions between garden staff at the two locations.
Web-based images will allow people who are unable to travel to enjoy the gardens in all their glory. Visitors to either attraction will be able to see what is happening at the other site, and can also see archived film to find out what the gardens look like at other seasons of the year. Records of the plants and trees and their positions could also have long term benefits for curators, since it has been found that many gardens do not have adequate records to assist accurate reconstruction, and yet are at risk, as shown by the great storms of recent years.
|
| Figure 3: Current draft webmap video control |
The Valhalla project is a continuation of previous work at UWE in linked VRML (Virtual Reality Modelling Language) and spatial databases, applied to visitor information and heritage site management. It extends the previous work in two directions, that of real-time remotely controlled acquisition of digital imagery, and that of the relationship of heterogeneous information to the images to explain and interpret them based on VRML, the whole managed and updated from a Geographic Information System (GIS). The GIS used is Mapinfo, and the VRML is output by a program called Pavan [2]. The project runs from October 2001 to September 2002, funded by the European Commission Information Society Technologies programme (IST-2000-28541). It is a partnership between UWE, the Gardeners Exchange Trust (who have promoted physical exchanges between European historic gardens staff over the last few years), and the twinned gardens of Hatfield House in the UK, and the Château de Villandry in France.
The goal is to promote comparative study and discussion between staff at each site (a virtual Gardeners Exchange), and put real-time interpretative samples on the Web, with 'hot-spot' information generated in matching VRML viewpoints from a 3D spatial information system. This involves a form of remote data capture, followed by spatial referencing and retrieval of digital images with other associated descriptive information. The project team has therefore installed remotely controlled video cameras in prominent positions overlooking the gardens. Staff may control the cameras during interactive on-line discussion to illustrate or seek information, or the cameras may follow scripted routines to capture matching images for later time-lapsed sequences showing diurnal and seasonal change.
There are thus six major elements to the project:
Recording information on the gardens took more time than planned due to the lack of useable surveys or planting plans. Geometrical measured surveys have been completed of the selected area of each garden in the field of view of the cameras and a 1992 survey of Villandry was purchased from a local geometrician. For both gardens the logged data then had to be translated into the Mapinfo GIS (Geographic Information Systems). Steps, walls, copings and other distinctive architectural features, and the edges of changes in hard and soft landscape surface materials, such as grass, paving, flower borders and paths have been separately identified in the GIS.
Previous experience had shown that even expert horticulturists cannot readily recognise every plant from photographs, and want to examine the leaves and flowers and form of the real plant. It became apparent that a separate process was necessary to log information about the plants visible within the video images. Thus the purpose of the geometric surveys was to 'map' locational information about plants, trees and hard landscape features within the field of view of each camera, into the GIS from which the VRML (Virtual Reality Modelling Language) 3D Web-based models are generated. Hyperlinked pages of plant information are generated from the GIS attached to the 3D elements in the VRML model. The same instructions used to control the video camera are to be used to instruct the VRML model to show the same view, in a companion frame to the video image. Clicking on elements in the model enables comparative identification of the elements visible in the video. (Common plant names in French and English are linked by the Latin name as a key to assist identification despite the different languages involved).
Compaq IPAQs were acquired for the task of hand-held plant data logging at each site, with ESRI Arcpad GIS software (on investigation found more usable than the comparable Mapinfo pocket PC product). These have both wireless networking connection and docking station transfer of data to and from the server. The Arcpad software has been loaded with spatial mapping of both gardens and with templates developed for logging the location, characteristics, and spread at different seasons of each distinctive plant feature. GIS standards for exchange are robust enough for it to be simple to exchange mapping and data between Arcpad on the IPAQ (or Desktop) and Mapinfo at UWE. However garden staff have proved reluctant to embrace technology to this extent, and have been more comfortable with paper based maps and forms, from which the data has been transcribed onto computer later.
Gardens were chosen that are designed to be seen from the windows of the House, so that a camera mounted on the house as a vantage point would give a similar view. Both a fixed and a motorised camera were installed because there was no suitable position from which the whole of the selected area at either garden could be viewed by a single camera. Initially it was intended to install a conventional pan and tilt motorised camera mounting with a Sony FCBIX47 Camera (460 TV Lines) with auto focus and 18 times optical zoom, in a heated weatherproof housing. However on investigation and testing, it became apparent that the conventional motorised mountings used in the security industry are only suitable for a limited range of preset views or for direct control by keypad and joystick by an on-site operator; they do not enable the precise telemetry required for remote control over the Web.
|
| Figure 4: Dome Camera at Hatfield House |
It was found that the Dome Camera (Dennard 2050) was capable of precise telemetry, but would require programs to be written to control the camera remotely. Unfortunately the dome camera is more prone to reflection, glare and raindrop distortion. A C program was written and tested by Oggle Ltd, from whom the cameras and Web upload utilities have been leased, to control the dome cameras. Shell scripts were developed by UWE to enable the cameras to be remotely controlled by clicking on a map or panoramic image on a Web page, with a slider to control zoom. These now work effectively.
The Parallelgraphics Software Development Kit was bought in order to customise and simplify the Cortona VRML browser to display modelling in conjunction with video clips. A 'calendar' program has been written by UWE to control the cameras based on the data in the Mapinfo GIS, and simultaneously archive the video clips tagged with data on viewpoint and zoom. The program calculates and exports field of view and directional vector data from the GIS, (based on optimal viewing times and locations for features within the spatial database), to prepare scripted directional information to control the path and field of view of the video camera. The same program responds to date-time triggers tagged onto plants and objects in the GIS to operate the Mpeg2 capture card (to avoid over-filling the local server hard drive) and write the results (tagged with viewpoint data) to DVD for transfer back to UWE. This program records vector and field of view metadata with each image file to enable video sequences to be selectively archived with associated VRML seasonal modelling.
|
| Figure 5: Draft Villandry model VRML with plant data |
A search program will use the metadata to invoke an archived video clip and a matching VRML model view. This helps to address the issue of data management of potentially very large quantities of images: partially by planned 'scripting' to capture in a selective manner; but also by use of the GIS to assist in 'automated' content description, management, archival storage and retrieval by place, time, and objects within the field of view. The calendar and search program will upload both seasonal VRML models and matching movies to the Web. Previous experience has shown the models work well to invoke Web pages of information.
A Web Site was created [3] for public information about the project on a server at UWE, with threaded discussion forum, whilst links to the video footage and the VRML modelling are being added as they became available. The video cameras are linked to a specialist compression card (from a French company Com1) incorporated in a Linux Web server on each site, which serves the images in 640 by 480 resolution motion JPEG format on the Web via ADSL or ISDN, and archives the video in clips, currently set at 5 minute intervals throughout the day. (The archived clips are transferred nightly to a server at UWE, Bristol to avoid limiting bandwidth during the day.) The remote location of servers and cameras in the unused wing of the Chateau also required the installation of a wireless Ethernet bridge to connect the servers (and cameras) to the Local Area Network in the separate administration building approximately 100 metres away, and thence to the ISDN router.
The Web-based interpretative real-time samples and archived or time lapsed sequences linked to matching 3D model views are intended to enable the comparative study of similar information within both gardens from a Web browser, so broadening public access to this aspect of European cultural heritage.
This is designed to:
Spatial extensions to SQL with active server pages are to be developed to make available Web-based template search routines to serve bespoke images or maps of the gardens in printable Web pages. The extent of accessibility of routes is to be colour coded in both map views and VRML models. Video clips, walking through the gardens, have been taken to supplement and enhance the aerial real-time views, and historic images are being incorporated at both sites, using the VRML model to fade from matching existing image to historic image and vice versa. Active server page scripts, and cookies generating questionnaires to frequent users, will audit visitor use of the Web site at the garden or on the Web.
Expertise from professional gardeners for display on the World Wide Web will be added by:
The cameras generate higher quality video than can be seen currently on the Web. The additional occasional upload of high quality Mpeg2 (approximately 45 m/bytes for 1 minute) is possible but cannot be relied on. Transferring the archive of Web Video Clips per night is approximately 500 - 600 m/bytes per day for the two cameras on each site, which takes about 5.25 hours at 256 kb/s upload (the ADSL at Hatfield), and twice as long for the 128 kb/s upload (ISDN at Villandry). It proved necessary therefore to write scripts to record the Mpeg2 in real-time on site, archive it onto rewritable DVDs on site and then send the DVDs by post to UWE for editing and archive. The calendar program also handles this. Therefore a second server (running Windows 2000) has been installed at both Hatfield and Villandry and linked to a splitter in the feed from the cameras. This server incorporates a specialist Amber Video card which encodes analogue video in real- time into Mpeg2 format, and a Pioneer DVD recorder. Windows 2000 terminal services are used to remotely control the server, the Mpeg2 card, the DVD recorder, and the Hard Disc space, from UWE, Bristol. This process has been tested through and found to work effectively.
Phillips USB Video cameras and software (Net-Meeting and Yahoo) have been tested and installed, to enable staff to videoconference between the two gardens to exchange knowledge and skill, while reviewing real-time video imagery of selected aspects of the gardens. It has taken longer than anticipated to reach the point where staff can deploy the remotely controlled camera during discussion, due to the difficulties in achieving Web-based camera control discussed above. However the scripts have now been developed and this phase is just commencing.
|
| Figure 6: Villandry fixed camera on the Garden of Love |
I have described an investigation into the marriage of long-term data with what might be called more ephemeral imaging data, using a common key of spatial and temporal location, and served by a spatial information system or GIS, to create a meaningful whole. The physical and historical complexity of heritage sites is held to be better recorded and displayed in 3D than 2D, to ensure commonality of understanding between all those involved in its care and with the wider public who fund it. However, in order to enable common retrieval, much more locational and time data needs to be captured and entered into a database with visual images and information than is currently the norm. A common approach to spatial and temporal referencing across a range of sites will enable comparative search and simultaneous display to envision the broad range of examples that Thompson described as so important to enable reliable understanding of what is seen on site. This broad understanding can only be obtained asynchronously by first hand experience at present.
The Grand Canyon is cited in support of the argument that some sites need no interpretation, although this is not held to preclude the need for informed professional understanding [4]. The Grand Canyon might be brought to a remote off-site audience using video and audio alone. However many other sites are enhanced by interpretation and for these remote access or, in the future, augmentation of the reality on site will require on-tap synchronised abstract information in addition to that directed at the senses. It takes a long time to commission and procure useful records. To meet these future developments it is desirable to record locational and temporal metadata with such records now. The Valhalla project suggests that this can be done, and that the process can be made resource effective, by enabling public access to the information gathered in interactive and interesting ways.
Counsell, J., Worthing, D. (1999): Issues arising from Computer Based Recording
of Heritage Sites,
in Vol 17, No 4. Structural Survey Journal, 1999.pp
200-211.
Counsell, J (2000):The management and visualisation of 3- dimensional models
using a spatial database,
in International Journal of Computer Integrated
Design And Construction, Volume 2 Issue 4 November 2000, pp225-235.
Counsell, J (2000): Spatial Database Management and Generation of VRML Models
in Proceedings of the 15th IKM - International Conference on the Application
of Computer Science and Mathematics in Architecture and Civil Engineering,
published by the Bauhaus-Universitãt Weimar, Sept 2000.
Counsell, J (2001): Virtual Access to Landscapes and Historic Gardens at Linked
Locations,
in Proceedings of IV'2001, the International Conference on
Information Visualisation,published by the IEEE Computer Society, California,
July 2001.
John Counsell
Senior Lecturer
University of the West of England, Bristol
Frenchay Campus
Coldharbour Lane
BRISTOL BA16 1QY
United Kingdom
URL: <http://www.uwe.ac.uk/fbe/>
Email: john.counsell@uwe.ac.uk
Phone: +44 117 344 3929 +44 117 9656261
Fax: +44 117 344 3002
John Counsell is a senior lecturer in the Faculty of the Built Environment at UWE. He has been a practising Architect specialising in historic buildings, and a computer-aided design consultant, with experience of 3D modelling and visualisation. At UWE since 1995 he has also specialised in GIS and Virtual Reality on the Web.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Counsell, J. "Valhalla - linking historic garden records with real-time web video", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/valhalla/>
-------------------------------------------------------------
By Paul Miller, David Dawson and John Perkins - July 2002
Paul Miller, David Dawson and John Perkins report on the second in a series of international meetings at which representatives of cultural content creation programmes from around the world work towards greater collaboration.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
As reported in issue 5 of Cultivate Interactive [1], there is increasing evidence of broadly similar approaches to the digitisation of cultural content being adopted in countries around the world. In an initiative to ensure interoperability of approach wherever possible, effort is being devoted to opening and maintaining communication between the various programmes, and to exploring scope for collaborative working and standard setting. Following an initial meeting in London during the summer of 2001, some 40 representatives of various cultural content creation programmes gathered in Washington, D.C., in March of 2002, as guests of the US Federal Institute for Museum and Library Services (IMLS).
A number of clear themes emerged from the meeting, building upon those already identified in London. Clearest of these was a recognition of the importance of gaining a far better understanding of our users, the uses they make of digitised cultural content, and their requirements around the creation of new content. There was also a high degree of interest in the Open Archives Initiative (OAI), and in the various testbeds already underway or shortly to commence amongst several of those represented. Work on these and other activities is now moving forward, both bilaterally between specific initiatives, and in concert under the umbrella of a new digital Cultural Content Forum.
There has been a great deal of effort expended in recent years in digitising a wealth of cultural content for display on the Web. The reasons for digitising this material range from preservation to education, but relatively little effort has been devoted to understanding what users actually want from content, or the uses to which they wish to put it. There are notable exceptions to this apparent lack of interest, but the broad trend is unfortunately one, to paraphrase, of building it, safe in the knowledge that 'they' will come (and, presumably, enjoy, tell their friends, and come again!) Although visitor numbers for many cultural sites on the Internet are certainly highly satisfactory, comparing them to the far larger community of potential users suggests that more users await awareness raising activity or some suitable enticement. There is also largely unquantified work to be done in improving the experience for those who actually do visit, and in delivering the content and services that the user seeks, rather than expecting them to be satisfied with that which the content provider has selected on their behalf.
The work to be done in this area is potentially costly and long-term, but those gathered in Washington were unanimous in agreeing that the sector requires far better understanding of the issues, and of information that is already available. As such, a brief is being developed for a focussed piece of work, in which existing studies of user experiences and requirements for digital cultural content - both commercially sensitive and already in the public domain - will be located and synthesised in order to identify broad trends and issues. This work will serve, we believe, to identify knowledge which already exists within memory institutions in isolated pockets. It will also serve to generate both hypotheses for testing and flag up questions in need of asking through further consultative work, whether conducted collaboratively or at a local level by individual institutions and agencies.
Since first announced, there has been significant interest in the work of the Open Archives Initiative [2]. This interest has arisen both for associations with potentially radical changes in the manner in which scholarly research is published and disseminated, but also for the technical work around a Protocol for Metadata Harvesting [3]. This protocol extends beyond the confines of scholarly publishing and e-Prints. It offers capabilities for software to 'harvest' basic Dublin Core [4] records describing content in a wide variety of forms and formats, and to create large repositories of metadata suitable for integration and manipulation in a range of ways. In the UK, for example, an evolving Architecture [5] for the JISC's Information Environment [6] recognises the key role of such harvesting in the distributed information landscape, alongside searching remote databases and alerting both human and machine users to changes in content.
Amongst memory institutions and related bodies, there has been a great deal of interest, and a number of programmes are currently getting underway to explore the realities of creating, maintaining and making use of these repositories of metadata, and the services likely to grow up around them. The JISC's FAIR Programme [7], a number of Mellon projects [8], and an RfP from IMLS are amongst a raft of funded explorations of the technology and the issues of integrating it with existing practice.
Participants expressed interest in learning more about the potential of OAI and other 'new' technologies to assist them in delivering on their missions, and it was agreed that it might be useful to explore in detail the known work on OAI within the community at the next meeting. Stronger bridges were also built between a number of the funded programmes exploring this technology, and there is the potential for some synergistic work to emerge, rather than a number of wholly separate programmes.
This meeting, possibly more than that in London, emphasised some of the very real differences in approach across the jurisdictions represented, whether in terms of the reasons for digitising content in the first place (lifelong learning, national identity, tourism, digital librarianship, etc.), or the roles expected of public, private, and quasi-public bodies within the process.
Despite these differences, there remain clear advantages in ensuring effective lines of communication across national and disciplinary borders, and the group was felt to play an important role in maintaining this dialogue due to its broad (and growing) international make-up, and its mix of funders, policy makers and strategists.
In order to form a focus for this work to progress, the group is to become the digital Cultural Content Forum, and work is currently underway on a Web site [9] and a series of associated documents for this group in order to raise its profile and create an environment in which issues of import may be progressed.
Notice of public deliverables from this group, including the launch of the
new digital Cultural Content Forum Web site, will be given via e-mail to various
community mailing lists. Those who are interested in receiving notification of
all such deliverables are invited to join the public mailing list,
interoperability, hosted by the UK JISCmail service.
To join this list, send a message to
with the body of the message reading
join interoperability Your_Firstname Your_Lastname
--
e.g.
join interoperability Paul Miller
--
The authors wish to thank all of those who travelled to Washington to participate in this meeting. Without their attendance and ongoing participation, this initiative would be much diminished. Thanks are also due to all of those at IMLS who worked so hard to ensure a comfortable and productive two days.
Participants at the meeting were: Helen Aguera (National Endowment for the Humanities, USA), David Dawson (Resource, UK), Lorcan Dempsey (OCLC, USA), Jose Luis Esteban (National Library of Spain, Spain), Eleanor Fink (World Bank, USA), Shelagh Fisher (CERLIM, UK), Kati Geber (Canadian Heritage Information Network, Canada), Tony Gill (Research Libraries Group, USA), David Green (National Initiative for Networked Cultural Heritage, USA), Dan Greenstein (Digital Library Federation, USA), Steve Griffin (National Science Foundation, USA), Catherine Grout (Distributed National Electronic Resource/ Joint Information Systems Committee, UK), Nancy Gwinn (Smithsonian Institution, USA), Monika Hagedorn-Saupe (State Museums of Berlin, Germany), Susan Haigh (National Library of Canada, Canada), Ken Hamma (J. Paul Getty Trust, USA), Jieh Hsiang (National Taiwan University, Taiwan), An Knaeps (Flanders Ministry of Culture, Belgium), Steve Knight (National Library of New Zealand, New Zealand), Clifford Lynch (Coalition for Networked Information, USA), Marianne McLean (National Archives of Canada, Canada), Gerald Maier (State Archive of Baden-Württemberg, Germany), Deanna Marcum (Commission on Library and Information Resources, USA), Bob Martin (Institute of Museum and Library Services, USA), James Michalko (Research Libraries Group, USA), Paul Miller (UKOLN, UK), Sarah Mitchell (New Opportunities Fund, UK), Michel Murray (Canadian Heritage, Canada), Frits Pannekoek (University of Calgary, Canada), John Perkins (CIMI, Canada), Joyce Ray (Institute of Museum and Library Services, USA), Jacob Schouenborg (Ministry of Culture, Denmark), James Shulman (Mellon Foundation, USA), Kevin Sumption (Powerhouse Museum, Australia), Jennifer Trant (Archives & Museum Informatics, USA), Sirkka Valanto (National Board of Antiquities, Finland).
The meeting was conceived and realised as a partnership between UKOLN, Resource and CIMI. Our hosts in Washington were the Federal Institute for Museum & Library Services (IMLS).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Paul
Miller
Interoperability Focus
UKOLN
United Kingdom
p.miller@ukoln.ac.uk
<http://www.ukoln.ac.uk/interop-focus/>
Phone: +44 1482 466890
Paul holds the post of Interoperability Focus at UKOLN. This post is jointly
funded by the Joint Information Systems Committee (JISC -
www.jisc.ac.uk/
) of the United Kingdom's Further and Higher Education Funding Councils, and by
Resource, the Government agency responsible for libraries, museums and archives
(www.resource.gov.uk/
).
Paul's background is in archaeology, where his PhD research examined the use of Geographic Information Systems (GIS) in mapping deposits buried beneath modern cities, concentrating specifically upon the archaeologically rich and varied city of York.
In his current work, Paul is responsible for encouraging and facilitating the development of interoperable solutions across a range of domains, principally museums, libraries, archives, and government. Paul sits on a wide range of committees and working groups related to this area, both internationally (for example, the executive committee of CIMI) and within the UK.
Previously, Paul worked for the Archaeology Data Service (ADS -
ads.ahds.ac.uk/
), a service provider of the UK Arts & Humanities Data Service. Here, he was
responsible for designing and establishing the catalogue, which now contains
content from local and national archaeological agencies across the UK.
David
Dawson
Senior ICT Adviser
Resource: the Council for Museums Archives & Libraries
United Kingdom
david.dawson@resource.gov.uk
<http://www.resource.gov.uk/>
Phone: +44 20 72731415
David Dawson is one of the Senior Network Advisers within the Learning and Information Society Team (LIST) of Resource.
David studied Archaeology at Durham University, and completed the Museum
Studies Course at Leicester in 1985, before becoming an Associate of the Museums
Association in 1988. He worked in a range of museums before joining the Museum
Documentation Association (www.mda.org.uk/
) in 1992, as Business Manager of mda Services, before becoming Outreach Manager
(ICT), giving advice and training to museums in documenting their collections,
with a focus on helping small museums as well as working with a number of
museums in the UK and abroad. Whilst at mda, he was closely involved in the
development of the Aquarelle Project.
In 1998 David joined the Museums & Galleries Commission (www.museums.gov.uk/
) as New Technology Adviser, before becoming Senior ICT Adviser for Resource. He
works particularly on ICT in museums, managing the DCMS/Resource IT Challenge
Fund, acting as an expert adviser to the New Opportunities Fund, and working on
a range of other projects and strategic developments, such as Culture Online
(www.cultureonline.gov.uk/
). David is currently a member of the Office of the e-Envoy Broadband Research
group and is the nominated UK Representative on the EU activity to Coordinate
National Digitisation Policies.
John
Perkins
Executive Director
CIMI Consortium
Canada
jperkins@ca.inter.net
<http://www.cimi.org/>
Phone: +1 902 4295392
John Perkins is Executive Director of the Consortium for the Interchange of
Museum Information (CIMI - www.cimi.org/
). CIMI is a group of the world's most prestigious museums, technology
companies, and libraries working to advance museum digital intelligence through
standards, research, testbeds, advocacy, training and international
collaboration. Current interests are in the area of digital information object
management and interchange for museums, metadata harvesting, and distributed
searching, mobile computing, and content architecture for Semantic Web
applications.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Miller,P., Dawson,D. and Perkins, J "Towards a Digital Cultural Content Forum", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/washington/>
-------------------------------------------------------------
By Brian Kelly - July 2002
Brian Kelly reviews the CULTIVATE National Node Web sites using a variety of automated tools, and makes some comparisons with a survey of National Focal Point Web sites.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
The CULTIVATE project [1] (of which the Cultivate Interactive e-journal is a part) supports the dissemination of information about the EU's Digital Heritage and Cultural Content (DIGICULT) programme [2]. National Nodes have a key role to play in the project by disseminating information within their own country. As well as organising meetings and events, giving presentations and running mailing lists the National Notes also provide Web sites to support their dissemination function. The National Node Web sites complement the work of the the central CULTIVATE Web site [3] by providing regional information and by providing information in the national language(s).
The work of the National Nodes has built on the work of the National Focal Points (NFPs) which were funded by the EU's Fourth Framework's Telematics For Libraries programme. The work was coordinated as part of the EXPLOIT project [4].
A survey of the NFP Web sites was published in Exploit Interactive in October 1999 [5]. The survey showed that a variety of approaches to the provision of Web sites was taken. There was no consistent visual identity or navigational aids and it appeared that there was little sharing of information on best practices.
In the CULTIVATE project it was agreed to address some of these limitations by making use of a consistent visual identity and navigational structure and, through regular meetings and email communications, provide support and advice in implementing best practices.
In this article a summary of a survey of the National Node Web sites is given.
The analysis of National Node Web sites makes use of the central list of National Nodes maintained on the central CULTIVATE Web site [6].
It should be noted that this list includes information on National Nodes funded by the CULTIVATE-EU (for EU countries) and CULTIVATE-CEE (for Central and Eastern Europe Countries), together with the recently established CULTIVATE-Russia project.
Of the 26 countries listed 22 provide a National Node Web site. Details of the Web sites' address are given below.
As can be seen from Table 1, 11 Web sites have an entry point of the form <www.country_code.cultivate-europe.org>. This can be compared with the findings for NFP Web sites, for which there was no consistency and the address of other Web sites could be be extrapolated from knowing the address of one.
A total of 14 Web sites make use of a simple domain name, with no additional path name required. This compares with NFP Web sites, in which only two countries made use of a domain name as the entry point. Use of a simple domain name means the URL is easier to type, is more memorable and can be more easily marketed.
Many of the Web sites have made use of the CULTIVATE visual identity and look-and-feel, as can be seen from Figure 1.
![]() |
| Figure 1: The Irish and Estonian National Node Web Sites |
As well as helping in raising awareness of the CULTIVATE project, consistent use of the visual identity and look-and-feel will also be helpful for end users who may have an interest in accessing more than one National Node Web site.
A "Web tour" of the National Node Web sites is available [7]. This provides an automated display of the the entry points.
Netcraft [8] was employed to analyse the server technology used to provide the National Node Web sites. A summary of the findings is given in Appendix 1.
Thirteen of the Web sites are (probably) hosted on a Unix server and 9 on an Windows NT server. Interestingly, in addition, one Web site appear to use the Microsoft server software on a Unix platform.
As well as helping in raising awareness of the CULTIVATE project, consistent use of the visual identity and look-and-feel will also be helpful for end users who may have an interest in accessing more than one National Node Web site.
A "Web tour" of the National Node Web sites is available [7]. This provides an automated display of the the entry points.
AltaVista [9] was used to provide information on the numbers of pages indexed by AltaVista and the number of links to the National Node Web sites. A summary of the findings is given in Appendix 1.
It should be noted that the information on the number of links is taken from the AltaVista database. It cannot be guaranteed that the information held on the database is complete.
NetMechanic [10] was used to analyse the quality of the National Node home pages. This included reporting the numbers of broken links and HTML errors on the page and the load time for the page. A summary of the findings is given in Appendix 1.
From the findings it will be noted that several of the Web sites appeared to contain broken links and HTML and browser compatibility errors. In some cases this was not actually the case - NetMechanic may sometimes provide incorrect results. However in a number of cases there were problems with the pages.
The Robot Exclusion Protocol [11] enables a Web site administrator to specify directories which robots should not access. Although it does not provide a security mechanism this protocol can be used to avoid search engines indexing draft documents and personal files. It can also be used to stop search engines from wasting server capacity by attempting to index files such as images, CGI scripts, etc.
A brief summary of the use of the Robot Exclusion Protocol for the National Node Web sites is given in Appendix 1.
It was noted that none of the Web sites appeared to support the Robot Exclusion Protocol. This may not be an issue, as it is likely that National Nodes will want all pages on their Web sites to be indexed by robots.
The 404 error page is displayed when a URL is given for a resource which does not exist. This may be due to an incorrect URL being contained in a HTML page, a page being moved or deleted or the end user typing in an incorrect URL.
A brief summary of the 404 error pages for the National Node Web sites is given in Appendix 1.
It is possible for the 404 error page to be branded with the Web site's visual identity. The 404 error page can also provide useful additional functionality, such as providing a search facility. However it was noted that none of the National Node Web sites provides branding or extra functionality on the 404 page.
The Bobby tool [12] was used to analyse the accessibility of the main entry point for National Node Web sites. The results are given in Appendix 1.
It was noted that a number of the National Node Web sites contained P1 errors.
A summary of the search facility provided on National Node Web sites is given below.
The following Web sites provided a search facility: Austria, Czech Republic, Denmark, Estonia, Finland, Hungary, Ireland, Latvia, Lithuania, Norway, Russia, Slovenia and the UK. This is a total of 13 of the 21 Web sites.
In order to establish how easy it was to find the CULTIVATE National Node Web sites using a search engine Google was used to search for "Cultivate country" where country was replaced by the English spelling of the country. The results are given in Appendix 2.
It is pleasing to report that for all of the countries the National Node Web site, the central CULTIVATE Web site or the Cultivate Interactive e-journal was found in the first page of the results for all Web sites, in numbers ranging from the first 4 results to all 10 results on the first page provided by Google! These findings are even more impressive if one takes into account the fact that the search term is in English and not in the country's native language.
Based on these findings the following recommendations are made:
The analysis of the National Node Web sites has shown that many improvements have been made in comparison with the approaches taken in providing Web sites for the National Focal Points. The CULTIVATE project itself has benefited from the consistent approaches which have been made. It is especially pleasing to discover that a search for the term "cultivate" (a widely used word) obtains so many hits from the Google search engine, and that a search for this term in conjunction with the country name will also provide relevant information for all countries. The CULTIVATE name is clearly a valuable asset to the project (and the European Commission itself).
Although the National Node Web sites are clearly search engine-friendly, the survey has identified a number of areas in which minor modifications to the Web sites should be made in order to improve the Web sites accessibility and interoperability. It is hoped that this survey has helped in identifying these areas, and that the tools which are accessible from this article will be used by the National Nodes both to enhance their Web sites and to examine other National Node Web sites in order to make use of best practices.
A summary of the findings is given in the following table:
| Node | Server | NetMechanic Analysis | Accessibility | 404 Page | robots.txt | Links To Site (from AltaVista) | Pages Indexed (by AltaVista) | |
| 1 | Central CULTIVATE Web Site | Zope/Zope 2.3.0 (source release, python 1.5.2, linux2) ZServer/1.1b1 on Solaris 8. Check |
Link check - 1 bad links HTML check - 16 errors Browser compatibility - 15 problems Load time = 6.35 secs Try it |
0 P1 errors Check |
Zope default Try it |
None Try it |
187 Try it |
23 Try it |
| 2 | Austria's CULTIVATE Web Site | Microsoft-IIS/5.0 on Windows 2000 Check |
Link check - 0 bad links Other information not available (framed Web site) Try it |
At least 1 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
0 Try it |
| 3 | Belgium's CULTIVATE Web Site | Microsoft-IIS/5.0 on Solaris 8 Check |
Link check - 0 bad links Browser compatibility - 0 problems HTML check - 1 errors Load time = 6.80 secs Try it |
2 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
0 Try it |
| 4 | Bulgaria's CULTIVATE Web Site | Microsoft-IIS/4.0 on NT4/Windows 98 Check |
Link check - 0 bad links HTML check - 0 errors Browser compatibility - 0 problems Load time = 5.68 secs Try it |
2 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
0 Try it |
| 5 | Czech Republic's CULTIVATE Web Site | Apache/1.3.14 (Unix) PHP/4.0.5-dev on Linux Check |
Link check - 1 bad links HTML check - 20 errors Browser compatibility - 14 problems Load time = 10.72 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
0 Try it |
| 6 | Denmark's CULTIVATE Web Site | Zope/Zope 2.3.0 (source release, python 1.5.2, linux2) ZServer/1.1b1 on Solaris 8. Check |
Link check - 0 bad links HTML check - 19 errors Browser compatibility - 12 problems Load time = 4.95 secs Try it |
0 P1 errors Check |
Zope default Try it |
None Try it |
1 Try it |
0 Try it |
| 7 | Estonia's CULTIVATE Web Site | Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_perl/1.23 on Linux Check |
Link check - 3 bad links HTML check - 13 errors Browser compatibility - 14 problems Load time = 8.07 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
0 Try it |
| 8 | Finland's CULTIVATE Web Site | Microsoft-IIS/5.0 on Windows 2000 Check |
Link check - 4 bad links HTML check - 28 errors Browser compatibility - 11 problems Load time = 5.73 secs Try it |
0 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
0 Try it |
| 9 | Germany's CULTIVATE Web Site | mod_perl/1.18 Apache/1.3.4 (Unix) (SuSE/Linux) PHP/3.0.7 on Linux Check |
Link check - 2 bad links HTML check - 22 errors Browser compatibility - 0 problems Load time = 5.16 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
42 Try it |
0 Try it |
| 10 | Greece's CULTIVATE Web Site | Apache/1.3.14 (Unix) tomcat/1.0 on Linux Check |
Link check - 0 bad links HTML check - 63 errors Browser compatibility - 27 problems Load time = 10.52 secs Try it |
0 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
0 Try it |
| 11 | Hungary's CULTIVATE Web Site | Apache/1.3.9 (Unix) Debian/GNU on Linux Check |
Link check - 2 bad links HTML check - 14 errors Browser compatibility - 14 problems Load time = 8.44 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
0 Try it |
0 Try it |
| 12 | Ireland's CULTIVATE Web Site | Zope/Zope 2.3.0 (source release, python 1.5.2, linux2) ZServer/1.1b1 on Solaris 8 Check |
Link check - 0 bad links HTML check - 15 errors Browser compatibility - 15 problems Load time = 5.70 secs Try it |
0 P1 errors Check |
Zope server default Try it |
None Try it |
2 Try it |
20 Try it |
| 13 | Israel's CULTIVATE Web Site | Microsoft-IIS/5.0 on Windows 2000 Check |
Link check - 0 bad links HTML check - 1 error Browser compatibility - 3 problems Load time = 14.19 secs Try it |
0 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
0 Try it |
| 14 | Italy's CULTIVATE Web Site | Microsoft-IIS/4.0 on NT4/Windows 98 Check |
Link check - 0 bad links HTML check - 2 errors Browser compatibility - 1 problem Load time = 12.15 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
1 Try it |
| 15 | Latvia's CULTIVATE Web Site | Microsoft-IIS/4.0 on NT4/Windows 98 Check |
Link check - 2 bad links HTML check - 59 errors Browser compatibility - 12 problems Load time = 14.30 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
1 Try it |
| 16 | Lithuania's CULTIVATE Web Site | Apache/1.3.14 (Unix) PHP/4.1.2 on Solaris Check |
Link check - 6 bad links HTML check - 22 errors Browser compatibility - 8 problems Load time = 7.04 secs Try it |
0 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
0 Try it |
| 17 | Netherlands' CULTIVATE Web Site | Apache/1.3.14 (Unix) mod_perl/1.21 PHP/3.0.12 on Solaris Check |
Link check - a bad links HTML check - NA errors Browser compatibility - NA problems Load time = NA secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
0 Try it |
| 18 | Norway's CULTIVATE Web Site | Microsoft-IIS/4.0 on NT4/Windows 98 Check |
Link check - 2 bad links HTML check - 28 errors Browser compatibility - 12 problems Load time = 5.30 secs Try it |
0 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
0 Try it |
| 19 | Russia's CULTIVATE Web Site | Microsoft-IIS/5.0 on Windows 2000 Check |
Link check - 0 bad links HTML check - 34 errors Browser compatibility - 13 problems Load time = 9.26 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
2 Try it |
0 Try it |
| 20 | Slovenia's CULTIVATE Web Site | Apache on HP-UX Check |
Link check - 3 bad links HTML check - 14 errors Browser compatibility - 15 problems Load time = 11.36 secs Try it |
0 P1 errors Check |
Tailored Try it |
None Try it |
0 Try it |
0 Try it |
| 21 | Spain's CULTIVATE Web Site | Apache/1.3.4 (Unix) tomcat/1.0 on Compaq Tru64 Check |
Link check - 0 bad links HTML check - NA errors Browser compatibility - NA problems Load time = NA secs Try it |
0 P1 errors Check |
Server default Try it |
None Try it |
1 Try it |
1 Try it |
| 22 | Sweden's CULTIVATE Web Site | Microsoft-IIS/4.0 on NT4/Windows 98 Check |
Link check - 0 bad links HTML check - 1 error Browser compatibility - 0 problems Load time = 6.12 secs Try it |
1 P1 errors Check |
Server default Try it |
None Try it |
9 Try it |
1 Try it |
| 23 | UK's CULTIVATE Web Site | Zope/Zope 2.3.0 (source release, python 1.5.2, linux2) ZServer/1.1b1 on Solaris 8 Check |
Link check - 0 bad links HTML check - 17 errors Browser compatibility - 11 problems Load time = 7.11 secs Try it |
0 P1 errors Check |
Zope default Try it |
None Try it |
3 Try it |
0 Try it |
As a comparison the survey has been carried out for the Cultivate Interactive e-journal. The findings are given in the following table.
| Site | Server | NetMechanic Analysis | Accessibility | 404 Page | robots.txt | Links To Site (from AltaVista) | Pages Indexed (by AltaVista) | |
| 1 | Cultivate Interactive's Web Site | Microsoft-IIS/4.0 on NT4/Windows 98 Check |
Link check - 0 bad links HTML check - 21 errors Browser compatibility - 14 problems Load time = 11.90 secs Try it |
0 P1 errors Check |
Tailored Try it |
Exists Try it |
412 Try it |
161 Try it |
The information in the table was collected between 10-13 May 2002.
It should be noted that there may be limitations in the services used to carry out this survey. For example, it has been noticed that the NetMechanic link-checking service does not understand the HTML element <BASE> which can provide an alternative directory for relative URLs.
The search engine Google was used to search for the term "Cultivate country" where country was replaced by the English spelling of the country. The results are given in the following table.
This survey took place on 14th May 2002.
| Site | Search | Comments | |
| 1 | Austria | Search for "Cultivate Austria" | All first 10 hits seem relevant |
| 2 | Belgium | Search for "Cultivate Belgium" | All first 10 hits seem relevant |
| 3 | Bulgaria | Search for "Cultivate Bulgaria" | The first 7 hits seem relevant |
| 4 | Czech Republic | Search for "Cultivate Czech Republic" | The first 6 hits seem relevant |
| 5 | Denmark | Search for "Cultivate Denmark" | The first 7 hits seem relevant |
| 6 | Estonia | Search for "Cultivate Estonia" | The first 5 hits seem relevant |
| 7 | Finland | Search for "Cultivate Finland" | The first 3 hits seem relevant |
| 8 | Germany | Search for "Cultivate Germany" | The first 8 hits seem relevant |
| 9 | Greece | Search for "Cultivate Greece" | The first 7 hits seem relevant |
| 10 | Hungary | Search for "Cultivate Hungary" | The first 6 hits seem relevant |
| 11 | Ireland | Search for "Cultivate Ireland" | All of the first 10 hits seem relevant |
| 12 | Israel | Search for "Cultivate Israel" | The first 6 hits seem relevant |
| 13 | Italy | Search for "Cultivate Italy" | The first 5 hits seem relevant |
| 14 | Latvia | Search for "Cultivate Latvia" | The first 5 hits seem relevant |
| 15 | Lithuania | Search for "Cultivate Lithuania" | All of the first 10 hits seem relevant |
| 16 | Netherlands | Search for "Cultivate Netherlands" | The first 4 hits seem relevant |
| 17 | Norway | Search for "Cultivate Norway" | The first 9 hits seem relevant |
| 18 | Russia | Search for "Cultivate Russia" | All of the first 10 hits seem relevant |
| 18 | Slovenia | Search for "Cultivate Slovenia" | The first 5 hits seem relevant |
| 19 | Spain | Search for "Cultivate Spain" | The first 5 hits seem relevant |
| 20 | Sweden | Search for "Cultivate Sweden" | The first 6 hits seem relevant |
| 21 | UK | Search for "Cultivate UK" | The first 9 hits seem relevant |
Note that in a search for the term "Cultivate" all the results shown on the first page are relevant to the CULTIVATE project.
It should be noted that this table only reports on links from pages which are indexed by Google. It will not, for example, report on links from pages on Intranets.
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath
BA2 7AY
URL: <http://www.ukoln.ac.uk
Email: b.kelly@ukoln.ac.uk
Brian Kelly is UK Web Focus. He works for
UKOLN
which is based at the
University of Bath
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Kelly, B "WebWatching CULTIVATE National Node Web Sites", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/webwatch/>
|
Copyright ©2000 - 2006 University of Bath. | Published by UKOLN | Design by ILRT | Contact Us ISSN 1471-3225 |