![]() |
Search Options | Help | Site Map | Cultivate Web Site | |||||
|
||||||
| Home | Current Issue | Index of Back Issues |
| Issue 3 Home | Editorial | Features | Regular Columns | News & Events | Misc. | ||
By René van Horik - January 2001
René van Horik reports on the European Visual Archive Project (EVA), which reviews the obstacles and alternatives in providing access to the photographic collections of public archives. EVA aims to create a working information system for end users allowing them to discover the rich photographic resources of public archives.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Historical photograph collections are among the least accessible sources available to researchers because of their large size, complex organisation, physical fragility, and often-rudimentary description and cataloguing. Most consist of large groups of related materials that share one or more significant common denominators, such as source, subject, or medium. That common feature often serves as the framework for organizing and providing access to the individual pieces" [1].
This citation from Stephen Ostrow's report Digitizing Historical Pictorial Collections for the Internet describes in general the starting point of the EVA project. The project aims to investigate relevant issues to enhancing access to historical photographic collections. These issues include: copyright issues, selection procedures, user surveys, digitization techniques, description standards, pricing policy and digital information management systems. Based on the outcomes of this research a Web-based information system is being developed: the EVA system. This system contains descriptions and digital images that belong to the photographic holdings of two City archives: the London Metropolitan Archives and the City archives of Antwerp.
The EVA project has two main audiences: Image producers and image consumers. Based on the outcomes of the project an archive will be able to digitize and document its photographs in a well thought-out way. The low threshold for collections to join the EVA system provides them with a tool to get in contact with a huge potential of image consumers [2]. These users can search the image descriptions, view reference images and order images for specific use.
The purpose of this article is to report on the main outcomes of the studies carried out within the framework of the project and to describe the starting points on which the EVA system is based.
EVA is part of the INFO2000 initiative launched by the European Commission. INFO2000 projects are multi-national, public-private sector partnerships that exploit public sector information. Both end-users and information-holders throughout the member states of the European Union should benefit from the project results. The project started in December 1998 and ended in February 2001. After completion of the EVA project, the EVA system will be further expanded and developed [3].
The project follows a more or less natural path that starts with an inventory of relevant issues regarding the exploitation of historical photographs and leads to an information system that best meets the requirements of the content providers and the collection users [4].
The project started with an analysis of historical photographic collections and a survey among users. It became clear that users are especially interested in historical photographs of the built environment. This played a role in the selection of the images to be presented in the information system that was developed further on in the project. One of the project partners, the European Commission on Preservation and Access, extended the research on a broad international scale and published the results in the printed and online publication In the Picture [5]. The study made clear that many different institutions hold photographic collections that can be considered as an essential part of European cultural heritage. For only a small minority of the institutions commercial exploitation of the collection is an important activity. The total number of photographs held is huge and displays a wide range of materials. Preservation of the collection is often problematic [6]. Almost all institutions in the survey have started digitization projects. Concerning digitization techniques and documentation schemes an enormous diversity can be observed. Often the short-term view prevails over a long-term vision. The outcomes of the study revealed the relevance for common standards and guidelines in the field of digitization and documentation.
Early on in the project the copyright was studied in European context in order to avoid legal problems once the photographs were available online. A report of the EVA project showed the differences between the copyright in Belgium, the United Kingdom and the Netherlands [7]. A harmonization under the new EU copyright directive is not realised yet. Concerning the selection of photographs for online access, the content providers followed their national copyright laws.
The quality and standards research of the project took a closer look into alternatives to digitize and describe historical photographs. Several good guidelines are available to base a digitization and documentation project on. But the ultimate standard does not exist, because several factors should be taken into consideration. After the discussion of the outcomes of the preparatory studies the EVA project team reached the following conclusions:
The initial project studies served as the basis for the digitization of the collection and the development of a working model, the EVA system. Both are described later on in more in detail.
In this section the background on the creation of digital images and descriptions for the EVA system is described.
Based on the results of the preparatory studies the content providers of the project, the City archives of Antwerp and London Metropolitan Archives, each started the process of selecting photographs by creating 10.000 digital master files [8]. These digital images had to be rich enough to serve as the basis for derivative images that are published online in the EVA system. The specifications for these online images are given in table 1. Next to that the digital master files will be used for other types of output, such as the creation of high quality prints. A labor-intensive re-scan of the vulnerable originals should be prevented. The two archives did not start from scratch. They both already had experiences with digitizing photographs, so several images could be re-cycled for the project.
The online access system to be developed in the project should contain small thumbnail images for a fast global reference to the original. Next to that a reference image should reveal the complete essence of the original photograph to the user. It took some discussion before agreement on the specification of the reference image was reached, because an image with too many picture elements could lead to an unintended use of the images. The reference image should give a fair impression of the details of the original photographs on a standard computer screen (800 x 600 pixels) but should not facilitate the creation of a high quality print on paper.
| Image type | Thumbnail | Reference |
| Purpose | Global reference | Fair impression of the original on a standard computer screen (800 x 600 pixels) |
| Pixel dimension | 50 pixels in longest dimension | 400 pixels in the longest dimension |
| Image dynamics | 256 gray levels | 256 gray levels |
| Image file format | Jpeg | Jpeg |
| Remarks | Visual watermark contains copyright statement at the bottom of the image |
Table 1: specifications of derivative images that are available in the online access system (EVA system)
![]() |
| Image 1: An example of a reference
image from the EVA system: 400 pixels in the longest dimension, 256 gray levels, jpeg image file format, visual copyright statement at the bottom of the image. |
![]() |
| Scheme 1: Illustration of digital
image production principle of the EVA system Archives create digital master files from which derivatives are subtracted. These are sent by the standard Internet protocol FTP to the server of the EVA-system. Archives can add, delete and replace images independently. |
Digitizing historical photographs is more than just putting photographic prints on a scanner. A lot of information associated with the creation of digital images is relevant for (future) use, access, update and maintenance of the images and the relation with the original prints. This information (or data) about data is called metadata. It turned out to be that within the universe of discourse of the EVA project several metadata schemes are of potential importance. This is because roughly speaking the EVA project is covers three related things: firstly, the historical photograph as a physical medium, secondly, the digital surrogate that is based on the photograph and thirdly, that what is visible on the photograph and the processed digital image. For the sake of abstraction these three things together (the photograph format, the digital image and the visible scene or content) are called an EVA visual object, abbreviated as EVO. It was the initial ambition of the project to develop a description scheme that covers all aspects of an EVO. The elements for this scheme are taken from several relevant professional communities. The archival community developed a standard on the description of archival holdings. From the digital imaging world initiatives resulted in important technical descriptive data elements for digital still images. Concerning the content of the visual sources museum and library organisations have produced metadata schemes that are relevant. Finally research into the history of photography resulted in interesting documentation protocols. The project reviewed several description schemes but concluded that more research and consultation with domain specialists is required to establish a complete and full EVO-metadata scheme. The project designed a provisional, simple and small description object: the EVOlite. More information on the EVOlite is given below.
The project concentrated its further description activities on a specific type of application: the minimal description elements that are required to set up the Web based information system that should give access to the digital images. The Dublin Core Metadata Element Set, a draft ANSI standard, served as the basis for the development of this minimal set of descriptive elements [9]. The 15 Dublin Core description elements were evaluated. Not all elements were considered as necessary for the minimal description of a digital image. To avoid ambiguous use of the selected elements the EVA project described the semantic interpretation in detail. The next step was to translate the description elements into XML-elements. The rational behind this is explained further on in the article. As the EVA system is based on a relational database management system, the XML-elements had to be connected to database fields. The table below illustrates the metadata elements that are used in the project. It can be observed that the EVA interpretation of the Dublin Core qualifier in some cases is related to the original photograph and in others to its digital surrogate.
| Dublin Core Qualifier | EVA interpretation of the DC qualifier | Element in XML file (part of EVOlite DTD) | Field name in DBMS of EVA system (relation between tables is not given) |
| Title | Short description in the original language of what is visible on the digital image. The title can include a date. | Title (required) | Title |
| Creator | Name of photographer that took the original photograph | Photographer (optional) | Photographer |
| Subject / Keyword | Descriptive terms related to the content in the original language according to the documentation policy as used by the local archive. No adjustment to any common authority list | Subject (optional) | Subject |
| Description | Free text description of what is visible on the digital image. | Description (optional) | Description |
| Publisher | Name of archive that provides the EVA system with the images. | Archive (required) | Archive |
| Date | Date connected to the
creation of the original photograph. Two alternatives: Exact date (day/month/year) Or Period: (begin year) and (end year) |
Date (optional) (day | month | year) (note: only year is required) Timeperiod (optional) (beginyear | endyear) |
Date_day / Date_month /
Date_year Year_ begin / Year_end |
| Identifier | Name of the reference image and the thumbnail image. | Location (required). This element has two attributes: thumbnail and refimg | Thumbnail Image |
| Language | Language used in the elements: title, description, subject/keyword and coverage. | Not a separate XML-element but an attribute of an EVOlite element | |
| Relation | Reference to original photograph in the physical archive, e.g. its inventory code. | Relation (optional) | Relation |
| Coverage | All geographic terms connected to the description in the national language (e.g. street, district, city, and country). No adjustment to any common authority list. | Geography (optional) | Geography |
| Type | Not used | ||
| Format | Not used | ||
| Source | Not used | ||
| Rights | Not used. Note: the archive creates a separate Copyright statement that is valid for its complete collection that is available online. |
Table 2: Description elements used by the EVA project for the description of the images that are available in the EVA system
It should be noted that the EVA system has only three obligatory description elements: a title, the names of the thumbnail and reference images and the collection holder (archive). In principle this is enough to give an end user, in a sensible way on a minimal level, access to the collection of the historical images. This minimal documentation scheme is quite a contrast with the ambitious extensive EVO description concept as described above. But the creation of descriptions is very labor intensive and both archives have already descriptions according to their own tradition available. The archive information systems used by the City archives of Antwerp and London Metropolitan Archives cover all kinds of archival sources and serve more purposes than just resource discovery of photographic units for long distance access. It turned out to be more efficient to agree on a limited generic set of description elements that can be extracted automatically from the local information system than to create a separate extensive documentation scheme according to the EVO concept.
For the implementation of the data exchange between the local archive information systems and the central EVA system the project decided to use the XML standard. This is an application independent data structure. For each description of a photograph a separate XML file is created. An XML document contains special instructions called tags, which usually enclose identifiable parts of the document. The tags that are used by the project are given in the third column of table 2. The elements that are allowed are specified in a DTD (document type definition). The DTD used by the EVA system is called EVOlite DTD[10]. In this way self-describing documentation units are created. Two examples of XML formatted descriptions can be found in figure 1.
The creation of the XML files in principle is the responsibility of the archives. Within the project software and procedures were developed to assist them in the creation of output in XML format. In the future probably more and more information systems will facilitate the creation of data in XML format and it will become easier to manage data consistency between a local archive management system and the Web-based access system. Just like with the images the XML files are sent via FTP to the server of the EVA system. The archives can independently add, change and delete descriptions. The process is illustrated in scheme 2.
![]() |
| Figure 1: Two example descriptions in
XML-format according to the EVOlite DTD. Both have the obligatory elements ‘title’, ‘archive’ and ‘location’. Next to that the descriptions contain several of the other optional elements |
![]() |
| Scheme 2: Illustration of documentation production principle of the EVA system.
Archives create XML files, subtracted from the local archive information system. The XML files are sent by the standard Internet protocol FTP to the server of the EVA system. Archives can add, delete and replace XML files independently. |
The EVA system is an information system designed mainly to provide access to individual photographs that are part of distributed photographic collections. Via the Internet [2] the user can get access to a catalogue of historical photographs of the current content providers of the project and search and browse through the description fields. The user can view digital images of historical photographs and order prints and digital images. The user can be any individual person or organization, e.g. multimedia industry and publishers. To facilitate access to as much users as possible some multilingual functions are part of the system. This is described in the next paragraph. The EVA system makes it possible for any type of user to order specific items, but the actual transaction between the user and the owner of the item (the archive) is done directly between the user and the archive. The system will inform the user on prices, formats and shipping procedures of the item the user is interested in. The basic principle of the EVA system is given in scheme 3.
![]() |
| Scheme 3: Basic principle of the EVA system. |
As can be seen in scheme 3 the EVA system aims at two types of usage: End-users interested to have access to a catalogue of images and descriptions and users interested in the results of the EVA project and the model of the EVA system. Based on the information on the Web site an archive employee e should be able to evaluate the relevance of the project results for the conversion and dissemination of its own collection.
The XML formatted descriptions are automatically converted to the database on which the EVA system is based. The fields of this database are given in the fourth column of table 2. Periodically the database is refreshed with new information that is sent to the server by the archives with the help of the FTP protocol. The interface between the database and the end-user consists of several Web pages. Image 2 contains the advanced search screen of the system. The input fields are based on the database that contains information that originates from the XML formatted files provided by the archives.
![]() |
| Image 2: Screen dump of advanced search screen of EVA system. |
One of the goals of the project is to facilitate the searching of images in several languages. This paragraph briefly describes how the project implemented this requirement [11].
As photographs by nature do not contain character strings, a multilingual query function could only be based on texts that cover the domain of the scenes visible on the photographs. Because the project could not find an existing corpus of terms or a thesaurus that sufficiently matches the content of the images it was decided to develop a specific EVA-lexicon. This was done by collecting many kinds of electronic texts that have a relation with the photographs to be converted. The data from London Metropolitan Archives was in English and the City archives of Antwerp provided Dutch texts. The first step towards the creation of the EVA lexicon was the automatic extraction of the Dutch and English terms. A manual editing process was part of this step that resulted in a list of 6.000 terms. The Dutch part of this list of terms (or lexicon) was translated in English and the English items were translated in Dutch. Then all terms were translated to an additional language, German, to demonstrate the possibility to retrieve descriptions in another language than the original one. In principle the lexicon can be extended with additional terms and translations. The result of the lexicon was reviewed and adjusted by the project partners. As an example two of the 6.000 terms are given. The English term Church, the German term Kirche and the Dutch term Kerk are available in the lexicon. Another entry in the lexicon is the English Building, the German Gebäude and the Dutch Gebouw. The terms originally come from the textual descriptions from London Metropolitan Archives and/or the City archives of Antwerp.
In order to achieve more extended hits and improved results of the database search, the terms in the lexicon are connected with a network of related expressions. This is done in the so-called expansion list that defines broader and narrower relations. This man-made expansion list is created in English, because all project partners can review a list in this language. The expansion list contains, for example, the broader term Building with its narrower term Church. Based on this English expansion list both German and Dutch expansions can be created because the translations are available in the lexicon. In case a lexicon term in any given language has more than one meaning in one or both of the other two languages, the synonyms are entered in the lexicon.
![]() |
| Scheme 4: Implementation of multilingual search facilities in EVA system. |
The available lexicon and expansion lists are used by a software component that is able to process a search string typed in by a user in the interface of the EVA system. This piece of software, more or less a black box, is called the Query Translation and Expansion (QTE) component [12]. This QTE component evaluates the search string of the user, tries to find a translation, synonym or "expanded" term and constructs a search string in the standard query language SQL. This SQL statement is processed by the EVA system and the output is presented to the user.
For the moment the implementation of the multilingual search facility can be considered as a prototype. The lexicon and expansion list can be adjusted without changing the QTE component. The component pre-processes the human readable lexicon and expansion list. After adjusting the list a new pre-processing step has to be executed.
As can be seen in scheme 4 the quality of the currently implemented multilingual search functionality very much depends on the extend to which the descriptive texts (where the lexicon is based on) corresponds with the data in the XML-files submitted to the EVA system later on in the project. Another quality factor is the influence of the manual editing actions that are part of the construction process of the lexicon.
Other collections can join the EVA system. In order to evaluate the possibilities a collection should pay attention to the following issues:
1. Do you have a photographic collection that is available for access by users via the Web?
2. Do you have digital master files that facilitate creation of derivatives (reference images / thumbnails) according to the EVA guidelines (see table 1)?
3. Do you have a description format of images that can be mapped with the description format of EVA system (see table 2)?
4. Are you prepared to consider the optional activities required to expand the lexicon and expansion list with terms that cover the specific domain of your photo collection? It is even possible to expand the system with more languages than the currently available English, German and Dutch.
5. Can you deliver documentation in XML format? For each individual photo an XML file according to the EVOlite dtd should be created. (see the examples in figure 2)
6. Do you have an order procedure statement that describes the formats, prices and usage of image reproductions?
7. Do you have a copyright statement that is applicable to all online images?
8. Can you receive and process orders that are sent by the EVA system in an email?
9. Can you upload data (images / documentation) via FTP to the server of the EVA system?
10. Are you prepared to pay for the data-transmission and data-storage of your collection (images / description) that is part of the EVA system?
If all questions are answered positive it will not be difficult to join the EVA system and present your collection to a huge user community. More information can be found on the EVA Web site.
The current version of the EVA system is an important end result of the now finished EVA project. In another sense it is also the starting point for further enhancement and fine-tuning in the future. We cannot foresee yet to what extend other collections and end-users are interested to use the system the adjustments will be based on presently unknown factors. We think, however, that the actual version of the EVA system facilitates both collection holders and image users in an efficient way.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
René van Horik
Researcher / Project manager
NIWI
Netherlands Institute for Scientific Information Services
P.O. box 95110
1090 HC Amsterdam
The Netherlands
Rene.van.horik@niwi.knaw.nl
<http://www.niwi.knaw.nl/>
René van Horik is employed as a researcher and project manager at NIWI (Netherlands Institute for Scientific Information Services). He is involved in research and projects in the field of the conversion, dissemination and archiving of cultural heritage sources.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
van Horik, R. "Archives and Photographs: the European Visual Archive Project (EVA)", Cultivate Interactive, issue
3, 29 January 2001
URL: <http://www.cultivate-int.org/issue3/eva/>
|
|