![]() |
Search Options | Help | Site Map | Cultivate Web Site | |||||
|
||||||
| Home | Current Issue | Index of Back Issues |
| Issue 1 Home | Editorial | Features | Regular Columns | News & Events | Misc. | ||
By Pasquale Savino and Costantino Thanos - July 2000
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Wide access to large information collections is of great potential importance in many aspects - economic, environmental, health, cultural, social, etc. - of everyday life. However, limitations in information and communication technologies have, so far, prevented the average person from taking much advantage of existing resources. Humanity, in its continuous evolution, has accumulated an enormous quantity of information, knowledge, experience, art treasures, etc. One only has to think of the art treasures contained in our archives, libraries and museums, or of the immense and precious collections of observational data in the areas of space exploration, earth sciences, the environment, medicine, etc. accumulated during the last century. A huge amount of material has also been produced as video material. Most European countries have national audiovisual archives holding historical documentaries produced during the twentieth century. Such material is extremely precious from a historical and cultural viewpoint.
The ECHO project aims at developing a Digital Library (DL) service for historical films belonging to large national audiovisual archives. Actually being able to see and hear an account of a historical event, filmed in the original context, is very different from reading about it. The ECHO services will allow a user to search and access these documentary film collections. Users will be able, for example, to see an event which is documented in the country of origin and how the same event has been documented in other countries, or to investigate how different countries have documented a particular historical period of their life, etc. One effect of the emerging digital library environment, is that it frees users and collections from geographic constraints. This that we have to work across languages, cultures, international standards, etc.
The project involves a number of European institutions holding or managing unique collections of documentary films, dating from the beginning of the century until the seventies. These collections are of great value since they document the different aspects (social, cultural, political, economic) of life in European countries during this period of time. The set of services implemented by ECHO will provide users with access to significant portions of their cultural heritage which would otherwise be almost inaccessible.
The emergence of the networked information system environment allows us to envision digital library systems that transcend the limits of individual collections to embrace collections and services that are independent of both location and format. In such an environment, it is important to support the interoperability of distributed, heterogeneous digital collections and services. Achieving interoperability among digital libraries is facilitated by conformance to an open architecture as well as agreement on items such as formats, data types, and metadata conventions.
ECHO aims at developing a long term reusable software infrastructure and new metadata models for films in order to support the development of interoperable audiovisual digital libraries. In addition, the project aims at improving the accessibility, searchability, and usability of large historical audiovisual archives. Through the development of new models for film metadata, intelligent content-based searching and film-sequence retrieval, video abstracting tools, and appropriate user interfaces, the project intends to improve the accessibility, searchability, and usability of large distributed audiovisual collections. Through the implementation of multilingual services and cross language retrieval tools, the project intends to support users when accessing across linguistic, cultural and national boundaries. The ECHO system will be experimented, in the first place, for four national collections of documentary film archives (Dutch, French, Italian, Swiss). Other archives may be added in a later stage.
In order to render a digital library of this type feasible, the project has to solve the numerous technical problems that currently bar the inclusion of film information in the digital environment. The aim is to make the film collections available to as broad as possible range of users. To achieve this goal, the project will:
The Informedia Digital Video Library was funded by the first NSF/ARPA/NASA Digital Library Initiative (DLI) from 1994-1998, and was the only DLI project focusing on full-content indexing and retrieval of audio and video material.
Media Archive® is a content management system built with a client/server architecture. The Media Archive® client components form an integrated application suite supporting a continuous workflow in documentation, retrieval and reuse. The six Media Archive® client components support every stage of an archive workflow, from the acquisition of new content, through the creation of the metadata and the precise selection of content to be retrieved, to its provision for further processing,
One of the intellectual challenges of this project is that of evaluating the vast collections contained in the national film archives in order to make available online the most useful film elements, as perceived by a variety of user communities.
Another important aspect of the project will be the addition of a layer of metadata to the film archives. Metadata elements as presently defined do not describe film data well. The project will, thus, define a metadata model for film information. A semi-automatic process will be designed by which existing local catalogue records can be integrated with metadata elements, automatically extracted during the indexing/segmentation of the film material, into a common description, i.e., the common metadata model.
A collection service provides the mechanism for the aggregation of sets of digital objects into meaningful (from a given perspective) collections. Collections play an important role in the usability of a DL. The division of a DL into collections allows the application of collection-specific methods to improve discovery and access within those collections. Collection definition criteria are used to define which films are elements of which collections. The ECHO system will support a film collection service. Each ECHO collection could include films belonging to different content providers.
Design a multilingual user interface that provides functionality for accessing the ECHO collections independently of language.
Local site interfaces will be implemented in the local languages; however, a common user interface in English will also be maintained on the project Web-site for external access. Online cross-language search tools will be provided. Cross-language interrogation will be enabled by means of the employment of standard metadata formats, and via mechanisms which provide a mapping between the descriptive languages used by each partner.
The utility of the digital film library can be judged by the ability of the users to retrieve information they need easily and efficiently. The project will provide content-based searching and film-sequence retrieval. As the content is conveyed in both narrative (text and speech) and the image, a collaborative interaction of image, speech and language technology will be adopted in order to search the diverse film collections with satisfactory recall and precision. Three speech recognisers (Italian, French, and Dutch) will be built and integrated into the system architecture.
The project will develop techniques to produce visual summaries. The aim is to capture the content and structure of the underlying documentary film in a brief visual abstracting process. The summary will consist of a sequence of moving images, much shorter than the original film, but preserving the essence of the original message. It should provide a good overview of the entire film documentary.
In order to make a digital library of films possible, the copyright owners must be assured that their property will be properly protected and that its use will be measured in order to ensure them appropriate compensation. The project, therefore, will develop mechanisms which support the following functionality: access control, authentication, security, privacy, and billing.
|
| Figure 1: System Overview |
Figure 1 provides an overview of main operations supported by the ECHO system. ECHO assist in the population of the digital library, through the use of mechanisms for the automatic extraction of content. Using a high-quality speech recogniser, the sound track of each video source is converted to a textual transcript, with varying word error rates. A language understanding system then analyses and organises the transcript and stores it in a full-text information retrieval system. Multiple speech recognition modules, for different European languages will be included. Likewise, image understanding techniques are used for segmenting video sequences by automatically locating boundaries of shots, scenes, and conversations. Metadata is then associated with film documentaries in order to complete their classification.
Search and retrieval via desktop computer and wide area networks is performed by expressing queries on the audio transcript, on metadata or by image similarity retrieval. Retrieved documentries or their abstract, are then presented to the user. By the collaborative interaction of image, speech and natural language understanding technology, the system compensates for problems of interpretation and search in the error-full and ambiguous data sets. Exploration of the ECHO library is based on these same techniques, allowing for spoken or typed natural language access to the information space.
The project will follow an incremental approach to system development. Three prototypes will be developed offering an increasing number of functionalities. The starting point of the project will be a software infrastructure resulting from an integration of the Informedia and Media Archive® technologies.
The ECHO partners can be divided into three categories: (i) content providers (documentary film collection holders), (ii) industrial partners and (iii) academic partners.
The ECHO content providers (Istituto Luce; Italy, Institut Nationale Audiovisuel, France;, Netherlands Audiovisual Archive, the Netherland, and Memoriav, Switzerland) own very large and precious audiovisual collections which document the different aspects of the life in their own countries during this century, starting from the twenties and continuing up to the sixties/seventies. The content providers will provide important input of the user expectations and requirements from a system of this kind. They will be responsible for selecting from their collections and, several sets of meaningful and interrelated collections of film footage. This will be indexed using the common metadata model defined by the project and made publicly available on the Web for demonstration purposes.
The industrial partners (Tecmath, EIT, and Mediasite) will develop and implement the global architecture of the ECHO system
There are two main academic partners (CNR and CMU) and four associate partners (CNRS-LIMSI, IRST, University of Twente, and University of Mannheim) with very specific tasks.
CNR-IEI, Pisa, an institute of the Italian National Research Council, is the project coordinator and is responsible for the overall technical management and system implementation and development.
Carnegie Mellon University (CMU) is a top US research institution. It has developed the Informedia digital video library project as one of the six Digital Library Initiative (DLI) projects funded jointly by NSF, DAPRA and NASA. Its goal is to enable for video all the functionality and capability existing for textual information retrieval, while leveraging the temporal and visual qualities of video for richer information delivery. CMU will be responsible for the definition of the global system architecture and will be involved in the definition of the metadata model, and development and evaluation of the system prototypes. CNRS-LIMSI, IRST, University of Twente, and EIT will adapt and integrate speech recognition modules (for French, Italian, Dutch, and German) within the ECHO system and evaluate its performance. The University of Mannheim will adapt its video abstracting mechanism to the digital film context and integrate it within the ECHO system and evaluate its performance.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Pasquale Savino
IEI-CNR
Via Alfieri, 1
56010 Ghezzano (PI)
Italy
P.Savino@iei.pi.cnr.it
<http://www.iei.pi.cnr.it/>
Phone: +39 050 315 2898
Pasquale Savino graduated in physics from the University of Pisa, Italy, in 1980. From 1983 to 1995, he has worked at the Olivetti Research Labs in Pisa; since 1996, he has been a member of the research staff at the Information Engineering Dept. of the Istituto di Elaborazione della Informazione - an Institute of the Italian National Research Council in Pisa, working in the area of multimedia information systems. He has participated and coordinated several CEC-funded research projects in the multimedia area.
Currently, he is involved in the EU-funded ECHO project, and he is the coordinator of the Italian National Research Council-funded Project "Museo Virtuale della Storia dell Informatica in Italia".
He has published scientific papers in many international journals and conferences in the areas of multimedia document retrieval and information retrieval.
His current research interests are multimedia information retrieval, multimedia content addressability, and indexing.
Costantino Thanos
IEI-CNR
Via Alfieri, 1
56010 Ghezzano (PI)
Italy
C.Thanos@iei.pi.cnr.it
<http://www.iei.pi.cnr.it/>
Phone: +39 050 315 2910
Costantino Thanos is Head of the Information Engineering Dept. of the Istituto di Elaborazione della Informazione - an Institute of the Italian National Research Council in Pisa. He has been/is coordinator of a number of EC projects in the digital library or related areas currently including ECHO: European Chronicles On-line, a DL of film archives, and SCHOLNET, a DL service for a scholarly community. He was the Coordinator of the DELOS Working Group and is Director of the DELOS Network of Excellence for Digital Libraries.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Savino, P. & Thanos, C. "ECHO - European CHronicles On-line",
Cultivate Interactive, issue 1, 3 July 2000
URL: <http://www.cultivate-int.org/issue1/echo/>
|
|