![]() |
Search Options | Help | Site Map | Cultivate Web Site | |||||
|
||||||
| Home | Current Issue | Index of Back Issues |
| Issue 7 Home | Editorial | Features | Regular Columns | News & Events | Misc. | ||
By Pete Cliff - July 2002
Pete Cliff fills us in on a useful tool for Web site owners that brings them distinct benefits with relatively little maintenance effort and enhances what the site can offer to its users. This is an enthusiast who can provide different approaches to implementing RSS.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Imagine you could create a Web site that keeps your users informed of the latest news, jobs and resources available in a given subject area in addition to any content you wish to provide. A site that is updated automatically so that once you have set it up, it looks after itself, with minimal maintenance from you. Sounds like so much fiction? A little idealistic perhaps, but it is this sort of thing that a technology like RSS aims to provide.
This article explores what RSS is, how it may be used, and finally summarises some of the tools you may use to make the job easier. We begin with the obvious question:
Depending on whom you ask, RSS stands for either "Rich Site Summary" or "RDF Site Summary". The debate is old [1], and this article will not concern itself with that discussion. Suffice it to say the ideas and the motivation behind the creation of an RSS channel remain the same, regardless of the "flavour".
XML Deviant Leigh Dodds defines RSS as:
... an XML format for syndicating metadata about online content [2].
RSS is an XML format designed to enable the sharing of online content metadata. It can be used to describe the content of a Web site in a way that can be re-used by others. Because it is XML, it facilitates the full automation of the sharing and display of this metadata.
The use of RSS benefits everyone involved. As a Web site provider you get free content; as a content provider you make your content visible to a broader audience through a larger number of sites; and as an end user you get the benefit of a broad range of content available at a single site.
RSS is an incredibly flexible format. It can be used to describe just about any sort of online content. The latest news, the new books in the library, job advertisments and similar content can be pointed to via an RSS channel. That RSS channel can then be used by other institutions or local departments.
Imagine that a university carries around twenty links to campus (and other) news stories on its internal homepage. Each of the departments within the same university has similar news "channels" on its homepage in addition to departmental news. Imagine users are interested in finding out about new research grants awarded in each department. Unless the homepage was very comprehensive, they would have to visit every department, fairly regularly, just to keep up to date. If they have their own favourite external news feeds, the problem gets bigger.
If each of the departments chose to export its news as RSS channels, it would be a simple matter for the institution to create a single "information point" that syndicated all of the news across campus onto a single page.
| Figure 1. uPortal[3] supports the use of RSS to create institutional portals, like this demonstrator at the University of Nottingham[4] |
This idea could be extended further to provide the end user with the option to search the channels, or personalise which channels are displayed. This is, in effect, the beginnings of an institutional portal. A portal where the content creation is distributed, but access is at one point for any end user.
A very similar scenario could be imagined for users of a public library, where local news, jobs, new books, etc. may be syndicated onto a single library homepage, but the news is collected from other sources, including local goverment pages.
The key advantage is the devolution of Web site maintainence, (devolution that may well exist if departments all have their own pages), to those who know best, but which provides a mechanism to provide a central, one stop access point to all of this information.
Many Web homepages are lists of news items that highlight events, publications, and so on. If not the homepage, then a great many sites have news sections or pages that are lists in this way. It makes sense to provide RSS views of these listings, but this creates a problem. If you have an HTML view of a page, and an RSS view, you will have to update them both. This creates a problem of synchronicity.
There are many ways around this problem. Perhaps the most elegant is to put all of the news stories in a database and generate both RSS and HTML views from this data. However, if there is a single database this can remove the benefits of devolved maintainence of RSS channels. A database is also a fairly development heavy solution. If, however, you were looking to move your entire site into a content management system you may want to check if content can be exported as RSS as well as HTML.
There is a compromise: generate the HTML for the site from the RSS, or vice versa.
This approach has been explored by the World Wide Web Consortium [5] and has been implemented by a few sites[6]. The idea is to include a set of fixed "classes" into your page (which must be well formed XHTML), and, provided the HTML can be parsed correctly, a script is able to convert the content of these classes into well formed RSS.
For example, look at the following HTML fragment:
<p><span class="rss:item"><a name="26-06-02"><strong>26 June 2002</strong></a><br />
<small class="rss-anchor"><a href="http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-2002/news.html#26-06-02">Workshop Conclusions</a></small><br />
The <a href="talks/conclusions/">workshop conclusions</a>, given by Brian Kelly,
are now available online. </span></p>[7]
Here we see that an item is enclosed in a single span and the link for that item is given the class
rss-anchor. This enables an XHTML to RSS converter to create an RSS channel from this
Web page.
| Figure 2. HTML to RSS. A script is used to parse the HTML page and extract the relevant sections. This information is then used to create an RSS XML file. |
This approach has been adopted by a number of sites. Webreference[8] for example, or closer to home, the RDN[9]. The two "channels" on the RDN homepage (see figure 3) - Behind the Headlines and the RDN News - are automatically generated from two RSS channels [10]. We use the same two channels to create the News[11] and Behind the Headlines pages[12]. This means RDN staff need only maintain a single copy of the metadata, but it can be resurfaced in a number of ways.
| Figure 3. The highlighted area is created from two RSS channels. |
UKOLN's home page[13] is essentially a list of news items that highlight events, publications, announcements, etc.; in effect what has been happening at UKOLN. This list of news items is maintained as an RSS channel. It works like this, (numbers in brackets relate to diagram in Figure 4):
| Figure 4. RSS to HTML (see list above) |
UKOLN derives a number of benefits from this approach:
The approach adopted by UKOLN was, in fact, very easy to implement. However, it will
not suit all Web sites, nor will everyone have the time, effort or access to the server
to add SSIs, or write scripts. There has been some effort towards overcoming this problem
by making adding an RSS news feed to your page as simple as adding a <script>
tag.
For example, by pasting the following HTML into one of your Web pages will automatically include the RDN's "Behind the Headlines" RSS channel into that page:
<script src="http://www.rdn.ac.uk/rss/viewer/?rss=http://www.rdn.ac.uk/rss/channels/behind-the-headlines.xml" ></script>
The "source" of the script tag is the URL of the RSS parser script, in this case the one
used by the RDN (1). Part of that URL includes the location of the RSS to parse and return to the browser.
The RSS parsing script then retrieves the RSS channel (2), the RSS is then parsed and some HTML is created.
Because the browser is expecting to get JavaScript back, this HTML is then wrapped in document.write()
statements and passed back to the Web browser (4). At this point the Web browser has some standard JavaScript
and deals with it in the usual way. (Numbers in brackets refer to diagram in Figure 5).
| Figure 5. RSSxpressLite |
Examples of this approach can be seen at:
Because this way of including a channel means the channel is only included by a Web browser on processing the JavaScript, channel content does not get indexed by some robots. Because of this, it is not recommended to use this approach where indexing is important, for example, the UKOLN homepage.
RSS can provide you with ways of adding external content to your site, but what content is available? Because RSS has been around from a while and has been something of a success, there are many channels you can use. Public content can usually be found in registries of channels:
So you like the idea of RSS? You want to get out there and start using RSS. But I guess you want to know how to create it, what does it look like? You may, if you are thinking "RDF Site Summary", be getting nervous about the syntax and the complexities of RDF? If that is case you needn't worry. You can get by with RSS 1.0 with a minimal knowledge of RDF (though you may wonder about some of the syntax), and if you are not interested in RSS 1.0 then the 0.9x versions appear far simpler and are still supported.
When it comes to creating RSS there are many ways you can go, and many tools you can use:
If you are lucky enough to have a content management system, or you simply serve your Web content from a database, then you will find creation of RSS very easy. Many CMS's will export RSS for you and if they don't you (or your friendly programmer) can easily add RSS export support using one of the many tools available[21]. Otherwise an editor will quickly get you started with RSS creation.
| Figure 6. RSSxpress Editor Screenshot |
However, you don't need to get caught up with worries about using RSS. Tools are available, such as the JavaScript options, that make using RSS very easy. There is no requirement for you to create RSS (though you may want to) before you can benefit from the myriad of content available.
Hopefully by now you will be starting to see the benefits of a technology like RSS. It provides a simple way to syndicate online content. Implementation may be as simple as including some JavaScript in a single Web page to using a content management system to syndicate external content and export your content for others to use.
Either way, the benefits of this easy, open XML success story, are clear.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Pete Cliff
Systems Developer, The Resource Discovery Network
UKOLN
University of Bath
BATH BA2 7AY
United Kingdom
URL: <http://www.ukoln.ac.uk/
Email: p.d.cliff@ukoln.ac.uk
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For citation purposes:
Cliff, P. "RSS - Sharing Online Content Metadata", Cultivate Interactive, issue
7, 11 July 2002
URL: <http://www.cultivate-int.org/issue7/rss-issue/>
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Related articles:
If you would like to view similar articles to this one click on a key word below:
< - RSS - XML - online content metadata - >
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
Copyright ©2000 - 2001 Cultivate. | Published by UKOLN | Design by ILRT | Contact Us |