Computer Applications and Quantitative Methods in Archaeology

Fourth meeting of the UK chapter

Cardiff University, Wales, February 27-28 1999

Developing an Archaeology Soapbox and Marketplace on the Web

 

Leonel Morgado, Mila Simões de Abreu

GeIRA Project

University of Trás-os-Montes e Alto Douro - Computer Centre

Apt. 202, 5001 Vila Real Codex, Portugal

Phone +351 (59) 320356 Fax: +351 (59) 320480

E-mail: leonelm@utad.pt, msabreu@utad.pt

Abstract

Although the World Wide Web is probably one of the most powerful and successful media for scientific publishing, brainstorming and communication existing today, finding relevant material about special interest areas like archaeology or rock-art can be frustrating. A search for "archaeology" on Web indexes like Yahoo (1) often generates dozens of links and when one browses apparently pertinent categories, it is common to find themes like “Books”, and “Companies” listed side-by-side with “Mummification” and “Folsom Points”!

Special interest Web sites could benefit greatly from appropriate expertise, as “About.com” (2) and other locations prove; however, many experts may not have the time and know-how to handle the specialised applications and programming required to put material online effectively. Intuitive resources and assistance must be available to make this possible. Initiatives like “Rock Art Net” (3) illustrate the advantages of navigational interfaces for intending contributors.

The proposed hosting address or soapbox for archaeology and connected fields would have a Web-site engine supporting the navigational and design requirements of researchers, so providing easily accessible space and media for publishing on the Internet. It would be enhanced by an index having various sections compiled and organised by experts in corresponding fields. A potentially throbbing marketplace of ideas—with discussion areas, mailing lists and other forms of interaction that are emerging—would go hand-in-hand with the soapbox.

Current status of Web indexes and hosting sites

On 16th June 1999, a search for "archaeology" on Yahoo would yield 80 "category matches", i.e., 80 assemblage of sites or further subdivisions, and 769 "sites", i.e, sites whose description or name included the word "archaeology".

While the category matches already include several specific themes, these are not derived from the "Archaeology" theme itself but rather from the overall information structure of Yahoo - some of the themes just happen to have archaeology sites associated.


This is reflected in that of the 80 category matches, only these 12 where actual category matches, the remaining 68 being regional matches: -

Social Science > Anthropology and Archaeology

Full Coverage > Science > Anthropology and Archaeology

Social Science > Anthropology and Archaeology > Archaeology

Social Science > Anthropology and Archaeology > Archaeology > Marine Archaeology

Business and Economy > Companies > Scientific > Anthropology and Archaeology

Arts > Humanities > History > By Time Period > Ancient History > Roman Empire > Archaeology

Society and Culture > Cultures and Groups > Cultures > Mayan > Archaeology

Social Science > Anthropology and Archaeology > Archaeology > Biblical Archaeology

Social Science > Anthropology and Archaeology > Archaeology > Urban Archaeology

Business and Economy > Companies > Travel > Tour Operators > Archaeology

Net Events > Social Science > Anthropology and Archaeology

The "Full Coverage" and "Net Events" links direct us to Yahoo pages with news on these subjects. Useful as they may be, they are not the document repository or information pointer we desire.

From the list above, we see that the most general link is "Social Science > Anthropology and Archaeology > Archaeology". Following it, we get 25 links to sites with general information and a categorisation of the subject: -

·          Regions (24)

·          Ancient Art

·          Archaeoastronomy (14)

·          Archaeometry (10)

·          Biblical Archaeology (15)

·          Books

·          Companies

·          Education (1)

·          Egyptology

·          Events (5)

·          Fieldwork and Expeditions (46)

·          Folsom Points (6)

·          Institutes (54)

·          Journals (22)

·          Magazines (5)

·          Marine Archaeology (61)

·          Megaliths (13)

·          Middle Ages (19)

·          Mummification (6)

·          Museums and Exhibits (43)

·          Organizations (39)

·          Prehistoric (27)

·          Remote Sensing (7)

·          Repatriation and Reburial Issues

·          Rock Art (36)

·          Tour Operators

·          Urban Archaeology (11)

·          Web Directories (12)

·          Zooarchaeology (7)

The main problem with this listing is its classification from the Web-site availability perspective: main themes such as "Education", "Companies" or "Books" are listed alongside "Folsom Points", "Mummification" and "Repatriation and Reburial Issues".

Web indexes based on a human expert yield much better targeting; a search on About.com for "archaeology" yields only one "About.com Recommends" and 1.493 non-categorised sites. Going straight into that "recommendation" - "Archaeology - Home Page", we get a page with some recent additions, sites "In the Spotlight" and a categorisation of net links.

We still find the same classification problem: "Africa" is at the same level as "Scadinavia", "Computing" and "Ceramics".

While these sites are useful as more accurately designed pointers than Yahoo, because they are theme-oriented and not site-classification-oriented, the entire task is on the hands of one person. So far, no commercial site has employed the services of several experts for optimal classification of the secondary subjects.

Fortunately, several non-commercial sites provide expert indexing of the Web archaeological resources. These include the ArchNet, "the World Wide Web Virtual Library for Archaeology", as it is identified on its site (4) but also ARGE, the Archaeological Resource Guide for Europe (5), the CBA (6) links, for British Archaeology, and the BUBL (7) list of archaeological resources.


The category list of ArchNet, in particular, as a much nicer classification: -

Academic Departments

Archaeological Regions

Africa

Asia

Australia and Pacific

Central America

Europe

Near East

North America

South America

Featured Site
Museums on the Web
News & System Information
Other Resources

Email Directory (WEDA)

Electronic Journals

Newsgroups & Listservs

Publishers

Search ArchNet
Subject Areas

Archaeometry

Botanical

Ceramics

CRM & Government Agencies

Educational Materials

Ethnohistory / Ethnoarchaeology

Faunal

Geo-Archaeology

Historic Archaeology

Lithics

Mapping and GIS

Method & Theory

Site Files & Tours

Software

HELP

These expert-driven indexes are much more useful. But for successful maintenance and updating of such a categorisation, intervention of several experts is required. ARGE (5), for instance, states: "How do links end up in ARGE?" (...) "We find out about these new pages", [produced by people, on their own], "either because they then inform us of the URL directly, or because one of our correspondents has spotted the new URL (keep up the good work, you lot!), or because it turned up in the regular Web searches conducted by ARGE".

We have, therefore, a class of Web indexes that by lack of human resources have poor classification performances.

Another class, fairs not so well on updating, but is more useful, by providing human expert-driven classification.

Finally, we have sites like Rock Art Net (3), which present a classified index on a specific area, achieving a much more detailed categorisation and that accept articles for publication on-line. (However, articles on the Web do not possess the extra information, graphic appeal and broadcast possibilities of Web sites.)

And available throughout the net, we have a vast number of sites that host Web sites for free, such as Geocities (8) or Terràvista (9). This can be seen simply by searching Yahoo for "Free Web Hosting". On the 16th of June, we got 5 category matches and 242 site matches. By allowing the users to create Web sites, one's efforts can be used to promote a site, a theme, study area - whatever. But not many archaeologists are skilled on Web site assemblage; and even if they are, some graphic design skills are also required if we want people to really stop by the site and read or gaze at its contents - graphic appeal is key to a site's information effective broadcasting (McGovern, 1999).

The search engine perspective and the importance of metadata

One might also consider that a way to find relevant information is to improve the accuracy of search engines such as Altavista (10), HotBot (11) and others. This is in fact a very important issue, but the service that search engines provide is quite distinct from that of Web indexes. The former provide bulk information, that the user must wade through and evaluate; the latter provide categorised information, not so vast in scope, but much more time-efficient regarding retrieval of useful information.

Our focus is on sites that provide this latter kind of service: indexing. One aspect of search engine technicalities is however of marginal interest and worth mentioning: the usage of metadata.

Metadata, meaning "data about data", provides a mean to describe a document's content, allowing search engines to better associate the document with specific keywords.

Our concept, discussed below, can be summed up as "the mixing of indexers with hosting sites". I.e., creating a site that would be not only an index but also a place to host sites, with simples but efficient layout tools. Since the concept includes site-creation tools, those tools should incorporate metadata into the sites, to improve their visibility outside the indexer itself.

Weibel et al (1995), stated that "a reasonable alternative way to obtain usable metadata for electronic resources is to give authors and information providers a means to describe the resources themselves, without having to undergo the extensive training required to create records conforming to established standards"

However, it is not this paper's purpose to discuss the several archaeological applications of metadata, further than to register its importance as a link to the overall Internet community, that frequently comes across sites only by usage of search engines.

For the reader that wishes to delve on this subject, we point the already quoted paper by Weibel et al (1995), which includes a description of the set of metadata elements known as the Dublin Core, and also to the draft paper by Miller (1996), about an application of the Dublin Core.

Our concept

We propose a site that would act as a useful, specialised index for archaeology while offering archaeologists the possibility of having their sites hosted for free and with some amount of graphic design, in order to make sure they are at least minimally appealing. This follows on a basic concept included in Leusen et al's (1996) paper "Toward a European Archaeological Heritage Web". Leusen et al simply proposed a "service building on and extending the ways archaeological information is accessed by ArchNet and ArchWEB-NL", that would be complemented by "establishing one or more servers either dedicated entirely to archaeology or piggybacking on existing servers".

The number of servers dedicated to our proposed site is not relevant to the idea itself, since there are several purely technical solutions to the scalability problem, which are beyond the scope of this paper.

Our proposed "site" could be an entirely new site; it could also be an enhancement or companion to an already existing site that possesses the classification experts required. It expands on Leusen et al's idea by proposing the distribution of the development and updating of the categorisation, as well as providing a remote method for creation of graphically appealing pages.

The key ideas or "rules" behind it are: -

·         Categorisation by the site management is defined only at the top level;

·         Sub-level categorisation would be determined by an expert on the field;

·         Each expert would manage a category or define subcategories and assign managers to those subcategories;

·         Each category would have a collection of external links to relevant Web sites and internal links to sites hosted within the category.

·         Archaeologists’ sites hosted within this “soapbox site” should be accessible both by following the several categories and via a direct link or URL; i.e., a user should be able to reach the site both by browsing the categories or by typing a site’s direct URL.

·         The sites would be generated by use of a Web-site "engine" that would ensure an appealing look for each site.

·         A storage limit would have to be defined, as a safeguard measure against malicious attempts to overload the site-hosting machine(s). An archaeologist with a site too large to be accommodated can always contact the soapbox administrator, requesting more space. Since the imposed limits are safeguard measures, not economical restrictions, the assignment of the extra space should prove no problem at all, in most cases.

Previous Web engine system

In order to efficiently develop Web sites for museums, Morgado et al (1999) developed in 1997, at the University of Trás-os-Montes and Alto Douro (UTAD), in Portugal, a system for generating sites based on content rather than structure. The structure was based on a concept of using an adjustable layout, capable of accepting plug-in modules that would present specific information in some way. These modules could be vertical menus, scrolling text frames, photo albums, etc. This system was called a "web-site engine".

The programming structure of the original engine

The original engine's layout was structured in this way: every site would have an entry page and 5 sub-sections.

Every page would have a top area for navigation and a bottom area for displaying information. Examples can be seen at http://www.utad.geira.pt/museus/abadebacal/, http://www.utad.geira.pt/museus/ferromoncorvo/ and other sites.

Figure 1 presents the entry page for the Museum of Mogadouro, in Trás-os-Montes, Northern Portugal (http://www.utad.geira.pt/museus/mogadouro/salamuseu/).

Figure 1: area division on the original web engine's pages.

These areas were defined by use of frames, which allow different files to be assigned to different screen areas. By then coding them as Active Server Pages, or ASP, the pages could ensure the dynamic behaviour of the site.

For example: the top navigation page must have a graphical element for the name of the current area. Using HTML or DHTML, this would be expressed as:

<IMG src="CurrentArea.GIF" txt="Current Area">

However, using ASP, we can have a database hold the specifics of filenames, alternative text, links, etc. For instance, using a database table like the one on figure 1, we can query it for the information specific for each site:

Site ID

Element Name

Value

5

Activities_Name

Exhibitions.GIF

5

Activities_Name_Txt

"Exhibitions"

12

Activities_Name

Activities.GIF

12

Activities_Name_Txt

"Activities"

...

Figure 1: Table with the sites’ elements.

ASP includes code within HTML by enclosing it within <% %> tags. A snippet of ASP code would look like this:

First, we open a connection to the database:

<%

Set ODBCasp = Server.CreateObject("ADODB.Connection")

Session("id")=Request.QueryString("id")

ODBCasp.Open "DSN=Museums;UID=WebUser;PWD=WebUserPass;database=Museums"SQLQueryasp = "SELECT * FROM section Where ID=" & session("id") & ""Set RSaspList = ODBCasp.Execute(SQLQueryasp)

%>

These lines would place in the collection RSaspList the data for the Museum whose id number was passed in the URL.

We could use, afterwards:

<IMG src="<%=RSaspList("Activities_Name")%>" txt="<%=RSaspList("Activities_Name_Txt")%>">

This code is processed by the server and sent to the browser as static HTML. For instance, for the site with id 5, the client browser gets (see figure 2):

<IMG src="Exhibitions.GIF" txt="Exhibitions">

Figure 2: navigational frame for museum id=5.

But for the site with id 12, the client browser would get, from the same file (figure 3):

<IMG src="Activities.GIF" txt="Actividades">

Figure 3: navigational frame for museum id=12.

Caixa de texto: Root		
	Section 1	
		Site 1
		Site 2
		...
		
	Section 2	
		Site 1
		Site 2
		...
	...	
		...
Figure 4: Folder structure of the previous engine.
The engine used a folder tree to archive the required files, as presented on figure 4.

The root was common to all sites, with sub-folders existing for each sub-section. Within a sub-section, each folder would hold the files for a specific site. This folder tree, together with the database design, is one of the main sources of the non-variable number of sections of the engine.

For the museums' Web sites, for instance, there were 5 sections: atrium or main page, collections, activities, contacts, free theme 1 and free theme 2. While this structure allowed a fair amount of customisation, it nevertheless required a strict adherence to.

This problem as been addressed, as we'll detail further on.

The root contained a file called index.asp, which is key to the operation of the engine: it must be called with a parameter, identifying the desired site. That parameter is then passed on to every file, as required, making the ASP code work as intended.

This parameter makes the URL look a bit awkward, something like:

http://www.domain.com/sites/index.html?id=12

A more common URL, such as http://www.domain.com/MySite is desired. This is achieved by having a single file in the /MySite root, that redirects the browser to the complex link. For older browsers (Netscape version 1, Internet Explorer versions 1 & 2), without JavaScript capabilities, a "click here" message is displayed, requesting the user intervention in order to jump to the appropriate URL.

Required modifications

In order to use this engine, several facilities were required:

·         Variable number of sub-sections or design elements per site;

·         Several layout designs to choose from;

·         Larger number of plug-in modules;

·         Interactive, simple interface;

·         Remote web administration;

The new Web engine system

In order to make the engine usable for the intended purposes, we changed the original engine, in order to make it more flexible. We particularly intended to have: -

·         Variable number of sub-sections and associated navigational buttons;

·         Several structural layouts to choose from;

·         Variable number of design elements.

To achieve these goals, we had to address two issues: -

·         Folder structure;

·         Database structure.

Caixa de texto:  
Figure 5: folder structure of the current engine
Folder structure

In the new structure, presented in figure 5, there are only two folders within the root folder: one for content, another for the navigation models.

In the content folder, there is a folder for each site. These folders contain several other folders: one for navigation elements, another for design elements and one folder for each option present in the navigation frame.

These "option" folders are named with numbers, indicating the option presentation sequence. For example, the first option to be displayed will have its files in the "1" folder, the second in the "2" folder, etc. This allows for automated addition of any number of folders, and also automated on-the-fly re-ordering.

Database entities

The Web sites' content and models are stored in a single folder hierarchy; using the same approach, the sites' data and identification are stored in the same database as the models' data and identification.

We chose as our database entities: -

§         Sites

§         Information Elements

§         Navigation Elements

§         Models

§         Parameters


Database model

Figure 6: database model of the engine.

 

By establishing relations between these entities, the database model is defined (fig. 6).

To clarify how these entities work, our preliminary table definitions are on figures 7 & 8.

SITES

ID

Title

TemplateID

Location

Identifier.

Site title: text to be displayed in the title area of the browser window.

Id of the design model to be used when rendering this site.

Folder containing the site, in the \Content folder.

 

INFORMATION ELEMENTS

ID

SiteID

AltTag

ImageSrc

Anchor

Identifier.

Id of the site this element belongs to.

Text to be displayed as an element if no image is available, or as an alt tag, otherwise.

Image to be displayed. If left empy, element content will be the content of the AltTag field.

A tag (hypertext anchor) to be associated with the element (if desired). Allows definition of a bookmark or hyperlink (both with and without target frame).

 

NAVIGATION ELEMENTS

ID

SiteID

AltTag

ImageSrc

Sequence

Identifier.

Id of the site this element belongs to.

Text to be displayed as an element
(if no image is available), or as an alt tag, otherwise.

Image to be displayed.
If left empty, element content will be the content of the AltTag field.

Position of the navigation element on all lists. Also the name of the folder with the files of the area it represents. Selecting an information element displays the index.html file in that folder. Target frame or bookmark is determined by the site's design model.

Figure 7: preliminary table definitions, part I.

MODELS

ID

Name

Num_Param

Location

Identifier.

Model identification.

Number of set-up parameters required for operation of this model. The function of each parameter is defined in the model’s files.

Folder containing the model’s files (within the \Models folder).

 

PARAMETERS

ID

ModelID

SiteID

Name

Value

Identifier.

Model using this parameter

Site for which the entry's value is valid.

Parameter id.

Parameter value, of use to the model ModelID, on the site SiteID.

Figure 8: preliminary table definitions, part II.

Engine operation

The addition of new sites is performed by use of a specific Windows utility, not manually. A prototype of this engine is being used by UTAD for the several archaeology sites present in the IRAC' 98 Web site (12).

First, one chooses the model to be used. Each one requires a different number of information/design elements, and will have different methods of deploying the navigation elements on the screen.

After choosing the model, the utility checks the database to see what information/design elements are required. Values are then requested for these parameters. The same procedure is taken regarding navigation elements. These can vary in number, since the model only determines their organisation and operation.

For example, the model can have a top or side frame with the navigation buttons and display the relevant pages on another frame. Or perhaps the model has a start page using just some navigational elements that display the content in a frame set.

To include a navigation element, the only requirements are its position in the navigation element sequence, the files that belong to the associated sub-site and which one of those is the index file. The utility can then copy the files into the folder structure of the engine, into the proper folder. Should a resorting of the navigation elements be required, this is an ordinary task for the utility.

Creating Web sites from the Web

Since the utility is simply a form-based program, a Web-based version can be programmed without major hurdles. The user task (archaeologist) would be the supply of HTML and image files for each section. Since these are simply plugged into a navigation structure with graphic design, the site is, from a point where little has been done aside from text composing and image scanning, quite appealing graphically and from a usability perspective.

Examples of this efficiency can be seen from the archaeology sites developed at the UTAD (12).

Management of the index site

A search engine could automatically index the sites developed with the aid of such an engine - this could be a service of the engine itself. Keywords for the site (metadata) could be part of a model's parameters (see section "The search engine perspective and the importance of metadata", above).

But the development of an index site is key to the entire project: its main structure must be defined by an archaeologist, each sub-section being managed and co-ordinated by an expert on the specific field. This would be a perfect project for inter-university co-operation, due to the high number of experts required. But the potential for broadcast and classification of archaeological information would be immense.

Getting the model down to a page's elements

Having a working model and utility, it could be broken down into elements - framed text, image tables, etc. Currently, these elements are available only as pieces of HTML code, usable by anyone, but requiring HTML expertise. By applying the engine model to Web page construction, the overall concept could achieve a much better graphic design overall.

Internet URLs

(1) Yahoo!, http://www.yahoo.com/, Yahoo! Inc., 3420 Central Expressway, 2nd Floor, Santa Clara, CA 95051, USA.

(2) About.com, http://www.about.com/, About.com Inc., 220 E. 42nd St. 24th Floor, New York, NY 10017, USA.

(3) Rock Art Net, http://www.rupestre.net/, Società Cooperativa Archeologica Le Orme dell'Uomo, p.za Donatori di Sangue 1, 25040, Cerveno (BS), Italy.

(4) ArchNet, http://archnet.uconn.edu/, University of Connecticut, USA.

(5) ARGE, Archaeological Resource Guide for Europe, http://odur.let.rug.nl/~arge/, University of Groningen, Netherlands.

(6) CBA, Council for British Archaeology, http://www.britarch.ac.uk/info/uklinks.html, Bowes Morrell House, 111 Walmgate, York YO1 9WA, England.

(7) BUBL Information Service, http://bubl.ac.uk/link/hum.html, Andersonian Library, Strathclyde University, 101 St James Road, Glasgow G4 0NS, Scotland

(8) Geocities, http://www.geocities.com/, Yahoo! Inc., 3420 Central Expressway, 2nd Floor, Santa Clara, CA 95051, USA.

(9) Terràvista, http://www.terravista.pt/, Associação Terràvista, Centro Empresarial Torres de Lisboa, R. Tomás da Fonseca, Torre E - 9º Piso, 1649-032, Lisbon, Portugal

(10) Altavista, http://www.altavista.com/, Altavista, Inc., USA.

(11) HotBot, http://www.hotbot.com/, Wired Digital, Inc. 660 Third Street, Fourth Floor, San Francisco, California 94107, USA.

(12) IRAC '98, International Rock Art Congress, http://www.utad.geira.pt/irac/, Home\Rock Art in Portugal\Rock-Art in Portugal, UTAD, Portugal, 1998

References

Leusen, Martijn van; Champion, Sara; Lizee, Jonathan; Plunkett, Thomas. Toward a European Archaeological Heritage Web, Interfacing the Past: Computer applications and quantitative methods in archaeology CAA 95, edited by Hans Kamermans and Kelly Fennema. Analecta Praehistorica Leidensia Number 28, Faculty of Archaeology, University of Leiden, P.O. Box 9515, 2300 RA Leiden, The Netherlands, 1996.

McGovern, Gerry, Information Nobodies, New Thinking E-Mail Newsletter, http://www.nua.ie/newthinking/archives/newthinking331/index.html, NUA, Dublin, Ireland, 1999.

Miller, Paul, An application of Dublin Core from the Archaeology Data Service, Draft, University Computing Service, University of Newcastle, UK, 1996.

Morgado, Leonel; Reis, Arsénio; Abreu, Mila; Bicho, Joël; Santos, Arlindo; Guedes, Mário; Barroso, João; Melo-Pinto, Pedro; Lobo, Helena; Proença, Alberto; Bulas-Cruz, José. A web site engine for the development of heritage-related sites, New Techniques for Old Time - CAA 98, Computer Applications in Archaeology, Proceedings of the 26th Conference in Barcelona, BAR International Series 757, Archaeopress, PO Box 920, Oxford, OX2 7YH, England, 1999.

Weibel, Stuart; Godby, Jean; Miller, Eric; Daniel, Ron. OCLC/NCSA Metadata Workshop Report, OCLC, Online Computer Library Center, Inc. 6565 Frantz Rd. Dublin, Ohio 43017-3395 USA, 1995.