ELISAD European Gateway on Alcohol and other Drugs / Final Research and Activity Report December 2003
back to table of contents
2.2. Data collection and storage
The technical realisation of the Elisad gateway will be performed by 3 project experts: Susanna Prepeliczay from ARCHIDO, Germany (project coordination, database administration), Anne Singer from ELISAD, France (promotion and user consulting), and Marianne van der Heijden from Bureau Andromeda, The Netherlands (website design). Data collection for the gateway catalogue is performed by 7 libraries.
2.2.1. Distribution of data collection to project partners
This project is realised in cooperation with 7 participant libraries that are responsible for geographic and thematic coverage. Data collection on AOD websites is distributed to these participants according to their language skills, and closeness of geographical and contextual relationship. All of these contributing libraries are members to the Elisad network. Each library collaborates in a networking process with AOD libraries /Elisad members, and other institutions from its neighbor countries. Voluntary contributions have been provided by Elisad members from Spain, Greece, and Hungary.
Project participants and geographical distribution are listed as follows:
participant / library |
country coverage |
|
|
ARCHIDO, Germany |
Austria, Germany Switzerland (German-languaged) |
|
CAN, Sweden |
Sweden, Norway |
|
Drugscope, United Kingdom |
United Kingdom |
|
Gruppo Abele, Italy |
Italy, Switzerland (Italian-languaged) |
|
PNSD Plan Nacional Sobre Drogas, Spain |
Spain |
|
SZU Statni Zavodni Ustav, Czech Rep. in cooperation with BISDRO Institute for Drug Research, Germany |
Bulgaria |
|
Toxibase, France |
France, Belgium, |
|
Trimbos Instituut, |
Netherlands |
2.2.2. Identification of websites.
In the first phase of the project, each participant has performed an explorative search and listing of websites which served as the basic estimation of coverage for the gateway websites catalogue. These are mainly institutional websites with a focus of activity in the areas of public administration (government), policy (GOV/NGO), treatment, prevention, education, research and economics.
Strategies of searching and website identification have been reported at the Lisbon meeting (see Annex 3: Lisbon minutes), and are documented in detail in the short reports from the participant libraries.
Several techniques were employed by the participants to identify relevant web-based information. The main strategies are summerised ensuite.
Expert knowledge: From their professional scope, each participant knew a basic number of websites in the field which served and had proved to be reliable information sources.
Use of search engines: Many participants used search engines and metasearchers to find more alcohol and drugs related websites. This included directories and gateways. Search strings used were relating to working fields in drugs and addiction, and specific substances terms.
Snowballing: Revision of weblinks listed / recommended on known key AOD websites, followed by a review of these websites and assesment of their weblinks
Networking: The above methods were complemented by communications on the subject to colleagues from the own and neighbor countries, who were themselves experts in the alcohol and drugs field and gave recommendations for valuable websites to include to the gateway.
One objective of the project is to enhance networking between institutions through Europe. The networking process was performed by email, and supported by personal networking of Elisad members at conferences and meetings. Contacts were established and performed by participants with the following institutes:
ARCHIDO, Germany.
BzgA, Germany Reitox Focal Point (www.bzga.de) Marion David-Spickermann
Sucht-HH, Germany: (www.sucht-hh.de) Gabi Dobusch
SFA-ISPA Drug Library, Switzerland (www.sfa-ispa.ch) Carla Rouge
Anton Proksch Institute Library / Ludwig Boltzmann Institut for Addiction Research, Austria (www.api.or.at) Sabine de Bruyere, Irmgard Eisenbach-Stangl
ÖBIG, Austria Focal Point to the EMCDDA (www.oebig.at) Sabine Haas
CAN, Sweden.
Stakes, Finland (www.stakes.fi), Outi Merilainen
SIRUS, Norway (www.sirus.no), Jorunn Moen
Drugscope, United Kingdom.
HRB Health Research Board Ireland, Irish Focal Point (www.hrb.ie)
EMCDDA, EU Drugs Agency (www.emcdda.eu.int) Adelaide Duarte, Maria C. Cristobal
Gruppo Abele, Italy.
UMHRI Greek Focal Point, Greece (www.hol.gr/umhri) Penelopi Vasiou
SZU, Czech Republic / BISDRO, Germany.
Ministries of Health / Prado Library, Budapest / Hungary (www.prado.hu), ELISAD member
Secretariat of the National Drug Commission, Prague / Czech Republic
Ministry of Health, NFP Sofia / Bulgaria
Estonian Drug Monitoring Centre, Tallinn / Estonia
State Centre for Drug Abuse Prevention and Treatment, Riga / Latvia
Drug Monitoring Centre Focal Point, Vilnius / Lithuania
National Bureau for Drug Prevention, Warsaw / Poland
Institute of Health Services Management, NFP Bucharest / Romania
Ministries of Health, National Focal Point Bratislava / Slovakia
Institute of Public Health, NFP Ljubljana / Slovenia
Zavod za Bolesti Zavisnosti, Belgrade / Serbia
TOXIBASE, France.
TOXIBASE used an existing directory of weblinks as a base of research. Moreover, the French Toxibase network includes 5 libraries through France.
SFA-ISPA Drug Library, Switzerland (www.sfa-ispa.ch) Ms. Carla Rouge
IPDT Drug Library, Portugal (www.ipdt.pt), Paula Braga
Prospective Jeunesse, Belgium (www.prospective-jeunesse.be)
CEPS Idea prevencion, Spain (www.ideaprevencion.com) Teresa Salvador Livina
PNSD, Spain / Spanish Focal Point (www.mir.es/pnd) Jose delVal / Mercedes Alonso
OFDT Observatoire Francais des Droges et Toxicomanie (www.ofdt.fr), Anne de l`Eprivier
TRIMBOS, The Netherlands.
TNO Voeding Alcohol Documentation Centre (www.voeding.tno.nl), Marielle Zeeman
VAD Drug Library, Belgium (www.vad.be), Marc Wauters
Each AOD website identified in the exploration process was visited and reviewed. Those websites fitting into the scope and quality criteria were selected for inclusion to the gateway catalogue. Other websites were excluded from the gateway catalogue for the following reasons: Lack of substantial information, high degrees of generality, out-of-date content, in-evidence of author, site largely under construction, frequent runtime errors and related difficulties in accessibility. For the interim result of this explorative research performed by all data collectors see Annex 1, list of selected websites, state 9 / 2002.
According to the dynamic structure of the world wide web, the identification of AOD websites is a continous dynamic process which was ongoing during the complete duration of the project. For the overall results and growth rates in the AOD website landscape, see 3.1.1.
2.2.3. Instruments and technology for data collection
One objective of the project is to generate and systematically collect metadata about the content and general features of websites. These are produced manually and intellectually by the participants in charge of data collection, based on the information available on the websites. According to the DESIRE gateway definition (Koch 2000, see 1.3.1.), properties of metadata range between short annotations and review,
and include basic bibliographic features. It is anticipated that extensiveness of descriptive data available, including suitable indexing based on the use of standardised terms, will facilitate effective ordering and searching, in order to ensure good retrieval of specific items from the gateway catalogue. This applies even more given the complexity and multidimensional character of websites, and features of web-based AOD information (see 1.1.2. and 1.2.1.).
2.2.3.1. Online description form
To facilitate data input for the gateway catalogue, a tool was created, tested and revised several times to ensure its suitability for the description of websites. The result is an electronic questionnaire that is accessible online for all project participants. A print version of the online form is joined in Annex 4 to the research report. The questionnaire features a combination of qualitative (freetext fields) and quantitative (checkboxes) answering options. Fields of the online form are organised in a structure related to topics, and include the formal and contentual characterisics.
To describe the basic formal data, several traditional bibliographic fields are part of the questionnaire. These include fields for title, publisher in original and English language, country, language(s), free keywords, a freetext description of the website producer and a freetext summary of the website content. The description fields are guided by examples to ensure homogenous format and correct typography.
To cope with various potential thematic areas related to AOD, the questionnaire is structured in descriptive thematic sections according to the potential subjects of the website. These subject sections are in the same time intended to serve as main headings for browsing. The thematic sections correspond to the gateway scope policy (see 2.1.1.) and disciplines in professional activity in the AOD field (see 1.2.1.), and include the following items:
psychoactive substances, including generic, specific and related information
substance use & addictive behaviour, including patterns of consumption and potentially compulsive behaviours without the use of AOD
consequences & effects of substance use, including drug effects, health and social consequences
prevention & education, including concepts, campaigns and educative activities for the general public as well as professional audiences
treatment & services, including information on various specific interventions, therapy approaches and public services
policy, including information on governmental strategies, law and measures as well as alternative political activity from persons and initiatives
economics & trafficking, including topics related to legal and illegal distribution and trade, as well as public health economics and funding
history & culture, including research and experience based information on drug use related to various cultural contexts and contemporary subcultures
research, including scientific disciplines and methodological approaches
populations & audiences, including research populations and target groups of interventions
settings, including various environments of drug use, prevention and research.
To cope with the large variety of types of information potentially provided by websites, the questionnaire contains descriptive sections focusing on types of resources available on the website described. These 3 sections of the online form are structured as follows:
searchable databases, including all kinds of online accessible data retrieval systems provided
types of publications, including several kinds of specific materials, fulltext publications, and compilations
interactive tools, including various kinds of executable services based on the user´s active involvement.
The thematic sections enumerated above include freetext fields to enter specific data according to the website content. To facilitate structured input of qualitative data, the freetext fields are presented in relation to subjects and types of resources. Next to their explorative purpose, entries generated from the freetext fields are intended to serve to support the process of effective searching, ordering and retrieval. The online form includes a total number of 34 thematic and bibliographic freetext fields.
The quantitative fields of the questionnaire facilitate the choice of optional prefixed data by checkboxes, and serve as the internal keyword index of the gateway catalogue. In each thematic section, a selection of terms suitable to describe specific activities was compiled. In order to adapt standardised terminology, several thesauri and glossaries related to alcohol and drugs were consulted to achieve a basis keyword set, including the ISDD Thesaurus (13), the TOXIBASE Thesaurus (14), the NIAAA Alcohol and other Drugs Thesaurus (15), the WHO Lexicon of Alcohol and Drug terms (16) and the UNDCP Glossary of demand reduction terms (17).
Repeated discussions among participants accompanied the choice of terms for indexing. Other important factors for the selection of English indexing terms were their general comprehensivity in order to reduce problems of misunderstanding and varying interpretation, given the multicultural and multilangual European scope.
The keyword index includes a total number of 361 terms. According to their usefullness and suitability in practice, the keyword index will be subject to revision and updating during the research process.
In addition to descriptive metadata, the questionnaire includes fields for internal data related to the evaluation process, in order to achieve effective administration and edition of data. These fields include the name and email address of the evaluator, the date of website review / evaluation, comments on accessibility and user-friendlyness, and options for the evaluator´s recommendation and rating of the website on a scale from 1 5 according to the quality criteria (see 2.1.2.). Another freetext field serves to collect comments on the online form that will be used in order to revise and improve the tool at a later stage of research.
The online form serves to transfer data into the server database. This function is realised technically by PHP scripts that connects the fields of the form to the fields of the database tables (see 2.2.4.2.).
To access existing records in the catalogue and enable participants to do changes and corrections of data, another online tool was programmed. Within this edition tool, named selection.php, any record can be chosen and updated online. All data can ne accessed in the same format as the online form. The changes done can be saved in the database by re-submitting the edition form using a "send" button.
2.2.3.2. Offline form facility and input manual
In an early stage of the project (February 2002), experimental data input was performed within the project group. Each participant added 2-3 records to the draft version of the database. These sets of data also served to demonstrate the database demo version functionality at the Lisbon project group meeting (see Annex 3, Lisbon minutes). In this practical example, several difficulties and misunderstandings were discovered that reduced the consistency of the data records collected. To achieve a higher level of data consistency, an input manual was created to reduce differences in interpretation of the use of the fields. A print-out of this word tool named "Winputform.doc" is annexed to this research report (see Annex 5).
The offline tool is a word form which contains two parts: On the left side are the identical freetext and checkbox fields of the online form. They are in the same ordering and can be filled or checked in the word file. On the right side, next to the fields, a help manual is visible that gives guidelines for filling the fields, relating to content and to correct typography. Next to each freetext field, advise is given how to specify information not covered by the checkboxes, with examples for their possible content. E.g. next to the quantitative specification of a fulltext report by the according checkbox in the types of resources section, the name of the report can be entered in the freetext field of the same section to enable retrieval of this specific publication. Next to each checkbox section in the word form, advise is given to use them correctly, and to combine them with checkbox options in other thematic sections. In the manual, the user is reminded that the proper indexing of website contents can relate to other corresponding checkbox sections in order to retrieve the information described in their contentual relation, e.g. treatment + women + cocaine when a website describes a specific treatment programme.
Moreover, templates are provided to ensure correct use of the two big freetext fields, producer description nand website content summary. Another function of the word offline tool is to allow participants to keep local copies of the data records submitted to the catalogue.
2.2.3.3. Principles of data editing
In order to achieve a high level of data quality and consistency, it proved necessary to implement a strategy for data edition. The general rules and templates provided by input manual (2.2.3.2., cf Annex 5) were supported by specific advise and review.
In order to facilitate correct data input, guidelines for input typography used for recording were formulated in a style guidance document and applied by the group during data collection (see Annex 16). All participants were involved to editorial matters to agree on shared principles. During the Bremen meeting, the group concluded to insert HTML tags into the website description fields and publisher description field to display names of publishing organisations in bold characters , while titles of specific publications and programmes were set in italic characters to enable better recognition and readability (cf Annex 15).
Accompanying the distributed data input by participants, a proof-reading process was performed on 2 levels. To compensate the different native languages of the participants and varying capacity for English writing, one level of proofreading focused on English language correctness. Language proof-reading of the catalogue content has been performed by Stephan Schulte-Naehring (Ex-DrugScope) as a commitment to the Elisad project. The language revision was focused on correct English spelling, grammar and expression of all freetext input fields, with special consideration of those fields visible in the gateway search results and details results screens (producer description, website content summary and major keywords).
The second edition process included proofreading on typographical correctness. Proofreading on typography and style was performed by Anne Singer, Elisad. It focused on standardised presentation of the catalogue records, including the presentation of names in italic or bold letters using the according html tags within the freetext fields. Special emphasis was put on suitable formats of the website title and original publisher names, including a verification of correct appearance of special characters from various languages (e.g. Spanish, Czech, French, Swedish) which were facilitated by implementation of according fonts into the data processing mechanism.
In regular intervals, participants received editorial feedback, e.g. on common spelling errors or correct use of abbreviations. These tasks were especially concerned with the freetext fields visible in the catalogue output results page, in order to achieve a standardised appearance of records and thus a high degree of screen readability for the user.
2.2.4. Data storage: conception of the server database
Server space was reserved on the university server at the domain elisad.uni-bremen.de. The university server is a UNIX system (AIX by IBM), accordingly the database was created using the MySQL language. Programming of the database and functions related to input, output and administration were performed by Bernd Titz, dimploma candidate at the university Bremen informatik department.
2.2.4.1.Database tables structure
To store and administrate the website-related metadata, a MySQL server database was created. This relational database system offers multiple possibilities in administration and retrieval for the ELISAD websites catalogue. To create a local version of the database, the LINUX software package SUSE 7.3 was used.
The data collected is organised in records which include all entries from one website description. Each record represents a closed entity of information which is ordered according to the field structure of the database. The Elisad database contains 30 tables. All tables and field names are listed in detail in the Annex 6 of this report.
Options tables: The checkbox fields in the online form correspond to pre-fixed entries in each record that are chosen from a given set of values. Technically, these optional choice entries are stored in 28 options tables containing the invariant data. These invariant data serve as the keyword index and include 316 terms. The option tables correspond to the thematic sections of the form, e.g. the treatment related choice options are stored in a table named "treatment_options". Each options table has an individual id value number, and each optional entry in the table has an id value number to enable programming functions.
Freetext fields: Purpose of the freetext fields is to contain variable data entries. This includes bibliographical data and specific subject related data. Technically, the freetext fields refer to a different way of retrieval. The 34 freetext fields are stored in a table called "drogenseiten". Each freetext field of the form corresponds to an identical column in the table. Each freetext column in the database table has an individual id value number that allows programming functions.
Administration table: All these tables are managed by one overall table named "zuordnung", which contains the field-id codes for each table and fields included. In the administration table, the size and type of fields, and their relation to the option tables are specified. The administration table serves as the overall structure for all elements of the database.
The database structure was successively amended during the research process e.g. to include additional countries and languages. Moreover, the relational system provides multiple options for future extension and adaption, which will be especially applicable to the keywords set according to the thematic freetext fields content of the database records.
2.2.4.2. Programming database functions
To ensure compatibility and interoperability with other subject gateways, the use of the same technology and tools is a precondition. In the first step, usability of the ROADS tool developed in the DESIRE project (see 1.3.1.1.) was investigated. It was downloaded from www.roads.lut.ac.uk and installed on the university server. Several trials were performed in order to make it run on the UNIX system AIX. These efforts remained unsuccessfull even though all permissions were available to use the database. The reason then turned out in the fact that the project is not conciped to run an independent, physical server itself. Since the virtual server dedicated to the ELISAD project site and database is a physical partition of the total university server space, no executive administrator rights were available. This problem relates to the safety regulations for operation the ZFN (Centre for Networks at the university Bremen). The university server is often attacked, and its stability and reliability are the precondition for the scientific community adjuncted.
To run an own server is not possible in the frame of the ELISAD gateway project due to its limited funding time, because it would mean to continously employ staff in charge of the server administration.
Moreover, the use of ROADS would not necessarily imply a potential reduction, but increase of time expense: The ROADS software package consists of 380 files and a manual of 327 pages. Installation and adaption to the database would require to work through the code and change it where needed, a task probably not less time consuming than the development of an own code.
In order to have unlimited possibilities in the design of the database and the corresponding tools (frontends, online editing etc.), it was decided to write the necessary scripts in PHP language. Functions programmed relate to input and output of the database.
Each input field of the questionnaire (online form, see 2.2.3.1. and Annex 4) corresponds to a field in the ELISAD database described above. The database and the form are connected via the scripting language PHP (personal home page) which produces the HTML code that is displayed as the input form in a browser. Each change in the database structure, e.g. fields are renamed, added or deleted, is displayed in the online form.
Input functions: Via PHP scripts, data can be transfered from the input form fields into the according database fields. Moreover, records can be edited in a selection.php page, and changes are directly processed to the database.
Since October 2002, these input tools are provided with a password protection in order to avoid public access and to reduce dangers to the database content.
Output functions: PHP is the base of search scripts that formulate MySQL questions underlying the various output facilities. These correspond to the browsing and searching functions of the gateway website (cf. 3.2.). To date, PHP scripts are available for browsing of the database content according to each general subject and for each keyword. An overall script has been worked on to manage these functions. PHP scripts have been programmed to browse the database content according to 32 countries, and to browse the in progress files.
Search and filter functions: Moreover, PHP scripts are available for free search, and for a field oriented search modus which have been implemented to the gateway user interface. Retrieval is complemented by the data in the freetext fields which are not displayed but enable the system to produce search results on synonymous terms (e.g. marijuana for cannabis) and on titles of specific programmes and publications.
Complementing the browsing and search options, several PHP filters have been implemented in order to refine the results provided by browsing and searching according to the users needs.
The interrelation of the connected scripts are presented in a graphical sitemap wich is available in Annex 19 to this report.
ELISAD European Gateway on Alcohol and other Drugs / Final Research and Activity Report December 2003
back to table of contents