GBIF - Information and Communication Technologies

DiGIR Provider Package

USERs' GUIDE

(Windows 2000, XP)


 Giorgos Ksouris,
 Miruna Badescu


  16/03/2004


 Version: 1.6


Table of Contents

1. Introduction

2. Requirements

3. Distribution Package

4. Pre-installation Checks

5. Required Information

5.1. Domain Name for Server

5.2. About Ports and Firewalls

6. Installation

6.1. Configuration of Provider PHP Parameters

7. Start/Stop the DiGIR Provider

8. Provider Metadata

9. Resources Management

10. UDDI Registry

10.1 What is UDDI?

10.2 GBIF UDDI Registry

A. Databases

A.1 MySQL

A.2 PostgreSQL

A.3 MS SQL SERVER

A.4 MS Access

A.5 Oracle

B. Resources

1. Introduction

Global Biodiversity Information Facility (http://www.gbif.org/) is an international organisation and a megascience project aiming at making biodiversity data freely and universally available on Internet. In order to enable database owners such as museums and research organisations to integrate their databases in this data sharing network, GBIF makes available this software package for data providers. It is based on the efforts of the open source community, in particular the DiGIR project (http://digir.sourceforge.net/).

The purpose of this document is to describe how the package that installs automatically a DiGIR Provider (PHP code) can be used in a server running Windows 2000 or Windows XP. (There is an identical package available for Linux as well). This is a turn-key self-contained package that installs the DiGIR tools, but also PHP and Perl languages, and the Apache WEB server. However, no database is included and you are expected to make a connection to an existing database containing specimen and/or observation data.

You should have some basic knowledge of the Windows operating system in order to be able to perform the installation and its pre-installation steps.

Furthermore, the additional tools, encompassed in this package, which can be used for the log rotating, start/stop of the provider, definition of its metadata, new resources and the automatic registration in GBIF registry are also explained.

Finally, appendix A. Databases contains a few hints on how databases should be configured in order to be used by the DiGIR Provider.

2. Requirements

3. Distribution Package

The distribution package is available on CD-ROM medium, and it contains the following files:

4. Pre-installation Checks

The following checks must be done before proceeding with the installation:

a. Memory size should be at least 256 Mbytes. The General tab of your System Properties dialogue box (right click on the My Computer shortcut from your desktop to obtain it) displays a section called Computer, which provides you information about the RAM.

b. Check that there is enough free space (150 Mbytes) on the drive you are planning to install the kit. There are many ways you can find this out, for instance: open a Windows Explorer, click on My Computer item and inspect the size and available space of all the drives from your computer.

c. If you have another installation of PHP running on your system insure that no file called php.ini is placed in your %Windows% directory. If it is, read the install.txt file shipped with your previous installation of PHP and choose another location for it as explained in that file.

Should you leave php.ini in the Windows directory, this will be considered the ini file for all the installations and the DiGIR Provider will not function properly.

Also check the system PATH variable since it might contain the predefined path to the other's installation directories.

5. Required Information

For the installation, the following information will be requested:

6. Installation

When you have collected all the above information you can start the installation procedure. Unzip WinPackageDiGIR.zip into a temporary directory of your computer or from a shared path from you Local Area Network. Use Winzip 8.0 or higher to unpack it.

The unzip operation will produce a list of directories and the file DiGIRProviderInstall.exe which you must execute from that location.

The program will perform the necessary checks and will ask for all the information you collected in the previous section, before installing and configuring the DiGIR Provider. Every time the script will provide you with a (guessed) default option, but don't accept blindly its suggestions - use the information gathered by you.

Tips:

a. A check is made for your system memory but only a warning message is issued in the case of low resources. In this case, you should proceed on your own risk.

b. The folder you choose to install into should be empty at installation time. It is advised to let the installation program create it.

c. If you are an administrator, you are offered the possibility to choose between running the application as a service or manually. The installation program disables the running as a service option for users with less privileges on the local computer.

When running the application manually, the user concerns himself/herself with starting/stopping the application (using the corresponding scripts with shortcuts on the desktop). This is important in case of a system error, because the application will not automatically restart.

On the other hand, if the application is registered as a Windows Service, the operating system takes care of re-starting it, even before a user logs on that machine. However, start/stop scripts are available even in this case, since the user might want to stop/start the application manually at a certain time for whatever reason.

d. If the installation fails for some reason, the installation program will try to delete all the files copied. You are advised to check that yourself and manually remove the potential remaining files before running the installer again: all the files are copied in the folder you have specified in Step 1 of the set-up process.

No files are copied in other locations, like the system or Windows folders, nothing is written in the Windows Registry during the installation and no Environment variables are added or modified.

e. The performed installation steps are listed in a file named install.log. After the installation you are offered the possibility to consult this file by clicking the View log button. It could be useful in the case something went wrong.

f. The Step 3 or the installation gathering of information asks you to modify the parameters of the Provider. Read more about them in the paragraph below.

6.1. Configuration of Provider PHP Parameters

DiGIR operational defaults are controlled by defining constants in the file localconfig.php which overrides the default values of the source code. You will have to define the following operating parameters:

Parameter
Description
Default Value
(seconds)
DIGIR MAX RUNTIME Sets the maximum runtime of the script in seconds. The default value allows the script to run for two minutes, which should be ample except for very large result sets or poorly designed databases (e.g. no indexing). 120
DIGIR METADATA CACHE LIFE SECS Sets the number of seconds that cached metadata will be used before forcing an update. Default is one day. 86400
DIGIR FORMAT CACHE LIFE SECS Sets the number of seconds that a downloaded record-format definition is cached before downloading it again. Default is one day. 86400
DIGIR RESULTSET CACHE LIFE SECS Sets the number of seconds that a search response will remain in the cache and in consequence it will be reused. Default duration is one hour.
3600
DIGIR INVENTORY CACHE LIFE SECS Sets the number of seconds that an inventory response will remain in the cache. Default duration is one hour. 3600
DIGIR STATUSINTERVAL For each request serviced by the DiGIR Provider, a record of the type of request and the time stamp of the request is kept in an array that is cached to disk. Records older than the current time minus DIGIR_STATUSINTERVAL seconds are deleted from the array. The total number of each type of request is computed from the array and is appended as a diagnostic to each response generated by the DiGIR Provider. If the provider is running under a very high load (hundreds - thousands or requests per hour), then you should set this value smaller than one hour (<=> 3600 seconds).
600

Note: You can modify the afore-described parameters of the DiGIR provider later on by editing the file localconfig.php that is located in the directory %HOME_DIR%\DiGIRprov\www. Open the file with your favourite text editor (Notepad, Wordpad, ...) and locate the line that starts with:

    define('', );

for example: define('DIGIR_MAX_RUNTIME',120);. Replace the with the requested new number, save the file (as text) and close the editor. Do not forget to restart the application (as described in the next section).

7. Start/Stop the DiGIR Provider

In order to start or stop the DiGIR provider you can use the shortcuts created by the installation program on your desktop. Click on one of those to execute the corresponding operation. You can also choose to start the DiGIR provider at the end of the installation.

In both cases (manual run or run as a service), you are offered scripts to start and stop the application that you can execute directly. The desktop shortcuts correspond to the bat scripts available in the %HOME_DIR%\bin folder with respect to the type of run chosen. The scripts are:

Warnings:

1. Do not close the command window opened when starting DiGIR Provider manually since the Apache server will be closed with it.

2. Do not start/stop the Apache manually: the start/stop scripts also clean up the DiGIR Provider cache and start/stop the Cron application (or, consequently, register/unregister it as a service).

8. Provider Metadata

The metadata of the provider describe the DiGIR Provider service and supply information on who to contact for more information about the service. The server metadata are defined in the file HOME_DIR/DiGIRprov/config/providerMeta.xml. This is an XML file, and in consequence the normal rules for XML documents apply - element and attribute names are case sensitive and special characters such as >, <, &, " and ' must be escaped.

You can access the URL http://:/digir/admin/setup.php using a WEB browser. When you will be requested for a user name and a password you should provide the browser with the credentials: user name:admin, password:the password that you have defined during the installation.

Because of the caching that the provider is using, you have to restart the application - as defined in the previous section - if you want your changes to take effect immediately.

9. Resources Management

A DiGIR provider resource is an XML file that contains information about the database - where the resource is located - connection, the database schema representation, a mapping of the database columns to the conceptual schema elements and metadata regarding the resource (e.g. name, contacts, related information URL, etc.). The resource definition files are located in the directory HOME_DIR\DiGIRprov\config.

You can define a new resource and modify/delete existing resources accessing the URL http://:/digir/admin/setup.php using a WEB browser. When you will be requested for a user name and a password you should provide the browser with the credentials: user name:admin, password:the password that you have defined during the installation.

Tip: For an optimal display of the DiGIR provider pages we advice the use of MS Internet Explorer as Web browser. Alternatively, if the page fails to display (e.g. setup.php) change the browser and try again.

Because of the caching that the provider is using, you have to restart the application - as defined in section 6 - if you want your changes to take effect immediately.

10. UDDI Registry

10.1 What is UDDI?

The Universal Description, Discovery and Integration (UDDI) registry is based on existing standards, such as Extensible Markup Language (XML) and Simple Object Access Protocol (SOAP) and provides a method for publishing and finding service descriptions. The UDDI data entities provide support for defining both business and service information. There are four primary data types in a UDDI registry: businessEntity, businessService, bindingTemplate, and tModel.

By accessing a UDDI registry, one can search for information about web services that are made available by or on behalf of a business.  The benefit of having access to this information is to provide a mechanism that allows others to discover what technical programming interfaces are provided for interacting with a business for such purposes as electronic commerce, etc.

10.2 GBIF UDDI Registry

The GBIF registry (http://registry.gbif.net/) is a secure, platform-independent UDDI registry-service designed for public use within Global Biodiversity Information Facility participant nodes.

Your DiGIR Provider installation will be registered in the GBIF registry after the first execution of the standalone script UDDIregistry.pl (available only in the version of the package that encompasses Perl). In order to execute this script you have to open a command line window and then to type the command:

$ %HOME_DIR%\perl\bin\perl.exe %HOME_DIR%\bin\UDDIregistry.pl

Alternatively, you can either access the URL http://:/digir/admin/setup.php, as defined in section 8, and click on the Registration link or (recommended) use the on-line registration tool () and follow the instructions described in the particular page.

The script will use the values of the elements that you have defined as metadata of the provider (plus some extra information) in order to create a business entity, a business service and a binding template in the UDDI registry. In particular, the following information will be used:
You can find the UDDI registry information related to your installation if you access the URL http://registry.gbif.net/, click on the Find business link and fill in the Business name text field with the name of the host (e.g. GBIF Secretariat, Museum of Vertebrate Zoology, Zoological Museum University of Copenhagen,  etc.), which you used during the setting of the provider metadata.

A. Databases

The Windows DiGIR provider package has been tested with resources where the database server was one of the following, it was listening on a TCP/IP port (either the database default or a different one) and it was installed on the same or on a remote machine.

For MySQL, PostgreSQL and Oracle databases the Datasource string used during the definition of a resource (see section 8) in the Datasource section MUST have the format: dbserver.domain.name:portNumber. On the other hand, for MS SQL Server databases the respective Datasource string must have the format: dbserver.domain.name,portNumber.

You can find below a few tips of how to set up your database to listen on a TCP/IP port and to allow remote connections.

A.1 MySQL

The port number where the MySQL server is listening (default value is 3306) is defined in the %WINDOWS_DIR%\my.ini file. In order to modify the parameters from the my.ini file, you can run the WinMySQLAdmin.exe file from the %MYSQL_DIR%\bin folder.

In case that remote connections are not allowed to your database and the DiGIR Provider has been installed at a remote machine you have to modify the access rights for the database in question. You should connect to your database as administrator, using for example the mysql.exe (MySQL client) located in the %MYSQL_DIR%\bin folder, and execute a SQL query of the form:

GRANT SELECT ON db_name.* TO dbuser@'%.gbif.org' IDENTIFIED BY 'foobar1';

where: db_name is the name of the database that will be used by the DiGIR Provider, dbuser is the user name of an existing database user and foobar1 is his password. The above SQL statement allows the execution of SELECT queries with connections initiated at any host in the gbif.org domain.

Check the Linux Users' Guide for the respective information regarding Linux MySQL installations.

Example Datasource definition:

A.2 PostgreSQL

Check the Linux Users' Guide for the respective information regarding Linux PostegreSQL installations.

Example Datasource definition:

A.3 MS SQL SERVER

In order to connect with a MS SQL Server from the remote machine where you have installed the DiGIR provider it is necessary to activate TCP/IP connectivity in MS SQL Server using the "Server Network Utility".

Open the utility in question and in case that the TCP/IP is listed under the Disabled protocols area you have to select it and insert it in the Enabled protocols area using the Enable >> button. In case that you want to define the port number where the server can be accessed (default value is 1433) you should select the TCP/IP (under the Enabled protocols area) and then click on the Properties... button. Define the port, click on the OK button and then Apply your changes. You have to restart the SQL Server.

Example Datasource definition:

A.4 MS Access

You can connect with a MS Access database via ODBC or directly using a driver with OLEDB support.

To create an ODBC link, open the Control Panel window of the Windows workstation, click on the Administrative Tools >> Data Source (ODBC). Select the "System DSN" tab and then press the "Add" button. Select the Microsoft Access driver and press the the "Finish" button. In the Data Source Name text field, enter a name and memorize it because this will be the name that you will have to use in order to call your database. Finally, press the "Select" button. From this menu, select the required Access database. Your ODBC connection is done.

Example Datasource definition (ODBC):

Example Datasource definition (using ADO):

A.5 Oracle

In order to be able to connect with an Oracle 9i database, which is located in a remote server, you must download and install the Oracle9i Database Release 2 Client for Microsoft Windows 98/2000/NT/XP (http://otn.oracle.com/software/products/oracle9i/htdocs/winsoft.html) on the machine where you will install the DiGIR Provider. You can use Oracle's Universal Installer program for the installation of the Oracle Client software. The Runtime option provides sufficient libraries for the connection of the DiGIR provider with your database.
In case that you will install the DiGIR Provider on a machine where Oracle 9i is already installed, you do not have to download the Oracle Client package.
You should be able to install it (Oracle Client) "running" the Universal Installer program of your installed Oracle package.

In addition, after the installation of the DiGIR Provider (using the DiGIR Provider Package software) you must enable the loading of the PHP Oracle libraries, as follows.

        ;extension=php_oci8.dll

Furthermore, you have to define a Listener in your Oracle installation, using for example the Oracle Net Manage, where Protocol is TCP/IP, Host db.server.name, Port portNumber (default port number is 1521), and associate the Listener with the Database Service (database instance) that you would like to access using the DiGIR provider software.

Example Datasource definition:

The is the Service Name (or Global Database Name) of your database. For example, if you have created a database where its Global Database Name is pyyf.worldDomain and its SID is pyyf then the of the above definition is the pyyf.worldDomain string.

Hint: In case that your Oracle database is installed on a MS Windows server, the DiGIR provider is installed on a different machine and there is a firewall between the two servers you have to add USE_SHARED_SOCKET=TRUE into the MS Windows registry of the server, as follows, and then restart the database and listener services of Oracle.

B. Resources