[OAI-implementers] XML encoding problems with DSpace at MIT

david casal D.Casal@uea.ac.uk
Mon, 17 Feb 2003 12:02:50 +0000 (GMT)


Hello all,

On Sat, 15 Feb 2003, Hussein Suleman wrote:

> personally, the code i distribute to others does quite a lot of XML
> cleaning in the data provider, but none at all in the harvester. i think
> the basic philosophy i'm following is: clean data as close to the source
> as possible.

We (luminas.co.uk) have implemented an OAI-PMH layer for Cocoon
(xml.apache.org/cocoon). Since we use Cocoon in digital repository
projects, some of which will use DSPACE, we needed to achieve separation
of OAI harvesting and provision from the internal presenation and logic of
applications. Some of the benefits of this approach, as we see it:

- The OAI layer is built-in to Cocoon, so no separate implementation of
client/service needed.

- Cocoon's inherent Separation of Concerns means that one can map
repository data to OAI through a simple stylesheet, and deal with encoding
issues in the same way, through XSchema validation and further use of
XSLT.

- Built so as to achieve 'plug and play' funcionality within an
application.

- While it is primarily intended as a data provider layer, it can work the
other way too (as a harvester, when the application point indexer at other
sites).

- OAI functionality within an established web publishing framework
already being used successfully for digital repository applications.

Since we are still working out some issues with regards to further
integration of Cocoon within DSPACE (and doing extensive testing), we will
be releasing the code for the OAI-Cocoon layer after the OAI (OAForum)
meeting in Berlin.

Note: this effort means in no way to parallel OAICat's integration into
DSPACE, but offer a simple 'lego block' for use within Cocoon, when used
as a standalone framework and/or within larger architectures such as
DSPACE.

Comments welcome.

Cheers,

David

david casal                   --0+
    ---
d.casal@uea.ac.uk             --9+
    ---
ecdc.dyndns.org/dc	      --)+