[OAI-general] System Architecture
Thomas G. Habing
Thu, 20 Feb 2003 14:51:06 -0600
In the various search systems we have developed based on OAI harvested DC
records, we have always combined the creator and contributor (and I think
even the publisher?) fields together under a single index, primarily because
of the issues described in this discussion thread.
In our Cultural Heritage search service
(http://oai.grainger.uiuc.edu/CandI303/search/) the combined index is
labeled 'Author/Artist'. In our engineering search interface
(http://g118.grainger.uiuc.edu/engroai/) the combined index is labeled
When aggregating across many different OAI providers especially across
disciplines, this seems to be the only way to do it. If the experts can't
agree on how to split this data, the average (non-expert) user probably
isn't going to understand the difference either, or probably even care.
Tansley, Robert wrote:
>>From: François Schiettecatte [mailto:email@example.com]
>>I cant speak to the legality of their XML. What I have found
>>is that they put the author in the 'contributor' field as
>>opposed to the 'creator' field which is where everyone else
>>put it. I am not sure if that would be termed non-compliant,
>>but it is did throw my indexer for a loop.
> This is from Margret Branschofsky, from MIT Libraries:
>>We used Contributor because it relieves the user of the burden
>>of deciding who of all the people involved in the production of
>>an item is "primarily responsible" in the creation of the
>>item. We added roles to contributor, thinking that was more
>>specific than the vague "creator" field. So we have
>>Contributor.illustrator...etc. And if anyone wants to leave it
>>vague, they can use the unqualified Contributor field.
>>We did not do this out of the blue, but were actually following
>>the Libraries Application Profile (LAP) which was in an early
>>draft stage at the time. That draft conflated the Contributor
>>and Creator elements (as well as Publisher.) When the DC
>>Libraries Group published the present draft of the LAP
>>(officially called DC-Lib), they reneged on the merging of
>>these elements. However, there is a comment under both of these
>>elements that states "Creator and Contributor may be conflated
>>if desired by the application. In that case,
>>Contributor.Creator may be used if desired".
> Of course, the basic OAI schema for DC is unqualified, so the qualifiers Margret mentions are unknown to harvesters.
> My own tuppence' worth... I've always wondered why there's one element in DC (creator) that's arguably a subset of/subsumed by another (contributor). Additionally, when talking about digital objects, the 'creator' (whomever is 'primarily responsible' for creation) is often debatable: Consider a work originally written by a professor, that is scanned in and deposited by one of his students. Is the professor 'primarily responsible' for the work? If one considers the bits and bytes of the digitised manifestation, one could argue it's the student that is primarily responsible for their creation. Or are they both creators? That could cause confusion too. It seems there's an implicit assumption that 'creator' corresponds to original authors, the professor in this case.
> I'm sure I'm not the first person to think along these lines (the Library Application Profile folk obviously thought about this), so I'd be interested to hear what folk more directly involved in DC think.
> In other news, we've tracked down the encoding problem--we had some bad data in the database that was introduced by a bug in a batch import script. It'll be cleaned up on the production server soon, I'll let you know when that happens.
> Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624
> OAI-general mailing list
Research Programmer, Digital Library Projects
University of Illinois at Urbana-Champaign
155 Grainger Engineering Library Information Center, MC-274
firstname.lastname@example.org, (217) 244-4425