We have several components which could be part of a "suspicious paper" detector in small-molecule chemistry (and possibly crystallography). This term is meant to cover at least:<br>* syntactic and stylistic problems. This is not per se an indication that something has been deliberately fudged but it's an indication of poor quality. This includes incorrect chemical names and locally incorrect english syntax.<br>
* internal inconsistency. Examples are spectral peaks against element count, names not consistent with formula, etc.<br>* inconsistency between values in this paper and the rest of the literature. This can be done for crystallographic structures (e.g. in CrystalEye) or for chemical shifts.<br>
* computation. This relates particularly to what Marlon is doing in OREChem.<br><br>There is a lot of tedious work to do but it's certainly possible to do some of these on a high-throughput basis.<br><br>I am now fairly bullish about interpreting cvhemical reactions and we canm extract a lot of this from supplemental data.<br>
<br>P.<br><br><br><div class="gmail_quote">On Sat, Jan 16, 2010 at 8:27 PM, Lee Giles <span dir="ltr"><<a href="mailto:giles@ist.psu.edu">giles@ist.psu.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
There are many ways to address this. We could create a data and text<br>
plagiarism detector.<br>
CiteSeerX hopes to have a text similarity detection system up soon.<br>
There are other<br>
plagiarism detectors such as Turnitin that can be used for text. Penn<br>
State has licensed<br>
it for our courses.<br>
<br>
To do this, however, requires access to text and data which we don't<br>
easily have in<br>
chemistry.<br>
<font color="#888888"><br>
Lee<br>
</font><div><div></div><div class="h5"><br>
Coles S.J. wrote:<br>
> This has caused a massive amount of talk in the crystallographic community (filling my inbox somewhat!). Here's another case that has come to light<br>
><br>
> <a href="http://www.the-scientist.com/blog/display/56226/" target="_blank">http://www.the-scientist.com/blog/display/56226/</a><br>
><br>
> and there have been other cases in different journals from the same Acta E authors and even claims of direct 100% plagiarism of journal articles - text, figures etc (not just the crystal structure data!).<br>
><br>
> Ecrystals and OREChem would help address this problem, as it arises in part from:<br>
><br>
> "What we know is that when Dr. Murthy was asked to provide the data behind the structures, there was not sufficient material presented to allow the expert panel to determine the source of the error,"<br>
><br>
> Provenance and clear availability of underlying data is crucial and we can help here! We should try to use these incidents to our benefit...<br>
><br>
> Simon.<br>
><br>
> Simon Coles.<br>
> Co-Director, EPSRC National Crystallography Service.<br>
> School of Chemistry,<br>
> University of Southampton.<br>
> Southampton, SO17 1BJ. UK.<br>
> t: +44(0)2380596722<br>
> f: +44(0)2380596723<br>
> e: <a href="mailto:s.j.coles@soton.ac.uk">s.j.coles@soton.ac.uk</a><br>
> www: <a href="http://www.soton.ac.uk/chemistry/research/coles/coles.html" target="_blank">http://www.soton.ac.uk/chemistry/research/coles/coles.html</a><br>
> NCS: <a href="http://www.ncs.chem.soton.ac.uk" target="_blank">http://www.ncs.chem.soton.ac.uk</a><br>
><br>
><br>
><br>
><br>
> On 05/01/2010 22:02, "<a href="mailto:mpierce@cs.indiana.edu">mpierce@cs.indiana.edu</a>" <<a href="mailto:mpierce@cs.indiana.edu">mpierce@cs.indiana.edu</a>> wrote:<br>
><br>
> An interesting little editorial about fraudulent structures that the<br>
> crystallographers on the list have probably also read:<br>
><br>
> <a href="http://journals.iucr.org/e/issues/2010/01/00/me0406/index.html" target="_blank">http://journals.iucr.org/e/issues/2010/01/00/me0406/index.html</a><br>
><br>
><br>
><br>
> Marlon<br>
><br>
><br>
> _______________________________________________<br>
> Orechem mailing list<br>
> <a href="mailto:Orechem@openarchives.org">Orechem@openarchives.org</a><br>
> <a href="http://www.openarchives.org/mailman/listinfo/orechem" target="_blank">http://www.openarchives.org/mailman/listinfo/orechem</a><br>
><br>
><br>
> _______________________________________________<br>
> Orechem mailing list<br>
> <a href="mailto:Orechem@openarchives.org">Orechem@openarchives.org</a><br>
> <a href="http://www.openarchives.org/mailman/listinfo/orechem" target="_blank">http://www.openarchives.org/mailman/listinfo/orechem</a><br>
><br>
<br>
_______________________________________________<br>
Orechem mailing list<br>
<a href="mailto:Orechem@openarchives.org">Orechem@openarchives.org</a><br>
<a href="http://www.openarchives.org/mailman/listinfo/orechem" target="_blank">http://www.openarchives.org/mailman/listinfo/orechem</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Peter Murray-Rust<br>Reader in Molecular Informatics<br>Unilever Centre, Dep. Of Chemistry<br>University of Cambridge<br>CB2 1EW, UK<br>+44-1223-763069<br>