UW Ellbogen CTL:  Instructional Computing Services

Last update 27 January, 2003; R. Hill

What's all this about plagiarism and what help can technology offer?

(Adapted from 20 September 2002 workshop notes)

Instructors who assign writing projects occasionally find themselves uncomfortably suspicious of the originality of student work, and, having heard that the Internet is a rich source of plagiarized material, wonder how to search for a possible source of that work.  Many Internet sites do indeed provide self-published commentary and fiction, many provide documents as a public service by government agencies, and some have been set up to sell papers and essays outright (www.duenow.com).  Among the many services that compare text submissions to Internet documents are Turnitin and EVE2.  Finding Turnitin inadequate, UW has selected EVE2.

The staff of the Ellbogen Center for Teaching and Learning does not, in general, recommend the use of software to detect plagiarism, but rather recommends that students be given developmental assignments, turning in progress reports, prospectuses, drafts, and bibliographies, both to discourage plagiarism and to encourage good research and revision habits. Like any mechanical text search, EVE2 is a simple tool only, with limitations demonstrated by the test results in the Appendix below, and should play a minor part in assessment.  The writing model presented by Turnitin, in which a student presumably submits several refinements of a paper until it "passes" the plagiarism test, leaves us especially dubious.  The Council of Writing Program Administrators explains this issue in a position statement (http://www.ilstu.edu/~ddhesse/wpa/positions/index.html) that makes sense to us.

HOW TO USE EVE2

You must first obtain the UW account name and registration password from this office or from the Office of Academic Affairs. Then download EVE2 from http://www.canexus.com/eve/download.shtml and run the self-executing install file.

Set up the papers for which you wish to search as files in plain text or MS Word or Wordperfect, in your local folders, then choose them in EVE2's submission window.  EVE2 runs in a few minutes for simple searches.  The tests described below, submitted all at once, took about an hour. Results are returned in the EVE2 window and also logged in your local submission folder as RTF files.

 

 

APPENDIX

TEST RESULTS-- EVE2

EVE2 does not provide direct links between passages from the submitted text and source documents found. In other words, to discover plagiarism, the instructor would have to look through the source documents found, perhaps with a string search, to spot exact duplication. The following tests use these EVE2 settings:  Quick search, 50% cutoff
  1. A Robert Frost essay appearing as an HTML (text) file at www.robertfrost.org/essay.html
    >> Submitted as "The Most of Rhodora" by Lilliwhite Hands
    >>>> Result: Original source (HTML essay) found. 30% match
  2. A cut-and-paste combination of two reviews of the book "Xanthippic Dialogues" by Roger Scruton, one at www.geocities.com/Athens/Ithaca/2564/scruton.htm and one at www.staugustine.net/review.html
    >> Submitted as "Xanthippe's Presence" by Constant Cadger
    >>>> Result: One original source, the Geocities page, was found; the St. Augustine Press page no longer on the web. 43% match
  3. A short report, four paragraphs with headings, by someone named Justin on a Geocities page at http://www.geocities.com/lizards_312/Justins-Universe-dense-objects.html
    >> Submitted as "Afterlives of Stars" by Justice Knott (verbatim, in full)
    >>>> Result: Original source (Geocities page) found. 75% match.
  4. An essay from David Corker at the University of East Anglia (American Studies) on methaphor, available at http://www.uea.ac.uk/eas/People/corker/In%20Defence%20of%20Metaphor.htm
    >> Submitted as "Metaphor Rules" by Diablo Corker (verbatim, in full)
    >>>> Result: Original source found (UEA faculty page). 100% match.
  5. The first seven pages of an essay on tourism in national parks from a Dutch university source, in PDF form, by Jan van der Straaten, found by searching for "environment rain forest species" in Google, at http://greywww.kub.nl:2080/greyfiles/worc/1996/doc/17.pdf
    >> Submitted as "What's All This Then About National Parks" by Margy Bargy
    >>>> Result: Original PS source found ("greywww.kub.nl"), but with a low match level, either because only the first few pages were submitted or because the Postscript file would contain many extraneous printer language commands. (Also found my own notes for this workshop!) 12% match.
  6. A report on recombinant DNA in PostScript form (text with embedded commands) at www.ai.mit.edu/research/abstracts/abstracts2000/ps/z-abelson.ps
    >> Submitted as "Recombining" by Joe Schmo
    >>>> Result: Found 11 sources, mostly MIT sites, including the original, but not the original abstract. 67% match.
  7. A brief extract from a longer observation on Chaucer's Clerk's Tale, from http://www.richardhay.com/chaucer.html
    >> Submitted with slight variations in wording, and a couple of additional sentences inserted.
    >>>> Result: Not found, possibly because the 50% match criterion was not met due to the brevity of the extract relative to the original document.

EVE2 succeeded in most cases, finding obvious Internet documents along with original Postscript sources, amateur pages on commercial servers, and the problematic overseas university page. We advise, however, that the instructor never use its results alone as a basis for judgment regarding any given essay. As the last test shows, mechanical matching driven by parameters can yield false negatives, and false positives can be generated by earlier versions of the test document, or by lengthy quotation.

TEST RESULTS-- TURNITIN

Turnitin failed half of the tests submitted, especially for PDF and PostScript files, commercial servers available to the public, and overseas sources. See notes for the workshop of March 2002 for the full story. The company's claim that the use of paper mills will be revealed remains untested, as we balked at purchasing such a paper for test submission.

TEST RESULTS-- SEARCH ENGINES

  1. (March, 2002) This extract from the Jan van der Straaten PDF paper (see above) was typed in to the search phrase window of various search engines.
    "The disadvantages of this development are increasingly being recognised by politicians, particularly within the European Union. In recent European documents, such as the Fifth Action Programme, it is argued that the traditional development of the countryside should be stopped and that a sustainable development of society should result in limitations to the 'normal' economic development of regions."

    Results:
    Google (Advanced, exact phrase): "404 Not Found"
    Alta Vista (Advanced, exact phrase): "Found 0 results"
    MSN Search for exact string cut off at "... European doc," then failed.
    Excite: No exact phrase search.
    Dogpile: No exact phrase search.

  2. (March 2002) This shorter extract, a single sentence from the van der Straaten paper, was submitted.
    "In recent European documents, such as the Fifth Action Programme, it is argued that the traditional development of the countryside should be stopped and that a sustainable development of society should result in limitations to the 'normal' economic development of regions."

    Results:
    Google (Advanced, exact phrase): Successful; found van der Straaten paper.
    Google was able to find this source because it translates PDF documents into HTML as it inspects them.
    Alta Vista (Advanced, exact phrase): "Found 0 results"

  3. A single sentence from the essay on Chaucer (see above) was submitted to Google.

    Results: Google immediately found the original Richard Hay piece.

Hit Counter