Author Identification

Even in Shakespeare's day plagiarism was rampant. Indeed there is still argument among scholars over particular sonnets supposedly written by Shakespeare which are now suspected of being written by Bacon. The computer techniques used by these scholars consist mostly of simple word frequency analysis.

A technique that addresses more completely the style of an author is to analyze the grammar. Sentences and phrases can be parsed into a tree structure by recursive decomposition. Such a tree is shown below

The tree's shape is as unique as a fingerprint for identifying an author by means of his or her writing style and can also be used within a document to detect any inclusions of material from elsewhere.