Copyright Detection

The copying of copyrighted documents is an issue that is important to copyright holders and governments. Currently, an honor system is used in some schools for reimbursing owners of copyrighted documents, such as those owned by major publishers. A system that automatically identifies copyrighted documents would simplify use and afford better accounting of when copyrighted materials are copied. This technology can be useful in government, school and workplace settings.

Another use is detecting the copying of banknotes. While some banknotes have marks that can be easily detected, this is not the case for all banknotes. When detecting such copying, authorities can be alerted.

Technical Contact

Related Publications

Publication Details
  • Document Engineering
  • Aug 28, 2018


We introduce a system to automatically manage photocopies made from copyrighted printed materials. The system monitors photocopiers to detect the copying of pages from copyrighted publications. Such activity is tallied for billing purposes. Access rights to the materials can be checked to prevent printing. Digital images of the copied pages are checked against a database of copyrighted pages. To preserve the privacy of the copying of non-copyright materials, only digital fingerprints are submitted to the image matching service. A problem with such systems is creation of the database of copyright pages. To facilitate this, our system maintains statistics of clusters of similar unknown page images along with copy sequence. Once such a cluster has grown to a sufficient size, a human inspector can determine whether those page sequences are copyrighted. The system has been tested with 100,000s of pages from conference proceedings and with millions of randomly generated pages. Retrieval accuracy has been around 99% even with copies of copies or double-page copies.