United States or Solomon Islands ? Vote for the TOP Country of the Week !


An impressive number if we think about all the scanned and proofread pages this number represents. A fast growth thanks to Distributed Proofreaders, a website launched in October 2000 by Charles Franks to share the proofreading of books between many volunteers. Volunteers choose one of the books listed on the site and proofread a given page.

The books are digitized in "text" format, with caps for terms in italic, bold or underlined, so they can be read easily by any machine, operating system or software. Digitization is done by scanning. The book is then proofread twice by two different people, who make any corrections necessary. When the original is in poor condition, as with very old books, it is typed in manually, word by word.

PG Europe operates under "life +50" copyright laws. DP Europe supports Unicode to be able to proofread books in numerous languages. Created in 1991 and widely used since 1998, Unicode is an encoding system that gives a unique number for every character in any language, contrary to the much older ASCII that was meant only for English and a few European languages.

The use of scanned books as is converted to text format by OCR software with no proofreading gives a much lower quality result. After running OCR software, the text is 99% reliable, in the best of cases. For this reason, Project Gutenberg's perspective is rather different from that of the Internet Archive. In its Text Archive, books are scanned and "OCRized", but they are not proofread.

As stated on both websites, "Remember that there is no commitment expected on this site. Proofread as often or as seldom as you like, and as many or as few pages as you like. We encourage people to do 'a page a day', but it's entirely up to you! We hope you will join us in our mission of 'preserving the literary history of the world in a freely available form for everyone to use'."

When the original is in poor condition, as with very old books, it is keyed in manually, word by word. Some volunteers themselves prefer to type short texts, or works they particularly like. But most books are scanned, "OCRized" and proofread. The assets of digitization in "text format" are numerous.

A discussion forum allows them to ask questions or seek help at any time. A project manager oversees the progress of a particular book through its different steps on the website. On August 3, 2005, 7,639 books were completed, 1,250 books were in progress and 831 books were being proofread.

They were not written for carefully edited, thrice- proofread, leather-bound volumes, but ground out for the unwashed hand of a Waco printer's devil, done into hastily set type and jammed between badly set beer ads and patent medicine testimonials, on a thin, little job-press sheet that could be rolled up and stuck through a wedding ring.

In each case, someone else will proofread it. They can use ASCII and any other format. Everybody is welcome, whatever the method and whatever the format. Any volunteer anywhere is welcome, for any language. There is a lot to do. As stated on both websites, "Remember that there is no commitment expected on this site.

On May 1st, 2008, 13,039 books were completed, 1,840 books were in progress and 1,000 books were being proofread. The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another.