United States or Mongolia ? Vote for the TOP Country of the Week !


On May 1st, 2008, 13,039 books were completed, 1,840 books were in progress and 1,000 books were being proofread. The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another.

The books are proofread twice, and the second time only by experienced proofreaders. Volunteers can also work independently, after contacting Project Gutenberg directly, by keying in a book they particularly like using any text editor or word processor. They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original.

There is an average of 10 mistakes per page for a good OCR package and... many more mistakes if the quality of the scanner and the OCR package is not great. The book is proofread twice on the computer screen by two different people, who make any corrections necessary. When the original is in poor condition, as with very old books, it is keyed in manually, word by word.

They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original. In each case, someone else will proofread it. They can use ASCII and any other format. Everybody is welcome, whatever the method and whatever the format. Any volunteer anywhere is welcome, for any language. There is a lot to do.

Later on, it is hoped machine translation software will be able to convert the books from one to another of 100 languages. In 2004, Project Gutenberg was in touch with a European project studying how to combine translation software and human translators, somewhat as OCR software is now combined with the work of proofreaders. He considers himself a pragmatic and farsighted altruist.

Louis. It includes all possible Journal entries of Lewis and Clark. Most of the "courses and distances" and "celestial observations" have been omitted. The notes and most of the corrections of past editors have been removed. There are a few OCR errors, but most of the misspellings are almost 200 years old. The dates with the names in the brackets are a little redundent.

Volunteers can also work independently, by digitizing a whole book in any word-processing programme or else scan it in and convert it into text using OCR software and then make corrections by comparing it with the original. In each case, someone else will proofread it.

The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another. The books are proofread twice, and the second time only by experienced proofreaders.

OCR is usually 99% accurate, which makes for about 10 corrections a page. You save each page you do and can then either stop work or do another. You don’t have any quota to fulfill, but it’s recommended you do a page a day if possible. It doesn’t seem much but with hundreds of volunteers it really adds up.

Project Gutenberg is convinced that proofreading by human beings is a very important step, and that this step makes all the difference. The use of scanned books as is converted to text format by OCR software with no proofreading gives a much lower quality result. After running OCR software, the text is 99% reliable, in the best of cases. The main formats used are XML, TIF and DjVu.