Proofread as often or as seldom as you like, and as many or as few pages as you like. We encourage people to do 'a page a day', but it's entirely up to you! We hope you will join us in our mission of 'preserving the literary history of the world in a freely available form for everyone to use'." What about languages? First Project Gutenberg's books are mostly in English.
They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original. In each case, someone else will proofread it. They can use ASCII and any other format. Everybody is welcome, whatever the method and whatever the format. Any volunteer anywhere is welcome, for any language. There is a lot to do.
The books are proofread twice, and the second time only by experienced proofreaders. Volunteers can also work independently, after contacting Project Gutenberg directly, by keying in a book they particularly like using any text editor or word processor. They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original.
And now he will never know. February, 1918. I wonder if there is any other country where the death of a young poet is double-column front-page news? And if poets were able to proofread their own obits, I wonder if any two lines would have given Joyce Kilmer more honest pride than these: JOYCE KILMER, POET, IS KILLED IN ACTION
On May 10th, 2008, 496 books were completed, 653 books were in progress and 91 books were being proofread. As stated in the Project Gutenberg FAQ, "the public domain is the set of cultural works that are free of copyright, and belong to everyone equally", i.e. that books that can be digitized to be freely available on the internet.
An impressive result thanks to the relentless work of 1,000 volunteers in several countries. People could request the CD and DVD for free, and were then encouraged to make copies for a friend, a library or a school. An impressive number if we think about all the scanned and proofread pages this number represents.
The aim is also to ensure respect for the volunteers, who can be confident their work will be used for many years, even generations. Donations are used only to buy equipment and supplies, mostly computers and scanners. And then let’s remember that all the books scanned in are proofread twice, by two different people, to make sure they are 99.9% accurate.
There is an average of 10 mistakes per page for a good OCR package and... many more mistakes if the quality of the scanner and the OCR package is not great. The book is proofread twice on the computer screen by two different people, who make any corrections necessary. When the original is in poor condition, as with very old books, it is keyed in manually, word by word.
The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another. The books are proofread twice, and the second time only by experienced proofreaders.
Digitization is done by scanning the book page after page to get "image" files. There is an average of 10 mistakes per page for a good OCR package, and many more mistakes if the quality of the scanner and the OCR package is not great. The book is proofread twice on the computer screen by two different people, who make any corrections necessary.