How to make a Project Gutenberg eBook

Everything except LibriVox (yes, this is where knitting gets discussed. Now includes non-LV Volunteers Wanted projects)
Piotrek81
Posts: 3582
Joined: November 3rd, 2011, 2:02 pm
Location: Poznań, Poland

Post by Piotrek81 » August 26th, 2012, 11:48 pm

Of course there's a University library here, just as there is a whole network of Raczyński Library :wink: I know that for a fact- I'm subscribed to about 10 branches :mrgreen: The problem is that what I would need is not any book, but an edition old enough to be legally photocopied and scanned and be PG-admissible. I somehow doubt that they would be willing to borrow me a 1920 edition for 3 months... :roll: I'd have work on the spot instead.
Come help us record The Deluge THE DELUGE IS BACK!
Want to hear some PREPARATION TIPS before you press "record"? Listen to THIS and THIS

NinaBrown
Posts: 438
Joined: December 22nd, 2011, 6:17 pm
Location: Rockville IN

Post by NinaBrown » August 27th, 2012, 12:29 am

Piotrek81 wrote: I somehow doubt that they would be willing to borrow me a 1920 edition for 3 months... :roll: I'd have work on the spot instead.
I see your point, but I suppose it depends how rare the edition is? One way to find out :-)
-nina-

gypsygirl
Posts: 8643
Joined: June 12th, 2006, 6:00 pm
Location: British expat in Waco, TX
Contact:

Post by gypsygirl » August 27th, 2012, 9:04 am

Piotrek81 wrote:I somehow doubt that they would be willing to borrow me a 1920 edition for 3 months... :roll: I'd have work on the spot instead.
If there are scanners available for use in the library you could get away with borrowing it for just a couple of hours...
Karen S.

TriciaG
LibriVox Admin Team
Posts: 42885
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » April 1st, 2018, 2:09 pm

Pulling up a long-sleeping thread:

I have a Word (well, LibreOffice, but I can convert it to .doc or .docx) document of a book, Maud and Other Poems by Tennyson.

There might be slight OCR errors - substitutions that are real words ("boot" for "book" as a rough example), and periods that should be commas - that sort of thing. I've cleaned up all the more egregious OCR errors.

I'd like to see this get on PG, but I really don't want to do the final proofreading and conversion to html. Would anyone want to take it over from me? It registers at about 15,000 words.

Oh, and I submitted the copyright clearance about an hour ago.
Away from LibriVox Mon, Oct 7 through Tues, Oct 15.

TriciaG
LibriVox Admin Team
Posts: 42885
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » April 4th, 2018, 5:04 pm

Never mind. I did the final proofing myself. The book was short enough. :)

I guess you're required to submit the plain text file. I don't like the generated html, but aside from submitting a text file and an html one, I guess I didn't have much choice. :?

It's here: http://www.gutenberg.org/ebooks/56913
Away from LibriVox Mon, Oct 7 through Tues, Oct 15.

tovarisch
Posts: 2738
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch » April 4th, 2018, 5:18 pm

Pretty cool.

I'm going to have to contact PG soon about the typos in the book I'm reading now (and how many!).

For some reason there is a strange symbol my Chrome shows in the last line of Maud.I.9 (it's a diamond with a question mark in it, right between the "yes!" and the "-but"):
Peace in her vineyard--yes!�-but a company forges the wine.
I guess Chrome does not know what to display...
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please

TriciaG
LibriVox Admin Team
Posts: 42885
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » April 4th, 2018, 5:31 pm

It was an en dash, but their checker program should have flagged it for me to replace with a regular dash. (It did flag an en dash in another spot; or was it this one? But I fixed the spot it flagged.)

Does "Agavè" show the accent in yours? If not, you need to use Unicode to show the text file rather than... ASCII?
Away from LibriVox Mon, Oct 7 through Tues, Oct 15.

tovarisch
Posts: 2738
Joined: February 24th, 2013, 7:14 am
Location: New Hampshire, USA

Post by tovarisch » April 5th, 2018, 5:34 am

No, it does not show the accent. There is 'small E with grave' in extended ASCII (code xE8: è), so there should be no need for Unicode.
tovarisch
  • reality prompts me to scale down my reading, sorry to say
    to PLers: do correct my pronunciation please

TriciaG
LibriVox Admin Team
Posts: 42885
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » April 5th, 2018, 5:51 am

I can't get back into the submission page where one selects what character set one is using, so I can't tell what the options are/were. I do know that I couldn't choose the basic ASCII character set due to the accented characters, but perhaps I didn't have to go so far as Unicode. I don't know much about all that - just enough to be dangerous when given the toys to play with. :lol:

EDIT: The file listing on the PG page says the text file is UTF-8. If that's the encoding, it should have picked up the two accented e characters. That's apparently a mistake with their encoding. :(

The HTML format renders everything correctly.
Away from LibriVox Mon, Oct 7 through Tues, Oct 15.

mightyfelix
Posts: 4137
Joined: August 7th, 2016, 6:39 pm

Post by mightyfelix » April 6th, 2019, 11:06 am

This is an old thread, and maybe not the best place to ask my question. But be that as it may, can anyone tell me how to go about reporting an error to Gutenberg? I'm prereading through Doctor Dolittle's Post Office currently, in preparation for a DR, and there's a spot where they have changed Jip's name to Jim.

TriciaG
LibriVox Admin Team
Posts: 42885
Joined: June 15th, 2008, 10:30 pm
Location: Toronto, ON (but Minnesotan to age 32)

Post by TriciaG » April 6th, 2019, 11:08 am

errata2019 (at) pglaf.org

Errata within eBooks. To report an error within an eBook (such as a missing word), please be sure to include the eBook number or specific filename or download link you used. Error reports are easiest to handle when they clearly indicate the context in which an error was found. Note that since Project Gutenberg includes many old titles, it is common for unusual spelling or arcane word uses to be used. If possible, check a printed source to verify whether an error exists, before reporting it. Messages to the errata list generate an automatic response that your report was received.

Any errata/bug/typo report is welcome! There is additional guidance in the FAQ on how to prepare errata reports so they are easiest for the Project Gutenberg team to handle. Start with FAQ #R.26 on how to report typos.
Away from LibriVox Mon, Oct 7 through Tues, Oct 15.

mightyfelix
Posts: 4137
Joined: August 7th, 2016, 6:39 pm

Post by mightyfelix » April 6th, 2019, 11:22 am

Thank you! If I was on my home computer, it would have been easier to find this info myself, but I appreciate your getting it for me. :)

LikeManyWaters
Posts: 526
Joined: January 15th, 2018, 2:50 pm
Location: AZ

Post by LikeManyWaters » April 10th, 2019, 2:25 pm

Hmmm... I have a handful of errors/typos highlighted in most of the ebooks I have downloaded from Gutenberg. I suppose I should report them, but not sure if I have the time. Most are obvious, and I just highlight and remember to read it the right way when I read them for LV. Only one or two books have ever been VERY error-ridden.

On a related note, I have bought an old book that I couldn't find anywhere online and was wanting to scan it. My plan is to use my phone, I believe you two (Tricia & Devorah) have done that before. Any TIPS would be much appreciated! 8-) PS - The book is in great condition, and has several beautiful illustrations, so hoping not to damage the book in the process.

Not sure whether I will try for getting it on PG or Archive. Archive's submission page kind of intimidated me. But either will be a learning curve.
April
...the sound of His coming was like the sound of many waters... - Ezekiel 43:2

mightyfelix
Posts: 4137
Joined: August 7th, 2016, 6:39 pm

Post by mightyfelix » April 11th, 2019, 10:29 am

LikeManyWaters wrote:
April 10th, 2019, 2:25 pm
On a related note, I have bought an old book that I couldn't find anywhere online and was wanting to scan it. My plan is to use my phone, I believe you two (Tricia & Devorah) have done that before. Any TIPS would be much appreciated! 8-) PS - The book is in great condition, and has several beautiful illustrations, so hoping not to damage the book in the process.

Not sure whether I will try for getting it on PG or Archive. Archive's submission page kind of intimidated me. But either will be a learning curve.
Using my phone was much easier, I think, than it would have been to scan each page. Quicker, too. One thing I found that did help was a cool crop feature in the photo editor I used to process the pages after I took the pictures. This may not be much help, because I don't remember specifics, such as the name of the photo editor, but the feature may be more or less standard. It's like a perspective crop tool, I guess you would say. Rather than cropping the image in a perfect rectangle, it allows you to place the corners on the corners of your pages, which might give you kind of a funny quadrilateral, if your picture wasn't taken perfectly from above. Then it will basically squish and stretch that into a good rectangle, making it all straight and even.

For archive, you'll take all those images (single pages, not double page spreads) and put them (in order, of course) into a PDF format, then upload the PDF. The archive uploader says that you could also put all your images into a zip folder instead and upload the zip folder, but that didn't work for me. Maybe it wasn't named right, I don't know. But the PDF worked very well for me and wasn't too hard.

I've never submitted a book to Gutenberg. I think they may choose not to take one on, if it's something extremely niche or obscure (since it takes a lot of time and manpower for them to create an ebook), but I really don't know. Archive, on the other hand, will take whatever you have, and then it's all an automated process, as far as I know.

Sorry if that was all more info than you needed. I've only submitted books to archive twice now, and only one of them was something I actually scanned (or rather, photographed) myself. The other one came from a library that had it on microfiche, and I was able to save the pages from the fiche reader onto my USB. But anyway, let me know if there is something I might be able to help you with, and I'll do what I can!

LikeManyWaters
Posts: 526
Joined: January 15th, 2018, 2:50 pm
Location: AZ

Post by LikeManyWaters » May 9th, 2019, 9:52 am

I meant to say THANKS before now, oops! :D

My husband says he will help me set up something like this DIY book scanner. Thought it might be nice to share... so if you have some scrap plexiglass... it doesn’t open the book all the way, so less spine stress.

Image
April
...the sound of His coming was like the sound of many waters... - Ezekiel 43:2

Post Reply