Hacker News new | past | comments | ask | show | jobs | submit login
Backing Up My Kindle Ebooks (2018) (sonyaellenmann.com)
214 points by exolymph on July 2, 2019 | hide | past | favorite | 158 comments



Calibre is great, if you configure it properly.

By default, converting books to another format will change the layout! Destroying the original formatting by default is unacceptable behaviour to be honest, so user beware.

For Kindle, I suggest using the KindleUnpack plugin instead; which can extract the EPUB file embedded in the AZW/KF8 format file losslessly for use with EPUB based readers without having to worry about Calibre's conversion messing up the formatting.


I had no idea that an EPUB file was embedded in the weird Amazon formats.

I mostly agree that you want to avoid "transcoding" an EPUB. But the formatting on some official EPUBs is so bad that I like to use readers that can override the built-in formatting. I like to choose the font, spacing, hyphenation, and justification myself. (Full justification and hyphenation are my bugbears--I think they are never appropriate on smaller screens, as they lead to large rivers and too many words broken up, but are fine on large ones.)


There isn't an EPUB "embedded" in Kindle formats. The Kindle format is based on the pre-Kindle Mobipocket format (that's why unDRMed Kindle books often have an extension of .mobi). EPUB didn't even exist when this format was created. Making an EPUB (or any other format) from a Kindle book involves converting, which is always going to introduce changes in formatting. People can argue that one program or another does a better job at doing it, but there's no way around it.


That's partly correct. Quote from the KindleUnpack Github[1]:

- MobiPocket and early Kindle version 7 or less ebooks are unpacked to the original html 3.2 and images folder that can then be edited and reprocessed by MobiPocketCreator.

- Kindle Print Replica ebook are unpacked to the original PDF and any associated images.

- Kindle KF8 only ebooks (.azw3) are unpacked into an epub-like structure that may or may not be a fully valid epub depending on if a fully valid epub was originally provided to kindlegen as input. NOTE: The generated epub should be validated using an epub validator and should changes be needed, it should load properly into Sigil and Calibre either of which can be used to edit the result to create a fully valid epub.

- Newer Kindle ebooks which have both KF8 and older versions inside are unpacked into two different parts: the first being the older MobiPocket format ebook parts and the second being an epub-like structure that can be edited using Sigil

[1]: https://github.com/kevinhendricks/KindleUnpack


There's another layer to the format onion, even. The .mobi file, without DRM, is a PalmOS resource collection file https://en.wikipedia.org/wiki/PRC_(Palm_OS) since mobipocket


(I just looked and saw that I didn't finish that sentence: since mobipocket was primarily a PalmOS application at its peak.)


That's basically the case. The Amazon format is basically a proprietary wrapper around epub. You make a Kindle file by starting with an ePub file and running it through the Kindlegen software provided by Amazon. So Calibre's converters are basically reverse-engineering that process. This means that if an ePub version of the book is being sold by Apple or Kobo or some other vendor, that's the file you should start with, though their encryption may be better than Amazon's.


You could unpack the ePubs, edit the CSS and repack it; and then they would work with all readers.


Here is another approach to extract your books from their horrible proprietary and closed Kindle Cloud Reader. Fortunately someone reverse engineered the SQL based format [1].

[1] https://gist.github.com/yangchenyun/a1c123935d82f5e25d57


How do I start with Calibre and Kindle? I've tried several times, but I cannot understand the UI.

How do I list books on my Kindle? How do I list books in my account? It's all very byzantine.


To put EPUB (or AZW3/MOBI) files on your Kindle:

- Open Calibre

- 'Add' your EPUB file(s) (top left button)

- Connect your Kindle via USB

- Select the book(s) you want to send

- Click 'Send to device' (it's one of the buttons at the top)

- If you get asked to convert your book, choose AZW3 (it's newer than MOBI)

- Done

----------

To view the books on your Kindle:

- Connect your Kindle via USB

- Click 'Device' (at the top)

Note 1: If you want to be able to put your books on a different Kindle, you need the DeDRM plugin [1]

Note 2: If you want to be able to convert your books to EPUB to use with a non-Kindle e-reader, you need the KFX Input plugin [2]

Note 3: KFX is not a great input format, if you don't have moral reservations it's better to 'obtain' the books you paid for from the internet in EPUB format.

----------

To view the books on your Amazon account:

This is not possible using only Calibre. You need to install Kindle for PC/Mac and get the files from its storage directory. The notes above apply, i.e. it's probably better to just obtain EPUBs from the internet.

----------

In my opinion, it's not really worth it to back up books from your Kindle, you're better off obtaining the EPUBs off the internet. If you want to put those EPUBs on your Kindle, that's where Calibre shines.

This is also great if you don't want to support Amazon and their lock-in scheme. Just buy your books in EPUB from any online retailer and use Calibre to put it on your Kindle. Note that some publishers apply Adobe DRM to EPUBs (which you can break using DeDRM [1]), but there are also many EPUBs using only a watermark (aka 'social DRM'), which is not DRM.

[1] DeDRM: https://apprenticealf.wordpress.com/2012/09/10/drm-removal-t...

[2] KFX Input: https://www.mobileread.com/forums/showthread.php?t=291290


> In my opinion, it's not really worth it to back up books from your Kindle

Use-case: I have bought books from other vendors, converted them to epub format, copied them via Calibre onto my Kindle reader. - I made lots of annotations in those books on my Kindle reader. Does it not make sense to back those books up?


Instead of using the Calibre GUI, which is nightmare fuel, I recommend Calibre's command line conversion tool `ebook-convert`: https://manual.calibre-ebook.com/generated/en/ebook-convert....

It's kind of like pandoc, except pandoc only supports epubs.


I'm pretty surprised. I've used Calibre's inbuilt EPUB -> AZW3/MOBI converter for hundreds of books to left-align them for my Kindle. The only books where the formatting was messed up were the ones in which the EPUB wasn't formatted properly in the first place.

I'm also interested to know more about why Calibre destructively converts EPUBs instead of making a copy to work on instead. Do you have any sources for how this works?


I've heard people complain about Calibre's EPUB conversion utility, but I've never seen anyone do a deep-dive to explain exactly why it's bad. Is it comparable to the HTML-soup that Microsoft Word spits out when you export a Word doc to HTML?

FWIW, I still heed those warnings, and so I still hold on to the de-DRM'd AZW file even after I convert it to EPUB.


For me the most obvious mangling the converting process is doing is inconsistent paragraphing, some times a single paragraph in original splits into two at an arbitrary place after conversion, and vice versa, seriously disrupting the readability.


What format are you starting with? That doesn't happen if you start from a modern, flowed (kf8) format with proper markup. Calibre tends to preserve all markup. It might add some trivial css fixups in some cases, but usually that's only when tags it thinks are important have no associated css already.

There are two reasons I can think of for why you'd get broken paragraph separation. First, the markup in the kindle version (if it's old style mobi, or really badly generated kf8) could be broken, so when calibre tries to split into multiple files to prevent single (x)html files from being too big, it can't tell where paragraphs are so it might split in the middle of a paragraph.

Second, you might be converting a pdf or azw4 (essentially a pdf) to epub, in which case you'll get paragraph breaks across pages. You shouldn't convert those formats to epub without extensive editing.


It was a long time ago, possible it was converting from azw4 files.


I don't know why it's bad, but I know the end result is poorly laid out. It's always super obvious I'm reading something passed through calibre conversion, because accent marks will be missing, intra chapter breaks will be missing, punctuation is messed up, etc.


Er, pretty much by definition, when it does a good job you wouldn't know it had been passed through calibre conversion. So how do you count up those "successes" and measure them against the "failures" to determine whether, on a probabilistic kind of basis, it does a good or bad job?


Good point, in my case I've only done about 10 books and I can remember each one, 100% converted wit obvious problems.


Converting ebooks doesn't destroy formatting unless you convert to a formatting-limited format like mobi or rtf as an intermediate step.

kf8 and epub are roughly equivalent.


The default settings to remove/add spaces between paragraphs amounts to destroying the original formatting.

KF8 is a proprietary wrapper around a regular EPUB file.


I don't think that's the default. I just reverted to default settings when trying to convert a mobi to epub, and the options to remove/insert paragraphs, linearize tables and several other obvious unwanted modifications are all disabled.


If true, that's excellent news! I've carried around my custom Calibre settings for several years now, so I don't really know what is the default anymore. Back when I started using Calibre, the default settings were not ideal, and most Calibre converted files I've seen in the wild have used the old defaults.


mobi<->epub conversion is rarely problematic in practice. However mobi is not a wrapper around epub as some are claiming. They merely have low impedance mismatch.


Any tips for what settings to change to prevent the destructive behavior you mentioned?


The most destructive behavior is the default Look & Feel settings. If you uncheck "Remove spacing between paragraphs" and the "Insert blank line between paragraphs" settings, Calibre will leave most formatting as-is.

These settings can be useful in cleaning up badly formatted files, but should not be used carte blanche on all input files, especially those that don't have problems. You'll lose niceties like the first paragraph in a chapter not having text-indent, etc.

Generally it's better to use the "Edit book" feature on EPUB files and manually fix up the CSS if the styling is not to your liking. Converting with Calibre also hard-codes @page margins which overrides most e-Ink reader software's margin settings; so if I do end up using Calibre to convert to EPUB, I always have to edit it afterwards to remove these hard-coded margins in the CSS.


Sometimes it's just better to pirate the books even if you've paid for them. The formatting and error correction is usually better and they stick to epub and mobi standards.

As an aside: I was helping a friend out who owned a Kindle recently and shocked to learn it can't read mobi or epub. Why would anyone buy such a crap e-reader?


It reads mobi, just not ePub.

And I use it because it’s got the best hardware experience of the bunch, IMO.


Do some? I was under the impression you couldn't load .mobi files on to it and they had to be converted to amazon's proprietary format.


I’m not familiar with any models not loading mobi. The only caveat is that there is a sub-format of mobi that’s fixed rather than reflowable (like, a PDF vs. a TXT), and that’s never been supported. But I’ve only come across those once in a blue moon in the wild.

ePubs seem more popular than mobi though, so I think that’s where most of the “converting for kindle” happens.


.mobi predates Kindle, but it was the first format on the original Kindles. Amazon bought the company which made it. Amazon no longer uses the original mobi format, but keeps it for backwards compatibility. Most non-Amazon sources for Kindle books still use mobi.


Kindles are awesome e-readers. They can read .mobi files, the battery time is in weeks and the e-ink display is very easy on the eyes.


Yes but they don't support epub, which is frustrating (at least mine doesn't). Easy problem to fix but it seems the majority of ebooks online follow this format.


Installing KOreader after a jailbreak let’s you read whatever you want. (And connect to OPDS servers)


What you described is equally true of Kobo, Nook, and pretty much every e-ink based reader. And they all support more formats than the Kindle does.


> Why would anyone buy such a crap e-reader?

The point, and imho only point, of Kindle is the tight integration to Amazon store. It is very much dedicated single purpose device for reading Amazon content. I have one and its great; I browse Amazon store on my computer, select a book, and with one click buy the one I want and have it automatically delivered to my Kindle. No need to think about files and formats or copying and converting stuff around.


I’ve had both a Kindle and Kobo. The latter seemed like the better device because it felt less-tethered to bloatware I don't care about that the Kobo has (Pocket, gamification, etc.) and has decent GoodReads integration.

Kindle does support mobi, but you have to convert epub files (this can't be done through your Kindle email address' convert ability for some reason).


> Kindle does support mobi, but you have to convert epub files (this can't be done through your Kindle email address' convert ability for some reason).

I think the reason is they don't want to make it too easy to buy ebooks elsewhere and read them on the Kindle. Having to use something like Calibre to convert epub to mobi before sending it to a Kindle is just enough hassle and requires enough comfort with technology to make most stick with the easier option of buying ebooks from Amazon.


Err, I might not be reading this right, but don’t you mean the former?


AZW3 is basically EPUB, you just need to run your EPUB files through Calibre. (Not that this excuses the lack of EPUB support.)


My kindle can read .mobi file sfine.


While I try to buy non-DRM books when I can, I'm still willing to buy DRM'd books as long as the DRM is pretty breakable(and I back them up immediately most of the time). If the books on Amazon, Google, B&N, etc were to suddenly have unbroken DRM, I'd stop buying DRM'd ebooks until that issue was "fixed". Every ebook DRM format(except maybe Apple) is already trivially removable, so I wish they'd drop the farce and go DRM free already.


> Every ebook DRM format(except maybe Apple) is already trivially removable

AFAIK, Amazon's latest DRM on KFX files hasn't been broken yet. That came out in 2015. Some people get around it by getting the book in the older format, but then you lose all the advanced typography stuff that KFX supports.


Removing DRM from KFX works since March [1].

[1] https://apprenticealf.wordpress.com/2019/03/30/dedrm-tools-6...


I think Amazon patched their PC client and e-reader firmware to fix the hole. I don't think the DeDRM works on a KFX you download today.


Breakable DRM is there for legal reasons. IANAL but you can make copies of copyrighted materials for private use, but you are not allowed to circumvent copy protection mechanisms.


I believe it is legal to break DRM for purposes of backup / personal use on unsupported devices / etc in the United States, but I'm also not a lawyer.


It is not legal. Making backup copies for yourself works as a fair use defense of copyright infringement but DRM breaking is a separate matter; that's why that part of the DMCA is reviled, it prevents people from doing what copyright otherwise permits them to do.

There is an exemption for breaking DRM on ebooks to enable their use with assistive technologies used by people with disabilities. There are some audiovisual exemptions for educational use that encompass "multimedia ebooks" but only for specific purposes.

https://www.federalregister.gov/documents/2018/10/26/2018-23...


that's the sort of elitist behaviour that plagues digital world.

dvd locked by region? don't worry, more expensive models can be bypassed by those two button presses, only the poor is really impacted.

tracking users and OS security bugs, don't worry, the latest $600+ devices will get a patch. only poor, used device buyers and developing countries models will be affected by not receiving a patch.

the modern world now is divided into the elite and the not-economically-viable-to-port-security-patches. And everytime we, the elite, dismiss something as a nuisance because we are the elite, we are making things irreparably worse.

my sugestion: do not give in to drm because you can work around it. instead pay the extra cents and get a used hardcopy. the convenience of a ebook is lost if you have to work around a drm anyway. and that way their drm sales drop to zero and they have to rethink this whole falacy.

The number of people accepting DRM content is used as a sales pitch by Pearson printing to force universities to use DRM text books that can't be resold! again, screw the poor, right?

PS: sorry for the rant. DRM makes me salty.


I've backed up my eBooks from time to time & cracked some books over the years that I had purchased & wanted to put on another device but a lot of the time I don't worry about it.

I got my first kindle in 2010 and bounced over to Barnes & Noble for a while with a Nook but then came back to the Amazon ecosystem. I've bought hundreds of books over the years.

But just like paper books there are a lot of books that I read once and never look at again. They aren't priceless treasured objects for me.

The thing with Amazon.. people complain a lot, but their ecosystem just works the best in the eBook space. I actually find it a lot less annoying than some of the shenanigans with other digital stores like the iTunes store over the years. I haven't tried Kobo but B&N was always a hot mess compared to Amazon. Tons of annoyances in the store & device software that Amazon had right from day 1. And you were paying more for a worse experience.


> their ecosystem just works the best in the eBook space

I just want ebooks. I don't want an 'ecosystem in the eBook space'. If I had a similar paper book ecosystem 'in the paper book space' I'd have some giant foreign corporation insist that before I was permitted to read I must buy a bookshelf from them. Then all their books would have to be on that shelf. Then if there's a book they don't sell, or for which I choose another retailer, I'd need to buy a different shelf. Then every time I want to read a damned book I have to first know which corporation's mandated shelf it's on.

The whole DRM-ebook business stinks. I do buy the odd one from Amazon if I can't get it anywhere else, but the DRM goes immediately on download.


Fair, but just like I've never needed my photo backups, when I do need them (if I ever do), I'll be very, very, very glad I put in the ~3 hours of work or so a month it takes to ensure my photos are all backed up.


Doesn't Amazon let you re-download your books on your kindle?

That's a little different than photos in that once the photo is gone, you don't have the opportunity to capture it again.


Some books with Amazon's DRM have a limit on the number of devices it can be used with. Sometimes the limit is very low, such as only 1 or 2 devices. This makes it difficult to re-download some books.

I've run into this, especially with science or reference books. And was impossible to fix on my own because the original download was to an old Kindle that's now sitting in a box somewhere in the garage, or to the Kindle app on an old phone that's no longer in my possession.

I suppose that could be fixed with a call to Amazon's customer service who could probably de-register those old devices in their system. But who wants to deal with that? Every book I've ever bought from Amazon has been cracked and backed up.

I had been willing to put up with the DRM. Didn't want to go to the hassle of re-downloading and processing hundreds of books. But when I ran into the "This book has reached its license limit and cannot be downloaded", that was the last straw.


Assuming Amazon doesn't just decide to delete it [1], sure.

[1] https://www.nytimes.com/2009/07/18/technology/companies/18am...


Seems that those books were added to the store by a company that didn't have the rights. I'm not sure that's a widespread problem.


I'm not sure the distinction is relevant. The fact is, Amazon can and will remove your books at any time and there is nothing you can do about it.


Right, but in setting a pattern of behavior that they will remove books whenever they want, Amazon only seems to do it when there are bad actors involved.


My concern with this thought is that eventually any company will shut down or stop supporting whatever service I’m using.


How do you back up your photos?


By copying them to harddisks, and burning the important ones to blurays.


> But just like paper books there are a lot of books that I read once and never look at again. They aren't priceless treasured objects for me.

Traditionally your books could have been donated, or given to friends and relatives. DRM prevents those things. It really sounds like you might be better off using a library to borrow ebooks than to buy them. You're already basically just paying for a rental.


I have a Kobo and haven't had any complaints (though they're often not as cheap as Amazon though).

P.s. I wasn't sure if you were saying this, but I don't think Kobo has anything to do with B&N.


I was just mentioning that I haven't really given Kobo a shot... so I could be in the dark and Kobo is better.

But I hear enough through the woodwork that I have been under the impression using Kobo would be trading off annoyances with some things better & some worse. Mostly what I hear is Kobo often has some hardware advantages but has some different software annoyances.

What format the book uses is a very minor concern to most of us who just read the books. Epub vs Mobi vs KF8 or whatever has no impact on me 99% of the time. The only time it'd really matter is hopping around between platforms and having to convert stuff.

If Kobo started having a huge advantage I'd be more likely to buy a Kobo and keep my Amazon library with Amazon and start buying Kobo books there. I wouldn't go on a big project to convert all my Amazon bought books to a Kobo compatible format. Cause once done I wouldn't actually need/want to re-read all those books on the Kobo.


To me, Kobo has a major advantage with the “warm toned” backlight on the Clara HD. I switched from kindle just for that.

On kobo, the dedrm process involves installing the kobo desktop software (windows/Mac only), which syncs the books to your desktop. From there you can snatch the ebook from the file manager and put it in calibre. There’s an “obok” dedrm plug-in.

If you switch from kindle to anything else, you will lose the ability to download your kindle purchases. You need an active kindle account/device to download ebooks from the kindle library (this wasn’t always the case).


Obok can grab the books straight from your Kobo using the Calibre plugin. No need for the desktop software.


Some of the DRM failures are real: kindle DRM updates and some forms don't strip in calibre IIRC. But that aside, this is all pretty much what I do.

I feel uncomfortable about the legalities but I do not like the social contact being written in this way, against any consumer interest and without discussion.

Buying books is important. Authors deserve their pay. DRM is like book cancer.


Don't forget to backup your marks and highlights also. I made that mistake once, and then lost my Kindle. Lost years worth of highlights and notes, so now I set up a bash script to automatically sync them to a Git repo when my Kindle is plugged into my laptop.


Your bookmarks and highlights are synced to the cloud. You should be able to read them online on: http://read.amazon.com/notebook

From the kindle you can also export all the highlights of a book to your email. It's a lot better to read it on your email. The kindle device reader note reading experience is subpar.


Yes, but only for the books you purchase through Amazon. If you import DRM free books (or presumably re-import DRM free versions of Amazon purchases) or personal docs, the notes on those won't be saved.


And that’s a big use case. I use Kindle for most of my reading and import many other documents (papers, DRM free books I bought and such).


Tangential, but is there a way to scrape this page or an API to access it? I'd like to be able to automatically pull my highlights and haven't found an easy way.


Look into readwise. There's no API access to highlights, do they scrape it from an open tab.


This sounds like something that many could benefit from!

Would you mind sharing your script?


Sure! I barely know what I'm doing, but this worked for me: https://github.com/wneuheisel/Kindle-Notes-Backup

Suggestions welcome!


Thank you so much for sharing! You’re even using notify-send for notifications^^

I’ll test it as soon as I can and will get back to you with feedback if I make changes. From quickly glancing over the code, I do like it!

This closes my last Kindle lock-in. I’m very grateful _/\_


Huh. Are there scripts you recommend for those of us without physical Kindel's and just use our phones/tablets?


If you add your Goodreads account to the Kindle, it will store them there. :)


Does that work for imported documents also, or only Amazon-purchased books?


By coincidence I was doing this last night because I got a new eReader (non-Kindle). I found the most reliable method was to go the the "Manage Content and Devices" page on my Amazon account and manually download each book. This method downloads a single file for each book to your machine and its in a format Calibre and DeDRM can work with. It took a while even though I only had ~80 books but it was worth the effort.


I went to that page but there is no way to download the books. What's the name of the link/button to use?


That page should have a list of books you own. On the left side of each row for each book, there should be a light gray button that looks like: [...] It's under the "Actions" column, second to the left. Click that and there should be a popup that contains a link "Download & transfer via USB"

I believe this link will only be available if your account has a real kindle (e.g. eink or fire) registered to it. It encrypts the downloaded file for whichever registered kindle you select from a list of all your registered kindles and apps, with the apps grayed out so you can only select actual kindles.


The easiest ebook format to strip DRM from is the Adobe DRM that various stores use. I only buy ebooks from the Google Play book store now, since DRM removal is so straightforward. (Same deDRM calibre plugin, but you just get a single encrypted ePub you can download, and adding it to calibre gives you a DRM-free ePub, which is a more industry-standard format. Converting to or from KFX or AZW3 can be error-prone.)


I mentioned this in the post :)


What are folks using for Audible? Open Audible? It wasn't pulling my library correctly last time I tried.


1. Download your own AAX files (using the Audible download manager or search around for some ways to do it w/o the download manager). 2. Find out your 'activation keys' using ffprobe and the inAudible-NG project, see instructions here: https://github.com/inAudible-NG/tables/blob/master/README.md. Needs to be done only once for your Audible account. 3. Use ffmpeg with --activation_bytes to convert AAX to MP3 or other formats, either directly or with a nice script like this one that'll divide into chapters: https://github.com/KrumpetPirate/AAXtoMP3

OpenAudible looks like a nice tool that'll automate all of these steps for you.


Why is transcoding necessary even if the AAC option is used? (I was under the impression that audible used aac under the hood.) Is this avoidable?


I'm not sure. Possibly if you try ffmpeg's -a copy and such, it'll simply extract the AAC. Let me know if it works for you. I've always wanted mp3 in the past so I didn't mind the transcoding.


I recently used AAXtoMP3 to back up my Audible library. It worked quite well.


There are a variety of DRM Free audiobook providers:

https://www.downpour.com https://libro.fm

Most titles on audiobooksnow.com are free as well, but not quite all, so make sure there's no "DRM Protected" symbol.

I bounce between these providers (subscribing and cancelling to their various plans as needed) based on book availability and price. I believe I pay slightly less overall than if I used Audible for everything, even if acquiring books takes more work.

In a pinch, you can also remove Audible DRM with various tools (see sibling comment), but we should support proper DRM Free providers wherever possible!


This Q&A is relevant: https://askubuntu.com/questions/16918/how-to-listen-to-audib...

The second answer (which is the top voted one) is my answer. It's basically using ffmpeg with -codec copy.


I figured out how to finally get this to work. For Mac OS, you have to install an old version of the Kindle App. Version 1.23.1 appears to be the last version that downloads AZW3 formatted ebooks that can then be converted to EPUB easily with the DeDRM plugin. All later versions appear to be broken. Also, the Kindle app will silently update to the latest version unless you turn off the auto-update option. I converted all of my paid books to epub and not buying any more through Amazon. This post inspired me to do this before it is too late and the old version is no longer compatible to future OS versions.


In April, Microsoft announced customers who bought books through its ebook store would lose access starting in July.

https://www.bbc.com/news/technology-47810367


There are other reasons for doing this. Even if Kindle market would function forever and Amazon won't delete books remotely there is still a possibility that your account could be breached, for example by social engineering means. And if malicious person can access your account he can delete everything there, forever. If someone will steal your Steam account you can restore it completely, most likely. If Amazon acc is stolen - goodbye everything. And support won't restore or refund anything. (support itself is good)


It's a neat exercise but one I gave up on years ago. Now I avoid the hassle by buying a book and then pirating a copy for my archive.


this only works with purchased books. if you sent an epub to your kindle, it will not show up in the devices section of the amazon website.


But then you must already have the file as an epub... so what are you trying to accomplish?


I have this great idea for a business: "ewhiteout" for ebooks.

Basically you can selectively "patch" ebooks to censor profanity, sex scenes, violence, (and more) depending on your tastes and on what you like (and don't like) to consume.

However, this is extremely unpopular with authors/publishers for the same reason that censoring movies is extremely unpopular with Hollywood (hence why companies like VidAngel are constantly mired in legal issues despite widespread popularity with consumers).

Honestly, I don't understand why since at the end of the day everyone gets paid, and indeed get paid more than they normally would have (I guess it's more of a "you are altering my art so people can consume it in ways I didn't intend" attitude). But in my mind it's no different from using whiteout on a physical book or picking out the mushrooms (or whatever you don't like) in some chef's dish.

Basically, I've already built the tech to do this for myself privately (and have patch files for ASoIaF and a dozen other epubs), but I will never share it (I've even considered FOSS) except with close family and friends even though I know it would be wildly popular in places like Utah because I fear legal pursuit from authors/publishers (and since you must break DRM in order for the tech to work, I don't see a way out).


> Honestly, I don't understand why since at the end of the day everyone gets paid, and indeed get paid more than they normally would have (I guess it's more of a "you are altering my art so people can consume it in ways I didn't intend" attitude).

The problem with "you are altering my art so people can consume it in ways I didn't intend" can be a purely economic one, as it's misrepresenting the "brand" that is the author. If you removed the "vulgar" things than you are changing the product in a non-controlled way. It's why some directors or writers remove their names after producers or whatever external controllers get a hold of it. It doesn't represent their creative abilities. They could put out two versions, but I think that the opinions of most authors that if the sex scene could be removed, it didn't need to be there, and it wouldn't have been there in the first place.

It's removing an aspect of the brand.


The way I see it: it's not me or my company altering the brand, it's the consumers themselves altering the brand so that they can enjoy it more fully.

How is a consumer altering a movie experience or book experience any different from a consumer altering a:

- music experience (remix, etc.)

- food experience (customizing a chef's dish)

- video game experience (with game genie, mods, etc.)

- picture experience (with photoshop, memes, etc.)

?

People create derivative works all the time of all media types to suit their tastes, dreams, and whims - I don't see why that should be illegal as long as they aren't trying to resell it or create a competing brand.


Why not just consume media you don't find offensive in the first place? Would you feel differently if I wanted to _insert_ sex scenes in to movies that didn't have them?


> Why not just consume media you don't find offensive in the first place?

FOMO, mainly? You feel left out when all of your coworkers are talking about GoT everyday... if only there were a way to consume it in a way that met your standards and you could participate in the discussions too...

> Would you feel differently if I wanted to _insert_ sex scenes in to movies that didn't have them?

Not at all! If that's what you want to do, you should be able to do it on your own copy! Or if you wanted to replace the main character with Thomas the Tank Engine, whatever floats your boat, that's great.


"hence why companies like VidAngel are constantly mired in legal issues despite widespread popularity with consumers"

Widespread popularity with an incredibly small number of consumers who are generally very religiously oriented. And as seems to be common in the highly-religious groups, VidAngel commits copyright and trademark violations with the justification of "it's for families!" Just like a thousand religious summer camps who make "Reese's Jesus" shirts and shirts with "Jesus Christ" written like the Coca Cola logo because "it's for Jesus!"

The vast majority of people don't feel the need to censor the content they watch because they either understand what's in the content, or simply avoid tainted content altogether. If you're censoring things like ASoIaF, then maybe ASoIaF isn't for you, considering how important incest is to the plot.


> The vast majority of people don't feel the need to censor the content they watch because they either understand what's in the content, or simply avoid tainted content altogether

Right, but said people number in the millions, and would consume said content if only there were a way to selectively censor what they find objectionable.

> If you're censoring things like ASoIaF, then maybe ASoIaF isn't for you, considering how important incest is to the plot.

Speaking from experience, me and my family/friends agree that ASoIaF is a fantastic series, even when completely censored to remove all instances of explicit violence, sex, and profanity.

The only thing you really lose is explicit detail when it comes to specific sex scenes or violent acts - and that constitutes less than 1% of the total prose. Everything else (intrigue, politics, etc. - the 99%) is still there.


>Right, but said people number in the millions, and would consume said content if only there were a way to selectively censor what they find objectionable.

Perhaps I'm not understanding, but if you're whiting it out, you're reading it anyway, right? So, what's the benefit?

If you get to a sex scene, can you just skip it or something?


Have you ever used VidAngel? Basically it gives you a tree of content you can selectively filter before you even watch the show. You can say "filter all sex", or you can drill down into the tree and only filter "naked women" or "naked men" or even drill down and filter a specific scene if you know about it.

When you hit the timestamp of a filter, it just skips over the scene.

Books would be the same, but with the added advantage that you can easily alter the text to make it feel more natural. So instead of an explicit sex scene, you can end the chapter with a new transition where the characters enter the bedroom and close the door and it is simply implied what they did in between chapters.


> I know it would be wildly popular in places like Utah because I fear legal pursuit from authors/publishers (and since you must break DRM in order for the tech to work, I don't see a way out).

FWIW, congress wrote an exemption to the Copyright Act when a company wanted to do this for movies. https://en.m.wikipedia.org/wiki/Family_Entertainment_and_Cop...

So if your product was actually popular enough in Utah that might happen again. Of course that was 13 years ago.


I can understand this perspective but even voluntary censorship seems so contrary to some of the aspects of literature that make it compelling, chief among those its capacity to help expose yourself to different modes of thought.

Ironically GRRM specifically mentioned that he doesn't write "comfort fiction".


Think of voluntary censorship in terms of other things like voluntarily abstaining from alcohol, drugs, or even meat.

I avoid things that I believe are bad for my body, and I avoid things that I believe are bad for my mind. You might argue that it's impossible to consume something "bad for your mind", but I disagree. I think there are certain forms of entertainment (written, visual, and interactive) that absolutely can desensitize you, distract you, addict you, influence you (negatively), etc. - I try to avoid those effects where possible, and self-censorship is just one tool to combat them.


For what it's worth I have really appreciated this thread. I 100% agree with the comments that you have made and try and do the same. Unfortunately I doubt I will ever make the time to create a project like yours and just ignore certain media instead.


Thanks! If it makes you feel better, you can already use Calibre to do basic ebook editing (bulk find/replace profanity, etc.) - your ebook just needs to be DRM-free.

All my project allows me to do is export a set of changes on a book to a JSON-based metadata format, so the changes can be shared. I'm afraid to monetize it because I don't think I could deliver a good UX without breaking DRM (which would get me into legal hot water), but perhaps making it into a context-free Calibre plugin might be safe...


Alright, thanks! One more thing to add to the future tech list.


I don't really see religious/conservative populations going out of their way to abridge a book that includes scenes that they need need "whited out". There are plenty of other books to read, and they'd probably prefer not to support works they disapprove of, anyway. That's even assuming that the books make sense with the sex and violence torn out; if you were to read The Girl with the Dragon Tattoo with everything a puritan found objectionable torn out of it, there wouldn't be a very sensible story left.


> if you were to read The Girl with the Dragon Tattoo with everything a puritan found objectionable torn out of it, there wouldn't be a very sensible story left.

I think you misunderstand what religious people really want. Most people with "puritanical" objections to media content are just wanting it to be less explicit (i.e. left to the imagination) - not completely whitewashed from the story. After all, most religious texts themselves are rife with sex and murder; the acts themselves are usually not explicitly described in great detail, however.

"Lysanderoth's sword tore into Archibald's neck, spraying blood and chunks of windpipe across the battlefield and leaving him loudly gurgling as he slumped to the ground"

vs.

"Lysanderoth slew Archibald with the sword"

You can easily retain the story while eliminating the explicit aspect of the prose.

For example, I edited the entirety of the "A Song of Fire and Ice" series, and you may be interested to learn that eliminating every instance of explicit violence, sex, and profanity altered slightly less than 1% of the total word count of each book.


I completely understand why an author would reject anybody attempting to change the prose as you have suggested here.

Whiting out in a book, or in your own ebook copy, is one thing. Sharing, then, those edits would be like releasing a fan-cut of a movie. Doing so wholesale (the entire movie) isn't ok - sharing timestamp swaps that you can plug into VLC is a different story.


> sharing timestamp swaps that you can plug into VLC is a different story.

That is precisely what I am suggesting sharing - page numbers, paragraph numbers, length, and other metadata needed to filter the book (essentially "timestamps" for books). This metadata would tell the software where to excise sections from the ebook. People would need to own their own copy of the book, and the ewhiteout software just alters it according to the metadata input.


Interesting, I'm so far outside the target market for this I was very skeptical initially. But as you point out, I'm sure there are many communities that would make heavy use of this.


I knew someone that used the vidangel service, but I don't really get it. If you remove the cursing/sex/violence from much of the works that use them, does it not remove context that is necessary? You mention ASoIaF, which is littered with all three right? How can you consume a "patched" version and still retain the themes and messages of these works?

If you don't want to be exposed to these things, why not just only consume content that does not have them in the first place?


> How can you consume a "patched" version and still retain the themes and messages of these works?

From context! You don't need to explicitly see two characters having sex or read an erotic paragraph about them having sex to figure out they have an incestuous relationship.

And most, if not all religious folks just want the sex and violence to be less explicit (left to the imagination) not completely whitewashed from the story. Otherwise you wouldn't even be able to read the Bible/Quran/Book of Mormon/etc. without encountering sex and violence.


> Honestly, I don't understand why

I can't believe you don't see the issue?


No, please inform me (after reading my other comments).

It seems like "the issue" boils down to "it offends the author"


DRM isn't in and of itself the issue; copyright is. I have the copyright to my novel Kismet, for instance, and my publisher has the publishing rights in both print and ebook. You don't get to produce a "derivative work" from Kismet without my explicit approval and, because the contract with my publisher grants them exclusive rights to the novel in print and ebook form, a negotiated exception.

I suspect your argument is that you're only distributing the "patch files," and so that's not intrinsically a copyright violation. (This is distinct from VidAngel, which admitted that they did, in fact, have to have copyright-violating master copies of movies they were "patching" in order to stream the bowdlerized versions.) And...maybe? Yes, in that case, DRM would become more of an issue. But I'm not convinced that a publisher with sufficiently deep pockets who wanted to press the case couldn't successfully argue that the combination of your patch file plus the original work creates an unauthorized "derivative work."

Well, let me rephrase that: that combination absolutely creates an unauthorized derivative work. It's not clear to me that it's illegal, but the ethics strike me as dubious. For instance, using my own book again as an example: the book wouldn't be materially affected if the main character's penchant for swearing when she was stressed got toned down. But it would be materially affected if someone came through and edited out the parts that made it clear that she was pansexual, or that changed the non-binary character to no longer be so, or that changed a gay character to not mention his orientation at all, or that quieted the vocal politics of either the protagonist's socialist sister or her libertarian friend. And that's the real issue you'd run into: it's not just the potential for "you took out the R-rated parts, you prude," it's the potential for "you've made my text mean something it wasn't intended to."

If you made a firm commitment to work with authors and publishers and only release approved patch files, that would probably make this possible (again, setting aside the DRM for a moment). But without it...well, with respect, I'd definitely rather not see that.


> This is distinct from VidAngel, which admitted that they did, in fact, have to have copyright-violating master copies of movies they were "patching" in order to stream the bowdlerized versions.

Yeah but they purchased one DVD per streamer, so each streamer was "virtually" modifying their copy. Otherwise the FMA is incompatible with modern video technology such as streaming. The real reason VidAngel got in trouble was because studios do not want users to have the ability to remotely buy/sell/stream DVDs - even if they technically owned them at the time of streaming. To do so would circumvent their lucrative streaming licensing.

> You don't get to produce a "derivative work" from Kismet without my explicit approval

Agreed, but that depends on the definition of a "derivative work".

> Well, let me rephrase that: that combination absolutely creates an unauthorized derivative work

Not necessarily. Does using a game genie on a Super Nintendo game create a "derivative work" if you use it to completely modify the game at run time? The courts ruled that it didn't since it didn't create a new copy[1]. Hence if I could modify your book at run time such that I didn't create a new copy, it could be considered the same. This is sometimes known as the kaleidoscope effect (i.e. looking at a proprietary painting through the lens of a kaleidoscope doesn't create a new painting and thus doesn't create a derivative work).

Or, even more generally, what's to stop me from building two physically overlaid screens - one that displays your original work and one that overlays alterations? That doesn't create a derivative work anymore than drawing on my television screen with a sharpie while a movie is playing creates a derivative work.

Again, I understand your desire for users to consume your work as you intend. But extend that desire too far and you turn into a strong DRM proponent. I'm a DRM opponent, and I respectfully assert my right to modify my copy of your work as I please - and I invite you to do the same for my works - be they software, novels, or whatever.

[1] https://en.wikipedia.org/wiki/Game_Genie#Legal_issues


I find that a number of Kindle books I've purchased at the beginning say:

"The author and publisher have provided this e-book to you without Digital Rights Management software (DRM) applied so that you can enjoy reading it on your personal devices"...

So I assume either they don't have kindle DRM, or it is ethical (possibly legally allowed) to remove the kindle DRM.


Publishers (including self-publishers) can choose whether to apply DRM to books on Amazon as part of the uploading process, and not all publishers do so. Tor and Baen, for instance, don't use DRM on any of their titles.


I did a review of the varying e-readers 9 or 10 years ago and decided on the Nook based on standards support. I still use a Nook and never had any problems like this.


How legal is stripping the DRM like this? Can one get in trouble for this?


In the US, it is probably[^1] illegal under the DMCA, however, exceptions to that law are determined every ?2? years by the library of congress.

https://www.law.cornell.edu/uscode/text/17/1201

https://www.eff.org/deeplinks/2015/10/victory-users-libraria...

> The exemptions we requested—ripping DVDs and Blurays for making fair use remixes and analysis; preserving video games and running multiplayer servers after publishers have abandoned them; jailbreaking cell phones, tablets, and other portable computing devices to run third party software; and security research and modification and repairs on cars—have each been accepted, subject to some important caveats.

[^1]: I am not a lawyer.


> Use of Kindle Content. [...]. Kindle Content is licensed, not sold, to you by the Content Provider. The Content Provider may include additional terms for use within its Kindle Content. Those terms will also apply, but this Agreement will govern in the event of a conflict. Some Kindle Content, such as interactive or highly formatted content, may not be available to you on all Kindle Applications.

> [...]

> Termination. Your rights under this Agreement will automatically terminate if you fail to comply with any term of this Agreement. [...].

https://www.amazon.com/gp/help/customer/display.html/ref=hp_...

So you may get in trouble if you're caught doing stupid things, yes. Will they bother though? Unlikely.


physical books are so much better for this reason and generally cheaper


They are not “generally cheaper” (ebooks are usually about 30-50% cheaper), and much less convenient if you travel a lot.


also you have to buy a device to read them on


Once.


typically on amazon you can find cheaper physical books via alternative sellers like bookdepository


Not if you are not living in the US. For a European who wants to read English books, Kindle Store is a blessing.


I'm in UK and prefer buying a book for 1p and pay £2 shipping than £11 pound for an ebook that i wont' even own, merely licence



I periodically go through and back up my Kindle to Calibre.

I have a fairly enormous ebook library, and I don't care to have it exploded due to, e.g., some error in the kindle software api.


Anyone know of a good way to browse & put pdfs into the kindle? The Amazon PC application purports to do as much but is really bad at it.


Easiest way I know of is to email it to the kindle, https://www.amazon.com/gp/sendtokindle/email

Haven't done it often, but it seems to work fine.


You can use this app, too

https://www.willus.com/k2pdfopt/

to pre-process them in various ways so they are actually readable on a Kindle. (Crop whitespace, flip to landscape etc)


My old kindle mounts as a disk when connecting it to my pc using usb, so I drag pdfs into the directory all the other books are in


The ones that fail are the new encryption.

To remove that you need another add on and the serial from the kindle as key

Don’t have the name of it with me now sorry


If I remember correctly the process is described in the README of the DeDRM plugin that can be downloaded from GitHub: https://github.com/apprenticeharper/DeDRM_tools/blob/master/... (and I am talking about the README file inside the zip you download there).


I've downloaded all my books but I'm holding off cracking the DRM until I need to.


> I'm holding off cracking the DRM until I need to

Last time I checked, there was no crack for Amazon's latest version of their DRM (KF8 files). A workaround exists where you get Amazon to wrap the content in the old DRM, but then you lose all the typography improvements in the KF8 file.


This doesn't work anymore.

You need to download old Kindle app where it's possible.


Or download the files from your "Manage Your Content and Devices" page using the "download and transfer using USB" link which gives you the older AZW3 format.


anyone know how to do this for audible books?


2018


Why is this article so high up? It's a blog post by an Amazon associate who fails at following basic instructions (dedrm tells you how to remove the new DRM and how to do it without a physical device)

"hacker news"


It's probably not high up because it's a great blog post, but it sparked a discussion and with the news from a few days ago that some big provider's ebooks will stop working because of DRM issues it's a topic on people's minds.


Consider that it's useful for y'all to hear from power users who aren't programmers. Also I wrote this last year, so it doesn't have whatever the newest advances are. The point of documenting my experience was both to share knowledge and demonstrate what a pain in the ass this is.


Because of the recent Microsoft ebook store shutdown and people are concerned about losing other ebooks.


Or instead of messing with propertiary programs one could set up an rsync job to pull the books from the actual device. DeDRM still required, btw. But do humans really requires a guide how to backup stuff from a frickin' mass storage device? I hate this like the 1000 different "10 hammock mistakes" video on youtube, where all of them telling the same things. Ah, she is member of "Amazon Associates" program so this article is just a bs for more clicks.


She didn't have a phsyical kindle device available.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: