I'd be wary of applying such a harsh threshold to the output as the author has done here. While it captures the page well enough for typical text, I have encountered many pages from Google Books where the threshold has mutilated illustrations or small footnotes to the point of making them unreadable. And if the Google Books scan is the only one available, you're completely out of luck.
Isn't the thresholding just for identifying landmarks to be used for choosing dewarping parameters? Presumably once found those parameters could be applied to the original image.
This is very nice. Having a low-dimension page warping model to optimize seems to be what makes this work.
This is a good YC-sized problem - weeks to market, a few hundred thousand dollars to launch. Apple has a phone app for this which requires too much manual adjustment, and Microsoft has Office Lens / Microsoft Lens, which has comments such as "The edges just eventually go crazy and look horrendous". So there's a market for one that Just Works. Exit by selling to the usual suspects.
I remember being fairly happy with Google Drive app, but I don't recall details of the old functionality either. Today, it is limited to using a quadrilateral envelope to frame the page, which is inadequate for anything other than flat single pages. Its shadows removal filter is pretty good though.
Product management felt it was not worth the tech risk (seems too complicated & mathy), and they would get better user metrics by instead building a more sophisticated model for timing notifications by trawling user’s social media activity. The decision makers decided to be rigorously data driven, in their effort to reduce churn.
After John Warnock stepped down as Adobe CEO, he ramped up his involvement at Octavo, a company dedicated to preserving rare historical books. A challenge they faced was de-curling scanned pages that could not be flattened.
I ran into a different issue while trying to make an app to scan color coded notes in college. The colors skewed from the top to the bottom of the page making it hard to reliably tell blue pens from green ones. At some point I should look at it again.
Assuming the white background skews in the same way a good trick is to make a copy of the image, blur it VERY widely, and divide the original image by the blurred version. This essentially removes any low frequency color/brightness shifts. I'll often use it to get rid of shadows etc on photographed paper, but it should work for a color gradient all the same.
Looks adequate. Seems, though, that the warp model is a little too global, in the sense that some of the more involved distortions of the paper are not captured by the model and this can be seen in the final result as a residual distortion.
this is really interesting to read imho, thanks for sharing as i missed it in 2016 i guess. It's so nice to get the full, i have this problem, applied smart techniques and got a nice working solution. Fun and interesting read :-). Don't think i'll ever need anything like this ,but it's a great example of tackling a problem with some good methods, and cutting some corners where the output results and expectations allow it. well written and explained.