Hacker News new | past | comments | ask | show | jobs | submit login

Could isolate the text by compositing neighbor images and getting the pixels that don't differ. The background is moving, the text is not.

This would also allow you to extract out the subs without having to OCR them/get characters. Could just erase all static artifacts (including subs but also things like watermarks).




An approach I really want to try is taking a stream of the video without subs (can easily be found online) and subtracting the two. You'd have to deal with differences in resolution and compression between the two, and also handle cases where the background is either white or black, but in theory it should work very well. I haven't had time to dig into this.


Seems like you could get gstreamer and some subtractive elements working pretty quick...


Legitimate question: why would you want the first video (hardcoded subs) if you have a second stream without them, better resolution maybe?


In order to have access to vocabulary words. From the article: > I wanted to get a transcript of the episode’s dialog so I could study the unfamiliar vocabulary. Unfortunately, the video files I have only have hard subtitles


Makes sense, thanks. I went straight to the technical details and missed that part.


This has to be worth a try.


This only works if the camera isn't fixed, though. In the frame from the post it might erase the dashboard, the car roof, and so on.


Just define a small area of the screen to run on then. Subtitles are typically within a very small portion of the screen


Nope, every channel of CCTV seems to have its own subtitle convention.


Which is irrelevant to stripping subs from one movie at a time.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: