Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: CTRL-F for YouTube Videos (github.com/evan-wildenhain)
137 points by ewild 9 months ago | hide | past | favorite | 40 comments
This is a small project i made years ago and updated to whisper last year, i still use it from time to time and thought it might be useful to others, or just put the idea out there for someone better than me to make a better implementation!



Ctrl-F can already search the transcript on YouTube. I use it all the time. I guess this could be useful for videos YouTube doesn't have captions for.


im not able to do this, can you explain?


1. In a desktop web browser, visit a YouTube video with captions, which is almost all of them

2. Click the video description to expand it

3. Scroll down and click the tiny "Show Transcript" button near the bottom (whoever decided to bury it down here was very misguided)

4. Ctrl-F and search any word. Occurrences in the transcript will be highlighted and you can press enter to scroll the transcript to the next one. Click the transcript to seek the video.

I see that this extension shows occurrences on the seek bar which is cool. There is also a slight problem with regular ctrl-F: if you search for a multiple word phrase you might not find it if the phrase happened to be split between two chunks of the transcript. So that could be better in this extension. And of course not every YouTube video has captions, but most do these days.


> visit a YouTube video with captions, which is almost all of them

Depending on what you're watching, you might never come across a video with good subtitles but rather Youtube's auto-generated subtitles.

Whisper can do a better job in a lot of cases, but not all... I wonder if they've had multiple generations of auto-captioning and not gone back and redone the ones that were done earlier.

This extension is really interesting to me because in the past I've tried (and failed) to make a similar one that adds a new .vtt to the list of available subtitles for the video. I sometimes struggle with auditory processing, especially in a noisy environment, and following along with subtitles helps me out immensely, so it's frustrating when the auto-generated subtitles are poor quality. I've bookmarked the extension to see if I can fork it for that purpose in the future.


> Depending on what you're watching, you might never come across a video with good subtitles but rather Youtube's auto-generated subtitles.

Even with very little correspondence to the actual dialogue, if you already know what you're looking for, you can probably find it pretty easily in the auto-generated subtitles.

ctrl+F won't work in that case, but reading will.


if you have any questions feel free to ask!


you are correct, originally youtube didnt have this when i made it in 2019 with deepspeech, now they do but i just always preferred the idea of it being on the timebar to just click and go right to it. tbh i should just make a simple addon to take the youtube timestamps and slap it onto the timebar. also for the split chunks this would have no problem there as the transcript is actually stored in a json file, so any concurrent words will always be matchable for phrases. ofc downside being you need to run the model lol


I'd use an extension that made the transcript show by default on every video and added a transcript search bar in the page. That would be great.


i guess i might aswell do it so i dont need to run a model everytime myself too lol ill have it done in a day or two


If that also saved a copy of the transcript, with meta data added (title, channel, url, smilar vids) - as a text file on my local machine - I would actually use this as well.

Wait, is this using a cloud service in some way or is it all local / total private? That would be a deal breaker or maker.

Oh might as well copy a screen shot of the thumbnail and save it.


I built an extension that injected a search bar into the transcript card. Worked by filtering the YouTube transcripts themselves, and manipulating their display attribute.

Didn't release it to the store because YouTube released a search feature and it looked exactly like mine.


would you prefer if the timestamp was hidden since it takes up a bigass portion of the screen or that being an option to hide it in the extension settings?


I think the timestamp is OK, my biggest complaint is the huge amount of whitespace between the rows and the small size of the box. If I designed YouTube I would put the transcript on the left side above the video description, with a button that expands it to full height so there's no separate scrollbar for the transcript anymore, it's just all directly in the page.

BTW when I went to look at a video just now, YouTube actually served me a "Search in Video" box at the top of the transcript. So I guess the feature exists, they just haven't rolled it out to everyone yet.


damn i see this after im 90% done and just have to make a fancy button lol


bookmarklet to "Show transcript"

    javascript:document.querySelector('button[aria-label="Show transcript"]').click()
<https://getbookmarklets.com/scripts/data%3Atext%2Fjavascript...>


As a userscript https://github.com/madacol/web-automation/blob/master/usersc...

    // ==UserScript==
    // @name        Always show transcript
    // @match       https://www.youtube.com/watch*
    // @grant       none
    // @version     1.0
    // @author      madacol
    // @description show transcript on all youtube videos
    // @run-at      document-idle
    // ==/UserScript==
    (async ()=>{
        (await getElementNotYetRendered(()=>document.querySelector('button[aria-label="Show transcript"]'))).click()

        function getElementNotYetRendered(elementGetter, delay = 200, timeout = 10000) {
            let retries = Math.ceil(timeout / delay);
            return new Promise((resolve, reject) => {
                (function resolveIfElementFound() {
                    setTimeout(() => {
                        const element = elementGetter()
                        if (element?.toString().includes("Element")) return resolve(element)
                        if (element?.toString().includes("NodeList") && element.length > 0) return resolve(element)

                        if (retries-- <= 0) return console.error(`Max retries reached: element was not found
                        element: "${element}"
                        elementGetter: "${elementGetter}"
                        `);
                        resolveIfElementFound()
                    }, delay);
                })()
            })
        }
    })();


Vastly simplified userscript to only show transcript when pressing Ctrl+f

https://github.com/madacol/web-automation/blob/master/usersc...

    // ==UserScript==
    // @name        Show transcript on Ctrl+f
    // @match       https://www.youtube.com/watch*
    // @grant       none
    // @version     1.0
    // @author      madacol
    // ==/UserScript==
    document.addEventListener('keydown', event => {
        if (event.ctrlKey && event.key === 'f')
            document.querySelector('button[aria-label="Show transcript"]').click()
    })


You can find a button for the transcript in the description (or the three dot menu near the dislike button if it's still serving you the older interface). You have to open the transcript first, then Ctrl+f


Yeah that’s exactly what I thought just after finding out it uses whisper for transcribing. Why not use it when it’s already transcribed?


If we had an extension to skip all the filler garbage in YT videos, I would be ecstatic. Maybe that's doable now? YT captions -> identify fluff timestamps via a browser LLM -> insert segments onto the video timeline, which automatically skip, a la SponsorBlock.

We could slash through Youtubers repeating themselves, making hack jokes, narrating their video title & outline, vapid explanations of common knowledge, etc. Any of which can be customized to your taste via a system prompt!

This kinda semantic filter would actually be an immensely powerful UI tool for all webpages and media, now that I think about it...


just use sponsorblock today, works fine on all my devices. https://github.com/ajayyy/SponsorBlock

from mobile phone to tv to pc.


Do check the settings too, SponsorBlock is best known for skipping sponsored segments but it also has markers for things like intros, previews, self-promotion, and filler jokes/skits which aren't skipped by default but can be if you want them to.


The category for "non-music section" for music videos is great. Would probably make the extension worth it, if that was all it did.


Also look into DeArrow. Replaces clickbait titles and thumbnails.


Developed https://www.videototextai.com/ exactly for this reason as it was quite impossible to search videos otherwise. Also you can copy the transcript into a LLM and ask questions from video content like that.


I've been using https://www.appblit.com/scribe to get transcripts into a more readable/ctrl+f-able format


yeah I remember the whole transcript youtube coming out a yearish after i made the first version of this in 2019, but i still perfer the timebar highlighting, but thats just a preference thing


Could you explain what's the purpose of the model.pth? I'm trying to get it to work on my Apple Silicon Mac.


The model.pth is a custom LSTM for detecting phonetic similarity, as long as you're running it from the pythons folder ( I didn't manage file location very well) it should work.



If you're offloading transcription to openai, why have a local gfx card?


When all you have is a GPU, every problem can be solved with a custom AI model.


That's cool, but there is also Firefox extensions that does something similar. There's one for searching comments, and one for searching caption.

https://addons.mozilla.org/en-US/firefox/addon/youtube-capti...

https://addons.mozilla.org/en-US/firefox/addon/ycs/


ahh never really looked cause i built my original one in 2019 off of Deepspeech haha just updated it for fun mostly. I know youtube captions themselves are good, but one thing on his code would be that not all videos have captions. Since mine actually downloads the audio and runs it, it would still have values on those older videos that never got captions


Ctrl-F across all of youtube: https://www.askyoutube.ai


https://www.youglish.com is also a sort of search engine for YouTube captions, though mostly aimed at short phrases.


I searched for my YouTube username and then for the exact title of one video I posted and it didn't find either one.... instead it said the title of my video was not true because it didn't interpret it correctly (but it didn't link to the video).


I'll look into it, could you send the request to askutubeai@gmail.com?


not sure what else you need.... don't you keep logs?


Your request wasn't logged. This can happen sometimes if the request wasn't completed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: