Hacker News new | past | comments | ask | show | jobs | submit | jakecyr's comments login

I wrote a simple library to reduce latency in voice generations from LLM chat completion streams.

This lets you generate voices from streams of text from local LLMs, such as Ollama and local TTS clients, such as Apple Say along with external clients such as Google Text-to-Speech with the same speed as privately created assistants such as OpenAI.

As each sentence end is detected, it will run TTS on it and play it out loud while the rest of the completion is being generated in the background.


Developed an open source voice assistant that integrates OpenAI's Whisper, Chat Completion and Voice Generation APIs to provide an assistant experience.

Some potential extensions could include integrating into custom hardware or adding function calling to expand the default capabilities.


The first (that I found) tokenizer for open source LLMs. It retrieves config files from HuggingFace and can encode / decode text and tokens.

Was created in an hour and can definitely use some work. Would love contributions and feedback!


Simple experiment for question answering on YouTube videos using embeddings and the top n YouTube search result transcripts.

Take a question and optionally a YouTube search query (otherwise an LLM will auto-generate one), will compile transcripts for each video result, generate an embedding index using the transcripts and then answer the question using the relevant embeddings.

Returns both a string response and a list of sources that were used for the answer.


Introducing our Video Summarization API — a game-changing tool that leverages advanced language models to summarize any YouTube video, no matter the length. Similar in technology to OpenAI's ChatGPT, our API distills key points and themes from videos, offering a quick way to grasp content without watching it in entirety. Ideal for content creators, researchers, and anyone who wants to consume video content more efficiently.

Can be easily integrated into Apple Shortcuts to summarize YouTube videos on the go (will publish an example soon).

Currently in beta, feel free to leave comments and feedback so I can improve the API.


I wrote a few helper functions that let you define your functions using Python objects and generate the JSON schema dict with a method call.


An open source GitHub project that let's you converse with GPT chat models with your computer or phone microphone and speaker.


Using OpenAI GPT models, from a description of a software system or other entity diagram, generate a design diagram image or PDF.


A simple CLI to convert a JSON or JavaScript object example to a TypeScript interface with inferred types.


OpenAI's GPT-3 seems to be better than Google and smart home assistants so I wanted to make my own by wrapping the GPT-3 API in voice recognitions and text to speech.

I wrote a short script to recognize vocal input from a computer microphone, send the text to OpenAI's GPT-3 and respond with a voice over your speaker.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: