More

DarthNebo · 2024-02-12T16:24:41.000000Z

HiddenBar

Erratic6576 · 2024-02-13T11:20:57.000000Z

Yessir, Thank you very much

DarthNebo · 2024-02-10T16:22:46.000000Z

This is old news

DarthNebo · 2023-10-24T06:45:16.000000Z

They did use the Canny ControlNet Pipeline

ComputerGuru · 2023-10-24T18:04:29.000000Z

I think the article was updated with that in response to my comment:

> P.S. As pointed out by a fellow HackerNews reader, we clearly forgot to include our code snippet for ControlNet in the article.

No other code snippet besides the one added in response uses Canny, at least so far as I can see.

DarthNebo · 2023-10-28T18:22:51.000000Z

Oh I see, my bad

DarthNebo · 2023-10-07T06:35:29.000000Z

I'm hitting 3.9tok/s with CTX of 300 tokens on Android/778G via Userland & this is with an older unoptimized build of llama.cpp

DarthNebo · 2023-10-07T04:25:25.000000Z

Feels like there should be two branches of system design for unit profitable architecture for paying users vs VC backed architectures to support non-paying users.

DarthNebo · on Aug 23, 2023

For long running stuff https://developer.apple.com/tutorials/app-dev-training/trans... should be straightforward to translate as well using ported on-device BERT models

DarthNebo · on Aug 9, 2023

JM2C

lood_in_4bit=True will let you run Llama2-7B variants at 6.3GB VRAM.

DarthNebo · on July 31, 2023

I'm building a tool for transcriptions where your brand, product or any other technical jargon or heck even your own name does not need a fine-tuned model all the time. Both as a free native Mac app & a SaaS tool for those who need to process in bulk. Hit me up at nebo@minusgreed.com to check it out, will launch as FortuneSpeech.com once out of beta.

Thanks to this post, I kind of have another idea for a 'corrections to transcript' feature that Llama2-7B even on CPU can help with.

arjvik · on July 31, 2023

See sebastiennight's parent comment above - I think they meant to reply to you.

Curious to see your response.

sebastiennight · on July 31, 2023

Actually I wasn't, but if this can be useful to GP I shared my actual prompt in a follow-up comment.

I would be interested to see if LLama2 can perform a similar task.

DarthNebo · on July 30, 2023

It's like a repeat of c++ or python. If you can find folks & tooling for what you want to accomplish then go for it. I don't think founders who actually have paying customers concern themselves with what works best per cent of compute spend rather what allows them to improve the product, while as an outsider it may seem that they can reduce costs by choosing X over Y.

DarthNebo · on July 20, 2023

I would suggest getting your feet wet with HuggingFace Spaces free/Pro plan to get started & then their APIs once you get the hang of setting up things there. After that you can start with setting up LangChain pipelines or direct vector DB queries for which sort of columns or SQL queries to formulate(for the latter).

As for the former classifier you can try doing zero-shot classification between n number of categories + others. Models like Flan-T5/T5/Flan-UL2/DistillBART(also ~7B-40B param LLMs can also do this but would be overkill).