More

Whiteshadow12 · 2024-08-14T12:12:41.000000Z

Something Open Router get's close to that https://openrouter.ai/models/meta-llama/llama-3.1-405b-instr...

amrrs · 2024-08-14T12:30:08.000000Z

Amazing, just noticed the mention of precision

Whiteshadow12 · 2024-08-14T11:58:40.000000Z

Nice I built something similar https://huggingface.co/spaces/Whiteshadow12/llm-pricing-calc...

I like your charting, many have taken this task and then lose interest.

similar other tools for inspiration https://llmprices.dev/ https://www.llmpricing.app/

What no one is doing is focusing on GPUs, what is the cost of running L3-8B on an A100 or H100 per second.

maeil · 2024-08-14T12:32:52.000000Z

Going to add my personal favorite: https://artificialanalysis.ai/models/llama-3-instruct-70b/pr...

What sets them apart is that they have speed and latency as well.

mcbetz · 2024-08-14T14:25:08.000000Z

Thanks for bringing llmprices.dev to my attention. I have also a comparison page for models hosted on OpenRouter (https://minthemiddle.github.io/openrouter-model-comparison/), I do comparison via regex (so "claude-3-haiku(?!:beta)|flash" will show you haiku, but not haiku-beta vs flash.

I wish that OpenRouter would also expose the amount of output tokens via API as this is also an important criteria.

alextttty · 2024-08-14T12:14:57.000000Z

Yeah we want to do exactly this, benchmark and add more data from differnt gpus/cloud providers, will appreciate your help a lot! There are many inference engines which can be tested and updated to find best inference methods

Whiteshadow12 · 2024-08-14T12:32:54.000000Z

Goodluck, companies would love that. Don't get depressed unlike my tool I think you should charge, that might keep you motivated to keep doing the work.

It's a lot of work, your target users is companies that use Runpod and AWS/GCP/Azure, not Fireworks and Together, they are in the game of selling tokens, you are selling the cost of running seconds on GPUs.

agcat · 2024-08-14T19:32:15.000000Z

This is true especially if you are deploying custom or fine-tuned models. Infact, for my company i also ran benchmark tests where we tested cold-starts, performance consistency, scalability, and cost-effectiveness for models like Llama2 7Bn & Stable Diffusion across different providers - https://www.inferless.com/learn/the-state-of-serverless-gpus... Can save months of evaluation time. Do give it a read.

P.S: I am from Inferless.

alextttty · 2024-08-14T12:43:28.000000Z

Thank you!

Whiteshadow12 · 2024-04-09T09:48:34.000000Z

This is really pretty.

Whiteshadow12 · 2024-04-05T09:48:23.000000Z

Perfect Streisand effect.

Whiteshadow12 · 2024-04-05T09:15:17.000000Z

Sorry everyone. This was me.

notoverthere · 2024-04-05T09:41:14.000000Z

What did you do?

jjgreen · 2024-04-05T09:55:40.000000Z

Bounder

Whiteshadow12 · 2024-03-14T08:22:59.000000Z

The beauty of HN is interactions like this.

prakhartiwari0 · 2024-03-15T04:13:36.000000Z

Saw this for the first time, it's awesome, I love this place. I started coming here recently as a curious teenager :)

Whiteshadow12 · 2024-01-10T14:56:08.000000Z

Yes, this was the longest period in the last 6 months.

nerdypirate · 2024-01-10T14:57:31.000000Z

Had to visit twitter to be sure I wasn't the only one

I hope dang will take sometime to explain why it was down

We might learn a few things :)

Whiteshadow12 · 2024-01-10T15:02:34.000000Z

It was painful, usually, HN is where one goes to find out if a service is down, I also used Twitter to double check it wasn't just me.

mynameisnoone · 2024-01-10T20:04:48.000000Z

https://downforeveryoneorjustme.com

Whiteshadow12 · 2024-01-09T13:52:30.000000Z

You are most likely not Sam Altman.

Whiteshadow12 · 2023-12-07T14:03:13.000000Z

I doubt we would have clicked on it if they called it a Modern Alternative to Yahoo! Directory. You are correct though.

Can you allow people to add sites they found, for example adding something like Hydra or Citus would require you to know about them.

This is perfect though, cause I'm always finding fascinating tools online that I either favorite on hn or star on Github but not all the time there's also a rust serverless company that I can't recall the name of, which fits your use case nicely.

Whiteshadow12 · 2023-11-07T12:39:07.000000Z

It's talked about a bit on Twitter with the hashtag #MyElixirStatus. Discord makes use of it as well. Same with Supabase.

The problem is the languages one would hear about, fall into the OOP category, functional languages tend to not be as fashionable when you do get into one, you become almost a zealot.

It's hard to fight the imprinting a language has with its first fans, Elixir because of its Erlang origins tends to have the identity of telephonic-based use cases. Similar to how Rust gets its reputation as a specialist language (cli,kernal, etc) even though you could use it for web apps as well.

The last reason is, it's really hard to fight the gravity pool of NextJS and React.

Personally my circle and company and country are solidly on the side of C#/Java/PHP/Python and React/Angular, the average developer who fits into that grouping might not have even heard of Svelt or SolidJs when they do hear about it, they immediately make some snide comment about JS fatigue, even though that hasn't been a thing for a while.

freedomben · 2023-11-07T15:23:47.000000Z

> it's really hard to fight the gravity pool of NextJS and React.

Indeed. I love Elixir and I even struggle with choice when I don't need much or any server-side functionality.

There was a Jekyll-style project for Elixir some years ago but it fell into bitrot and deprecation. I'm still hoping somebody resurrects it or builds an equivalent, because using eex to build a static site is a really great experience.

Whiteshadow12 · 2023-11-08T02:22:09.000000Z

Have you seen Griffing, your post prompted me to scan for an active project (many graveyards)

https://github.com/elixir-griffin/griffin