Hacker News new | past | comments | ask | show | jobs | submit | keriati1's comments login

What model size is used here? How much memory does the GPU have?


he is using the 3b one, since it's the default when downloading it from ollama: https://ollama.com/library/llama3.2


We run coding assistance models on MacBook Pros locally, so here is my experience: On hardware side I recommend Apple M1 / M2 / M3 with at least 400Gb/s memory bandwidth. For local coding assistance this is perfect for 7B or 33B models.

We also run a Mac Studio with a bigger model (70b), M2 ultra and 192GB ram, as a chat server. It's pretty fast. Here we use Open WebUI as interface.

Software wise Ollama is OK as most IDE plugins can work with it now. I personally don't like the go code they have. Also some key features are missing from it that I would need and those are just never getting done, even as multiple people submitted PRs for some.

LM Studio is better overall, both as server or as chat interface.

I can also recommend CodeGPT plugin for JetBrains products and Continue plugin for VSCode.

As a chat server UI as I mentioned Open WebUI works great, I use it with together ai too as backend.


An M2 ultra with 192 gb isn't cheap, did you have it lying around for whatever reason or do you have some very solid business case for running the model locally/on prem like that?

Or maybe I'm just working in cash poor environments...

Edit: also, can you do training / finetuning on an m2 like that?


We had some as build agent around already. We don't plan to do any fine tuning or training, so we did not explore this at all. However I don't think it is a viable option.


Can the Continue plugin handle multiple files in a directory of code?


I think it is even easier right now for companies to self host an inference server with basic rag support:

- get a Mac Mini or Mac Studio - just run ollama serve, - run ollama web-ui in docker - add some coding assitant model from ollamahub with the web-ui - upload your documents in the web-ui

No code needed, you have your self hosted LLM with basic RAG giving you answers with your documents in context. For us the deepseek coder 33b model is fast enough on a Mac Studio with 64gb ram and can give pretty good suggestions based on our internal coding documentation.


We actually run already in-house ollama server prototype for coding assistance with deepseek coder and it is pretty good. Now if we would get a model for this, that is on chatgpt 4 level, I would be super happy.


Did you finetune a model?


No, we went with RAG pipeline approach as we assume things change too fast.


Thanks! Any details how you chunk and find the relevant code?

Or how you deal with context length? I.e. do you send anything other than the current file? How is the prompt constructed?


For projects where the estimated rewrite duration exceeds three months, we have employed an iterative approach to refactoring for several years. This methodology has yielded pretty good results.

We also utilize a series of Bash scripts designed to monitor the refactoring process. These scripts collect data regarding the utilization of both the old and new "state" within the codebase. The collected data is then dumped in Grafana, providing us with a clear overview of our progress.


An example: “Are We ESMified Yet?” is a Mozilla dashboard tracking an incremental Firefox code migration (1.5 years and counting) from a Mozilla-specific "JSM" JavaScript module system to the standard ECMAScript Module "ESM" system. Current ESMification: 96.69%.

“Are We X Yet?” is a Mozilla meme for dashboards like this.

https://spidermonkey.dev/areweesmifiedyet/


I saw the phrase “are we X yet” used in the Rust community (is Rust ready for games or whatever) but never realised the phrase’s origin with Mozilla. Thank you for the little piece of history.


AFAIK, https://arewefastyet.com/ (AWFY) was the first, registered in 2010.

“Are We Meta Yet?” http://www.arewemetayet.com/ is an incomplete and outdated list of some of these dashboards. Some domains expired and are now squatted.


If you have scripts that can count uses of the deprecated code, you can use them to detect regressions and generate build warnings if someone adds new code using the deprecated code. Periodically you can decrease the script’s max use counter, ratcheting down until you hit zero uses.


Oh, the idea of tracking the state of the refactoring process with small scripts is very cool. Obvious in retrospective too. These scripts would be useful even if they're only like 90% correct.


I find it awesome. Maybe it is targeted to my age group. Sadly I have an iPhone 13 and won't upgrade in the next 2-3 years. Otherwise I would order it right now.


+1 for the bucket queue. I learned about that trick a few weeks ago and in my use cases it cut the time to run A* by around 60-70%.


-


Do you work for chatGPT or something? I've seen a lot of really spammy people use phrasing similar to the beginning of your post lately. I'm not flagging you or downvoting you, but you made me raise my eyebrows, I can't tell if posts like these are authentic interaction.

(I don't mind if someone's a sincere fan of a service, there's places I've plugged, for lack of better phrasing, if I've found value from them, but I'm trying to cut back on uncompensated endorsements since folks abused the spirit of trying to respect good work.)


I just played with it around and wanted to compare what an article about this event would look like.

But maybe it was a bad idea to post it, if we want to stay on topic.


No it's fine, I just was being hypervigilant about spam, I apologize.


I also got a few messages from friends about this watch, but I don't think it will replace the "serious" diving computers for now, the missing dive pod support is already killing it.

Testing it next to a Shearwater (or Ratio, Suunto EON) would be interesting to see the differences in the algorithm. We will probably see a bunch of youtube reviews doing this comparisons.

With what I know so far from the watch, I would really not recommend using it as the only diving computer.

Also the "big buttons with gloves in mind" are kinda funny if I think about diving dry gloves ;)


Existing transmitters would not use Bluetooth but some custom radio signal on for example 123 kHz, i don't think the watch can suppor it


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: