Hacker News new | past | comments | ask | show | jobs | submit | modeless's comments login

Initial vibes are not living up to the hype. It fails my pet prompt, and the Cursor devs say they still prefer Sonnet[1]. I'm sure it will have its uses but it is not going to dominate.

[1] https://x.com/cursor_ai/status/1885415392677675337


R1 or R1-Distill? They are not the same thing. I think DeepSeek made a mistake releasing them at the same time and calling them all R1.

Full R1 solves this prompt easily for me.



Huh, that one got it wrong for me too. I don't have patience to try it 10 times each to see if it was a coincidence, but it is absolutely true that not all implementations of LLMs produce the same outputs. It is in fact common for subtle bugs to happen that cause the outputs to be worse but not catastrophically bad, and therefore go unnoticed. So I wouldn't trust any implementation but the original for benchmarking or even general use unless I tested it extensively.

Same. With the recommended settings, it got it right. I regenerated a bunch of times, and it did suggest Cathy once or twice.

R1 70b also got it right just as many times for me.


This is far too soon after R1 to be a reaction. They were training this model before R1. If they stopped censoring the reasoning steps or (Yud forbid) open sourced it, that would be competition really working. But they won't.

Apple did send traffic to Bing in the past. It wasn't all of their iOS search traffic, but some.

At different points in time Spotlight search and Siri have used Bing for internet search. It's not totally clear what the latest version of iOS uses, but it wouldn't surprise me if Bing was still used.

GDB is awful and a replacement is sorely needed. It's buggy and slow and feature poor and has a terrible UI and a terrible API which makes all the frontends terrible in turn. I haven't looked at its code but I'm betting it's terrible too.

I haven't tried UScope yet (I shall), but I don't agree with you about GDB. I don't find it especially buggy unless doing niche things like non-stop debugging -- I guess you may well have a different experience though.

I think the UI is much maligned unfairly. It has a few quirks, but ever used git? For the most part it's fairly consistent and well thought through.

By terrible API you mean the Python? I admit it can be a bit verbose, but gives me what I find I need.

What features do you most miss? My experience is that most people use like 1% of the features, so not sure adding more will make a big difference to most people.


It's been a while since I bothered to try to use it because my experience has been so bad. So I don't remember all my specific complaints about bugs and features. I do remember multi process debugging was a big hole last time I looked. In contrast, I was able to get multi process debugging working really well in Visual Studio.

By terrible API I mean GDB/MI that frontends use. I'm sure people will come try to defend it but the proof is in the pudding and I don't think it's a coincidence that every GDB frontend sucks.


I'll +1 GDB/MI being utter garbage. Bespoke format (ordered multimap as the only keyed data structure, why), weird quoting requirements (or sometimes requirement of lack thereof) on input, extremely incomplete, and in some cases nearly unusable even for what it does support. Feels more like carelessly shoehorning some of the existing gdb commands with a different syntax (but sometimes not different) than an actual API.

If it's been a long time I recommend taking another look. TBF you can tell it hasn't had the millions of dollars of investment that the Microsoft debuggers have, but still it's come a long way over the last 5-10 years.

e.g. it now has decent multi-process support.

I agree MI is kinda horrid, but no need for it these days, you can do everything via the Python API, and the modern equivalent is Debug Adapter Protocol which GDB claims to support now (although I haven't tried).

There a million frontends, including both Visual Studio (via ssh) and VSCode, if you like those.

The perfect developer tool does not exist, but I believe that if you're debugging native code on Linux more than a few times per year then you should really know how to drive GDB. It won't always be the best tool for the job, but it often will be.


> What features do you most miss?

one time I wanted to write generic printers. E.g. printer of any type which support C++ iterators. But gdb can't call C++ functions from python api (excepting weird hacks like evaluating `print c.begin()` and catching it output). Although this is not very useful because most of types we use changes very rarely, that's why writing printers is only matter of time.

Another feature is breakpoints which sleep next N seconds. We have breakpoints which can skip next N triggering, but similar with time will be useful to me to debug mouse events in gui apps, etc.

Also the most new gdb still have problems with tab-tab completion (and even Ctrl-C don't return control immediately).

Also lately I often meet problem cannot insert breakpoint 0. Probably this is a bug, because answers from stackoverflow isn't relevant for me


> one time I wanted to write generic printers. E.g. printer of any type which support C++ iterators.

How would that work for types where the required functions are not instantiated, or not instantiated as a standalone function? Most iterators' operator++ are inlined and probably never instantiated as a distinct function.


> How would that work for types where the required functions are not instantiated

Obviously, it will not. But why not to try?)

> Most iterators' operator++ are inlined

Sure, it's sad.

But I'm still think that such feature - calling C++ functions from Python API - can be useful.


> excepting weird hacks like evaluating `print c.begin()` and catching it output)

Why do you consider that a weird hack instead of legitimate programming technique?


If this is not weird hack, why gdb provide api for getting C++ variables values instead of using this "legitimate programming technique"?

> a terrible API which makes all the frontends terrible in turn

I don't know the details. But nowadays gdb supports DAP, as any other debugger: https://www.sourceware.org/gdb/current/onlinedocs/gdb.html/D...

Are you talking about this, or the old https://www.sourceware.org/gdb/current/onlinedocs/gdb.html/G... ?


Thanks, I hadn't seen that, seems like it was just released a year ago. Seems like a step forward. I was talking about GDB/MI, which was the main (only?) option for decades and could still be considered the "native" frontend API of GDB.

I guess this would require operator overloading. Is there a proposal for that? A JavaScript version of numpy/pytorch would also need overloading (plus array slicing).

Groundbreaking research reveals people surveyed report that they like it when you give them things for free.

I love libraries. But this "research" is silly.


This kind of thing is a perfect candidate for GPU acceleration. It would easily run 10x faster even on a low end graphics card.

Battery replacement is difficult, not impossible, but bordering on impossible is restoring the waterproof seal. Once you've replaced the battery you basically have to keep it away from water. I wasn't careful with mine and lost two to water damage after replacing the battery, despite re-sealing with permanent adhesive.

Good to know! IIRC the Round didn't have as high a water resistance rating to begin with, and I typically wear a leather band, so this might not be a deal-breaker for me.

The PTRs weren't diving watches for sure, but the original waterproofing was easily good enough to withstand submersion, as long as the battery hadn't started swelling yet.

Ah that is excellent to know, I appreciate it!

I am convinced that there must be Pebble fans on the Android team that keep a Pebble in CI and ensure it keeps working with each new release. Otherwise its continued extended working lifespan is inexplicable given the amount of churn in Android in general and the Bluetooth stack in particular.

Not as crazy of a theory as you might imagine. Years after death and acquisition after acquisition, a mystery Googler recompiled the long-discontinued Pebble app with the original signing keys so that it would work with new 64-bit app requirements.

https://arstechnica.com/gadgets/2022/10/pebble-a-2013-smartw...


Actually kinda to the contrary BT is extremely back-supporting. It adds/removes features so slow that it’s too boring. That’s why I left my BT expertise at the start of my career and moved to app development (it was a mistake in the hindsight but that’s another story).

Sure, it's backwards compatible in theory. In practice I haven't had any device that kept working reliably with zero issues through every Android update and every phone upgrade. Including very important ones like Tesla's phone key. Even Pebble wasn't flawless at the very beginning, but it got good fast and hasn't stopped working since the company went under.

My speculation is that they used very low level software design to achieve such reliability. This could be harder to maintain but who knows...

Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: