Hacker News new | past | comments | ask | show | jobs | submit | HanClinto's comments login

> Excellent reporting by Wired. I've been impressed by the depth of their coverage in the past year.

Depth of reporting is not accuracy, and there may be a bit of illusion here.

For instance, the bit you cite (their reporting on Schedule A) is a great example of lazy reporting for a cheap "gotcha" to amplify ragebait with misinformation -- the Wired reporters either couldn't be bothered to look up the various forms of Schedule A hiring (in this case, Schedule A (r) [0] vs. Schedule A (u) [1]), or they are intentionally mis-reporting in order to add more outrage. Only section U deals with disabilities -- section R deals with temporary hiring.

I think you and I are walking away with very different impressions of the quality of Wired's "depth of coverage".

[0]: https://www.ecfr.gov/current/title-5/part-213/subject-group-...

[1]: https://www.ecfr.gov/current/title-5/part-213/subject-group-...


Is the idea that software experience has been holding the FAA back?

The vague applicable part of the r is this:

professional/industry exchange programs that provide for a cross-fertilization between the agency and the private sector to foster mutual understanding, an exchange of ideas, or to bring experienced practitioners to the agency;

Amazing that rocket application guys are the most valuable experts when dealing with air traffic control.


I subscribe to an unlimited family plan. When considering how much cleaner my web experience is, it's a no-brainer. Default search engine on all our phones and devices.

They're my portal to the web. It's less like an optional web service (like a streaming service), and it feels more like I'm paying for them to be my ISP.


Even more than just simply watermarking LLM output, it seems like this could be a neat way to package logprobs data.

Basically, include probability information about every token generated to give a bit of transparency to the generation process. It's part of the OpenAI api spec, and many other engines (such as llama.cpp) support providing this information. Normally it's attached as a separate field, but there are neat ways to visualize it (such as mikupad [0]).

Probably a bad idea, but this still tickles my brain.

* [0]: https://github.com/lmg-anon/mikupad


I built one of these several years ago for MtG cards. Trained a neural network with a binary classifier on a cheap $20 USB microscope looking at examples of the backs of real cards vs. fake cards.

https://youtu.be/6_kKR7YgPF4

Sadly never got around to shipping it, because it worked really well. Ported it to the web, but never figured out the billing issue, and so it died during the delivery phase. From time-to-time, I still wonder if I should resurrect this project, because I think it could help a lot of people.


Keep in mind, two of the functions were translated, and the third was created from scratch. Quoting from the FAQ on the Gist [1]:

Q: "It only does conversion ARM NEON --> WASM SIMD, or it can invent new WASM SIMD code from scratch?"

A: "It can do both. For qX_0 I asked it to convert, and for qX_K I asked it to invent new code."

* [1]: https://gist.github.com/ngxson/307140d24d80748bd683b396ba13b...


The public dataset is available on HF here: https://huggingface.co/datasets/cais/hle

As the main website notes: "The dataset consists of 3,000 challenging questions across over a hundred subjects. We publicly release these questions, while maintaining a private test set of held out questions to assess model overfitting."


This sounds very plausible. Then if they click on their link or manually type in the website corresponding to the e-mail address, it goes to your (very official) site.

Of all the answers presented so far, this one feels the most plausible to me.


Maybe still worth it to separate the tasks, and use a traditional text detection model to find bounding boxes, then crop the images. In a second stage, send those cropped samples to the higher-power LLMs to do the actual text extraction, and don't worry about them for bounding boxes at all.

There are some VLLMs that seem to be specifically trained to do bounding box detection (Moondream comes to mind as one that advertises this?), but in general I wouldn't be surprised if none of them work as well as traditional methods.


We've run a couple experiments and have found that our open vision language model Moondream works better than YOLOv11 in general cases. If accuracy matters most, it's worth trying our vision language model. If you need real-time results, you can train YOLO models using data from our model. We have a space for video redaction, that is just object detection, on our Hugging Face. We also have a playground online to try it out.


THIS is how it's done. This is a masterclass in how to effectively use LLMs for such things. Providing proper context (Goldilocks -- not too much, not too little), asking targeted questions, and then continuing with intelligent dialogue with the LLM.

I've had several similar examples where LLMs have significantly augmented my workflow, but none so clean and encapsulated (and beautifully explained) as this example.

Thank you, Salvatore!


This is really brilliant stuff! Somehow I didn't realize that logprobs were being returned as part of the OAI requests, and I really like this application of it.

Any interest in seeing this sort of thing being added to llama.cpp?


Looking at llama.cpp, it already supports the logprob field in its OAI API emulation, so it shouldn't be too difficult to use this library with it.

It feels like this would be useful enough to build around -- I especially like the idea of asking the API to return the top K results for each field, and denoting their likelyhood -- almost like a dropdown box with percentages attached for each possible result.


I believe mikupad[0] supports showing logprobs from a llama.cpp backend.

[0]:https://github.com/lmg-anon/mikupad


Thank you for this link -- I had not seen this before. That is an absolutely gorgeous and intuitive interface!


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: