Hacker News new | past | comments | ask | show | jobs | submit | lanceflt's comments login

This is just an ad for the Swiss chocolate industry. The only people quoted are being funded directly by chocolate manufacturers.


Of course it is. Nobody else is going to fund such research.

That's true in pretty much every industry. That's why papers have "conflicts" sections, so you know who's funding it, and can take that into account. That doesn't make it wrong. You just need to put it in context.


A great deal of agricultural research is funded/carried out by governments who don’t have a vested interest in a particular outcome beyond general improvement.

A secure global food supply benefits the general public not just argo businesses. In essence everyone has a vested interest in farming.


no conflict = no interest


https://github.com/openai/sparse_autoencoder

They actually open sourced it, for GPT-2 which is an open model.


Thanks, I must have read through the document to hastily.


They didn't train the vision encoder either, it's unchanged SigLIP by Google.


“We finetuned billions of dollars of research by Google and Meta.”


- Llava is not the SOTA open VLM, InternVL-1.5 is https://huggingface.co/spaces/opencompass/open_vlm_leaderboa...

You need to compare the evals to strong open VLMs including this and CogVLM

- This is not "first-ever multimodal model built on top of Llama3", there's already a Llava on Llama3-8b https://huggingface.co/lmms-lab


Like InternVL, no llama.cpp support severely limits its applications. Close to GPT4v performance level and runnable locally on any machine (no need for a GPU) would be huge for the accessibility community.


Very curious how it performs on OCR tasks compared to InternVL. To be competitive at reading text you need tiling support, and InternVL does tiles exceptionally well.


I think CogVLM2 is even better than Intern at OCR (my usecase is extracting information from an invoice)


After some superficial testing I with bad quality scans you can find on kaggle I can not confirm that. CogVLM2 refuses to handle scans that InternVL-V1.5 still can comprehend.


Thank you for the link! Our initial testing suggests MiniCPM outperforms InternVL for GUI understanding: https://github.com/OpenAdaptAI/OpenAdapt/issues/637#issuecom...

(InternVL appears to hallucinate more.)


I’m going to be saying First Ever AI something for the next 15 years for clout and capital, not going to be listening to anybody’s complicated ten step funnel if they’re not doing the obvious


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: