Hacker News new | past | comments | ask | show | jobs | submit | KoolKat23's comments login

Some of the bigger models require certification due to EU laws and are delayed as a result, but I can't see Apple Intelligence falling in this category.


Existing OCR is extremely limited and requires custom narrow development.

I think xlsx files are a proprietary Microsoft format.


Google has the tech (some of it's gathering dust, but they have it). They can use the gameplay tech developed for stadia when a user experiences lag and combine it with their LLM.


And today they added a new AI abuse clause to their t&C's lol.


You're expectations could just be increasing as you start taking it for granted and are using other models.


An important note. If you're hearing your voice in your head doing this, that's subvocalisation and it's basically just saying it out loud, the instruction is still sent to your vocal chords

It's the equivalent of <thinking> tags for LLM output.


> What this tells us for AI is that we need something else besides LLMs.

Basically we need Multimodal LLM's (terrible naming as it's not an LLM then but still).


I don't know what we need. Nor does anybody else, yet. But we know what it has to do. Basically what a small mammal or a corvid does.

There's been progress. Look at this 2020 work on neural net controlled drone acrobatics.[1] That's going in the right direction.

[1] https://rpg.ifi.uzh.ch/docs/RSS20_Kaufmann.pdf


I think you may underestimate what these models do.

Proper multimodal models natively consider whatever input you give them, store the useful information in an abstracted form (i.e not just text), building it's world model, and then output in whatever format you want it to. It's no different to a mammals, just the inputs are perhaps different. Instead of relying on senses, they rely on text, video, images and sound.

In theory you could connect it to a robot and it could gather real world data much like a human, but would potentially be limited to the number of sensors/nerves it has. (on the plus side it has access to all recorded data and much faster read/write than a human).


You could say language is just the "communication module" but there has got to be another whole underlying interface where non-verbal thoughts are modulated/demodulated to conform to the language expected to be used when communication may or may not be on the agenda.


Well said! This is a great restatement of the core setup of the Chomskian “Generative Grammar” school, and I think it’s an undeniably productive one. I haven’t read this researchers full paper, but I would be sad (tho not shocked…) if it didn’t cite Chomsky up front. Beyond your specific point re:interfaces—which I recommend the OG Syntactic Structures for more commentary on—he’s been saying what she’s saying here for about half a century. He’s too humble/empirical to ever say it without qualifiers, but IMO the truth is clear when viewed holistically: language is a byproduct of hierarchical thought, not the progenitor.

This (awesome!) researcher would likely disagree with what I’ve just said based on this early reference:

  In the early 2000s I really was drawn to the hypothesis that maybe humans have some special machinery that is especially well suited for computing hierarchical structures.
…with the implication that they’re not, actually. But I think that’s an absurd overcorrection for anthropological bias — humans are uniquely capable of a whole host of tasks, and the gradation is clearly a qualitative one. No ape has ever asked a question, just like no plant has ever conceptualized a goal, and no rock has ever computed indirect reactions to stimuli.


I think one big problem is that people understand LLMs as text-generation models, when really they're just sequence prediction models, which is a highly versatile, but data-hungry, architecture for encoding relationships and knowledge. LLMs are tuned for text input and output, but they just work on numbers and the general transformer architecture is highly generalizable.


Chomsky is shockingly unhumble. I admire him but he's a jerk who treats people who disagree with him with contempt. It's fun to read him doing this but it's uncollegiate (to say the least).

Also, calling "generative grammar" productive seems wrong to me. It's been around for half a century -- what tools has it produced? At some point theory needs to come into contact with empirical reality. As far as I know, generative grammar has just never gotten to this point.


Well, it’s the basis of programming languages. That seems pretty helpful :) Otherwise it’s hard to measure what exactly “real world utility” looks like. What have the other branches of linguistics brought us? What has any human science brought us, really? Even the most empirical one, behavioral psychology, seems hard to correlate with concrete benefits. I guess the best case would be “helps us analyze psychiatric drug efficacy”?

Generally, I absolutely agree that he is not humble in the sense of expressing doubt about his strongly held beliefs. He’s been saying pretty much the same things for decades, and does not give much room for disagreement (and ofc this is all ratcheted up in intensity in his political stances). I’m using humble in a slightly different way, tho: he insists on qualifying basically all of his statements about archaeological anthropology with “we don’t have proof yet” and “this seems likely”, because of his fundamental belief that we’re in a “pre-Galilean” (read: shitty) era of cognitive science.

In other words: he’s absolutely arrogant about his core structural findings and the utility of his program, but he’s humble about the final application of those findings to humanity.


It's a fair point that Chomsky's ideas about grammars are used in parsing programming languages. But linguistics is supposed to deal with natural languages -- what has Chomskyan linguistics accomplished there?

Contrast to the statistical approach. It's easy to point to something like Google translate. If Chomsky's approach gave us a tool like that, I'd have no complaint. But my sense is that it just hasn't panned out.


Who has he mistreated?


Nobody, people are just crying because Chomsky calls them out, rationally, on their intellectual and/or political bullshit, and this behavior is known as projection.


In these discussions, I always knee-jerk into thinking "why don't they just look inward on their own minds". But the truth is, most people don't have much to gaze upon internally... they're the meat equivalent of an LLM that can sort of sound like it makes sense. These are the people always bragging about how they have an "internal monologue" and that those that don't are aliens or psychotics or something.

The only reason humans have that "communication model" is because that's how you model other humans you speak to. It's a faculty for rehearsing what you're going to say to other people, and how they'll respond to it. If you have any profound thoughts at all, you find that your spoken language is deficient to even transcribe your thoughts, some "mental tokens" have no short phrases that even describe them.

The only real thoughts you have are non-verbal. You can see this sometimes in stupid schoolchildren who have learned all the correct words to regurgitate, but those never really clicked for them. The mildly clever teachers always assume that if they thoroughly practice the terminology, it will eventually be linked with the concepts themselves and they'll have fully learned it. What's really happening is that there's not enough mental machinery underneath for those words to ever be anything to link up with.


This view represents one possible subjective experience of the world. But there are many different possible ways a human brain can learn to experience the world.

I am a sensoral thinker, I often think and internally express myself in purely images or sounds. There are, however, some kinds of thoughts I've learned I can only fully engage with if I speak to myself out loud or at least inside of my head.

The most appropriate mode of thought depends upon the task at hand. People don't typically brag about having internal monologues. They're just sharing their own subjective internal experience, which is no less valid than a chiefly nonverbal one.


[flagged]


What? I used the term "sensoral" after thinking about what I wanted to communicate. I have no idea if that is a pop psychology term, I didn't google it. I was attempting to communicate that I often think in visual, aural, tactile or olfactory modes, not just visually or via inner monologue, especially when recalling memories.

You're just projecting at this point and stalking previous comments to start arguments. That is exceedingly immature and absolutely against Hacker News guidelines. You need to reevaluate your behavior. Please refrain from continuing to start arguments on previous posts.


As far as I understand it, it's just output and speaking is just enclosed in tags, that the body can act on, much like inline code output from an LLM.

e.g. the neural electrochemical output has a specific sequence that triggers the production of a certain hormone in your pituitary gland for e.g. and the hormone travels to the relevant body function activating/stopping it.


That's just reducing the value of a life to a number. It can be gamed to a situation where it's just more profitable to mow down people.

What's an acceptable number/financial cost is also just an indirect approximated way of implementing a more direct/scientific regulation. Not everything needs to be reduced to money.


There is no way to game it successfully; if your insurance costs are much higher than your competitors you will lose in the long run. That doesn’t mean there can’t be other penalties when there is gross negligence.


Who said management and shareholders are in it for the long run. Plenty of examples where businesses are purely run in the short term. Bonuses and stock pumps.


Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: