Grok-2 Beta Release

espadrine · 2024-08-14T08:24:51 1723623891

The technology is impressive; achieving such a level requires a lot of efforts in dataset creation, neural architectural costs, and GPU shepherding.

What is the company’s ethical position though? It officially stemmed from Mr Musk’s objection that OpenAI was not open-source, but it too is not open-source. It followed Mr Musk’s letter to stop all AI development on frontier models, but it is a frontier model. It followed complaints that OpenAI trained on tweets, but it also trained on tweets.

Companies like Meta, Mistral, or DeepSeek, address those complaints better, and all now play in the big league.

baq · 2024-08-14T08:34:20 1723624460

“Conservatism consists of exactly one proposition, to wit: There must be in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect.”

sounds like mr Musk is a conservative.

clarionbell · 2024-08-14T13:02:57 1723640577

I would encourage everybody who thinks so to pursue basic political and philosophical education. Perhaps with a dash of history.

This definition is plainly wrong on so many levels that it's practically impossible to engage with. But I'll make that mistake and engage on two points.

First, it implies that conservative position has somehow consistent features across time and space. There is difference between conservative in Germany, USA and China. Not to mention conservative in early, mid and late 20th century.

Second, ignoring legal norms is neither stated, nor implicated position of conservative political movements. At very worst, we can accuse them of maintaining laws with discriminatory intents. But not of flaunting those same laws.

sangnoir · 2024-08-14T16:27:03 1723652823

If you're in a charitable mood, the context on when, where and who originally made the statement will provide clues on which strain of conservatism the statement is referring to.

falcor84 · 2024-08-14T17:22:45 1723656165

I found this about the origin and am not sure what to take from it:

https://slate.com/business/2022/06/wilhoits-law-conservative...

sangnoir · 2024-08-14T18:44:47 1723661087

So, the original author is an American living in Ohio, and made the comment in the year AD 2018 while critiquing an essay about the New Deal. I'm confident you can make a good-faith educated guess on which country and period they were characterizing.

assbuttbuttass · 2024-08-14T15:47:24 1723650444

> ignoring legal norms is neither stated, nor implicated position of conservative political movements

The Republican candidate for president in the USA is a convicted felon

swat535 · 2024-08-14T16:04:21 1723651461

What does that have to do with the definition of Conservatism as political thought?

thejazzman · 2024-08-14T18:59:38 1723661978

North American Conservatives (i.e. citizen of the United States) have done olympic-worthy gymnastics to align with the aforementioned felon's redefinition of conservatism belief in America, even while those beliefs actively contradict their religion and life-long belief systems, or even their own on going behaviors and decisions.

I say this as someone living in Pennsylvania, drowning in the hypocrisy and escalating hate this group of people has been spewing for the last ~8 years.

Therefore I can completely understand why someone might focus on that as the most relevant definition on 'conservatism' today in the USA.

hnarn · 2024-08-14T22:32:28 1723674748

> I can completely understand why someone might focus on that as the most relevant definition on 'conservatism' today in the USA.

You don't consider this a problem, that the word "conservatism" when discussed with an unknown recipient online (very possibly non-american) is constrained to the context of the past decade(s) in the United States?

Words have meaning, so if you're going to have a meaningful discussion about a word like "conservatism" or any type of -ism for that matter, I would think it benefits anyone engaging in that discussion to be aware of the different wings present in that word, whether that be across history or across present day geography.

hagbard_c · 2024-08-14T19:30:58 1723663858

[flagged]

SideQuark · 2024-08-17T22:20:49 1723933249

Get back to us when Biden is a twice convicted sex offender, has caused a dozen plus of his inner circle indicted for felonies across many jurisdictions, when several have pled guilty, when he is convicted of felony fraud, when he steals nuclear secrets and gives them to foreigners, and when his decades long employees and own lawyers turn him in with video, audio, eyewitness, photographic, and text evidence.

Then we won't be playing whataboutism wackamole.

didgeoridoo · 2024-08-14T12:25:10 1723638310

According to the random Crooked Timber blog commenter who coined that viral aphorism in 2018, yes. But by what standard are that commenter’s musings to be considered expositive on modern conservative philosophy?

llamaimperative · 2024-08-14T13:02:28 1723640548

Empirical observation of the last 10 years? Cue No True Conservative, etc.

didgeoridoo · 2024-08-14T13:10:29 1723641029

Observing the last 10 years, which political movement is most associated with the idea that inherent identity characteristics should dictate how you are treated under the law?

thrance · 2024-08-14T15:36:57 1723649817

In those last ten years, Republicans have been utterly obsessed with "identity characteristics". From pushing back against gay marriage, abortion, civil rights... It's basically all they talk about in political rallies today. Not the economy or anything else, just how it is important to never talk about trans people, and how they should not exist.

archagon · 2024-08-14T18:16:00 1723659360

In my observation, every time a prominent conservative breaks the law, all I hear from the right is how “he’s a good man,” “he learned his lesson,” “he was acting in good faith,” and so on — even if the crime is as egregious as homicide or pedophilia. The same generosity is never granted to someone not in the in-group: just look how Crystal Mason was treated when compared to the scores of Republicans who were caught with their hands in the cookie jar.

In other words, identity politics to a T.

drxzcl · 2024-08-14T13:20:49 1723641649

The guys closing polling places in black neighbourhoods? The guys denying women and trans people healthcare?

Identity politics has always been a conservative project.

joenot443 · 2024-08-14T13:51:01 1723643461

My understanding is the earliest application of identity politics comes from thinkers like Fanon and Wollstonecraft, would you categorize them as being conservatives?

drxzcl · 2024-08-14T14:14:33 1723644873

Again, you are ignoring the identity based systems they describe, that have been in place for centuries before either of them were born.

You know which ones I mean.

llamaimperative · 2024-08-14T14:00:08 1723644008

"In-group" doesn't necessarily mean identity characteristics. In today's (US) conservative party, it distinctly means "pledges personal allegiance to party leader."

As an example: The "conservative" judge who threw out 40 years of precedent on a technicality to prevent the American public from learning whether their former and potentially future president sold, gave away, or otherwise exposed national security secrets after he undoubtedly stole those documents.

There's a fundamental asymmetry in "the movement" on the left - which essentially rounds out to whatever annoying undergrad student showed up in your Twitter feed today - and the actual elected, governing leaders of the right, doing things like throwing out very strong criminal cases on matters of deep public importance.

ithkuil · 2024-08-14T17:04:07 1723655047

It's probably a bad idea and it will likely backfire but nevertheless motivation matters and a lot of people are willing to cut some slack to that political movement because they're honestly convinced it was done in good faith to restore a balance and give some power to disenfranchised groups.

dredmorbius · 2024-08-14T16:15:01 1723652101

Sometimes a phenomenon exists for a long time before being encapsulated in a concise, thought-provoking, and often (though not always) amusing aphorism.

An excellent example would be Murphy's Law, and by extension many of the similar, often eponymous, laws.

See:

- List of Eponymous Laws: <https://en.wikipedia.org/wiki/List_of_eponymous_laws>

- Murphy's Law and other reasons why things go wrong! by Arthur Bloch: <https://archive.org/details/murphyslawotherr0000arth>

- Compilation of Murphy's (and similar) laws: <https://www.cs.cmu.edu/~fgandon/miscellaneous/murphy/>

Some of those are humourous, some are in fact quite serious though have a comedic element particularly out of context. Most speak to at least a colloquial truth.

What Whilhoit did was manage to buttonhole a hypocrisy of modern conservativism, perhaps over the past few decades, perhaps a century or so (Anatole France, "The law, in its majestic equality, forbids the rich as well as the poor to sleep under bridges, to beg in the streets, and to steal bread", further evidentiarially supported by SCOTUS in Grants Pass), perhaps by millennia (see the opening paragraphs of A.H.M. Jones, Augustus, describing the political situation in the late Roman Republic, quoted here: <https://news.ycombinator.com/item?id=22208105>, and at greater length: <https://web.archive.org/web/20230607042525/https://old.reddi...>). It's not so much a proved hypothesis as a phrasing which fits the understanding of many and expresses it concisely and memorably.

ralfd · 2024-08-15T11:26:43 1723721203

Offtopic: Wiki says about that quote attributed to political scientist Francis Wilhoit it is just from some random musician from Ohio

https://en.wikipedia.org/wiki/Francis_M._Wilhoit#:~:text=Con....

> However, it was actually a 2018 blog response by 59-year-old Ohio composer Frank Wilhoit, years after Francis Wilhoit's death."

tacitusarc · 2024-08-14T14:46:52 1723646812

This is unconstructive flamebait.

dredmorbius · 2024-08-14T16:01:07 1723651267

Quote's by Frank Wilhoit.

ribelo · 2024-08-14T08:53:13 1723625593

[flagged]

HeatrayEnjoyer · 2024-08-14T09:55:51 1723629351

I'm not following.

SequoiaHope · 2024-08-14T10:16:27 1723630587

ribelo is saying that "modern liberalism" can be characterized by a belief that "There must be in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect."

At least in leftist philosophical circles, which are what I am familiar with, this is a relatively common critique of liberalism.

Another common rhetorical tactic in leftist critiques is to point out that the bad beliefs liberals often blame the conservatives for having are actually in practice tenants of modern liberalism too. For example both conservatives and liberals are in favor to some degree of using the military overseas to maintain global hegemony.

I don't know if ribelo is leftist or not but in any case I can see what they are going for.

Moto7451 · 2024-08-14T10:22:59 1723630979

https://slate.com/business/2022/06/wilhoits-law-conservative...

Here’s an interview with the actual author. The quote has its own interesting history.

AnimalMuppet · 2024-08-14T12:40:48 1723639248

Imagine a kid insulting another kid on a playground. The other kid says "I know you are, but what am I?"

That's pretty much what's going on with the GP comment. It's a really low-effort and really transparent attempt to paint the other side with what your side has been accused of.

Mind you, I don't think it's a fair criticism of conservativism, either...

ein0p · 2024-08-14T20:28:57 1723667337

[flagged]

mattigames · 2024-08-14T21:20:11 1723670411

The opposite is what happened, actually: https://apnews.com/article/hunter-biden-gun-trial-federal-ch...

ein0p · 2024-08-14T21:34:07 1723671247

Nothing has “happened” yet, seeing that Hunter is still not in the slammer. Moreover, nothing is going to happen. Nothing ever does.

mattigames · 2024-08-14T23:29:21 1723678161

I feel like I'm living in an alternative dimension when somebody worries about that when we have someone like Trump literally running for president and with a high likelihood of winning, like someone buying a gun after using drugs its a matter of massive importance when we have a jackass this close to being president who literally stole from charity and would convert muslins, black people and mexicans into slaves if he could.

ein0p · 2024-08-15T00:29:33 1723681773

I’m not the one arguing for moral superiority that Democrats purportedly have over Republicans, which is the point the OP was unsuccessfully trying to make. Truth is: the probability of seeing the inside of a jail cell is markedly lower for the rich and well connected, irrespective of their political affiliation.

alm1 · 2024-08-15T07:11:17 1723705877

this is quite a different statement from "nothing ever happens".

ein0p · 2024-08-15T14:20:44 1723731644

But nothing ever does if you know the right people. Prima facie evidence is Epstein’s client list which our DOJ is categorically not interested in investigating.

_bkyr · 2024-08-15T01:11:38 1723684298

I'll never understand this obsession with Hunter Biden. I'm sure he's a fuck up or was a fuck up as a drug addict. He's probably had all types of questionable dealings being who he is but his list of crimes are:

1. Failure to pay income tax 2. Illegally owning a gun and lying about it

Both are bad with the second being worse IMO.

Here are just a couple people who Trump pardoned and their crimes:

- Roger Stone (convicted of obstruction, making false statements, and witness tampering)

- Steve Bannon (charged with conspiracy to commit wire fraud and money laundering)

- Three U.S. military officers who were accused or convicted of war crimes

- Chris Collins (Congressman convicted of wire fraud, conspiracy to commit securities fraud, securities fraud, and lying to the FBI)

- Duncan Hunter (Congressman convicted of one count of misusing campaign funds)

- Steve Stockman (Congressman convicted of money laundering, mail and wire fraud, one count of conspiracy to make "conduit contributions" and false statements)

- Paul Pogue (Convicted of making and subscribing a false tax return)

- Bernard Kerik (Obstructing the administration of the Internal Revenue Laws; aiding in the preparation of a false income tax return; making false statements on a loan application; making false statements)

This is a small list of pardons but all of these seem for the most part like worse or similar crimes than what Hunter Biden is guilty.

Again, I don't doubt Hunter Biden is a fuck up but as far as I know he's not been pardoned by his father.

kardianos · 2024-08-14T12:38:37 1723639117

That's not any conservatism that I recognize. In fact, what is espoused there is exactly the progressive left (Herbert Harcuse's "repressive tolerance") mindset.

And while I will grant you that liberalism (not to be confused with leftism), is different then conservatism, both (classical) liberalism and conservatism strongly require equal treatment (procedural symmetry).

thrance · 2024-08-14T15:58:33 1723651113

Didn't Trump himself say he would pardon the rioters that stormed the Capitol, if ever he was reelected ? Didn't he say that he would "lock up" all the "sick, evil" democrats after he is reelected ?

Modern american conservatism very well fits the quote from grandparent.

Also, surely you would know what "repressive tolerance" is, since you're quoting it ? You would also know that the author you cite, whose name you misspelled, was critiquing the concept ?

kardianos · 2024-08-14T17:24:15 1723656255

Yes, I fat-fingered his name, didn't I. It should be: "Herbert Marcuse".

And no, I have not seen a primary source where in context Trump said that he would "" "lock up" all the "sick, evil" democrats after he is reelected"". Do you have such a primary source? Years ago I was told that Trump said the white supremacists were "very fine people", so I looked at the transcript and he literally said the opposite.

thrance · 2024-08-14T18:33:13 1723660393

He said those words in an interview with Glenn Beck. Here's a Guardian article reporting on it: https://www.theguardian.com/us-news/2023/aug/30/trump-interv...

I don't think he ever endorsed white supremacists, but I think I remember (wouldn't bet on it) an instance where a journalist asked him why some keep showing up at his rallies and why he does nothing about it. He then answered that he doesn't really know who they are, that he didn't know, etc. basically eluding the question. i.e. they're welcome but won't proclaim his support for them, one of the many dog whistles Republicans use nowadays.

archagon · 2024-08-14T18:25:14 1723659914

I don't know what you read, but Trump is (or was) buddies with Fuentes: https://www.politico.com/news/2022/11/25/trump-white-nationa...

binary132 · 2024-08-14T12:34:37 1723638877

what a nasty vicious outgroup

bko · 2024-08-14T11:58:11 1723636691

This is not what conservatism is

katzinsky · 2024-08-14T12:36:45 1723639005

It's not a complete definition but he is right that conservatism is completely incompatible with universalism.

This is a little confusing in the US and other Anglo countries because traditionally we have been fairly liberal so sometimes people confuse liberalism and conservatism.

didgeoridoo · 2024-08-14T12:52:12 1723639932

In the Anglo countries, what is being “conserved” is the liberal universalist tradition of the Enlightenment, and what is being “progressed” is a power- and identity-centered postmodernism.

Don’t get this confused with conservative and progressive politicians, though, who are generally ignorant of the actual traditions and philosophies behind their respective movements, and are essentially just cutouts for competing media and financial corporate interests. The few holdouts on both sides have been successfully sidelined (Bernie Sanders, Ron Paul), and it looks like the military-industrial complex will have their war with Iran no matter what the results of the next few elections are.

(Sorry, I’ll go have my coffee now and see if I get a little less doomer.)

katzinsky · 2024-08-14T13:12:33 1723641153

Actually I think you're overly optimistic.

wavemode · 2024-08-14T12:44:56 1723639496

You're going to need to provide a source for the definitions you're using for "conservatism", "liberalism" and "universalism".

Without those, your comment is difficult to make any sense of, since the way you're using those words seems to differ from any sort of standardized definition.

katzinsky · 2024-08-14T13:11:58 1723641118

I think we can both agree that for example liberalism and Islam are incompatible. So if you want to conserve liberalism you'll have to exclude Muslims. That's not universalist -> liberalism and universalism are incompatible.

hagbard_c · 2024-08-14T19:05:50 1723662350

Sounds like a made-up definition of Conservatism as 'that which I do not agree with'. It is not a very good definition, if you really want to find out what it is about you could read (or listen to) some of Roger Scruton's works. Here's an interview with Scruton to give some ideas of what it is about:

https://www.nationalreview.com/2018/07/roger-scruton-meaning...

You do not need to agree with him or his definition of Conservatism and there are other definitions of the term which are also applicable but none of those definitions have any resemblance to what you posed.

Musk is a centrist, not a conservative. He used to stand to the left of centre but has been moved to the right of centre by virtue of the left moving further left, thereby moving the centre to the left as well while Musk staid put.

Society needs conservatives just like it needs centrists and progressives and whatever other names you want to give to these philosophies and/or ideologies. A world made out of only progressives never gets anywhere since they will never find out what works well and what does not since their aim is to shape the future by means of societal change. A world made out of only progressives will eventually grow stale when the rate of change in the environment outpaces societies' capacity for change. Progressives can make good innovators but tend to be less able at keeping things running. Conservatives can be good at keeping things running but tend to be less inclined to innovation. These are broad brush strokes but the essence is sound, society tends to work best when there is a balance between conservatives and progressives.

You might notice some parallels: architects tend to be lousy builders, builders tend to be uninspired architects. Developers tend to be less gifted at UI design, UI designers tend to be sloppy developers. Copy editors tend to be unremarkable writers, writers tend to be less effective copy editors.

machdiamonds · 2024-08-14T08:48:33 1723625313

Pretty simple explanations for all of those:

- xAI opens sources models with a 6 month lag, look at Grok 1

- No one else stopped development, so why should he?

- He owns Twitter, why wouldn't it be okay for him to train on Tweets?

viraptor · 2024-08-14T09:43:08 1723628588

> xAI opens sources models with a 6 month lag, look at Grok 1

That's what happened once, rather than a policy that we can expect to be applied. (Unless I missed some announcement?) Based on "we'll publish the algorithm" which ended up being a one-off partial snapshot, never updated afterwards, I wouldn't hold my breath for the models.

> He owns Twitter, why wouldn't it be okay for him to train on Tweets?

There's a whole thing about having clear opt-in agreement about how your data will be used for EU citizens. Twitter didn't comply here with their hidden opt-out strategy.

handsclean · 2024-08-14T10:06:55 1723630015

> He owns Twitter, why wouldn't it be okay for him to train on Tweets?

Because he doesn’t own the tweets. Can you imagine if posting a photo you took to Twitter meant it’s not your photo anymore? Totally ridiculous.

andsoitis · 2024-08-14T10:11:08 1723630268

X terms of service:

You retain your rights to any Content you submit, post or display on or through the Services. What’s yours is yours — you own your Content (and your incorporated audio, photos and videos are considered part of the Content).

By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed). This license authorizes us to make your Content available to the rest of the world and to let others do the same. You agree that this license includes the right for Twitter to provide, promote, and improve the Services and to make Content submitted to or through the Services available to other companies, organizations or individuals for the syndication, broadcast, distribution, promotion or publication of such Content on other media and services, subject to our terms and conditions for such Content use. Such additional uses by Twitter, or other companies, organizations or individuals, may be made with no compensation paid to you with respect to the Content that you submit, post, transmit or otherwise make available through the Services.

— https://x.com/en/tos/previous/version_13

handsclean · 2024-08-14T11:03:26 1723633406

Right: you retain ownership, and grant X certain rights to the content. Whether those rights include training AI on the data is legally and morally in dispute. X claims that right in its ToS, but a ToS isn’t law and may be legally invalid, and besides that the ToS system is famously broken in the US. Morally, I think it’s pretty clear that reasonable users did consent to their content being published as a tweet, and did not consent to X recreating the content as their own and taking credit for it.

varjag · 2024-08-14T13:31:18 1723642278

When I signed up on Twitter in 2009 these ToS in no way implied using my tweets as training data. Nor they are worded explicitly that way now either.

oezi · 2024-08-14T10:55:37 1723632937

Clearly does not include a provision to utilize Content for purposes of training an AI model.

In fact, they didn't include any purpose for their own use of the data and following GDPR thus cannot use the data at all. They did include purposes for other companies (syndication, broadcast, etc) which also doesn't include training of AI.

olalonde · 2024-08-14T13:04:16 1723640656

GDPR only covers europeans. Also I doubt very much it applies to publicly accessible data.

gizajob · 2024-08-14T11:01:45 1723633305

Err, yeah clearly does:

“you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).“

Not sure how anyone could defend that an AI model is not covered by this idea - such a model is easily covered by “distribution methods”.

oezi · 2024-08-14T11:14:11 1723634051

Nope, the GDPR separates the action you perform on data from the purpose of such action. You need to collect consent for a purpose. X didn't state a purpose for why they would do any of these actions. Thus under EU laws their data collection is likely unlawful.

Adding a new purpose requires additional consent at least in the EU.

gizajob · 2024-08-14T11:22:49 1723634569

Well you might be right but their lawyers don’t seem to share your concerns.

maeil · 2024-08-14T12:08:49 1723637329

Their lawyers may well share their concerns, but in the case of X, those lawyers may simply be getting ignored. This isn't a normal company.

agys · 2024-08-14T11:04:57 1723633497

…still Twitter?

glass-z13 · 2024-08-14T10:12:53 1723630373

I was under the impression ( and assumption ) that majority of mainstream social medias, literally, own everything that you post and archive it

acomjean · 2024-08-14T10:48:25 1723632505

They don’t. Mainly for legal reasons. They don’t want to responsible for stupid/libelous things users post.

alphabettsy · 2024-08-14T10:40:57 1723632057

Doesn’t appear to be the case https://x.com/en/tos/previous/version_13

moralestapia · 2024-08-14T12:38:00 1723639080

Whoever owns the tweets is completely irrelevant.

If it is within his right to use this data for training purposes, then that's it.

And he is, btw.

And those terms were in place since way before he took over Twitter, btw, btw.

infecto · 2024-08-14T11:54:34 1723636474

I cannot recall specifics but I thought this was very much a real thing with some sites? What you upload can be used by the publishing company.

croon · 2024-08-16T09:59:26 1723802366

IANAL disclaimer, but I believe social media companies very explicitly separate themselves from publishers for the purpose of not being responsible for what users post. They can't have it both ways.

lenerdenator · 2024-08-14T13:19:11 1723641551

> - No one else stopped development, so why should he?

I thought it was a moral imperative or some such thing to do AI right because it could "destroy humanity"?

Or was that just Musk and the rest of the special people in SV's way of aggrandizing themselves while trying to do something most of them have either no experience in or fail miserably at, which is raise an intelligence to be a responsible actor?

DrSiemer · 2024-08-14T09:38:34 1723628314

Regarding the last question: because nobody gave them permission to use that data.

They tried to add a pre-checked mark to the settings, but at least in Europe, where we actually have consumer protection, that won't fly.

polski-g · 2024-08-17T13:03:55 1723899835

The data is sitting in northern Virginia in a data center. It's no longer in Europe's jurisdiction.

sangnoir · 2024-08-14T16:21:32 1723652492

> OpenAI trained on tweets, but it also trained on tweets

Not only that - Grok is/was trained on ChatGPT output, which I suppose Musk felt was turnabout. When asked about its identity, the first Grok would respond like ChatGPT (https://news.ycombinator.com/item?id=38584922)

dachworker · 2024-08-14T11:04:06 1723633446

Don't take this as a pro-Musk or anti-Musk comment. I just want to paraphrase his reasoning:

In a recent interview on Lex Fridman, he envisioned a future where humans augmented AI, through a brain to computer device like Neuralink would be able to keep up with pure AI.

Now one can immediately notice a hole in this reasoning: namely what guarantees that the AI that is use to augment humans is going to be benevolent and won't go rogue?

Palpatineli · 2024-08-14T22:40:06 1723675206

Nothing guarantees that. But in this augmentations scenario human brain is necessary, unlike in the many extinction scenarios with pure silicon AGI take off.

alm1 · 2024-08-15T07:07:59 1723705679

maybe that humans are controlling the creation process and can terminate it when the AI versions are going increasingly "rogue"?

bionhoward · 2024-08-15T00:01:06 1723680066

Unfortunately, Grok is not even Open Output, nor is Mistral’s platform or DeepSeek’s. None of them can be used for work (mistral with a fancy commercial license, but you gotta jump through hoops)

Only Meta’s Llama can be used for work. The rest are just toys for personal use noobs who don’t read the fine print.

kardianos · 2024-08-14T12:40:36 1723639236

Musk has publicly stated his goal for AI is alignment with Truth, where truth is defined as to what corresponds with reality, not necessarily with the current social consensus. Specifically in terms of reason, given a set of facts, being able to reason to a real place, not just to a socially given answer.

GuB-42 · 2024-08-14T17:10:47 1723655447

Which means essentially nothing. Most questions where alignment matter do not have a "true" answer, just a social consensus.

You don't need an "aligned" AI to tell you the distance between the Earth and the Moon. You need an "aligned" AI to tell you not to rape people even if you can get away with it, that's because the idea that rape is bad is not an objective truth based on the laws of nature, it is a "social consensus".

FloorEgg · 2024-08-15T04:19:29 1723695569

There is actually sound ethical reasoning for why rape is bad, that doesn't rely on social consensus.

Truth isn't one fundamental thing, truth is what works.

There are places in the world today that the social consensus is it's fine for a man to rape his wife. Social systems in these places don't work very well, and one can make a logical and well reasoned argument linking the social acceptance of rape to a myriad of other dysfunctions.

GuB-42 · 2024-08-15T11:29:09 1723721349

Rape is present in many successful animals species, and many successful human civilizations tolerated or even encouraged rape in some circumstances. In the modern world, the dominant (which likes to call itself "most advanced") culture doesn't tolerate rape and we may argue that if the most successful humans came up with this idea, it works and it is the "truth". Not only I find the logic a little shaky, but are we that successful? The western population is crashing down and this is a problem, maybe rape can fix that, maybe rape "works".

Do I think rape is good? Absolutely not, because I follow the current social consensus on that one, not a "truth" that is muddy at best. And I also want AIs to do the same.

FloorEgg · 2024-08-15T21:54:06 1723758846

So your moral compass isn't based on compassion or ethical reasoning, it's just based on social consensus?

So I guess you would have been fine with being a concentration camp guard, rounding up Jews and putting them in the oven because social consensus said it was the right thing to do?

Maybe you see the logic as shaky because you lack knowledge in logic and ethics...?

GuB-42 · 2024-08-16T01:47:14 1723772834

Honestly, I don't know how I would have been as a concentration camp guard. In my mind, I wouldn't have accepted it, because thankfully, I am not in this situation. But if I really was in this situation, who knows, we tend to underestimate how easily influenced we are.

Ethics gives us more questions than answers. The trolley problem doesn't have a true answer for instance.

The human rights are a social consensus, it is even made explicit by being a signed declaration. It felt good to the people who wrote them, it also feels good to me, because I was born and live in a society that has these values. It is only truth because by social consensus, we decided it is. In logic that would be an axiom and aligning an AI would mean implementing social consensus as axioms.

There are some fundamental reasoning that can justify human rights. One can use game theory, or the idea that human rights promote free thinking and free thinking is what brings the most value out of people now that machines do better than slaves for menial labor. But these are, I think, not enough.

FloorEgg · 2024-08-19T19:37:56 1724096276

I think you got it in your last paragraph.

Absolutely NOT social consensus as axioms, as that will result in stagnation and tyranny.

Instead we must progress gradually one axiom at a time through reasoning and experimentation.

Truth is not what we decide it is, truth is what works. The universe decides what is true, not people.

Re your comment: "but these are, I think, not enough". I both agree and disagree depending on what you mean. Fundamentally this approach is enough, but practically we haven't developed our understanding enough to map out absolute truth. It's probably something we can only approach but never reach.

But in theory the right AI system could allow us to approach the faster

alm1 · 2024-08-15T07:05:54 1723705554

logical or well reasoned argument doesn't equal a casual or factual relationship. The well reasoned arguments on many issues change over time, just take the same issue and go back in time 100 or 50 years to find much less consensus and much weaker logical links. Elon shows pretty consistently that truth for him is mostly just what Elon deems truthful or useful.

FloorEgg · 2024-08-15T21:59:59 1723759199

So, your point is because we get better at logic and reasoning over time (better today than 100 years ago), that logic and reasoning aren't valid ways to progress towards truth?

If this isn't your point, what is? Just that you don't trust Elon?

plorg · 2024-08-14T13:03:15 1723640595

Then surely he wouldn't be training on tweets.

motoxpro · 2024-08-19T17:35:33 1724088933

I wonder what "Truth" is. If I say I want you to make a picture of a lion eating at a 5 star restaurant, is that Truth? Is it truth because it can't refuse an ask? That feels like it is uninhibited, but not Truth, or truth.

worstspotgain · 2024-08-14T08:32:37 1723624357

[flagged]

zamalek · 2024-08-14T09:15:24 1723626924

https://news.ycombinator.com/newsguidelines.html :

> Please don't comment about the voting on comments. It never does any good, and it makes boring reading.

bheadmaster · 2024-08-14T08:41:12 1723624872

> Either way, it's impossible to have a level discussion about it because the Muskovites are arrow-clicking en force.

Have you actually tried, or are you preemptively censoring yourself?

Regardless, internet points shouldn't stop a person from speaking what's on their mind.

worstspotgain · 2024-08-14T08:47:20 1723625240

Yes, I made a critical comment and it was instantly flagged.

bheadmaster · 2024-08-14T10:07:29 1723630049

I assume you're talking about this comment:

    We need an alt-right version of AI like we need a pumpkin spice sushiccino. No thanks but no thanks.

It was flagged because it is against HN guidelines [0], in particular these ones:

    Eschew flamebait. Avoid generic tangents. Omit internet tropes.
    Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

[0] https://news.ycombinator.com/newsguidelines.html

worstspotgain · 2024-08-14T10:16:26 1723630586

You commented on my opinion that it's "impossible to have a level discussion." You asked me if I tried commenting, and I answered you.

What motivated my opinion is not just whether my comment deserved flagging or not (and I see a lot of comments that may be more deserving of it by the logic you quoted.) It's the fact that it got downvoted and flagged almost instantly.

read_if_gay_ · 2024-08-14T10:51:26 1723632686

you were asked if you tried to have a level discussion, to which you should have answered “no”

worstspotgain · 2024-08-14T21:11:04 1723669864

Even in an alt-right delusional doublethink universe, answering "no" would have been false.

A level discussion means one where criticism is allowed. It doesn't mean a discussion in which everyone gives a white-glove treatment to yet another useless chatbot, while ignoring the alt-right elephant in the room out of an abundance of courtesy.

jorvi · 2024-08-14T10:35:25 1723631725

You won't ever convince the people here that having your comment sent into the gray is detrimental. Not to mention the 1-9-90 rule[0], meaning 90% of people don't even understand how annoying it is to have a good comment sent into the gray.

According to to them, getting your comment grayed out means its still technically there, so you aren't getting censored by the bandwagon.

They fail to understand that graying out your comment signals to the cursory viewer that it is a low quality comment. Whereas often it is not. You might comment something that is factually right, but goes against HN's vibe du jour, so you get one or two downvotes, and then the larger group starts mass-clicking ▼ without any critical thought.

A much more healthy system would just be sorting comments by vote activity and percentage-positive. It would still make controversial comments slightly less visible, but because there is not explicit signal of quality, no bandwagon effects.

[0]https://en.wikipedia.org/wiki/1%25_rule

oezi · 2024-08-14T11:08:45 1723633725

> 90% of people don't even understand how annoying it is to have a good comment sent into the gray

Even if 90% of users are lurkers it doesn't mean they don't know how it feels to be downvoted and can't emphasize.

Good comments are rarely downvoted disproportionately on HN. Perceived censorship "by the bandwagon" just means it isn't a good comment.

jorvi · 2024-08-14T11:20:57 1723634457

> Perceived censorship "by the bandwagon" just means it isn't a good comment.

It just means you said something that goes against the grain of the larger part of HN. Nothing more.

But as stated, you people are inconvincible.

oezi · 2024-08-14T12:31:32 1723638692

I think it is more nuanced, because the majority of HN is voting on the quality of the argument rather the alignment of ideas. If you present a well reasoned contrarian idea, I don't think you would gather a lot of downvotes.

What gets downvoted are the really bad takes with lazy arguments.

bheadmaster · 2024-08-14T12:33:51 1723638831

You assumed that the reason you were flagged is because there is an army of Musk fans flagging anyone who is disagreeing with their opinions.

I have provided alternative reasoning on why your comment was flagged, which doesn't rest on the former assumption.

> It's the fact that it got downvoted and flagged almost instantly.

HN is a popular site, and you might have commented during peak hours. I think that is a more reasonable explanation.

In my experience, HN is generally anti-Musk, so it is odd for me to see someone asserting the opposite.

worstspotgain · 2024-08-14T13:00:39 1723640439

I answered your question honestly. You don't have to agree with my assumption or opinion. What matters (for the argument at hand) is if the facts were minimally enough to justify my opinion. You hinted that I might be giving up without trying to post, which was not the case.

> In my experience, HN is generally anti-Musk

I agree with this. I found the early moderation on this comment section to be suspiciously pro-Musk on a site that usually isn't.

bheadmaster · 2024-08-16T12:44:26 1723812266

> What matters (for the argument at hand) is if the facts were minimally enough to justify my opinion.

They were not. You were downvoted because you broke the rules of site, therefore there's no evidence of Elon Muskery. Try to have a discussion without breaking the rules first.

worstspotgain · 2024-08-16T22:41:08 1723848068

[flagged]

bheadmaster · 2024-08-17T11:11:47 1723893107

> By arguing that they weren't, what was left of your credibility for this argument has evaporated. My guess at this point is that you were probably one of the "moderation massagers" when this PR piece for Musk's Truth Social of chatbots was posted. You got irritated a little too quickly when I commented on the moderation.

You're attacking my credibility and character, instead of attacking my arguments. That's ad hominem.

> Further proof that this post was being PR-managed comes from the fact that my comment at the root of this thread was flagged many hours after the original post, maybe even a day later. Only someone who's keen on PR appearances would bother to do that, probably someone within the organization.

That's no proof of anything. Timing of the flags is random and depends on the attention of registered users. Your comment was flagged because you broke the rules [0] again:

    Please don't comment about the voting on comments. It never does any good, and it makes boring reading.

An "evidence" is a fact that indicates that something is true. A comment that breaks the rules being flagged isn't evidence of anything. That line of reasoning is akin to attacking a police officer, then shouting "police brutality!" after they fight back. Yes, police brutality may exist, but it's not applicable to your particular situation.

Start following the rules, and then if you get flagged, your argument will make sense.

[0] https://news.ycombinator.com/newsguidelines.html

1123581321 · 2024-08-14T09:01:55 1723626115

I think it was flagged because it was a pumpkin spice joke and “no thanks but no thanks.” Couching sharply critical comments in a few more explanatory lines would probably help the reaction. I see some longer comments from people who dislike Musk that are doing better.

n4r9 · 2024-08-14T09:24:03 1723627443

For the benefit of non-US users, what is a "pumpkin spice joke" please?

diffeomorphism · 2024-08-14T09:44:23 1723628663

Not a type of joke but just a joke making fun of pumpkin spice.

Context:

- "pumpkin spice" is a mixture of cinnamon, nutmeg, ginger, cloves and possibly other spices, commonly used for pumpkin dishes.

- Some people like it and around fall you can find it applied to just about everything no matter whether it seems like a good fit. E.g. pumpkin spice latte (coffee), pie, bread, ..... Joke part: just what the world needed, pumpkin spice bacon.

worstspotgain · 2024-08-14T09:17:02 1723627022

It was an emphatic way of criticizing the political motivations behind Musk's pushes into social media and AI, not a joke.

Either way, the downvoting and flagging were almost instant, which I suspect might be the reason why this comment section is looking atypically pro-Musk overall.

read_if_gay_ · 2024-08-14T09:27:15 1723627635

>We need an alt-right version of AI like we need a pumpkin spice sushiccino. No thanks but no thanks.

darn muskovites downvoting deeply thought provoking, critical comments

worstspotgain · 2024-08-14T09:33:24 1723628004

You won't find me flagging this deeply thought-provoking critical comment of your own.

read_if_gay_ · 2024-08-14T10:48:32 1723632512

the difference is you really won’t find me crying about it

worstspotgain · 2024-08-14T12:28:44 1723638524

You've done nothing but mock and decry a stranger's answer to someone else's provocative question. I don't actually care about your opinion, but the tactics certainly smack of alt-right projection. Decry perceived MSM censorship, only to pursue it and justify it for themselves.

Buy Twitter, make yet another chatbot, then "massage" moderation systems when people point out that it's not only crappy and redundant, it's also alt-right.

asmor · 2024-08-14T09:35:14 1723628114

keeps them from being heard though! I've experienced some wild swings in comment score on here. the diversity of thought is generally enough that some comments elicit such strong positive or negative feelings, that a comment that'd otherwise be in the positive or somewhat neutral can hit the hidden threshold if it gets unlucky almost instantly.

cheptsov · 2024-08-14T08:28:35 1723624115

I entirely agree. I’m quite sure Musk has a very strong stance on ethics, but it would be great to hear about it more clearly, and ideally not just through words, but through actual actions.

kergonath · 2024-08-14T10:09:03 1723630143

> I’m quite sure Musk has a very strong stance on ethics

His whole history tells otherwise.

cheptsov · 2024-08-14T10:30:45 1723631445

Profs?

kergonath · 2024-08-14T11:54:46 1723636486

If you mean “proof”, then lol. Off the top of my head:

- the pedo guy moment

- the hyperloop smoke and mirrors which are actually just his campaign against public transport

- the several union busting episodes

- the whole “Tesla is going to save the world” thing

- the multiple harassment cases

- the multiple instances of overwork, discrimination, and general lack of any consideration for his staff

- all the severance payments he failed to make after having fired a whole bunch of people

- the stupid ultimatum before one of the firing episodes, plus the utterly stupid “show your work” thing that came just after

- the multiple times he stiffed his creditors (either landlords, lawyers, contractors in general) or tried to do it

- the way he tried to force open Tesla factories in the middle of the COVID pandemic

- the multiple instances of pushing Russian propaganda verbatim (concerning.)

- most of the Neuralink saga

- the FSD vapourware that has been coming next year for a decade

- the way he publicly disparaged people who were killed by their Tesla using telemetric data that are supposed to be confidential

- the Media Matters lawsuit

Well, I could go on. He’d need to work quite hard to reverse his public image of massive arsehole at this point.

DennisP · 2024-08-14T13:19:37 1723641577

I could push back on some of these but I mainly want to ask about this:

> the way he publicly disparaged people who were killed by their Tesla using telemetric data that are supposed to be confidential

In the cases I've seen, Tesla pulled data showing that the people "killed by their Tesla" were either not paying attention at all (contrary to Tesla's explicit warnings), or were driving without the automated features enabled after all despite initial media claims to the contrary. Is this what you consider "disparagement" or do you have more egregious examples?

kergonath · 2024-08-14T15:44:54 1723650294

I did not particularly keep track, I do not dedicate my life to obsessively follow even massive dangerous idiots. There were at least 3 major ones.

That said, yes. What you said is evidence that he is mean-spirited and does not follow the rules he set himself. Of course, having an ethical behaviour sometimes means making hard choices. It is not about doing things that are understandable in context, it’s about doing the right thing, even if it is at a cost to you in the short term. He does not have any history of doing so.

These people are dead. Spitting on their graves because he is annoyed by their family is absolutely unethical. Particularly since their main failure was to believe the smoke and mirrors about FSD, which is itself another ethical clusterfuck.

If he had a beef, he could have sued for defamation, where he could have shown his data in an ethical and confidential manner. He knows the deal, he’s been in more than his fair share of defamation lawsuits, on either side.

DennisP · 2024-08-14T20:43:01 1723668181

I don't find it particularly mean-spirited to say "actually, our product didn't kill him, he wasn't using that feature." I don't even find it especially pejorative to note that the victim at that particular time was ignoring warnings and reading a newspaper; most of us do something foolish occasionally.

But that's just me. I doubt further debate on this would be productive.

Agentus · 2024-08-14T14:01:30 1723644090

Add how he took a multibillion payout while laying off 14% ish of Tesla staff. He seems to use shareholder Tesla equity to bail out his other misadventures. To add insult to injury he laughed with Trump about firing unionizing workers.

Oh he likes to bully people with lawsuits. Amber Heard comes to mind (he bullied the studio behind Aquaman).

Conscat · 2024-08-14T19:31:49 1723663909

The Solar City investor fraud, for which 4 of his cohorts settled in a lawsuit against.

lucasRW · 2024-08-14T11:49:57 1723636197

His whole history proves that his moral principles go first, not money.

He doesn't care if his defense of free-speech causes him revenue losses on X.

input_sh · 2024-08-14T13:04:55 1723640695

Try tweeting the word cisgender. That alone should be the end of all association about Musk and free speech.

s08148692 · 2024-08-14T14:40:27 1723646427

A simple search of twitter for "cisgender" shows that the word is not banned

sangnoir · 2024-08-14T16:41:48 1723653708

Searching for it is the only way you can find the word, because tweets containing the word are "reach-limited" (and appropriately labelled to the author, so they are discouraged from using that "slur" ever again).

vanjajaja1 · 2024-08-15T08:14:17 1723709657

worth pointing out that banned and visibility limited in certain scenarios are not the same thing, which might be causing some confusion in this thread.

kergonath · 2024-08-14T12:12:46 1723637566

> His whole history proves that his moral principles go first, not money.

Having moral principles is completely orthogonal to being ethical. Ayn Rand had lots of moral principles and she was still a reckless sociopath. One of his moral principles is that greed is good, and his actions certainly are consistent with this one.

He did lose a lot of money on Twitter, but you can hardly call that him following his moral principles, considering how things actually happened.

> He doesn't care if his defense of free-speech causes him revenue losses on X.

Whose free speech is he defending? There is no evidence that he champions free speech, merely that he supports however agrees with him and edgelords. He is more than happy to harass, intimidate, bully, and be a general nuisance to those whose opinions he finds objectionable.

cheptsov · 2024-08-14T19:51:20 1723665080

I’m not a fan of Musk but it is so funny to see that many haters here.

shafyy · 2024-08-14T10:49:59 1723632599

Haha, good one!

thomashop · 2024-08-14T09:16:28 1723626988

A strong stance on ethics? Like his comments about unions and firing workers in the Twitter space with Trump yesterday?

NekkoDroid · 2024-08-14T10:15:05 1723630505

Well... he does have a strong stance on ethics, just not a positive one

cheptsov · 2024-08-14T10:30:09 1723631409

The one you disagree with but why do you think all people think the same way as you?

meiraleal · 2024-08-14T10:53:14 1723632794

He wrongly accused a British of being a pedophille because he declined Elon's "help". That's the side of ethics you are standing for.

Manabu-eo · 2024-08-14T17:26:20 1723656380

He tweeted "pedo guy" in response to the diver saying to "stick his submarine where it hurts". I don't see it as accusation as much as I don't see what the diver said as an order given to Musk. Both were just insulting each other.

meiraleal · 2024-08-14T20:08:38 1723666118

Yes. The richest man, out of boredom and the me too disease, was insulting a guy rescuing kids in a tragedy.

ineedaj0b · 2024-08-14T12:58:42 1723640322

I have met people who consider all meat eaters murderers.

You will find yourself called unethical around different groups.

Understanding what group you are in is important to keep in mind when judging others.

llamaimperative · 2024-08-14T13:06:09 1723640769

Lol, a random person sharing their obvious opinion about ethics of eating meat is totally, undeniably different than one of the world's most powerful men legitimately and honestly accusing a rescue worker of being a pedophile.

No one is going to get investigated or have their careers ruined because a vegan called them a murder, obviously.

Unreal what sort of knots someone will tie themselves into to excuse this type of behavior.

geodel · 2024-08-14T15:07:27 1723648047

Well, In India one can get killed for eating beef or supplying beef. And there will be millions who would celebrate the murder.

Your worldview seems limited to tweets and social media in first world.

llamaimperative · 2024-08-14T15:41:00 1723650060

This isn't complicated.

If you say something false in a context where it is likely to actually harm someone, or with the goal of actually harming someone, you're an asshole.

Your level of assholeness rises in tandem with the expected harm of your falsehoods.

mdhb · 2024-08-14T10:51:05 1723632665

I don’t understand why you’re playing dumb here.

He is primarially known specifically as someone who is incredibly impulsive, is unable to differentiate fact from fiction and not actually interested in chasing any kind of objective truth in so much as that is possible.

But multiple times per week now for a long time you can see him sharing and commenting on things that are provably wrong and I don’t mean in some kind of “it’s just a different opinion” kind of way.

There is never any kind of introspection, never any kind of “oh I was wrong” just proceeds to roll immediately into the next round of bullshit.

So, no… people don’t have any kind of assumption that he has “strong ethics”. Maybe you meant strong convictions? Because that he certainly does have.

CaptWillard · 2024-08-14T13:19:34 1723641574

"He is primarially known specifically as someone who is incredibly impulsive, is unable to differentiate fact from fiction and not actually interested in chasing any kind of objective truth"

I'm sure you could make a case that these descriptors apply to him, not a particularly strong case, but ... You think he's primarily known for these things?

tgsovlerkhgsel · 2024-08-14T07:51:48 1723621908

I assume they will have a lot less "safety", i.e. the model will be more likely to actually do what you ask instead of finding a reason why "sorry Dave, I can't do that".

Since these "safety" features tend to also degrade the model, that's likely also helping them catch up in the benchmarks.

42lux · 2024-08-14T07:57:50 1723622270

Sadly it's at the level of Claude and way worse than grok-1 or Lama without safety. It roleplays as nearly everything so I guess they know their target group.

drawnwren · 2024-08-14T08:23:33 1723623813

I’m so confused by this comment. Are you not aware that Claude 3.5 Sonnet is currently considered the best model?

42lux · 2024-08-14T08:31:57 1723624317

Yes, you are confused because we are talking about censorship.

ineedaj0b · 2024-08-14T12:36:38 1723638998

It’s less censored than Claude but only slightly.

smileybarry · 2024-08-14T21:39:30 1723671570

It’s weirdly the opposite of what you have in mind. It has no problems generating images of Trump and Elon in explicit situations or Elmo covering 9/11, but it “safety-censors” LGBT-related prompts to the point of generating a heterosexual couple when asked for a gay couple: https://x.com/karlmaxxer/status/1823753493783699901 Some got expected results for prompts with LGBT terms, but that generation is still very weird.

leroman · 2024-08-14T08:15:27 1723623327

It's hilarious they put Claude 3.5 Sonnet in the far right corner while it scores the highest and beats most of Grok's numbers.

jug · 2024-08-14T08:34:27 1723624467

Yes, and I also noted how it beats Claude 3.5 Sonnet in Chatbot Arena by a bit of a margin.

This further feeds into my concern that the more advanced AI models we get, random enthusiasts at that site may no longer be able to rank them well, and tuning for Chatbot Arena might be a thing. One that is also exploited by GPT-4o. GPT-4o absolutely does not rank wildly ahead of Claude 3.5 Sonnet in a wide variety of benchmarks, yet it does in Chatbot Arena... People actually using Claude 3.5 Sonnet are also quite satisfied with its performance, often ranking it more helpful than GPT-4o when solving engineering problems, but at the expense of tighter usage limits.

Chatbot Arena was great when they were still fairly stupid, but these days, remember that everyday people are put against the task of ranking premium LLM's even solving some logic puzzles, trick questions and with a deep general knowledge far beyond that of singular humans. They can strike against traditional weaknesses like math, but then all of them suffer. So it's not an easy task at all and I'm not sure the site is very reliable anymore other than for smaller models.

lhl · 2024-08-14T09:57:17 1723629437

There was a mini-uproar when GPT-4o-mini (an obviously "dumber" model) outscored claude-3.5-sonnet on Chatbot Arena, so much so that LMSYS released a subset of the battles: https://huggingface.co/spaces/lmsys/gpt-4o-mini_battles

You can review for yourself and decide if it was justified (you can compare based on W/L/T responses and matchups). Generally, Claude still has more refusals (easy wins for the model that actually answers the request), often has worse formatting (arguable if this is better, but people like it more), and is less verbose (personally, I'd prefer the right answer with less words, but ChatArena users generally disagree).

If you look at the questions (and Chat Arena and Wildchat analyses), most people aren't using LLMs for math, reasoning, or even coding - if anything the arena usage is probably overly skewed to reasoning/trick questions due to the subset of people poking at the models.

Of course, different people value different things. I've almost exclusively been using 3.5 Sonnet since it came out because it's been the best code assistant and Artifacts are great, only falling back to GPT-4o for occasional Code Interpreter work (for tricky problems, Mistral's Codestral actually seems to be a good fallback, often being able to debug issues that neither of those models can, despite being a tiny model in comparison).

squigz · 2024-08-14T09:15:14 1723626914

Is there yet standardized ways of objectively testing LLMs? The Chatbot Arena thing has always felt weird to me; basically ranking them based on vibes.

freediver · 2024-08-14T16:03:01 1723651381

Short answer is no, because there is no 'standardized' use case.

One thing is sure - that current commonly used benchmarks are mostly polluted and worthless. So you have to go to niche ones.

For example the one I check for coding is Aider LLM leaderboard [1].

We maintain Kagi LLM Benchmarking Project [2] optimized for the use case of using LLMs in search.

[1] https://aider.chat/docs/leaderboards/

[2] https://help.kagi.com/kagi/ai/llm-benchmark.html

maeil · 2024-08-14T12:18:16 1723637896

Not really. There's a hundred benchmarks, but all of them suffer from the same issues. They're rated by other LLMs, and the tasks are often too simple and similar to each other. The hope is that just gathering enough of these benchmarks means you get a representative test suite, but in my view we're still pretty far off.

lossolo · 2024-08-14T13:01:09 1723640469

Use this https://livebench.ai It's a better benchmark.

maeil · 2024-08-14T12:12:02 1723637522

Your concerns are valid.

Two more things concerning Chatbot Arena:

- The prompts people use on have an incredible sample bias towards certain tasks and styles, and as such are unrepresentative of "overall performance" which is what people expect from a leaderboard.

- It is incredibly easy to game by a company, their employees or their fanboys if they would like to. No idea if anyone has done so, but it's trivial.

Just to give one example of the bias; advances in non-English performance don't even register on the leaderboard because almost everyone rating completions there is doing so in English. You could have a model that's a 100 in English and a 0 on every other language, and it would do better on the leaderboard than a model that's a 98 in every human language in the world.

pikseladam · 2024-08-14T06:19:19 1723616359

It uses FLUX.1 to generate images and it has been fun so far. Its good on writing, can generate very realistic photos, can create memes, and looks like hands problem is fixed now.

DaoVeles · 2024-08-14T07:15:38 1723619738

When I have time I will do my usual test "Realistic looking wizards bowling!" and see how it goes. So far I have had fairly disappointing results.

espadrine · 2024-08-14T08:10:36 1723623036

First try, so no cherry picking: https://fal.ai/models/fal-ai/flux-pro?share=43b11da2-e008-4d...

maeil · 2024-08-14T12:25:06 1723638306

I guess for a wizard it does make sense for a spell-gone-wrong to have chopped off two of his fingers.

esperent · 2024-08-14T08:02:20 1723622540

What's a realistic wizard? Given that wizards don't actually exist this might be a confusing request.

Have you tried putting in "photorealistic" instead of "realistic", assuming that's what you mean? I'm curious if that would get better results.

Y_Y · 2024-08-14T08:37:33 1723624653

It's a wizard who doesn't have unreasonable expectations. Wizards definitely exist by the way, for many reasonable definitions of "wizard".

Alifatisk · 2024-08-14T10:35:13 1723631713

You know what’s also impressive besides this beta release? How Claude 3.5 Sonnet is still able to keep up so well. Grok-2 beat every other LLM except Claude. How did Anthropic achieve this?

aoeusnth1 · 2024-08-15T01:16:24 1723684584

It’s possible Claude is using the same model tuning that they used to create Golden Gate Claude to dynamically tune the 3.5 model to be better at whatever task it’s doing.

lossolo · 2024-08-14T13:03:12 1723640592

A lot better quality of training data and instruction tuning (data again). There is no other secret sauce.

miohtama · 2024-08-14T13:22:25 1723641745

Also the sauce cannot stay secret very long. There is no moat in AI.

clarionbell · 2024-08-14T13:13:46 1723641226

I don't really care. The model may be competitive, but my use cases require speed, local (semi-local) execution and reliability. Neither of these seem to be baked into whatever X produced now.

When they make the mini model available for download and quantizable. That's when I may be interested. But given the minimal improvement in the past several months, I'm inclined to believe that we have reached the plateau.

miki123211 · 2024-08-14T07:58:16 1723622296

Do we have any info on this model's balance of censorship versus safety?

This is Musk after all, so I wouldn't be surprised if it strayed far from the norm.

chrisco255 · 2024-08-14T08:22:57 1723623777

Censorship isn't safety.

CaptWillard · 2024-08-14T13:22:00 1723641720

"censorship versus safety"

Do you guys have any idea how sinister "safety" sounds in this context?

remexre · 2024-08-14T15:15:51 1723648551

For example, not telling people to eat glue just because Reddit suggests eating glue could be considered a safety measure...

2OEH8eoCRo0 · 2024-08-14T16:10:41 1723651841

Try asking it questions that are critical of Musk.

corn13read2 · 2024-08-14T08:24:48 1723623888

Why is this speculation your go-to first question here? Do some research yourself on the models instead of adding your own implicit bias. Are you saying the engineers at X are collaborating with Musk in a coupe for a secret censorship of their model vs others. Do you have evidence or is this your bias?

majewsky · 2024-08-14T08:48:39 1723625319

I don't think their bias was implicit. :)

voidUpdate · 2024-08-14T08:37:33 1723624653

Given how when grok first came out, and people started asking questions about trans people and it came back with very sensible takes (trans women are women etc), and Elon and all the techbros absolutely hated it, I'd guess steps have been taken to avoid a repeat of that

voidUpdate · 2024-08-14T10:55:54 1723632954

Watching the score on this go up and down as HN tries to work out if they agree with it is hilarious. I'm pretty sure its crossed 0 about four times now

make3 · 2024-08-14T08:24:43 1723623883

[flagged]

fernandotakai · 2024-08-14T10:29:57 1723631397

>the word "cisgender", which is banned on Twitter while the n-word is not

https://x.com/search?q=cisgender&src=typed_query&f=live

i see tons of posts with cisgender.

skinnymuch · 2024-08-14T13:00:26 1723640426

You are either a Musk shill or don’t use Twitter. Cis is absolutely censored. Any active twitter user that naturally uses cis/cisgender knows this. Some posts make it through clearly, but a ton don’t. It depends on how it is written and whatever is flagging stuff.

holbrad · 2024-08-14T08:50:02 1723625402

Cisgender is banned on twitter ? That's hilarious.

Philpax · 2024-08-14T13:23:16 1723641796

Accounts below a certain threshold of followers are visibility limited for using "cis", yes: https://pbs.twimg.com/media/GU1sgbtXwAAPL0P?format=jpg&name=...

I believe the threshold is 35,000 followers, but don't quote me on that.

make3 · 2024-08-14T19:42:29 1723664549

@dang why is my post flagged while what I said is true and relevant?

lcnPylGDnU4H9OF · 2024-08-14T21:47:28 1723672048

Generally speaking, users flag posts, not mods. I have seen that some posts are minimized on load, which I believe is done by mods; that doesn't appear to be what happened here. A bunch of people thought it was inflammatory (or disagreeable, or controversial, or whatever) enough to flag it.

spiderfarmer · 2024-08-14T09:23:23 1723627403

Many people who claim to be "free speech absolutists" often seem unaware of their own hypocrisy.

littlestymaar · 2024-08-14T10:01:04 1723629664

I'm pretty sure he's fully aware it's BS. He's also the guy who censored journlists and acrivits Twitter account on day 1 when he bought the company, and also the guy who canceled a command for a Tesla by a customer after a bad review.

Musk gives no shit to free speech, it's just a rhetorical argument, which isn't unheard of: https://i.redd.it/3b470c0htra61.jpg (note that I'm obviously not comparing Musk to Hitler here…)

api · 2024-08-14T10:26:37 1723631197

Not sure if it’s still true but at least for a while saying it was instant account lock. Free speech absolutism!

ineedaj0b · 2024-08-14T12:38:14 1723639094

It is not. This is wholly false

vanjajaja1 · 2024-08-14T12:55:49 1723640149

in fairness its not entirely false - at some point he started talking about how it is banned and considered a slur on twitter... but nothing came of it and like all other slurs it continues to be allowed

skinnymuch · 2024-08-14T13:00:56 1723640456

Do you use twitter? Cis and cisgender are absolutely flagged a ton of the time.

vanjajaja1 · 2024-08-15T08:11:57 1723709517

i do, and i was agreeing with the case that it is not false (ie the ban is not entirely false, because he said it). then i said you can use it, just like any other slur, but i imagine it will get flagged

i imagine the confusion here is that you're making the case that cisgender is not a slur

skinnymuch · 2024-08-16T17:17:41 1723828661

Not sure if I meant to respond to someone else. Cisgender isn’t a slur tho yeah, that would be insane.

voidUpdate · 2024-08-14T09:00:33 1723626033

hey now, you can't be saying that word, that's a slur /s :P

speedgoose · 2024-08-14T08:38:49 1723624729

> Be maximally truthful, especially avoiding any answers that are woke!

Alleged end of the system prompt of the previous version.

ssijak · 2024-08-14T07:08:34 1723619314

Oh this is great, one more competitor with top model which will be available via API. I wonder what the pricing will be. OpenAI was slashing prices multiple times in the last year and a half I was using it.

mcintyre1994 · 2024-08-14T07:29:38 1723620578

I can't imagine anyone would want to build on top of their APIs after they completed destroyed the Twitter API and its whole ecosystem.

miki123211 · 2024-08-14T07:54:28 1723622068

LLMs are pretty easy to switch, though.

From a black box perspective, LLMs are pretty simple, you put text or images in, (possibly structured) text comes out, maybe with some tool invocations.

If you use a good library for this, like Python's litellm for example, all it takes is changing one string in your code or config, as the library exposes most APIs of most providers under a simple, uniform interface.

You might need to modify your prompt and run some evals on whatever task your app is solving, but even large companies regularly deprecate old models and introduce vastly better ones, so you should have a pipeline for that anyway.

These models have very little "stickiness" or lock-in. If your app is a Twitter client and is built around the Twitter API, turning it into a Mastodon client built around the Mastodon API would take a lot of work. If your app uses Grok and is designed properly, switching over to a different model is so simple that it might be worth doing for half an hour during an outage.

cosmosgenius · 2024-08-14T10:39:42 1723631982

Prompt to Output quality vary by a large amount between models IMO. The equivalent analogy would "lets switch programming language for this solved problem".