More

ersiees · 2024-07-16T12:07:50

I think I was not very specific, but I think there is a lot of video on YouTube that does not make any money for the producers and in the past YouTube also did not show ads for these videos, but now they show Ads even if the producers of the content don't receive any money. I watch mostly lectures and niche videos on YouTube, I am pretty sure for most of these videos, the only entity getting Money out of it is YouTube.

ersiees · 2024-07-11T13:02:49

Check out Lektor.lol an open source wrapper around ChatGPT I created just for that :)

ersiees · 2024-01-31T12:40:32

more of a rural myth, I guess ;D

ersiees · on July 24, 2023

This trick “they found” is part of the standard torch implementation of multi head attention, namely it is called, add_zero_attention. They add a zero to the logits, resulting in a one in the denominator as e^0=1 https://pytorch.org/docs/stable/generated/torch.nn.Multihead...

lovelearning · on July 25, 2023

I find its documentation quite poor though: "If specified, adds a new batch of zeros to the key and value sequences at dim=1."

Doesn't describe the implications even briefly. If they add just your second sentence to that description, it'll immediately become so much more useful.

civilized · on July 24, 2023

It's an option which is set to false by default. Does that mean people have tried it and it's not usually helpful...?

blackkettle · on July 25, 2023

It probably means they have tried it for _some_ purpose, but not necessarily the one described in OP's post here. The claim is that this is specifically useful for quantization. It's seems reasonable to assume that this would have initially been tried and potentially discarded for having little or impact on general accuracy. But that's a different issue. I suppose we'll here something definitive in a month or so.

mlyle · on July 24, 2023

quickthrower2 · on July 24, 2023

Can you elaborate? (It wouldn't be the first time there was an extraneous feature that no one has every used in some code!)

thomasahle · on July 25, 2023

If you take the inner product between a lot of more or less random vectors (the key and query vectors in attention) most values are going to be close to 0. This means they contribute by e^0 to the denominator. Now, if you have a context length of say 2000, your denominator is already ~ 2000. Increasing it to 2001 doesn't really make a difference.

Adding 1 to the denominator can be useful if you have softmax with just a few options. Not in self-attention where you have thousands.

quickthrower2 · on July 25, 2023

That simple comment is a strong counterpoint to the entire blog post?

Except with the +1 denominator, it might be that the model trains all of the inputs to become very negative so softmax chucks out close to zeros, whereas it wouldn't bother before because making one prob bigger makes another smaller.

thomasahle · on July 25, 2023

> it might be that the model trains all of the inputs to become very negative

It still can't do this because of L2 regularization / weight decay. If two vectors are norm 1, their inner product is at least -1, so with 2000 vectors that's still 2000 * e^(-1) =~ 735.

Not saying it's theoretically impossible that it could happen. But you would have to try _really_ hard to make it happen.

redox99 · on July 25, 2023

I guess you could add a sort of gating operation with a learnable parameter that sends the value to -inf if doesn't reach the threshold.

Of course it might have some other serious repercussions.

Q6T46nT668w6i3m · on July 24, 2023

It’s useful but it’s less used than dummy tokens.

thomasahle · on July 25, 2023

Are dummy tokens just tokens that don't have an associated input/output token? Like, a way to give more computational power to the model without splitting the text into more actual tokens?

dijksterhuis · on July 25, 2023

TL;DR sort of yes. But they're also useful for reasons not related to computational "power".

An example here with an actual algorithm, although it's been a couple of years so my explanation might be a bit wrong in places. and/or i might have gotten the completely wrong end of the stick with the current thread.

--

The CTC (Connectionist Temporal Classification [0]) algorithm maps a sequence x with length X -> sequence y with length Y.

i.e. in speech to text we might have some audio features that correspond to the following class predictions (post softmax classification)

    x -> hellllloooooooooo wwwooorrrllld

we want to get this as the output

    y -> hello world

we have the alphabet as classes we try to predict for each sequence item in x.

we could just removed all the duplicate in the first long sequence, but we would end up with `helo world` ... we need to preserve one of the early `l` characters in `hello` somehow

CTC uses a blank token (aka dummy) token to handle potentially deliberately repeated items in sequence x.

By adding the blank token to the classes predictions, we can get the model to predict something like this (post softmax classification)

    y* -> hel~l~~oooo~~~~~~ w~~o~~r~~l~~d

The CTC decoder (non-ML decoding algo) heuristically removes repeated tokens. Turning the above into ...

    y -> hello world

... the duplicate `o` and `~` characters are removed.

It was a decent enough algorithm for speech-to-text prior to attention/transformers etc.

However, it makes CTC vulnerable to well designed adversarial example attacks because there is a massive bias within models to predict the blank token -- meaning it's very easy to modify input sequence x to switch the output sequence y to include blank tokens for nefarious purposes (the subject of my unfinished phd).

[0]: www.cs.toronto.edu/~graves/preprint.pdf

thomasahle · on July 25, 2023

> By adding the blank token to the classes predictions, we can get the model to predict something like this (post softmax classification) > y* -> hel~l~~oooo~~~~~~ w~~o~~r~~l~~d

This is a great solution. Though that's a dummy token in the output rather than the input. I guess you could do something inverse to do text to speech, but it might be hard to say where to insert the dummy tokens in that case.

janalsncm · on July 25, 2023

Nice catch! Hopefully OP will see this.

fstokesman · on July 25, 2023

https://en.wikipedia.org/wiki/Multiple_discovery

ersiees · on April 8, 2023

Very interesting that someone finally tries out muP in the real world. Do I understand the usage correctly:

MuP is only used to get around choosing an lr for each size? Here I wonder how it compares to standard heuristics like the one in the OG scaling laws paper by OAI and tricks like back winding a few steps after loss explosion.

For some reason muP was not trusted with the largest trainings? Why is that?

ersiees · on March 19, 2023

Why do we call people expats and not rich migrants or something like that?

amrocha · on March 19, 2023

Because US and EU citizens think that "immigrant" is a dirty word for people from poor countries.

To all the other people talking about intent to settle or not: read the post. The author lived in China for 20 years and built a family there. They're not an "expat" under any technical definition of the word.

lmm · on March 19, 2023

> To all the other people talking about intent to settle or not: read the post. The author lived in China for 20 years and built a family there. They're not an "expat" under any technical definition of the word.

And yet he's leaving. He calls the US "home". So evidently he was an expat.

amrocha · on March 20, 2023

He repeatedly talks about abandoning his plans to build a life in China. Changing plans when the government decides you're the enemy doesn't invalidate decisions taken before that.

lmm · on March 20, 2023

> He repeatedly talks about abandoning his plans to build a life in China.

No he doesn't. He says it once, in the tagline, and gives no details.

def_true_false · on March 20, 2023

EU citizens mostly don't think in English at all... In most languages you would call someone who moves abroad an emigrant or something similar. From the point of view of the new country, they are obviously an immigrant. The word for people from poor countries is refugee or "economic migrant".

amrocha · on March 20, 2023

That's true in their native language, but it's common for high status Europeans to use the term expat when speaking English as well. And in my experience, most European immigrants speak English to a conversational level.

It's likely not an intentional choice, but a learned one, since other westerners around them use that term.

refurb · on March 20, 2023

This is silly. I migrated to the US from a wealthy country and I am an immigrant. I've also lived temporarily in other countries, and therefore was an expat.

It's not hard to understand.

amrocha · on March 20, 2023

Well yes, going by the dictionary definitions that's true, and I commend you for adopting the immigrant term. I'm an immigrant too, and proud of it.

At the same time, it's also clear that the status of the country determines whether people call themselves immigrants or expats. In the US people are immigrants, in Japan they're expats, for example. US citizens in particular very rarely refer to themselves as immigrants in my experience.

jimbob45 · on March 19, 2023

Expats expect to come home at some point in their lives. Emigrants (the word I think you were looking for) leave for good.

https://english.stackexchange.com/questions/97835/difference...

carlmr · on March 19, 2023

>Emigrants (the word I think you were looking for)

Emigrants leave and immigrants come. It's two sides of the same coin. You can't emigrate without immigrating. Except if you manage to become stateless and remain on international territory.

Migrants is just the term without the direction. Which is unnecessary if you're not talking about the country they leave or move to.

hahaxdxd123 · on March 19, 2023

If he had a family there, it seems to me like he was a migrant until the political circumstances forced him out.

908B64B197 · on March 19, 2023

There's a legal distinction. At least in America.

An immigrant has an intent to settle, an expat doesn't. In America, one could theoretically lose work authorization should he refer or present himself as an immigrant.

adastra22 · on March 19, 2023

Uh, no. The only legal definition of expat in US law is someone who gives up their citizenship or green card.

908B64B197 · on March 19, 2023

Technically yes. The legal term for someone on temporary work authorization without an intent to immigrate would be something like "non-immigrant alien resident". The common term would be... expat.

adastra22 · on March 20, 2023

The common term, yes. But the legal definition of expat is:

2) Expatriate The term “expatriate” means— (A) any United States citizen who relinquishes his citizenship, and (B) any long-term resident of the United States who ceases to be a lawful permanent resident of the United States (within the meaning of section 7701(b)(6) ).

cutemonster · on March 19, 2023

That's what I've read, too, i.e. intent to settle, or not.

Maybe wanting to move forever (i.e. migrant) is more common, if one is from a poorer country, which could explain why some others here thought that "immigrant" implied "poor and less educated".

hayst4ck · on March 19, 2023

While there are very jaded views about cultural superiority and other things like that. I think the truth is probably less cynical.

I can move to japan and marry a Japanese wife, work for a Japanese company, pay Japanese taxes, and speak fluent Japanese, but I will never be Japanese because my skin is white.

People can move to America, and as long as you speak English without an accent, there is an assumption that you are American.

If the place you move will assimilate you, then I would call that migrancy. If the place you move will never accept you, then I would call that ex-patriotism.

People who move to civil societies migrate, people who move to ethnic societies become ex-patriots.

I strongly recommend reading this: https://www.amacad.org/publication/what-does-it-mean-be-amer...

It really puts into perspective conservatism and liberalism by showing their contextual effects on immigration.

nitwit005 · on March 19, 2023

How accepted you are has nothing to do with you being an immigrant or not. You immigrated. It's the act of living permanently in a foreign country.

hayst4ck · on March 20, 2023

We're talking about a colloquial understanding and use of those words, not a technical definition.

andrekandre · on March 19, 2023

  > I will never be Japanese because my skin is white.

for some reason i find this statement kind of strange... are you saying "japanese" to mean "fit in as japanese" or "japanese nationality"?

  > and as long as you speak English without an accent, there is an assumption that you are American.

n=1 but this has not been my experience....

  > and speak fluent Japanese

without accent?

adastra22 · on March 19, 2023

Not the person you are replying to, but I have experience here. It’s both. For most Japanese, nationality is intrinsically tied to ethnicity, and cannot be changed.

hayst4ck · on March 20, 2023

> n=1 but this has not been my experience....

I don't doubt your experience. I was projecting myself onto others, probably too much...

America is a big place. If you don't have that experience in big west coast cities, I would be a bit surprised. Likewise, I would expect well educated folks to also assume that lack of accent means American, not in an intentional way, but an automatic one.

There are large swathes of the US that I doubt would see anyone who isn't a white evangelical christian as American.

My intention was not to be black and white or absolutist, though I see how what I said reads that way.

> without accent?

Yes

> are you saying "Japanese" to mean "fit in as japanese" or "japanese nationality"?

In a way, I meant both. I meant for a Japanese person to apply the word "Japanese" to me casually. To see me as part of "us" when a Japanese person says "us" to mean Japanese.

I think if you read the article I linked, what I am trying to express will be clear. It is an extremely meaty read, but I feel like it is somewhat like taking the red pill when it comes to understanding politics and the political forces that govern us. American education has poked at the ideas in that article without providing the philosophical basis. I've heard "diversity" so many times, but never a real, non hand-wavy, explanation of why diversity is important or why we apply energy to it.

justusw · on March 20, 2023

I would argue that the path to Japanese citizenship is easier than being seen as Japanese. One is a pre-defined legal process, the other one depends on the complexities of what is cultural identity.

rendang · on March 20, 2023

The word is spelled expatriate, it isn't a derivative of "patriot" btw

hayst4ck · on March 20, 2023

Thank you for the correction. I think I typed "expatriot" and used the closest suggestion from the right click menu without thinking about it too much. Embarrassing, but such is life.

kwere · on March 20, 2023

funnily, foreigners can never acquire chinese citizenship unless some chinese ancestry and/or enough connection in high places. Legal Naturalizations are a few hundreds a year, on a foreign population of 850 thousands

dariosalvi78 · on March 19, 2023

that's what migrants who are not poor like to call themselves

nonethewiser · on March 19, 2023

Well, many arent rich for starters

mr90210 · on March 19, 2023

- Basketball players are athletes - Not all athletes are basketball players

favaq · on March 19, 2023

[flagged]

varjag · on March 19, 2023

You sure your ESL class is a bigger contribution to a host economy than a skilled welder?

favaq · on March 19, 2023

[flagged]

varjag · on March 20, 2023

Why, you can still be useful but only marginally so.

For instance the demand for ESL classes is largely driven by school curriculum, which in China is set by the government rather than organic market forces. Welding, on the other hand, is necessary for physical production of certain goods. It can be argued about second order effects from learning a language but from purely libertarian economic POV a welder creates more added value.

CydeWeys · on March 20, 2023

All those immigrant tech workers in the US are assuredly adding more to the economy than they are taking. (And yes, they are immigrants on the path to US citizenship, not expats.) Your argument makes no sense.

ClumsyPilot · on March 20, 2023

All the Corporate lobbyists must be immigrants then!

varjag · on March 19, 2023

Expat == anglo work migrant

ersiees · on Nov 22, 2022

What about inside trade? Isn’t there a big risk that people at institutions you use as source take bets last minute before they announce the critical information? Or maybe even change what they publish (maybe pretending it to be an honest mistake, which is corrected the next day) to win?

tmansour · on Nov 22, 2022

Good question.

Among a number of other safeguards that I mentioned in a reply above, we often use well established and reputable data sources that either 1) already have restrictions on their employees trading on the event or 2) we enter into data licensing agreements with and require those restrictions to be put.

We also run KYC and pass all the participants through Politically Exposed Persons (PEPs) list, which allows us to flag people that are potentially close with a lot of our data sources (BLS, Nasa, MTA, etc.).

Our surveillance systems also do a great job of flagging weird activity (more in the post above) and anyone who we find to have done something wrong can be fined all the way to criminally prosecuted by the CFTC.

In short, a lot of similar safeguards to what you have against insider trading in stocks.

ersiees · on Nov 22, 2022

Can I also trade “Will Chelsea win against Barcelona next weekend?”

tmansour · on Nov 23, 2022

We don't do sports

ersiees · on Nov 16, 2022

I can’t stand this twitter behaviour: something new comes out, they try to find examples where it fails and say it is generally bad because of these failures. Have some respect for the work of others.

sdflhasjd · on Nov 16, 2022

I think this is just an example of Twitter's general tend towards extremes: it's either absolutely terrible, or it's literally the most amazing thing ever.

Unfortunately, it's easier to break things than to provide more constructive critique.

mattnewport · on Nov 16, 2022

I couldn't disagree more. The thing being criticized is so much worse than the criticism here, I actually think the criticism is rather too measured.

julienreszka · on Nov 16, 2022

I agree. This kind of behavior is of poor taste.

ersiees · on Nov 16, 2022