More

_pastel · 2024-11-01T18:16:54 1730485014

You could fine-tune the embedding model to reduce cosine distance on a more specific function.

_pastel · 2024-08-17T05:43:41 1723873421

I work with an extremely effective machine learning engineer, and the biggest thing I've learned is how far you can get with vibes, even in a more traditional ML situation.

He invests most of his time visualizing the inputs and outputs of his systems very carefully, instead of focusing super heavily on metrics; this is a lot more effective than I thought it would be, particularly in the early stages of a project.

CharlieDigital · 2024-08-17T09:19:27 1723886367

Measuring things correctly is hard. Spending time measuring things before you know what should even be measured/how it should be measured is almost surely to be a wasted effort early in a project.

_pastel · 2024-05-17T00:37:29 1715906249

Why is max_seq_len set to 2048 [1] when the model card says the context size is 8k [2]?

[1] https://github.com/meta-llama/llama3/blob/14aab0428d3ec3a959...

[2] https://github.com/meta-llama/llama3/blob/14aab0428d3ec3a959...

mkolodny · 2024-05-17T00:48:34 1715906914

That's just the default. You can set max_seq_len to 8k. From the readme [0]:

> All models support sequence length up to 8192 tokens, but we pre-allocate the cache according to max_seq_len and max_batch_size values. So set those according to your hardware.

[0] https://github.com/meta-llama/llama3/tree/14aab0428d3ec3a959...

_pastel · 2024-03-05T19:56:58 1709668618

Tooling around embeddings has improved. Creating and fine-tuning custom embeddings for your tabular data should be easier and more powerful these days.

_pastel · 2024-02-27T02:25:41 1709000741

100% agree.

One thing that helps is hooking metabase up to its own database and building queries on your queries, e.g.:

    select *
    from report_card 
    where dataset_query ilike '%' || {{query}} || '%'

(You can also join in metadata like the author, when it was last ran, etc.)

We also try really hard to keep the Collection directory structure clean and consistent. But it's still really hard.

EricRiese · 2024-02-28T14:49:05 1709131745

That's so... meta

_pastel · on July 8, 2023

Not only is it not constant time, it's not even polynomial - it's psuedo-polynomial. Also it'll fail on negative numbers, right? You'll need something like `10000 * log(time + min(time) + 1)` to be linear in the bits used to represent the inputs.

> In computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the numeric value of the input (the largest integer present in the input)—but not necessarily in the length of the input (the number of bits required to represent it), which is the case for polynomial time algorithms.

https://en.m.wikipedia.org/wiki/Pseudo-polynomial_time

libealistand · on July 8, 2023

> Not only it not constant time, it's not even it's polynomial

You understand that this is part of the joke, right?

If we really want to get down to the details and kill the joke, then you don't actually need to wait real time. Computational complexity is concerned with steps in a computational model, not how much time passes on a clock. Sleep sort uses OS scheduler properties and in a virtual time environment, time advances to the next scheduled event. That's what brings you back to actual polynomial complexity, if you assume this kind of thing as your computational model.

> - it's psuedo-polynomial.

If you lecture people then please at least get your spelling right.

orangesite · on July 8, 2023

This is really what makes the joke work IMO.

Haskell's runtime and the OS it executes on exist only as a transient implementation detail of what is, literally, a pure environment!

bmacho · on July 8, 2023

I still don't get it. Sleep sort still needs O(n) Haskell-steps, while for example sorting in bash is just 2 bash-steps, calling it, and receiving the output.

I fail to see the joke, really. I only see false and nonsense statements, which still could be funny or interesting, but I don't see how?

xena · on July 8, 2023

The "constant time" is wall clock time, where it will be at least the biggest value times the constant microseconds plus like 5% for overhead like the garbage collector.

fenomas · on July 8, 2023

Complexity analysis deals with asymptotic cases where N gets arbitrarily large. Sleepsort would take the same wall time to sort [1,2] and [1,2,1,2] - but it would presumably take somewhat longer to sort an array of 1e10 ones and twos, because it's not a constant time algorithm.

On the joke part, sleepsort is intrinsically an extremely funny concept and I think everyone here gets that. But "constant time" has a rigorous/pedantic definition, which sleepsort doesn't meet, so I think for some readers calling it that kills the joke (in the same sort of way that it would kill the joke if TFA's code snippets used lots of invalid syntax).

musicale · on July 11, 2023

I'd never seen sleepsort before, so I thought it was funny. ;-)

I like the idea of (ab)using the scheduler to sort numbers.

Now I'm inspired to make my own "sorting" routine, maybe touching filenames and then sorting them with `ls -n`, or rendering triangles and then ripping them off in z-plane order, or stuffing frames into a video and then playing it back.

OJFord · on July 8, 2023

Surely any PP problem can be done in P, it's just a matter of encoding? As in if you give me a box that computes whatever in PP, then I can give you back a box that takes a single input consisting of 1s to the length of each value delimited by 0s, turns those back into integers (linear), uses your PP box, and returns its output - but now it's polynomial in the length of my input?

(I said integers but I don't think that's significant, just a matter of encoding scheme - we can use a single 0 for decimal point and two for delimiting inputs say.)

But anyway isn't the joke that sleep-time doesn't count, because the computer could be doing something else? It's actually quite compelling in a 'this is stupid, I love it' sort of way.

kadoban · on July 9, 2023

You can indeed make the distinction between polynomial and pseudo-polynomial stop existing by enforcing the inputs are in unary. But you haven't made anything faster, you've just made almost all of them worse.

_pastel · on April 28, 2023

Some ways to make recipients feel more comfortable:

- You can suggest some other contribution. "Would you mind bringing snacks? / Would you mind handling music on the drive? / Would you mind giving X a ride?"

- You can allow them to reciprocate in less expensive situations, like taking the check when you are at a cheaper place.

_pastel · on March 10, 2023

True in this situation, but note that intermediate activations and gradients do take memory and in other contexts that's the limiting factor. For example purely convolutional image networks generally take fixed-size image inputs, and require cropping or downsampling or sliding windows to reach those sizes - despite the convolution memory usage being constant for whatever input image size.

_pastel · on March 2, 2023

Ah, the celery semantics.

_pastel · on Feb 9, 2023

Interesting how Gitlab repeatedly emphasizes it's based on "cost of market", not "cost of living".