More

zenogais · on Nov 24, 2021

Like the author said this is completely unoptimized. The natural next step in optimization might be to profile and then SIMD optimize the slow bits in compression and decompression. This would likely produce a significant speedup and may even bridge the gap with lz4.

phire · on Nov 25, 2021

The algorithm is extremely resistant to SIMD optimizations.

Every pixel uses a different encoding, 95% of the encodings rely on the value of the previous pixel, or the accumulated state of all previously processed pixels. The number of bytes per pixel and pixels per byte swing wildly.

SIMD optimization would basically require redesigning it from scratch.

sitkack · on Nov 25, 2021

SIMD only gets you up to the width that your hardware platform supports and every SIMD program has to be rewritten for the new width.

Two other immediate avenues are multithreading, which think could be quite effective for this algorithm or GLSL, of that I have no opinion.

zenogais · on Nov 13, 2021

What you're looking for exists today and is called a stable-coin (for example, USDC).

zenogais · on April 29, 2021

Agree with this and I think it's the crucial point. Where I've had the most success with UML-like tools and diagrams is early on in a project since its much faster to rip up and rebuild a few diagrams. The diagrams also help with getting consensus among stakeholders that this direction makes sense and doesn't have any glaring holes.

Later on you can drop the diagrams entirely and perhaps keep them as some onboarding documentation if they're still useful.

zenogais · on Oct 20, 2020

But it links to the Coinbase blog post in the "Inspirations" section on the front page?

aravindet · on Oct 20, 2020

Ah yes, I missed that. Thanks.

zenogais · on Oct 6, 2019

Cars aren't killing people though, it's the drivers of those cars. Such an odd headline.

Better title: "Why are drivers killing more pedestrians?"

licebmi__at__ · on Oct 6, 2019

>Better title: "Why are drivers killing more pedestrians?"

I disagree. That title lacks information and might lead to some people to think drivers are beating the pedestrians to death or maybe they're more prone to gun violence against pedestrians.

Maybe, a better title could be: "Why are drivers who are currently driving a car, pedestrians currently walking on the street, by running them over with said car"

zenogais · on Sept 9, 2019

Type generic expressions [1] already exist in C11. You can do this without a special compiler front-end if you need it, you would just need to define each variant by hand. These are supported in GCC and Clang with the -std=c11 flag.

[1]: http://www.robertgamble.net/2012/01/c11-generic-selections.h...

AndrewBissell · on Sept 9, 2019

I developed these ideas a bit further with the use of XX macros, which allow you to make a set of functions generic across some list of types. There can be real performance wins to this over void*: https://abissell.com/2014/01/16/c11s-_generic-keyword-macro-...

flukus · on Sept 10, 2019

That looks handy. I got bored one day and built a fluent interface: http://flukus.github.io/fluent-interfaces-for-c.html

okl · on Sept 9, 2019

_Generic has been created for a specific use (tgmath.h) and is not very powerful.

zenogais · on Aug 21, 2019

I wanted to like this article because I'm been thinking about this a lot in the context of game development, noticed a few things. One thing I'll say from briefly playing with this - the code leaves lots out a looks ostensibly simpler than it really is. Would very much appreciate tips / pointers on this or a more fleshed out and working implementation of the code.

For the bulk data with holes code:

First, there's an initialization step that has to happen the first time you allocate your bulk_data_t. Namely, you need to iterate through every item in the list and set its next_free item to the item following it, looping the last item back around to zero. You also need to do this for all the items between the new size and old size every time you resize your item list.

Second, safe iteration over all of the bulk data doesn't seem possible without adding some sort of flag to indicate whether or not an item is free.

Am I missing something here?

gpderetta · on Aug 21, 2019

No need to preinitialize the list. When allocating an element you have two cases: a) the free list is non empty: you pop an element by making the head point to its successor; b) the free list is empty: you increase the size of the vector by one.

To deallocate an element you simply set the elemnt next to the whatever is the current head, then make head point to this element.

You start with an empty vector.

An element is either allocated or in a free list, that's why the next pointer can be kept in an union.

dhruvrrp · on Aug 21, 2019

I would say it depends on how one might want to handle it, like when you create an item_t you set next_free = -1 as a flag to indicate that it is not free. And have bulk_data_t's 0th position's next_free be 0.

zenogais · on Aug 21, 2019

Update: I was indeed missing something.

I think I've figured out roughly what the author intended, code below [0].

First, it looks like he's relying implicitly on data stored in std::vector. Namely vectors have both a capacity and a size. The capacity is total number of allocated elements. The size is the total number of elements stored actually stored.

Second, vector::resize won't reallocate until it runs out of capacity, but it will give you access to extra elements if you need them. So this is used to lazily re allocate while bumping up the size of the vector.

Both of these effectively make it "do the right thing" by leaning on the vector storing both size and capacity.

If you hand manage those values yourself you can get a pretty compact C implementation without a lot of code.

One last thing: Using a union here for the item_t is pretty much guaranteed to get you a segfault. The whole thing should really be a struct. This also allows for setting next to sentinel value if necessary.

[0]: C code for bulk_data_t example: https://pastebin.com/Tfcdt39h

dhruvrrp · on Aug 21, 2019

The author's pseudo-code implies that bd->items is full already, since it is probably initialized to the initial number of items that are added. This is probably for memory optimization for games. Also explains why resize increases the size by 1, just enough memory to add the new item.

This way you don't need to worry about keeping track of size and capacity either.

The reason I suggested -1 is because when we iterate through bd->items, we need a way to know if it's a valid value or just "holed".

zenogais · on Aug 21, 2019

> This way you don't need to worry about keeping track of size and capacity either.

The only reason you don't have to worry about this is because std::vector handles it for you, at least in the code examples provided by the author. If you choose to go with a pure C implementation (which is what I'm trying out) then you will have to keep track of these.

> The reason I suggested -1 is because when we iterate through bd->items, we need a way to know if it's a valid value or just "holed".

Yep, I was able to get an example using -1 as a sentinel working and passing fuzz testing.

gpderetta · on Aug 21, 2019

A union is perfectly fine, the next element is only needed when the element is in the free list in which case item_t won't be accessed.

zenogais · on Aug 21, 2019

It should work fine in the examples, but it won't work if you also attempt to use the -1 in the next slot to tell you which items are non-free.

gpderetta · on Aug 22, 2019

If you want to iterate through your objects without some external mean to track the live objects (like an embedded next pointer in the object itself), then the swap and pop idiom is a better solution (iterating via indexing is going to be significantly faster than following the next pointers).

zenogais · on Jan 23, 2019

I think the strong counter-argument to this is in [1]. Quite simply we need to make sure we don't get fooled by randomness and see bias where there is none, thereby making good algorithms worse in an attempt to fix imaginary bias.

[1]: https://jacobitemag.com/2017/08/29/a-i-bias-doesnt-mean-what...

viraptor · on Jan 23, 2019

I don't think it's that good counterargument. The explained problem was treated as simple mapping - this value causes that result. It spends a bit talking about what statistics mean, but unfortunately doesn't discuss the idea of confounding variables, which influence both sides of the equation.

zenogais · on Dec 30, 2018

Initial thought.

This reads largely like a recapitulation of Boehm's 2003 book "Balance Agility and Discipline" [1]. Have you read this and did this provide any inspiration for the article?

[1]: https://www.amazon.com/Balancing-Agility-Discipline-Guide-Pe...

bobm_kite9 · on Dec 30, 2018

I haven’t read it but I’ll put it on my reading list, thanks.

zenogais · on July 24, 2018

^ This is pretty much standard at any startup in my experience.