Hacker News new | past | comments | ask | show | jobs | submit login
Handles are the better pointers (2018) (floooh.github.io)
223 points by ibobev on June 21, 2023 | hide | past | favorite | 78 comments



Reminds me of a startup I worked for in the 1990s. The C code base was organized into "units" (modules). Each unit allocated/destroyed its own data structures, but used the "ref" unit to return a reference (really, a handle) instead of a pointer. Each module used a "deref" function to convert the handle to a typed pointer for internal use.

ref used many of the tricks described in the above post, including an incrementing counter to catch stale handles.

All pointers were converted to handles in debug and testing builds, but in release builds ref simply returned the pointer as a handle, to avoid the performance penalty. As machines grew faster, there was talk of never turning off ref.

Side win: There was a unit called "xref" (external ref) which converted pointers to stable handles in all builds (debug and release). It was the same code, but not compiled out. External refs were used as network handles in the server's bespoke RPC protocol.


The old Mac used Handle so that it could move memory around as pressure mounted.

In “some” ways, it’s a bit like a smart pointer as it’s a hook to allow the underlying system to “do things” in a hidden way.


If anyone's curious, here's an old article about working with handles and the Macintosh memory manager: http://preserve.mactech.com/articles/develop/issue_02/Mem_Mg...

(The article's examples are in Pascal, the original language of choice for the Mac.)

—-

Update: wow, Apple still has bits of the original Inside Macintosh books available online. Here’s a section on the memory manager, replete with discussions of the “A5 world” and handle methods (MoveHHI, etc.) in Pascal: https://developer.apple.com/library/archive/documentation/ma...


Oh man, that brings back memories! Inside Macintosh. Pascal. …

The article got virtual memory wrong to a bit.. it got much better over the years and using relocatable handles fell by the wayside to more plain pointers.


I was using MPW C, writing with 68k for performance pinch points, and malloc() was much faster than NewHandle() - faster than NewPtr() too.

I guess there was some simple pre-allocated larger block that got parts handed out on demand. Not unlike a modern implementation.

I still miss MPW Shell.


Microsoft's Win16 memory allocator APIs (GlobalAlloc and LocalAlloc) also returned handles so the OS could move memory blocks to new addresses behind the scenes. Application code would need to call GlobalLock/Unlock APIs to acquire a temporary pointer to the memory block. The APIs still exist in Win32 and Win64 for backwards compatibility, but now they're thin wrappers around a more standard memory allocator.

https://learn.microsoft.com/en-us/windows/win32/memory/compa...


Historically, I think there was an overlap of APIs between early Macintosh and early versions of Windows because Microsoft was porting their software.

Microsoft released the first version of Excel for the Macintosh on September 30, 1985, and the first Windows version was 2.05 (to synchronize with the Macintosh version 2.2) on November 19, 1987. https://en.wikipedia.org/wiki/Microsoft_Excel#Early_history


I kinda miss the fun days of using addresses to physical memory. (Maybe I'm wrong but I've always assumed that explicit use of handles went out of fashion because virtual address lookup is, in effect, a handle.)


This was used to avoid memory fragmentation: https://en.wikipedia.org/wiki/Classic_Mac_OS_memory_manageme...


A really powerful design pattern is combining the use of handles and closures in a memory allocator. Here's a simplified Rust example:

  let mut myalloc = MyAllocator::<Foo>::new();
  let myfoohandle = myalloc.allocate();
  let myfoohandle2 = myalloc.allocate();
  myalloc.expose::<&mut Foo, &Foo>(myfoohandle, myfoohandle2, |myfoo, myfoo2| {
    myfoo.do_mutating_foo_method();
    println!(myfoo2);
  });
  myalloc.compact(); // Rearrange the allocated memory for myfoo and myfoo2
  myalloc.expose::<&mut Foo>(myfoohandle, |myfoo| {
    // Still valid!
    myfoo.do_mutating_foo_method();
  });
Internally, the allocator would use a map of handles to pointers to keep track of things.

Because the closures strongly limit the scope of the referenced memory, you can relocate the memory of actively used objects without fear of dangling references. This allows the allocator to perform memory compactions whenever the user wants (e.g. idle periods).


This is just manually implemented garbage collection, really.


Not really. Compaction is a feature of many garbage collectors, but the allocator I described doesn't impose any particular way of deciding how and when to deallocate the objects. You could do so explicitly with a MyAllocator::deallocate() method, or use runtime reference counting, or (if you're willing to constrain handle lifetimes to the allocator lifetime) you could delete the allocator entry when the handle goes out of scope.


I kind of agree. A while ago I was thinking about how to implement something like Roslyn's red-green-trees in Rust. The solution I came up, with a similar in principle handle-and-allocator approach, did work, but would need the occasional cleanup, to get rid of zombie objects. At that point I realised that all I've done was reinvent a poor mans garbage collector.


Garbage collection is not necessarily memory compacting.


The article basically describes the Entity-Component-System architecture and it makes sense when your app is essentially a stateful simulation with many independent subsystems (rendering, physics, etc.) managing lots of similar objects in realtime (i.e. games). I thought about how it could be used in other contexts (for example, webdev) and failed to find uses for it outside of gamedev/simulation software. It feels like outside of gamedev, with this architecture, a lot of developer energy/focus will be, by design, spent on premature optimizations and infrastructure-related boilerplate (setting up object pools etc.) instead of focusing on describing business logic in a straightforward, readable way. Are there success stories of using this architecture (ECS) outside of gamedev and outside of C++?


It's mostly a linguistic distinction: is your indirection a memory address, or is it relative to a data structure? You gain more than you lose in most instances by switching models away from the machine encoding - ever since CPUs became pipelined, using pointers directly has been less important than contriving the data into a carefully packed and aligned array, because when you optimize to the worst case you mostly care about the large sequential iterations. And once you have the array, the handles naturally follow.

The reason why it wouldn't come up in webdev is because you have a database to do container indirection, indexing, etc., and that is an even more powerful abstraction. The state that could make use of handles is presentational and mostly not long-lived, but could definitely appear on frontends if you're pushing at the boundaries of what could be rendered and need to drop down to a lower level method of reasoning about graphics. Many have noted a crossover between optimizing frontends and optimizing game code.


It’s more the Entity in an ECS is a special case of a handle that references into multiple other containers. Handles in general are used all over the place outside of that context.


ECS are a special case of what the article is about, not the other way around.

For instance, I independently came to the same conclusion than the author while developing a library implementing polytomic tree structures.

It's much easier to handle blck-box NodeID's to the user (techincally u64, but don't tell them ;) and act on them with dedicated functions from the lib rather than clenching your teeth waiting for the inevitable segfault when one plays with pointers.

And it's also much easier to handle for the user: you have handles, functions creating them, functions using them, and the library can gracefully fail if you keep using them the wrong way.


You are conflating object pools and ECS architecture. Yes, they work very well together, but neither is required for the other.

The ECS architecture is about which parts of your application are concerned with what, and it's an absolute godsend for building complex and flexible business logic. I would be loathe to ever use anything else now that I have tasted this golden fruit.

Object pooling is about memory management; in an OO language this is about reducing pressure on the garbage collector, limiting allocation and GC overhead, improving memory access patterns, etc. -- all the stuff he talks about in this article. I almost never use object pools unless I'm running a huge calculation or simulation server-side.

Games use both, because they're basically big simulations.


I write business logic and I use a fair amount of object pooling. It's not quite the same as in game dev or e.g. a socket pool or worker pool where you're constantly swapping tons of things in and out, but it can still be helpful to speed up the user experience, manage display resources and decrease database load.

One example would be endless-scrolling calendar situations, or data tables with thousands of rows, or anything where I'm using broader pagination in the database calls than I want to in the display chain; maybe I can call up 300 upcoming reservations every time the calendar moves, erase the DOM and redraw 300 nodes, but I'd rather call up 3,000 all at once and use a reusable pool of 300 DOM nodes to display them.

Sure, it's not glamorous...


This is interesting. Do you have any links on use of ECS for application logic?


Really? The main point (use an index into an array instead of a pointer into a blob of memory) gets you 99% of the benefit and it isn’t anymore difficult than using pointers. I do this all the time.


I think this pattern is also useful for creating tightly performing middleware. However it needs language, compiler and library support to really take off wrt large scale adoption.

Ie, when someone marks gives a `Resource` annotation to a struct, then the compiler also generates a `ResourceHandle` and a `ResourceManager` for you which does all this work. All the internal SoA (structure of arrays work) is handled for you as long as you implement the `ResourceFinalizer` interface (for non trivial resources)


This is not an entity component system. ECS defines how you need to structure your code. You have basic entities that you can assign components to. If you add certain components to entities then some systems will handle logic for all entities that have some subset of components. Entities that do not have the required set of components will be ignored. ECS can usually benefit from these types of allocators as they provide performance benefits but it isn't required.


I use something like that for controlling a cluster of vending machines.


If you're using a relational database for your web app, then you're doing something very similar to ECS, but more powerful.


> - items are guaranteed to be packed tightly in memory, general allocators sometimes need to keep some housekeeping data next to the actual item memory

> - it’s easier to keep ‘hot items’ in continuous memory ranges, so that the CPU can make better use of its data caches

These are huge advantages. Memory managers tend to keep pools for specific allocation ranges, eg, a pool for < 24 bytes, a pool for <= 64, etc up to, say, a megabyte after which it might be delegated to the OS, such as VirtualAlloc on Windows, directly. This is hand-wavy, I'm speaking broadly here :) This keeps objects of similar or the same sizes together, but it does _not_ keep objects of the same type together, because a memory manager is not type-aware.

Whereas this system keeps allocations of the same type contiguous.

You can do this in C++ by overriding operator new, and it's possible in many other languages too. I've optimised code by several percent by taking over the allocator for specific key types, and writing a simple and probably unoptimised light memory manager which is effectively a wrapper around multiple arrays of the object size which keeps object memory in pools for that type, and therefore close in memory. I can go into more detail if anyone's interested!


in go you can do this with a channel of ununused structs if pulling from the channel hits default branch of select, make a new one. same if adding back to the channel hits default, then free it; else just stick them in the channel and pull them off. eases GC pressure in hot paths. does put a little scheduler pressure.


Why not just do this with a regular stack and avoid having to deal with the scheduler at all? Try pop from the stack and if the stack is empty, do a new allocation.


Because I am using them in different go routines and I find Go channels are pretty good at handling the synchronization as well as I can.


Java started out with handles. It seemed to be useful for getting the compacting collector working right. Later on, around Java 5, those went away, improving branch prediction. Then sometime around Java 9 they came back with a twist. As part of concurrent GC work, they needed to be able to move an object while the app was still running. An object may have a handle living at the old location, forwarding access to the new one while the pointers are being updated.

That was about when I stopped paying attention to Java. I know there have been two major new collectors since so I don’t know if this is still true. I was also never clear how they clone the object atomically, since you would have to block all updates in order to move it. I think write barriers are involved for more recent GC’s but I’m fuzzy on whether it goes back that far or they used a different trick for the handles.


WGPU, the cross-platform library for Rust, is in the process of going in the other direction. They had index-based handles, and are moving to reference counts.[1] The index tables required global locks, and this was killing multi-thread performance.

[1] https://github.com/gfx-rs/wgpu/pull/3626


Maybe that’s a case of the wrong design? The index tables shouldn’t need global locks. It gets a little hairy if you need to be able to reallocate or move them (I.e grow them) but that happens at most a small number of times and there are ways of only taking the lock if that’s happening.

I’ve implemented this pattern without locks or CAS in C++, and it works just fine.

I’m currently using this pattern in rust (although with fixed size) and it works really well. The best part is it bypasses the borrow checker since an index isn’t a reference. So no compile time lifetimes to worry about. It’s awesome for linked lists, which are otherwise painful in rust. Also it can sometimes allow a linked list with array like cache performance, since the underlying layout is an array.


"it bypasses the borrow checker since an index isn’t a reference"

That's a bug, not a feature. The two times I've had go to looking for a bug in the lower levels of Rend3/WGPU (which are 3D graphics libraries), they've involved some index table being corrupted. That's the only time I've needed a debugger.


Lifetimes are managed at runtime, somewhat like Arc, just without the ref counts. If implemented well you get an assertion if you misuse a reference (eg use after free.)

I’m fine with the trade off, but yes, it can be a source of difficult bugs.


We use a similar mechanism, but it is thread-local. Handles get passed to other threads, but the other threads cannot directly access them; instead, they send requests associated with the handle to the owning thread over an MPSC queue. Our construct is not a general purposes replacement for pointers, obviously.


Sounds a bit like what seastar.io is supposed to enable. On their website they say they focus on IO bound applications. I wonder how well that model fits to parallelize computationally intensive work.


I have no familiarity with Seastar, but a quick look at the marketing: shared nothing, message passing -- that sounds about right.


I assume there was an underlying assumption of single-threadedness in this post.

You could presumably use the same tricks the allocator is using to handle more concurrent threads, although I suspect that's not worth the time required.


Three thing:

a) pointers are handles, to parts of virtual memory pages. When your process maps memory, those addresses are really just ... numbers... referring to pages relative to your process only. Things like userfaultfd or sigsegv signal handlers are even capable of blurring the line significantly between self-managed handles and pointers even by allowing user resolution of page fault handling. Worth thinking about.

b) If performance is a concern, working through a centralized handle/object/page table is actually far worse than you'd think at first -- even with an O(1) datastructure for the mapping. Especially when you consider concurrent access. Toss a lock in there, get serious contention. Throw in an atomic? Cause a heap of L1 cache evictions. Handles can mess up branch prediction, they can mess up cache performance generally, and...

3) they can also confuse analysis tools like valgrind, debuggers, etc. Now static analysis tools, your graphical debugger, runtime checks etc don't have the same insight. Consider carefully.

All this to say, it's a useful pattern, but a blanket statement like "Handles are the better pointers" is a bit crude.

I prefer that we just make pointers better; either through "smart pointers" or some sort of pointer swizzling, or by improving language semantics & compiler intelligence.


> pointers are handles

Pointers are quite limited handles - they would all to be handles to the same array with base pointer 0, but that removes many of the useful benefits you get from having different arrays with different base pointers.


My hunch is that b is often mitigated by the fact that you were having to touch many items? Such that you don't necessarily even care that it is O(1) in lookup, you are iterating over the items. (And if you aren't having to touch many of the items on the regular, than you probably won't see a benefit to this approach?)


> move all memory management into centralized systems (like rendering, physics, animation, …), with the systems being the sole owner of their memory allocations

> group items of the same type into arrays, and treat the array base pointer as system-private

> when creating an item, only return an ‘index-handle’ to the outside world, not a pointer to the item

> in the index-handles, only use as many bits as needed for the array index, and use the remaining bits

> for additional memory safety checks only convert a handle to a pointer when absolutely needed, and don’t store the pointer anywhere

There's two separate ideas here, in my mind, and while they play nicely together, they're worth keeping separate. The first one-and-a-half points ("move all memory management" and "group items") are the key to achieving the performance improvements described and desired in the post, and are achievable while still using traditional pointer management through the use of e.g. arena allocators.

The remainder ("treat the array base pointer" on) is about providing a level of indirection that is /enabled/ by the first part, with potential advantages in safety. This indirection also enables a relocation feature -- but that's sort of a third point, independent from everything else.

There's also a head nod to using extra bits in the handle indexes to support even more memory safety features, e.g. handle provenance... but on modern 64-bit architectures, there's quite enough space in a pointer to do that, so I don't think this particular sub-feature argues for indexes.

I guess what I'm saying is that while I strongly agree with this post, and have used these two patterns many times, in my mind they /are/ two separate patterns -- and I've used arena allocation without index handles at least as many times, when that trade-off makes more sense.


Totally agree these are separate ideas. We use the system-private part but not the index-only part in a handle system in the product I work on.


I love this pattern! I make use of it all the time in Rust with the slotmap library. Improves upon arrays by versioning the indexes so values can be deleted without messing up existing indexes and the space can be reused without returning incorrect values for old indexes.


I use a system much like this for entity management for games, but with an additional property that I don't see outlined here:

when an entity is "despawned" (destroyed), it is not "freed" immediately. instead, its id ("generation counter" in the article) is set to 0 (indicating it is invalid, as the first valid id is 1), and it's added to a list of entities that were despawned this frame. my get-pointer-from-handle function returns both a pointer, and a "gone" bool, which is true if the entity the handle points to has an id of 0 (indicating it used to exist but has since been despawned), or if the id in the handle and the id in the pointed-to entity don't match (indicating that the entity the handle pointed at was despawned, and something else was spawned in its place in memory). then, at the end of each frame, the system goes through the list of despawning entities, and it's there that the memory is reclaimed to be reused by newly-spawned entities.

in this system, it's up to the user of the get-pointer-from-handle function to check "if gone", and handle things accordingly. it's a bit cumbersome to have to do this check everywhere that you want to get a pointer to an entity, but with some discipline, you'll never encounter "use-after-free" situations, or game logic errors caused by assuming something that existed last frame is still there when it might be gone now for any number of reasons—because you're explicitly writing what fallback behavior should occur in such a situation.


Yes, that's basically an entity component system. It still runs into the ‘fake memory leaks’ problem the author describes for obvious reasons, i.e. you still have to deal with "components" attached to some handle somewhere (and deallocate them).


(2018)

Previous discussions:

2018: https://news.ycombinator.com/item?id=17332638 (80 comments)

2021: https://news.ycombinator.com/item?id=26676625 (88 comments)


Counterpoint: I read many of these "handles" articles several years ago, and tried them in my code, and it was something of a mistake. The key problem is that they punt on MEMORY SAFETY

An arena assumes trivial ownership -- every object is owned by the arena.

But real world programs don't all have trivial ownership, e.g. a shell or a build system.

Probably the best use case for it is a high performance game, where many objects are scoped to either (1) level load time or (2) drawing a single frame. You basically have a few discrete kinds of lifetimes.

But most software is not like that. If you over-apply this pattern, you will have memory safety bugs.


Disagree, you can easily add validation to handle access with generations. The classic example in games is reusing pooled game elements where their lifetimes are very dependent on gameplay. For example reusing enemies from a pool of them where their lifetime is dependent on when they get spawned and destroyed which can be chaotic. Here handles with generations preserve memory safety by preventing access to stale data.


you'd be surprised how far you can get with this. Handles are basically how things like Erlang VM (and IIRC javascript VMs) work.


As someone who did way too much with handles in early Mac and Windows programming, I'll say they're definitely not 'better', but for some (mostly memory constrained) environments, they have some advantages. You get to compress memory, sure, but now you've got extra locking and stale pointer complexity to deal with.

If you _need_ the relocatability and don't have memory mapping, then maybe they're for you, but otherwise, there are usually better options.


The UEFI spec leverages handles throughout its APIs. The implementations from the sample in 1998 to today’s EDKII at tianocore.org use the address of the interface, or protocol, as the handle value. Easy for a single address space environment like UEFI boot firmware.


Since he mentions C++, I would add you don't have to give up on RAII to adopt this approach. You obviously can't use the standard smart pointers, but developing similar smart handles isn't that much extra effort.


The C++ standard library smart pointers (e.g. unique_ptr) have optional template parameters for the allocator and destruction method. It seems like the standard C++ smart pointer types could still be used with handles. All you need is the right typedef.


> Once the generation counter would ‘overflow’, disable that array slot, so that no new handles are returned for this slot.

This is an interesting idea. When a slot becomes burned this way, you still have lots of other slots in the array. The total number of objects you can ever allocate is the number of slots in the array times the number of generations, which could be tuned such that it won't exhaust for hundreds of years.

You only have to care about total exaustion: no free slot remains in the array: all are either in use, or burned by overflow. In that case, burned slots can be returned into service, and we hope for the best.

If the total exhaustion takes centuries, the safety degradation from reusing burned slots (exposure to undetected use-after-free) is only academic.


For my senior level software development class, one of our tasks was to write a simple database system that was most definitely declared to NOT be a subset of the SQL language, because subsets of SQL were expressly forbidden. Ours was called the Network Query Language.

Most of the other teams were taking the commands as they were entered and saving them to a file. Then you could reply the file when you restarted, and then continue to proceed with processing new commands.

I had the bright idea to write a memory management routine using handles, so that we could do a binary save of memory state, and then a binary read of memory state on restart. And it worked.

We were only one of two teams to finish the project that year, and both of those teams got A's.

The other team who finished was made up exclusively of people who were Student Assistants in the Engineering Computer Network system downstairs, and so they had SysAdmin privileges on the Encore MultiMax where we were all writing our code. And some of them had some real world systems programming experience.

But ours was the only team who had a memory manager and did binary saves and reads of the entire database state. The other teams would take a long time to replay all the commands on restart, and that delay just kept getting longer and longer as more commands were entered.

That still stands out in my mind as one of my better ideas I've had with regards to computer programming, and one of my proudest experiences.

Sadly, none of the teams got to do a readout at the end where they discussed their experiences and the things they would do differently.


Handles are great, and I agree that their use can help solve or mitigate many memory management issues.

The thing is that Handles themselves can be implemented in very different fashions.

I extensively use handles in my own framework, and I plan to describe my implementation at some point, but so far I have seen 3-4 different systems based on different ideas around handles.

We might need a more precise naming to differentiate the flavors.


Yeah I agree, seems like most of the comments here are talking about slightly different things, with different tradeoffs

Including memory safety


It's interesting to see the convergence between ECS (primarily game engine) and RDBMS.

"AoS" ~ tables

"systems" ~ queries

"handles" ~ primary keys

Makes me wonder if you could architect a "RDBMS" that is fast enough to be used as the core of a game engine but also robust enough for enterprise applications. Or are there too many tradeoffs to be made there.


IBM was, for a while, in the business of selling "gameframes" -- mainframe systems that performed updates fast enough to run a MMO game. They used Cell processors to provide the extra needed compute, but it was the mainframe I/O throughput that allowed them to serve a large number of players. It was backed by IBM DB2 for a database, but it's entirely likely the game updates were performed in memory and only periodically committed to the database. Since, as you say, ECS entity components closely resemble RDBMS tables, this could be accomplished quickly and easily with low "impedance mismatch" compared to an OO game system.

https://en.wikipedia.org/wiki/Gameframe

Back on Slashdot there was a guy called Tablizer who advocated "table-oriented programming" instead of OOP. The rise of ECS means that when it comes to gamedev, TOP is winning. Tablizer might be surprised and delighted by that.


Essentially all a game engine is are these things:

* A tuned database for mostly-static, repurposable data

* I/O functions

* Compilation methods for assets at various key points(initial load, levels of detail, rendering algorithm)

* Constraint solvers tuned for various subsystems(physics, pathfinding, planning).

A lot of what drives up the database complexity for a game is the desire to have large quantities of homogenous things in some contexts(particles, lights, etc.) and fewer but carefully indexed things in others(mapping players to their connections in an online game). If you approach it relationally you kill your latency right away through all the indirection, and your scaling constraint is the worst case - if the scene is running maxed-out with data it should do so with minimal latency. So game engines tend to end up with relatively flat, easily iterated indexes, a bit of duplication, and homogenization through componentized architecture. And that can be pushed really far with static optimization to describe every entity in terms of carefully packed data structures, but you have to have a story for making the editing environment usable as well, which has led to runtime shenanigans involving "patching" existing structures with new fields, and misguided attempts to globally optimize compilation processes with every edit[0].

Godot's architecture is a good example of how you can relax the performance a little bit and get a lot of usability back: it has a scene hierarchy, and an address system which lets you describe relationships in hierarchical terms. The hierarchy flattens out to its components for iteration purposes when subsystems are running, but to actually describe what you're doing with the scene, having a tree is a godsend, and accommodates just about everything short of the outer-join type cases.

[0] https://www.youtube.com/watch?v=7KXVox0-7lU


Memory handles and their cousins, function suites. The last time I used both of these must have been when writing an After Effects plugin.

The “suite” is a bit like a handle but for code: a scoped reference to a group of functions. It’s a useful concept in an API where the host application may not provide all possible functionality everywhere, or may provide multiple versions. Before, say, drawing on-screen controls, you ask the host for “OSC Suite v1.0” and it returns a pointer to a struct containing function pointers for anything you can do in that context with that API version.


> Before, say, drawing on-screen controls, you ask the host for “OSC Suite v1.0”

Or for IWebBrowser2, or for EFI_GRAPHICS_OUTPUT_PROTOCOL, or for xdg-shell-v7, or for EGL_ANDROID_presentation_time. It’s not really an uncommon pattern, is my point. It can be awkward to program against, but part of that is just programming against multiple potential versions of a thing in general.

I can’t see the connection with handles, though. Suites/interfaces/protocol/etc. are an ABI stability tool, whereas handles are at most better for testing in that respect compared to plain opaque pointers.


In my mind, the similarity is that both handles and function suites allow the host runtime to change things around behind your back because you're not holding direct pointers, but instead the access is always within a scope.

With a memory handle there's usually an explicit closing call like unlock:

  uint8_t *theData = LockHandle(h);
  ...
  UnlockHandle(h); theData = NULL;
With a function suite (they way I've seen them used anyway!), the access might be scoped to within a callback, i.e. you are not allowed to retain the function pointers beyond that:

  doStuffCb(mgr) {
    SomeUsefulSuite *suite = mgr->getSuite(USEFUL_V1);
    if (suite && suite->doTheUsefulThing) suite->doTheUsefulThing();
  }


Ah. Yes, that’s quite a bit more specific than what I imagined from your initial description.

Doesn’t that mean that you have to unhook all your references from the world at the end of the callback and find everything anew at the start of a new one? (For the most part, of course, that would mean that both the runtime and the plugin would end up using as few references as possible.) What could a runtime want to change in the meantime that pays for that suffering?

I can only think of keeping callbacks active across restarts and upgrades, and even then allowing for parts to disappear seems excessive.


I like this, but I also think some of the benefits can be had independently by combining several complementary techniques (in C++).

The allocation can be handled by custom allocators for each object type. You can do only this and you avoid the double de-reference of handles.

If you are fine with double-dereferences and also want to implement extra checks and use less bits for the pointer you could overload `operator->()`, e.g.

struct CompressedPointer {

   uint16_t obj_id;

   Object \*operator->() { return Object::get_by_id(obj_id); }
};


We use something like this in a newish C++ project I work on. Our handle system just registers arbitrary pointers (we sometimes follow the convention of grouping items of the same type into a single allocation, and sometimes don't, and the handle-arbitrary pointer system tolerates both). It basically makes it explicit who owns a pointer and who doesn't; and non-owners need to tolerate that pointer having been freed by the time they access it.


I've seen an argument that collections of objects should be allocated field-wise. So an array of points would actually be two arrays, one for x-es, one for y-s addressed by index of the object in the collection.

I wonder how programming would look if that was the default mode of allocation in C++. Pointers to objects wouldn't make much sense then.


conflates a few of the sub-topics and gets some of it wrong, but a good read nonetheless.

my first experience with handles is from early classic macOS, "system N" days. Most everything in the system API was ref'd via handles, not direct pointers, and you were encouraged to do the same for your own apps. Memory being tight as it was back then, I imagine the main benefit was being able to repack memory, ie nothing to do with performance as is mostly the topic of TFA.

I guess ObjC is unrelated to the macOS of olde, but does it also encourage handles somehow? I understand that one reason apple silicon is performant is that it has specific fast handling of typical ObjC references? Like the CPU can decode a handle in one operation rather than 2 that would normally be required. I can't find a google reference to substantiate this, but my search terms are likely lacking since I don't know the terminology.


> I guess ObjC is unrelated to the macOS of olde, but does it also encourage handles somehow?

No, in some ways it allows memory optimization (1. since there is no inlining, clients don't know the internals of data structures, so they're free to change behind your back 2. tagged pointer objects) and in some ways it disallows it (since all objects are heap allocations i.e. there are no value types, object trees can't be "flattened" the way C++ sometimes can).

> I understand that one reason apple silicon is performant is that it has specific fast handling of typical ObjC references?

It's just ARMv8. There are some features like TBI, but the security improvements are more interesting than the performance improvements.


I too have come to this realization.

It does away with problems associated with child-parent references.

Also, you might be able to use a bitset to represent a set of handles as opposed to a set or intrusive booleans.

It also plays nicely with GPUs too.

I don't know why this is not like the default. Given this is how Handles work in Windows.


Just wondering, for those who have implemented something like this - do you still stuff these arrays with unique_ptr's instead of raw pointers, at least to make it easier to manage / reset / assert them within the "system" that owns them?


Another advantage of handles is that they can often be made smaller than pointers, especially on 64-bit code.

In some cases, it can significantly lower memory consumption and improve performance through more efficient cache use.


Pointers are already handles if you are using virtual memory


this should be titled "handles are better implemented this way, rather than as smart-pointers, which themselves aren't really pointers"

and blaming cache misses from fragmentation on pointers is whipping your old tired workhorse.


Handle described here sounds like the entity ID in an ECS setup.


my handles are offsets into a mremap-able region of memory.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: