Yeah the in depth look helps before forming conclusions like "you have to render everything every frame." Key thing is immediate mode can often mean just the sense of the API, not necessarily the draw scheduling.
re: widget-local state -- React is one of the most popular models for that. Basically if widgets are identified over time by the sequence of ids (includes array indices like React keys) on the path to them from the root, state is coherent across that identity, and mount / unmount events exist at the boundary of the identity. Callstack / push-pop based APIs like ImGUI maintain this sequence either impicitly or explicitly in the stack. Then there is some API to read / write to the state store (like React hooks or ImGUI generic state storage) with optional update triggering in async APIs like React's.
re: widget-local state -- React is one of the most popular models for that. Basically if widgets are identified over time by the sequence of ids (includes array indices like React keys) on the path to them from the root, state is coherent across that identity, and mount / unmount events exist at the boundary of the identity. Callstack / push-pop based APIs like ImGUI maintain this sequence either impicitly or explicitly in the stack. Then there is some API to read / write to the state store (like React hooks or ImGUI generic state storage) with optional update triggering in async APIs like React's.