It seem like that would be fairly simple if you are dynamically creating kernels.
-- Edit: Which implies you're doing a pretty "bare metal" interface to the GPU which still lets you do that and there's interesting story there, if I'd not totally off-base.
I try to avoid going bare metal as much as I can because of the difficulty involved with working with GPUs at that layer. I try to focus on higher level concerns, and use the excellent low level work of people such as the ArrayFire developers to handle the low level stuff. I still have to understand it and work with it at some level sometimes, but for the most part I stick to my core abstractions.
Basically, group nodes and map/operate on them.
It seem like that would be fairly simple if you are dynamically creating kernels.
-- Edit: Which implies you're doing a pretty "bare metal" interface to the GPU which still lets you do that and there's interesting story there, if I'd not totally off-base.