That's an interesting idea, and I can see there being some overlap, but the Key operator is actually not my invention, but is a member of a class of operators sometimes called classifiers. You can think about the GROUP BY operator in SQL. Or Henglein's work on discriminators. Key is an operator that is relatively new in Dyalog that came from J and some other APLs, I think. I have to go back to my paper to remember.
It seem like that would be fairly simple if you are dynamically creating kernels.
-- Edit: Which implies you're doing a pretty "bare metal" interface to the GPU which still lets you do that and there's interesting story there, if I'd not totally off-base.
I try to avoid going bare metal as much as I can because of the difficulty involved with working with GPUs at that layer. I try to focus on higher level concerns, and use the excellent low level work of people such as the ArrayFire developers to handle the low level stuff. I still have to understand it and work with it at some level sometimes, but for the most part I stick to my core abstractions.
http://dl.acm.org/citation.cfm?id=2935331