Hi Aaron, I saw the link earlier but did not get the time to check it out earlier. Someone from arrayfire pointed out that you were using arrayfire and now I recognized your name / id from the arrayfire forums :)
Anyway, I am giving it a listen now.
If I had to ask a question, what is something that frustrated you about arrayfire and what do you think needs fixing ?
ArrayFire was designed primarily as an user library for scientists, it seems, and so there is a clear bias there. My coopting it for use as an IR language for a compiler front-end is perhaps outside of the original scope of ArrayFire. The good thing is that ArrayFire manages to capture a few key ideas that are often absent, but critical to APL, so that was good.
On the other hand, it is non-trivial to work with more advanced concepts and workflows/algorithms in a way that allows me enough control over my data to get the performance I want when dealing with nesting or other nuanced execution patterns. As a trivial example, it would be not having more control over how Scan works. ArrayFire is unfortunately not unique in the very limited amount of support for scan operations that will be executed in parallel. I have worked around this, but it's not ideal.
Additionally, the rank limitation of 4 in ArrayFire is infuriating. Given that we're working with C++, it seems like there should be no reason that I should be able to have arbitrarily many dimensions.
The design of arrayfire also makes certain code compositions difficult (this goes back to the issue with nested execution) meaning that I'm spending a lot of effort on working around this limitation in the compiler, having to add new compiler passes and optimizations to massage the output code in such a way as to avoid the syntactic and semantic limitations of the arrayfire primitives. I'm not sure ArrayFire can do better in this regard, and this is the sort of optimization that a compiler front-end could make plenty of use of, but it would have been a nice touch to have some built-in support for mixed CPU/GPU statement overlap to avoid me needing to separate out certain codes and possibly introduce various stages to make sure that all the computation can be properly fused and executed. This limitation is especially felt with GFOR.
So, if I had my way, ArrayFire would be less user friendly to the scientists (that's why Co-dfns is here! ;-) ) and more friendly as an IR for implementing APL runtimes. :-)
> On the other hand, it is non-trivial to work with more advanced concepts and workflows/algorithms in a way that allows me enough control over my data to get the performance I want when dealing with nesting or other nuanced execution patterns. As a trivial example, it would be not having more control over how Scan works. ArrayFire is unfortunately not unique in the very limited amount of support for scan operations that will be executed in parallel. I have worked around this, but it's not ideal.
Can you give me an example ? If you think this is not the forum to discuss this, you can always email me (check my profile) or ping me on gitter (https://gitter.im/arrayfire/arrayfire.
> Additionally, the rank limitation of 4 in ArrayFire is infuriating. Given that we're working with C++, it seems like there should be no reason that I should be able to have arbitrarily many dimensions.
I want to solve this as well, but it requires a lot of tedious work so it keeps getting postponed. But rest assured this will be fixed!
> So, if I had my way, ArrayFire would be less user friendly to the scientists (that's why Co-dfns is here! ;-) ) and more friendly as an IR for implementing APL runtimes. :-)
ArrayFire development has been request driven. So the features requested / used a lot got implemented first. But I don't see why the features you requested can not be implemented in arrayfire!
If I were to expand that idea a bit more and remark on the biggest class of bug that causes segfaults for me in ArrayFire, it would be the requirement for explicit data type casting using as() to avoid problems. I wouldn't want this to go away at the cost of performance, but....
Hmm this is the down side of using a dynamic typed array class. While the functions to handle the type checking of the host pointers can be done in C++, it would not be so easy to implement it in the C layer.
And since it looks like you are using the C++ api anyway, can you open an issue on github so this will be looked at?
Anyway, I am giving it a listen now.
If I had to ask a question, what is something that frustrated you about arrayfire and what do you think needs fixing ?