Well, one of the authors is from NVIDIA, and the paper already mentions CUDA, and it's only to argue that CUDA is a good point of reference but doesn't do what the authors need it do (shader programming). SYCL doesn't add anything to that conversation other than being yet another example.
Hi, lead author here. As oddity mentions, this paper references CUDA and C++ AMP (which both support C++ lambdas as well) as examples of unified programming models for GPU compute as contrasts to the non-unified programming models available for real-time graphics. SYCL is another interesting example in the GPU compute space, but as far as I am aware, it (like the others) does not include any new features that would provide adequate support for the crucial shader specialization optimization.
Adding C++ lambda support to shader programming would certainly be beneficial, but I don't think it, alone, would provide the modularity, composition, and GPU code specialization features necessary for an effective shader programming model.
That said, I am not an expert in SYCL, so if you have some ideas about how SYCL could meet our design goals, I would appreciate further feedback!
I think you should have cited SYCL anyway. You way cited things less relevant, like rust-gpu. While SYCL may not have all the features, it's a precursor by enabling programming GPU and CPU from a single source.
This does unfortunately feel like the case. We've gone all-in on SYCL, mainly because of a pre-existing Intel relationship, but it never really seems to have gotten mindshare and e.g. never seems to have been dicussed here https://hn.algolia.com/?query=sycl
Hi, lead author here. The Rust GPU project is certainly some great work! We cite it in Section 7 of our paper.
The Rust GPU project is working to satisfy a necessary condition of unified shader programming in Rust: the ability to compile Rust code to GPU-executable kernels. However, in its current state, it does not provide interop between the host (CPU) code and GPU code in a unified way. Programmers still need to create separate host and GPU representations of parameters (and keep these two definitions in sync) and use API calls to communicate between the two halves of the code.
Similarly, Metal's GPU shader code is written in (a subset of) C++, but Metal does not meet our definition of a unified shader programming environment for the same reason.
Our work investigates the next step in this process: once we can translate host language X to GPU-compatible code, what else do we need to accomplish to unify the host and GPU halves of shader code into the same programming environment.
Hi, lead author here. The Circle compiler is certainly some great work! We cite it in Section 7 of our paper.
The ability to compile C++ code to GPU-executable kernels for use in shader programming is a massive undertaking, and we're glad that the Circle compiler is working to accomplish this task! Instead, our work focuses on the next layer: once we can generate both host- and GPU-compatible shader code from C++, what else is required to achieve useful unified shader programming?
Our work identifies shader specialization as a major missing piece when attempting to unify host and GPU shader code into C++. For reasons we discuss in the paper, current methods available in C++ to potentially support GPU code specialization are insufficient, and Circle does not add any language design provisions to allow dynamic logic in host code to influence compile-time specialization and selection of GPU code, which is central to supporting unified shader specialization. Our work aims to provide this support by co-opting C++ attributes and virtual functions.
Circle also adds other language features on top of C++, including some interesting new metaprogramming features. It may be possible to build a unified shader programming system using these new features, but since these new features are not part of C++, they would not allow us to meet our design goals. The end of Section 7 in our paper includes an expanded discussion of this topic.
I'm happy to answer any questions and have further discussion here!
I think shader specialisation is handled pretty well in circle. Since you can essentially run arbitrary C++ code at compile time, selection and specialisation of a shader can even depend on hardware specific benchmarks. There is an extensive repo with examples here: https://github.com/seanbaxter/shaders. One example decodes a sprite sheet stored as a png at compile time and creates a specialised compute shader for it. You can also easily implement a control UI based on reflection of uniform shader parameters.
Anyways I understand that under different constraints (adherence to C++ standard, can't easily implement a whole C++ frontend and interpreter) this becomes much harder.
Not exact related, but is there a modern, widely used object oriented c++ wrapper for OpenGL? Why does OpenGL persist with a C style API instead of a more expressive one?
There's no wrapper because a modern high level rendering engine doesn't expose raw graphics APIs to the user, and instead keeps it behind higher level structures like draw calls and render graphs. At that layer, you don't need the C++ API, in fact you want something to efficiently let you manage the OpenGL context state as directly as possible. And a C API is fine for that.
Basically, the amount of code interfacing with OpenGL should be fairly minimal.
There are higher level graphics libraries on top of OpenGL, Vulkan, Metal, WebGL & DirectX, for example bgfx or sokol, but IIRC they still go C style for the same reason. Then there are language-specific libraries that will be idiomatic, like wgpu for Rust. Finally, you reach entire very opinionated game engines at various levels (Unity, UE, Godot, etc).
While not being based on C DirectX actually has a C API, even for D3D12. For example to call device->CreateRootSignature(...) from C you call ID3D12Device_CreateRootSignature(device, ...).
Before or after dealing with the dynamic loading of all required extensions, validation layers, memory allocator (which AMD had to create a more user friendly wrapper) and initialization boilerplater?
I disagree; type safety is something that is missing from almost every C api I've used; you end up with opaque handles that are just integers that can be easily misused. Consider the glBufferSubData method [0] - 2 of 4 of those parameters are really just integers that the programmer needs to know about, that the compiler won't check,and the last two are tightly coupled and not type safe; thetes nothing stopping you from passing an array of doubles (or whatever pointer you have lying around).
Opengl also gets away with because you're almost always working with either floats or integers (for say index buffers) but once you start dealing with containers of containers, a C api ends up really messy with everything being an "object" and having to call function_str(my_obj_handle) with the correct handle - see sentry's api [1] for an example of what this looks like.
A C++ api for OpenGL would allow for type safe flags, type and bounds checked buffers, RAII handles with all the good (and granted, bad) that comes with them.
It seems like you don’t consider requirements like ABI at all. There’s a reason for opaque handles (typeless pointers or integer handles).
When managing resources on separate memory, RAII is also rarely something you’d want. There are different memory management strategies and tight control over memory is very important.
Even if you implement your library in C++ you will need to expose a C API to be compatible with other languages. Not everyone uses C++.
> When managing resources on separate memory, RAII is also rarely something you’d want.
It's the opposite.
When managing resources in system memory, RAII is rarely something I’d want because malloc() is slow.
When managing resources on separate device, RAII is invaluable. These external resources often have API designed like bind/use/unbind, map/update/unmap and similar. You forget that one close/unbind/unmap and you violate your part of the contract i.e. anything may happen including crashes and memory leaks.
DirectX is only barely C++. It's structs and virtual interfaces. They actually provide a full C interface. The API itself is exceptionally C-like IMHO. It does not, for example, return unique_ptr or vectors. There's no std::string or std::function. In fact there's no std:: anything.
Don't get me wrong I like namespaces, constructors, and methods over C-style "namespace", no constructors, and an ocean of loose functions. If people want to create and use C++ wrappers around C APIs that's great. But OP's question was "Why does OpenGL persist with a C style API instead of a more expressive one?".
Writing a C API and wrapping with C++ is very different than writing an "expressive" C++ interface imho.
A C++ wrapper doesn't need std::anything to be C++.
Barely C++ is still an improvement over bare bones C.
LibGNM(X) and GX2 are also not std::whatever_else, but again build up on not being a bare bones C API stuck in the days of IrisGL.
Finally Metal is a mix of Objective-C and C++, both definitly an improvment.
Then there is the whole issue of they are proper frameworks, not "here is a specification and now go hunting how to load fonts, models, materials" that Khronos does.
Agreed 100%. I quite like C APIs. They’re simple, elegant, and can wrapped with RAII or your preferred style.
A C++ API imposes its style on you. Usually badly. Never mind the ABI issues.
Most interfaces could be improved by re-writing them in C. I can’t say the same for the inverse.
That said, OpenGL is a terrible API that is nearly impossible to use correctly. But neither C nor C++ have much impact on its horrifically stateful architecture.
C++ APIs are made far better by templates (and soon concepts), smart pointers, objects, better type checking, etc, not just RAII. C-style APIs really can't compete, IMO.
A C API can called by every language under the sun. C is the lingua franca of computer science.
A C++ API can't even be reliably called by other C++ programs. If I produce a shared library with C API it can be used broadly and easily. If I produce a shared library with a C++ it can only be used by C++ programs compiled with precisely the same settings.
Providing a C++ API really just means "compile this entire library from source". I do indeed compile a lot of great third-party C++ code from source. That's not always possible or even desirable.
If I need to use that code from a different language - for example C# for Unity or matlab for data science - then that C++ API is worthless and needs to be wrapped with a C API.
I've recently been integrating some C++ libraries into C++ based Unreal Engine. It's easier to integrate a C API with Unreal C++ than it is to integrate random C++ project with C++ based Unreal.
The fact that C++ doesn't have a standard build system doesn't help. Integrating a C++ library to compile from source requires adapting random build system A with your build system B. There's a reason C++ projects love single-header (or single header + single cpp) C++ libraries with no additional dependencies. Because they don't require jumping through a million hoops.
> A C API can called by every language under the sun.
And is usually only used as a last resort. I've lost count of the number of libraries reimplemented in different languages just to avoid using a C-style API at all.
> C is the lingua franca of computer science.
Is it? If you were to randomly draw a developer from the body of working programmers, how confident would you really be that they could competently write a good portable C program, or even write one at all? I certainly wouldn't put any money on it.
C++ is the preferred language of web browsers, interpreters, games/game engines, GUIs, heterogeneous programming frameworks, and every relevant C compiler.
And to reiterate, a C ABI is inherently more fragile than a C++ ABI simply due to the lack of name mangling, among other things. The fact that platforms choose to never touch their libc doesn't change that. But if you have to maintain a C library you will feel this.
> Is it? If you were to randomly draw a developer from the body of working programmers, how confident would you really be that they could competently write a good portable C program, or even write one at all? I certainly wouldn't put any money on it.
Perhaps what they meant is more like "C is the lingua franca of linking". CS people or developers may not be great at writing C code, but almost everything can execute it. C++ of course, but also Java and Python. I think even Haskell can call C code, but I'm not certain of that.
As a C programmer I find it really telling when an API is full of smart pointers. It means the heap is completely unmanaged and I have no control over memory allocation. It makes most c++ libraries completely unusable in embedded, high performance, and high reliability environments (all of which don’t randomly allocate memory at random sizes throughout their run time).
This is why I really, really like unique_ptr. You either explicitly take responsibility (and it implicitly gets handled) or you explicitly hand it off.
All platform APIs are in C and not C++. The main reason IMO is the lack of standardization of C++ ABI. This is not such an obvious problem if you always only use single compiler, but the main difference between C and C++ after all the language sugar has been removed is that C++ binaries on a single platform, compiled on two different compilers beyond their C like aspects are only accidentally compatible.
On some platforms you can make even single compiler builds in C++ incompatible if you use stl - e.g. a stl::vector may be quite different in debug and release build which leads naturally to faulty memory acceess.
There's https://openframeworks.cc/. It's basically the C++ equivalent to https://processing.org/. Unfortunately, it's not really a library but rather a complete app framework, but it shows how a C++ wrapper API could look like. It definitely makes OpenGL programming easier and more fun.
Ogre is the renderer for a number of robotics projects, like rviz (robot visualization) and Gazebo (robot simulation). I don't know that either of those necessarily showcase it in the best possible light, but more just to say that it has a rich history serving in a variety of cross-platform use cases.
I kinda understand what OP want to say here: This work is IMO pretty valuable from an engineering perspective or as a generic project. But academic is an special (if not slightly weird) place -- an academic paper usually needs to demonstrate strong novelty (or extreme novelty if your reviewers are picky) on its idea. When you see the paper from this angle, it doesn't make significant difference from previous unified shader programming idea. In short, "creating something high quality using existing ideas" is usually pretty hard to get you a solid accept in tier 1 PL conferences. Don't get me wrong, this very point is also my biggest disappointment on academic area: I absolutely appreciate their breakthrough on new knowledge, but the fact that they underestimate some of engineering works that truly help our daily life just really piss me off.
Hi, lead author here. Thanks for your kind words on the value of our work!
This work is definitely more on the practical or engineering side of things, and we believe it has significant research value as well. Our work echoes motivations from prior works, but the choice to target C++ and its limitations represents a fundamental difference compared to prior works that instead use uncommon/new languages with expanded feature sets. We believe that the synthesis of ideas from prior work, combined with our implementation strategy of co-opting existing language features, allows us to create a cohesive, C++-based system that is a novel contribution to the field.
Nevertheless, these types of "systems" papers are sometimes challenging to review, especially if reviewers are unfamiliar with systems research. The novelty comes from the act of building the system---it lets us see what "just works," see what doesn't (e.g., shader specialization in this case), and figure out solutions to the challenges that arise along the way.
I think Kayvon Fatahalian of Stanford University does a fantastic job of illustrating the value of systems papers here: https://graphics.stanford.edu/~kayvonf/notes/systemspaper/ and the structure of our paper is certainly influenced by this philosophy.
When I was working on a game engine, I came across a few papers trying to do the same thing. I think it's more than worthy of academic study. It's no different than any other compiler research.
The problem isn't just compiling C++ to glsl. The problem is taking c++ and allowing you to write your shader as part of your program. Right now, you essentially have 2 options: you can write one shader that does everything and use the actual data passed to it to conditionally run what you want, or you build a shader composition system that takes small chunks of code and pieces them together based on what data you want to pass to it. Thats overly simplified, but a kind of high level view. There are a few talks from bungie detailing the work they put into their shader system so that technical artists could have a smooth workflow.
Ultimately, your engine can't handle custom shaders without the engine user also writing code for the cpu side, unless you have a AAA sized engine team that develops a compiler to achieve that.
Hi, lead author here. You're definitely getting at the core of the issue. An effective shader programming models needs to support modularity and composition of shader code, while also supporting GPU code specialization and interop between host and GPU code. Large game engines provide layered tools that aim to provide these features, and they are very useful especially for users who aren't experts in graphics programming (e.g., artists and technical artists). I'll use this opportunity to shamelessly plug our prior work, which explores these aspects of shader systems in more detail: https://dl.acm.org/doi/10.1145/3355089.3356554 (preprint here: https://escholarship.org/uc/item/2f8448n2 )
Even with these layered tools, the host-GPU interop is still something that shader programming systems have to contend with directly, in contrast to "unified" systems like CUDA and C++ AMP in the GPU compute world. As you point out, translating C++ to GPU-compatible code is just one step in the process. Our work looks at the next step: once we can write both host and GPU shader code in C++, what's missing? What we discovered is that by merging the host and GPU halves together in C++, we inadvertently broke the critical shader specialization optimization. So our work presents a method to support shader specialization in a first class way, while also meeting our other design goals.