This is similar to the work done in winapi [1], com-imp[2] and my own tangential work on porting VST3 [3] to Rust [4]
I'm really glad MS is doing this. What needs to be a bit clearer to me is how they maintain ABI compatibility under the hood of MSVC for COM interfaces (which uses the vtable layout of an inherited class) and how that's compatible with MinGW/GCC stacks on Windows, mostly what can break it. I got stuck porting VST3 with multiple inheritance, and it was a headache trying to reverse engineer the appropriate struct layouts for COM implementations.
COM is an ABI standard. The structs are defined in terms of C with Winapi (stdcall) calling convention. Very little needs reverse engineering - it's all pIntf->vTable->func(pIntf, ...). You can explain it on a whiteboard in a couple of minutes.
The way MSVC does it may need reverse engineering (it may be patented btw). I could explain how Delphi implements COM interfaces, but any specific implementation is actually more complicated than the ABI, because they're trying to add implementation ergonomics on top of the basic calling convention.
Is the ABI documented anywhere? Every time I google around for it, I just get information like "COM is ABI stable and language agnostic" but not what the ABI is. I've successfully implemented implementations of single COM interfaces and get the basics, my trouble was in implementing many interfaces for the same implementation and running into copious segfaults when testing the Rust implementation through a reference app written in C++.
Then if you have access to public libraries, maybe one of them has one of the several Microsoft Systems Journals issues, later MSDN Magazine, with plenty of low level COM articles.
COM is from the days where good documentation was to be found in books, not on the Interwebs.
+1 for Essential COM by Don Box, it still survives my bookshelf purges..."just-in-case". IUnknown and IDispatch are burned permanently in my memory from a period of my life building a COM/CORBA bridge.
I'm gonna piggyback on your comment to give another shout out to Essential COM. I haven't touched COM in ages but that book is so good I still pick it up every once in a while -- and it was a lifesaver back when I did touch COM on a daily basis.
Wow, that's a throw back to the late 90s, early 2000s. I'm still slightly scarred from working with ATL and COM. A lot of people around here would probably be surprised to hear that there were even Python -> COM bindings back in the day that were even used to ship server software once upon a time. Anyway, I remember that Don Box book well.
True... I've looked at Delphi, but at this point I doubt the license fee is worth it compared to say C#. I've never known anyone who actually used Eiffel though.
What's to document? The stdcall / WINAPI calling convention, which is of course OS and architecture dependent, but can be summarized as on the stack, right-to-left, callee pops.
The rest is just that interfaces are doubly indirected to get to a vtable (an array or struct of function pointers), the method is chosen by ordinal in the vtable (first three are always the IUnknown methods), and the interface is passed as the first argument.
How you construct those vtables and how you select one to return in QueryInterface, and how you implement interfaces (i.e. traits) so you can convert them into a vtable is where all the work is. You can do anything you like that works as long as it's called according to the COM conventions.
Delphi works by implementing each method in the vtable with a stub which subtracts the offset of the vtable pointer in the instance data from the passed in interface, and then jumps to the implementation method on the instance after the instance has now been adjusted. Instances look like this:
[class pointer, the native vtable] <- normal instance pointers
[vtable interface 1] <- COM interface pointers
...
[vtable interface n] <- COM interface pointers
instance field 1
...
instance field n
So you can see that in order to convert a COM interface pointer into an instance pointer that the methods expect, the COM interface pointer needs to be adjusted depending on the offset of the vtable in the instance.
In a language with multiple inheritance like C++, the compiler vendor targeting Windows may choose a layout which is suitable for COM's calling convention (there's more than one way to do MI, e.g. fat pointers is another way to go that wouldn't be COM-compatible). If the vendor does that, then they make implementation of COM interfaces much easier. And if they don't, well, life isn't going to be easy. Technically you could do a bunch of stuff with reflection and code generation, but that's harder and harder these days with security restrictions around code writing code. You could write macros which create statically initialized structures of the right shape, and fields of the right type, to emulate the same effect as Delphi's scheme as I sketched above, or some other method which would work with COM's calling convention.
Realize that COM is supposed to be used with code generators. You can write it in raw C even, but it quickly becomes intractable. Perhaps I did not read completely but this is something I see missing from the post -- the annotations fit what the pre-pass C++ code looks like, but don't mention that layer?
COM layout is followed upon most mainstream Windows compiled languages, namely major C++ compilers, .NET, Delphi, Eiffel, Ada, so it is not MSVC++ keeping ABI compatibility under the hood on their own.
My issue isn't ABI stability but the ABI itself w.r.t vtable layout. Best I can tell it should be similar to Itanium's spec? [1]. It's been months since I did this, but iirc my problems stemmed from having multiple interfaces on top of the same implementation and the ordering/layout of those interfaces, though the IUnknown interface which is supposed to handle that.
COM is agnostic as to how you do multiple inheritance - it doesn't have the concept. It specifies the QueryInterface protocol, but you don't need to return the same instance for the result of the QI call, just one that uses the same lifetime refcount.
Tear-off interfaces and delegated implementations are things in this world.
MSVC uses a completely different ABI from Itanium, and you shouldn't rely on the Itanium ABI to inform you what it might look like.
vtable layout in the most basic situations is going to be accidentally portable because those situations boil down to "it's a struct of function pointers," and there's only so many ways you can order the fields of a struct. But even here, MSVC uses a quite different ABI: the order of the vtable entries can change if you overload a virtual method with a non-virtual method.
AFAIK, non-virtual methods never affect the vtable layout. But when you overload a virtual method with another virtual method, the ordering in the vtable is unspecified!
Also, a public COM interface mustn't have a virtual destructor, because some compilers (e.g. recent GCC) put more than one method in the vtable. Implementation classes might define a virtual destructor, though.
About multiple interfaces, all of them need 3 first vtable entries pointing to the 3 IUnknown methods. Also, don't forget that when client calls QueryInterface on any interface of the same object with IID_IUnknown argument, you must return same IUnknown pointer. Some parts of COM use that pointer as object's identity.
> COM has been superseded by WinRT which builds on COM to provide even more guarantees about the binary interface.
Yet the WinRT page says:
> The Windows Runtime is based on Component Object Model (COM) APIs, and it's designed to be accessed through language projections. A projection hides the COM details, and provides a more natural programming experience for a given language.
Which implies that WinRT doesn't supersed (ie. replace) COM but it uses COM itself (though the "language projection" part makes me think that the way it is used is so convoluted that you need language extensions to make it tolerable :-P).
Basically there is a new base interface, IInspectable instead of IUnknown, and many more datatypes are allowed to cross the wire between processes, making it easier to also use generics, lists and arrays across COM libraries.
Additonally, .NET metadata libraries are using instead of the old COM type libraries.
If you prefer, it is the original design of .NET before the CLR came into play, and related to the Longhorn architecture, just using COM instead of .NET. A path started in Vista.
Now for this Rust approach to win the hearts of Windows devs, it needs to be as productive as C++/CX or C++/WinRT VS tooling, in all COM/UWP usage scenarios.
> Now for this Rust approach to win the hearts of Windows devs, it needs to be as productive as C++/CX or C++/WinRT VS tooling, in all COM/UWP usage scenarios.
Possibly related, I just spotted this [1] on the personal blog of Kenny Kerr, creator of C++/WinRT:
> On a personal note, I’m spending a lot of my time working with the Rust language and can’t wait to share more about that eventually.
Disclosure: I work on Windows at Microsoft, but not in this area. Posting on my own behalf though.
One thing i'm not sure about WinRT and cannot figure out from the docs, AFAIK it is possible to "bit bang" (or "byte bang" if you will) a COM object by arranging everything in memory using plain old C structs (or equivalent for other languages) from compilers that weren't COM aware. Is it possible to do the same with WinRT or it requires the language to be WinRT aware? F.e. could i get something like TCC and write some code that uses a WinRT object even though TCC has no idea about WinRT?
> it is possible to "bit bang"... a COM object by arranging everything in memory using plain old C structs ... from compilers that weren't COM aware. Is it possible to do the same with WinRT or it requires the language to be WinRT aware?
Yes, actually, you can do that. I had such code working. Basically declared the IInspectable VTable as a C struct and filled it with my own implementation. Then wrote an "activator" for the relevant classes the same way. [What COM used to call factories.]
Did the same thing with a few VTables for MS-authored WinRT classes calling into them in the other direction.
All this was tedious, I had complicated reasons for wanting to do this and I eventually switched to WRL when those reasons disappeared. Writing COM/WinRT from C++ with something like WRL is much nicer.
The documentation is a bit lacking, as it appeared with WinRT, then we had UAP, and finally UWP. While it all boils down to the same, those marketing changes, did cause the documentation to keep changing.
Alongside C++/CX, there was the low level "Windows Runtime C++ Template Library (WRL)", which was basically an evolution of ATL, with MDIL 2.0.
COM interop used to be a big buzzword say, a decade ago, but I haven't really heard or seen much usage of it since.
Most of my experience in using it has been horribly abusing it to drive Office applications and Notes to do some awful things because at the time there weren't any libraries to manipulate certain file types. I have done some truly heinous things with the Lync/Skype COM-based client SDK, back in the times when the Office 365 version of that product was viable, and there was no other API options available.
> I haven't really heard or seen much usage of it since.
I think COM is incomprehensible to most developers today--it certainly is to me, and I maintained a .NET application with COM components for several years! COM is so arcane and verbose.
Why invest hundreds of hours learning something with no clear benefit or future? The documentation introduces COM as:
> COM is a platform-independent, distributed, object-oriented system for creating binary software components that can interact. COM is the foundation technology for Microsoft's OLE (compound documents) and ActiveX (Internet-enabled components) technologies.
My point is that for the vast majority of developers this does not matter one bit. The Windows API is not relevant to the work that most developers do today. In the cases where a developer actually needs to interact with Windows there are easier technologies available.
Marshaling COM objects for use among different threads was fun. Remember the function 'CoMarshalInterThreadInterfaceInStream'? :-)
I feel Windows' C API (i.e., Win32 API) was adopted at a larger scale than COM. It's much easier to use the Win32 C API than COM objects. You'd have to use COM if you wanted to program the shell or DirectX but most of everything else you could use the C API.
UWP is ambiguously defined and the acronym's usage is on the way out. An evolution of UAP and Windows Store apps, it boiled down to a whitelist of APIs allowed on multiple devices (the U in Universal) and encompassed Win32, WinRT and COM APIs. UWP itself isn't an entity with APIs, is not a technology, and the original concept is dead thanks to new platform leadership at Microsoft. (Still in play on Hololens, Surface Hub, and other sandboxed devices for now...)
That said, "UWP" is confusingly also used to refer to Windows apps that tap into WinRT APIs, participate in the modern app lifecycle, are modeled and have identity, or generally tap into newer technologies such as AppContainer. You'll see this, for example, perpetuated in Visual Studio in its project templating. It's a BIG OL' MESS right now; I'm working with/pushing Microsoft to get this communicated clearly and mopped up but old habits die hard.
These docs are outdated and incorrect, sadly. For example, nearly everything under the UWP heading can be done via traditional Windows apps. Current leadership seems to get it, "write an app, use any APIs you need" (rough paraphrasing). Just hasn't fully set internally and externally so these warts exists.
On a technical level, there are still some apps that can get CoreWindows, respond to PLM events, receive callbacks from contracts involving CoreWindow UI like Share target, as well as go into certain UI modes involving shell participation like modern fullscreen and CompactOverlay, and some that can't (without stapling them together with one that can via Desktop Bridge) ... aren't there? Maybe that will change in the future but it hasn't yet.
And then there are some OS editions where some apps can't run or can only run in a container environment.
Yeah we're on a bit of a purge right now to remove a lot of MS Office automation. We're just reading data from files - nothing fancy - but it's all over the place and apparently technically outside of licensed usage so compliance is cracking down on us.
COM is an interesting concept. If it were never invented I have a feeling native apps would be spinning up a simple REST API server instead, since that's what we all do now apparently.
It's mostly a reflection of the fact that you don't have to implement COM interfaces to use DirectX—you only have to call them. Also, the constructors are wrapped in functions like D3D11CreateDevice() so that you don't have to use raw functions like CoCreateInstance().
If you're using an API like DirectWrite (at least older versions) that requires you to implement interfaces to do basic things like load fonts from memory, then the COM underpinnings of the APIs become very apparent.
I'm interested in more information regarding the Safe Systems Programming Languages team's work. Did they assess other languages other than Rust?
I'm scanning through their blog entries, without having read every single one in depth so please pardon me if I miss something, and I'm not seeing anything mentioning formal methods or other systems programming languages.
It's a very interesting development for sure!
I have played with COM in Go to a lesser extent. It’s not too bad actually. My intent was to write Go wrappers around all of the TSF COM interfaces, but I burnt out after getting basic IME working - TSF is enormous!
Ah I see.. sadly, I never really figured out a great way to support full COM interop in Go. It would be ideal if you could implement an interface and register it as a COM object, but I hadn’t hashed out exactly how.
I'm really glad MS is doing this. What needs to be a bit clearer to me is how they maintain ABI compatibility under the hood of MSVC for COM interfaces (which uses the vtable layout of an inherited class) and how that's compatible with MinGW/GCC stacks on Windows, mostly what can break it. I got stuck porting VST3 with multiple inheritance, and it was a headache trying to reverse engineer the appropriate struct layouts for COM implementations.
[1] https://github.com/retep998/winapi-rs/blob/0.3/src/macros.rs
[2] https://github.com/Connicpu/com-impl
[3] https://github.com/steinbergmedia/vst3sdk
[4] https://github.com/m-hilgendorf/cli-host (sorry for the messy code, it was a weekend of hacking away trying to host a VST3 in pure Rust)