Hacker News new | past | comments | ask | show | jobs | submit login
Comtypes: How Dropbox learned to stop worrying and love the COM (dropbox.com)
91 points by frsandstone on Oct 4, 2012 | hide | past | favorite | 34 comments



COM is a necessary evil when integrating with Windows, Office, Visual Studio, or any other big/old Microsoft product. That's just how it works.

The .NET framework actually does a pretty good job of hiding the complexity from you, but having done some serious integration with VS, let me say that's an abstraction leak that I wouldn't wish upon my worse enemy.


I never thought the .NET Framework did enough complexity hiding to justify a 60MB installed size, especially if it was just providing a bit of plumbing to get you back down to the COM level.

Most COM interfaces support late binding, so you can easily call them from a language like JavaScript, which is already installed in Windows as part of the platform. Ten years before node.js existed, you could run this JavaScript on Windows 2000 to read and print a file:

    var fso = new ActiveXObject("Scripting.FileSystemObject");
    if ( fso.FileExists("myfile.txt") ) {
       var ts = fso.OpenTextFile("myfile.txt", ForReading);
       WScript.Echo(ts.ReadAll());
       WScript.Quit(0);
    } else {
       WScript.Echo("File Not Found");
       WScript.Quit(1);
    }


Well, that's a facetious argument if I've ever seen one :). .NET does a lot more than just interface with COM or the Windows API. VB (pre .NET days) had much better integration with OLE/COM and the runtimes were like 1mb in size at worst.

I've always thought of COM as one of those almost brilliant pieces of technology that somehow got completely ignored. Of course, it's still commonplace in the Windows world, but the concept facilitates reusability so much, it's a shame it wasn't adopted by other platforms.


I think Mozilla used something similar in Firefox: https://developer.mozilla.org/en-US/docs/XPCOM

Other than that I don't know any other companies who did something like COM.


The .NET framework is more than just a bit of COM plumbing.


Where have you found the abstraction leaky? I've generally found it works just fine; the only leaky part of it I've found is in some of the Microsoft-provided .NET wrappers, which I've seen leak memory. (Directly using the COM objects works around that 100% of the time for me.)

Edit: I'm not really being clear above. When I'm saying I've found the leaky abstraction to be around memory management, I'm not conflating "leaky abstractions" with "memory leaks," except insofar as the big point where I find problems with COM is around the need to be much more explicit with resource management. That the leaky abstraction happens to be around the...er, leaks, is coincidental.


Leaky abstractions have nothing to do with leaking memory. A leaky abstraction is a layer that attempts to reduce complexity and fails at its job by not hiding enough of the ugly parts.

It's a Spolsky-ism: http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...

Edit: Egg meet face.


I actually work at Fog Creek for Joel; I just happened to pick an absolutely horrid way to phrase what I was trying to convey. (I edited the original post to clarify.)

The memory management is the part of COM that leaks through the abstraction layer in my experience, but it has usually leaked through in MS's attempt to actually properly jacket the things in full-blown .NET objects. I've otherwise not had issues.

I have a hunch I'd also complain about the remote proxying if I used DCOM from .NET, but I haven't.


I think the parent means that it's a leaky abstraction because normally the GC takes care of deallocating memory of CLR objects but it can't help you with native code.

Also, parent works for Fog Creek.


The average consumer of Microsoft code instantiates a few objects and calls a few methods. That, for the most part, works extremely well. Memory management, as you mention, is a big pain point because you immediately are faced with reference counting. However, by itself, that's not a big issue.

Things go crazy once you decide that you want to give your COM object to somebody else in a framework-style callback pattern. For example, if you want to extend Visual Studio, you need to implement a bunch of interfaces, manage memory yourself, deal with object registration and GUIDs, worry about invalid interfaces and versioning, etc. And that's just all the standard COM stuff, that you're faced with 100% of the time.

But there are compounded complexities. COM uses HRESULT return values, .NET uses exceptions. Most public interfaces quickly start looking like dynamicly typed code: full of references to IUnknown and QueryInterface calls. You need to understand all the threading apartment model stuff, but the .NET libraries bundle up some best practices there that often don't play nice if you need to slightly bend the rules.

Finding documentation is impossible because the web is dominated by low quality questions with low quality answers using the same terminology. Using the Microsoft personas, the Morts and Elvises drown out the Einsteins in the Google results. However, even having the VS source (which I did), it's far too big to grep and far too many levels of QueryInterface indirection to grok.

Need some enum values? You'll need to redefine them yourself. Go find the C headers and copy out the values.

Need to deploy a new version? Good luck understanding the rules of AssemblyVersion, AssemblyFileVersion, the class loader, Locales, etc. Some random error happening somewhere in the class loader? You need to install some little utility program that some guy in DevDiv made that makes a secret registry key set up some secret logging to some secret IPC channel. Argh.

I was tasked with wiring up deploy/debug for WinPhone 7. I needed to implement a few dozen methods from the IVsDebugger interface, do all my own threading and networking, and then spend a few weeks debugging synchronization issues with the UI. Meanwhile, I had the exact same task for Familiar Linux on a Compaq iPaq several years prior to that. Despite Eclipse's bloat rivaling Visual Studio's, all I needed to do was change a config file from `$PROG $ARGS` to `ssh $HOST gdb $PROG $ARGS`.

EDIT: Just wanted to add that I found this post difficult to write. I, for the life of me, couldn't recall the interface name IUnknown. In the 3 years I've been gone, I've forgotten more about COM than most .NET programmers will ever know. I'm quite happy about that...both for me escaping, and for those .NET programmers lucky enough to remain ignorant of these things. In general, all of DevDiv does a great job keeping this complexity away from people writing line of business applications.


Okay; I get why we're having different experiences.

When I'm writing managed code that needs to work heavily with COM, I write it in C++ using ATL, and expose that a level higher for calling from .NET. This avoids 90% of the issues you're hitting, while making it trivial to use COM from .NET (or, for that matter, any other non-C++ language).

The good news is that WinRT does legitimately solve most issues managed languages have. To me, far more importantly than WinRT's ability to export full-blown classes, is its ability to export much, much richer metadata, providing a straightforward way to work with WinRT objects from any language that can process the new metadata format. No more secret enums in header files, no more HRESULT insanity (HRESULTs are transparently mapped to exceptions). Just clean, cross-language calling. We've finally realized the original promise of COM.

Of course, that goal was met by "cheating." In my opinion, Microsoft achieved their goal merely by pushing the abstraction layer further into C++. In practice, I think that's an overall win--it certainly makes most of your points about using COM from managed languages moot--but it also means that your "native" libraries are going to be significantly more WinRT-specific. Or, if you prefer: instead of writing an ATL jacket to make the COM interface clean for .NET, I'll instead write a WinRT jacket to make a C++11 library clean for .NET.

The more things change...


Semi related: Not sure how many people have played with PowerShell/COM, but it's fun to toy with. Haven't used it for anything too useful (yet) but for example, to delete all the comments in a Word document in a couple lines:

$a = New-Object -com Word.Application

$a.visible = $false

$a.Documents.Open("{absolute path}").DeleteAllComments()

Better examples: http://www.simple-talk.com/dotnet/.net-tools/com-automation-...


And this and VBA is why office is so irreplaceable. I wrote a report generation tool in 1999 with word 97. It picked a report definition off a file share, assembled documents from fragments on disk (up to 50 pages a pop with tables, graphs etc) and emailed them or printed them. 2000 documents an hour on a single Pentium pro 200 with 128mb of ram on NT. It took 2 days to write and was used until 2009 by 9000 users

This was just word, com and nothing else. Even the application host was an instance of word.

I doubt it could have been done with anything else then or now.


Provocative title for a pretty generic article.

1. Summary of COM, a widely used technology

2. Summary of comtypes, a python package for interacting with COM

3. One example of a gotcha they ran into with comtypes and arrays of COM objects.

Conclusion: meh.


How is the title provocative?


The title, "How Dropbox learned to stop worrying and love the COM" is a reference to the film "Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb". The object taking the place of the atomic bomb in the variant title is COM. To me this implies that the article will describe why people worry about COM and why they're wrong. I did not find it to contain either.

Admittedly I have never seen the movie, which wikipedia describes as a "dark comedy satirizing the nuclear scare". I suspect the article doesn't follow along those lines either.


> I have never seen the movie

You should fix that.

In addition, the bomb in question was the result of an ill- conceived Doomsday device to prevent the use of bombs.

In other words, this title suggests that as part of a policy of mutually assured destruction, Microsoft created a device to unleash COM if it ever sensed it was being attacked. They just forgot to tell anyone about the device (why didn't you tell the world!?!) and now we are all living with the fallout.


EJ...is that you? What a coincidence. What's up, old office-mate?


Indeed. It's a small world. :)


Am I reading this correctly? they are using comtypes as an exploratory tool, but the photo upload feature was written in something else...

(I haven't disassembled the dropbox dlls, but there aren't any obvious python signatures in the install directory)

If the above is wrong, I would love to know what they are using!


No, Dropbox is using Python and comtypes in its actual product.


Any idea what they are using to package the Python into the exe?

I've tried py2exe and all of the similar packagers that I could find, and every one of them requires brittle incantations to get everything packaged (I'm using a lot of graphics and number-crunching code beyond standard python, but all from the package index). I guess on the scale of Dropbox this is probably not a big issue. On my scale (1), it's one of the reasons I've started using C# (an unexpectedly awesome experience so far, btw). I need the fastest possible path to one-click exe files that non-technical (bio/chem) researchers and RAs can run. I've belatedly realized that the overhead of doing this in python is really too high.

[edit: found this SO answer: http://stackoverflow.com/questions/2678180/how-does-dropbox-... but it doesn't give specifics. In particular, AFAIK, none of the packagers can eat libpython.dll into a larger exe files]

[edit: interesting - http://blog.codepainters.com/2012/09/17/python-care-and-feed... ]


SharpDevelop and IronPython will give you one-click packaging to exe. IronPython has a built-in compiler, so really you can use it with any IDE.

IronPython is actually very cool if you want/have to live in .Net Land: it's mature, simple, and supports Python 2.7.


... but an EXE that has a dependency on the .NET runtime


Like every other application on Windows.


Some people just really, really liked Windows 2000 and they aren't giving up on it.


It is straightforward to tell py2exe to put the dll in the exe file:

http://www.py2exe.org/index.cgi/SingleFileExecutable#line-30...

But that doesn't say anything about other dependencies.


I use py2exe to package a complex application using a number of libraries into a single exe that contains everything except the graphics resources it uses (those are in a folder in the same dir). I'd be happy to help you set it up if you need help. Poke me on IRC, I'm Kliment on freenode. I have a bat file that runs "python setup.py py2exe" and out comes an executable. I rerun that bat file when I change stuff.


PyInstaller can bundle everything in the exe itself, not sure about py2exe.


>So, the following code

  for i in range(100):
      print idevice_item_array[0]
>actually crashes Python because it creates and destroys a hundred IDeviceItem objects, resulting in a hundred calls to Release on the real COM object. After the first Release call, the COM object is considered deleted.

This doesn't sound like a "gotcha". It sounds like a huge bug that comtypes needs to address.


If you read Don Box's book on COM he takes you through each part of it explaining in detail why it is the way it is. Every point is justifiable and the book is convincing.

However it clearly didn't work as a whole as Microsoft felt the need to replace it with .Net for application level programmers.


"COM is Love" - Don Box [1]

[1] http://en.wikipedia.org/wiki/Don_Box


What do you expect, he wrote the book on it.


And I wrote the forward to the book on it :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: