You mean, the original System 1.0 of the 1984 era? Did not support threads, no.
There was a thing called "Desk Accessories", sort of single-window mini apps (think Calculator), that were implemented as a special kind of device driver. But IIRC even these were serviced by periodic calls to SystemTask() which would make sure every running device got some time.
To get around this, we hacked the OS in crazy ways. For example, all of the system APIs were in a big address table. So if you had loaded some code and wanted to get a little slice of execution time to yourself, patching your function's address into the table in place of SystemTask() was a common technique (not forgetting to call to the original SystemTask() address before finishing, of course)! Since everyone did this (including Apple), you can imagine the long sequences of hacks patching on top of each other would lead to a really unstable system. Hence the classic Mac OS reputation.
I once wrote a driver for a custom-built hardware keyboard which plugged in via the serial port. The keyboard driver was basically a Control Panel (CDEV) with an INIT resource that patched something (I'm guessing SystemTask() but my memory is hazy...not sure why I didn't just make it a straight-up device driver) to check for bytes from the keyboard, which it would post into the event queue, and they'd show up in the running application.
Another common technique for getting time slices was to use the interrupt manager, although that was dicey because calling into system APIs at interrupt time wasn't safe.
I, too, once wrote a driver for a hardware keyboard which plugged in via the serial port, and I did it the same way you describe. My recollection is that the keyboard/mouse driver system at the time assumed it would only be dealing with ADB devices. There was no abstraction for "a keyboard" or input device in general, there was just a system for interacting with ADB specifically and turning its input into events.
Many of the OS calls had asynchronous versions (PBHOpenAsync, etc) which add it to a queue for later processing and call your completion pointer when done. That was documented in Inside Macintosh Volume 1 (1985), predating Multifinder.
I could be wrong, but OS 9's Multiprocessing framework actually relies on multiple physical CPUs, from what I can tell - there's no way to use it or test in an emulator (I've tried in the past)