I'll take your word for it; we are moving into architecture that sits below my experience now. But we are also below the level of abstraction relevant to the original question, which concerned the design of an API used for communicating with hardware outside the CPU.
Any system that shares a clock is synchronous. Two common communication methods for hardware are SPI, which has a shared clock, and is hence synchronous (https://upload.wikimedia.org/wikipedia/commons/thumb/f/fc/SP...) and UART, which has no shared clock, indeed it can be implemented with a single wire, and is hence asynchronous. Internally the receiver's clock needs to be fast enough to correctly sample data being transmitted to it. (Ed: or I guess you might be able to do something clever with an async circuit fed directly by the signaling line.)
Communicating with hardware outside the CPU doesn't have to be synchronous or asynchronous, it's a design decision depending on what you're trying to accomplish. Languages that don't support both methods are missing half of the design space.