The clock is not very accurate and devices have to interpolate it. Which is something they all do slightly differently, especially if the tempo changes - which doesn't happen much in pop, but happens regularly in media composition.
If you're lucky the result will be in-time-ish, but it's never going to be sample accurate.
This actually matters for some applications, including sound design. If you trigger two samples with a variable offset you can get phase cancellation and other easily audible effects.
DAWs are sample accurate now, but 5-pin DIN MIDI 1.0 really isn't.