It was at one time standard to store each bitplane separately. If you had a 4-color 320x200 image, for example, you'd have one page of VRAM that stored a 320x200 1-bit image, holding all the bit 0s, and another, exactly the same, holding all the bit 1s. And so on up to as many bit planes as required. (That was one usual arrangement, but there are other options - e.g., interleaved bitplanes and/or no video RAM as such.)
Separate bitplanes were very annoying in many respects, and the demise of the approach was probably regretted by only a few. But storing each bitplane separately does have one major advantage: you can just write all your algorithms to operate on 1-bit images, and they automatically run at any bit depth. Just run the routine once for each bitplane you're interested in processing.
(That may also mean the hardware is simpler to implement - one plausible excuse for its ubiquity. Not my field of expertise though...)
...but only a 2-color image.