The 8088 and 8086 actually have the same number of pins and almost identical pinouts, despite the the difference in data bus width. This is because the data and address are multiplexed onto the same pins, because of intel's obsession with lowest pincount (each pin does add quite a bit of cost to packaging).
On the 8086, the bottom 16 address pins (of 20) are multiplexed with data, on the 8088, only the bottom 8 address pins are multiplexed.
So you had to demultiplex the bus no matter which chip you chose. I think the cost savings mostly come from being able to configure systems with only 8 DRAM chips per bank (instead of 16 DRAM chips per bank with the 8086). A 16 bit bus also requires that your ROM chips are in pairs, and it probably increases the motherboard routing complexity. And a small bit of extra decoding logic.
A 16 bit bus would have also required IBM to skip straight over the 8 bit ISA standard with it's smaller sockets and forced up the complexity of all PC expansion cards (which would also require double the ROM chips, double the RAM chips and extra logic)
Edit: And now that I think about it, the fact that the upper bit of the address bus aren't multiplexed might actually allow you to simplify the DRAM row/column addressing logic... But only if you put the column into the upper bits of the address.
On the 8086, the bottom 16 address pins (of 20) are multiplexed with data, on the 8088, only the bottom 8 address pins are multiplexed.
So you had to demultiplex the bus no matter which chip you chose. I think the cost savings mostly come from being able to configure systems with only 8 DRAM chips per bank (instead of 16 DRAM chips per bank with the 8086). A 16 bit bus also requires that your ROM chips are in pairs, and it probably increases the motherboard routing complexity. And a small bit of extra decoding logic.
A 16 bit bus would have also required IBM to skip straight over the 8 bit ISA standard with it's smaller sockets and forced up the complexity of all PC expansion cards (which would also require double the ROM chips, double the RAM chips and extra logic)
Edit: And now that I think about it, the fact that the upper bit of the address bus aren't multiplexed might actually allow you to simplify the DRAM row/column addressing logic... But only if you put the column into the upper bits of the address.