No you should spin on a read. Once you see the value you want you then try the CMPXCHG. If that succeeds you exit. If it fails you go back to spinning on the read.
Read and load mean the same thing. (I think GP just missed the end of your comment.)
You care about exchange vs read/load because of cache line ownership. Every time you try to do the exchange, the attempting CPU must take exclusive ownership of the cacheline (stealing it from the lock owner). To unlock, the lock owner must take it back.
If the attempting CPU instead only reads, the line ownership stays with the lock holder and unlock is cheaper. In general you want cache line ownership to change hands as few times as possible.