That's a case where mmap isn't actually all that much faster than read, and due to the inter processor interrupts needed to synchronize the memory mappings across cores, it may end up much slower. You're grabbing large chunks and flushing the TLB a whole lot.
If you are seeking randomly and doing small reads, then mmap will help quite a bit: the data will be faulted in, and accessing it a second, third, or hundredth time will not cost much.
If you are seeking randomly and doing small reads, then mmap will help quite a bit: the data will be faulted in, and accessing it a second, third, or hundredth time will not cost much.