Using memory mapped files as an option
The memmap functionality seems too slow. Many systems have plenty of memory available and so simply reading the full array into memory before permuting the coordinates should speed things up significantly. It may be worth adding a flag to use the memmap method in the event that someone is on a low-memory machine, but the default should be to not use memmap. Evidence from python indicates that non-memmap functionality is faster.