It really sucks that the Gilectomy thing went nowhere. It looks like perhaps the only viable way to remove GIL would require breaking compatibility, and it's kind of too soon after Py2 -> Py3.
According to Raymond Hettinger, it was easy to get rid of the GIL and replace it with a bunch of smaller locks that allowed true concurrency - but the single-thread performance was dramatically worse due to all the extra locking and unlocking, and that wasn't something they were willing to sacrifice.
Are you sure it was Raymond? Well at least Larry Hastings (the person who was attempting Gilectomy) said something similar, but even then he actually was ok with that as long as multithreaded performance was better, but unfortunately even that was worse. The main problem is Python's grabage collection through reference counting. Changing the counter requires locks, which then require cache flush every time.
There was an attempt (I think in PyPy) to use Software Transactional Memory (STM) which would solve this problem, but apparently it is still difficult to do it and looks like it did not succeed.