In my experience, if you track velocity long enough, and 'categorise' the values based on things like team size, technology etc, you get values that are useful to predict how much work you should commit to in an iteration.
Yes, things change: productivity, moral, team members join and leave, some teams are shit at estimating etc - but at least in my experience if you take an average you do arrive at a useful figure.
In my experience if you have some sort of measure that looks a bit like productivity then senior management will latch on to it and treat it as a proxy for productivity.
That inevitably means that developers have an incentive to inflate their estimates, which means story point inflation.
Yes, things change: productivity, moral, team members join and leave, some teams are shit at estimating etc - but at least in my experience if you take an average you do arrive at a useful figure.