I tend to think of feature selection as being methods like LASSO that induce conceptual sparsity in the feature space; and use the label "dimensional reduction" for methods which reduce covariant dimensionality without inducing conceptual sparsity -- PCAing 100 features down to 3 principal components may or may not actually lend itself to a simpler interpretation but often it just susbstitutes the problem of labeling loaded principal components in lieu of the original problem.
That's fair and was helpful to hear! I had dimensionality reduction in the back of my mind, and now that you mention it, your point about conceptual sparsity definitely seems important here.
Not disagreeing with you, just spitballing.