I tend to think of feature selection as being methods like LASSO that induce conceptual sparsity in the feature space; and use the label "dimensional reduction" for methods which reduce covariant dimensionality without inducing conceptual sparsity -- PCAing 100 features down to 3 principal components may or may not actually lend itself to a simpler interpretation but often it just susbstitutes the problem of labeling loaded principal components in lieu of the original problem.
That's fair and was helpful to hear! I had dimensionality reduction in the back of my mind, and now that you mention it, your point about conceptual sparsity definitely seems important here.
The title seems like it has the form "<Specific method> vs <Broader category method fits in>".