You seem to know more about this than me, but it seems to me that the first law does more than just induce a metric, I've always thought of it as positing inertia as an axiom.
There's also more than one way to think about complexity. Newtownian mechanics in practice requires introducing forces everywhere, especially for more complex systems, to the point that it can feel a bit ad hoc. Lagrangian mechanics very often requires fewer such introductions and often results in descriptions with fewer equations and fewer terms. If you can explain the same phenomenon with fewer 'entities', then it feels very much like Occam's razor would favor that explanation to me.
Indeed inertia. Theory of motion consists of describing the properties of Inertia.
In terms of Newtonian mechanics the members of the equivalence class of inertial coordinate systems are related by Galilean transformation.
In terms of relativistic mechanics the members of the equivalence class of inertial coordinate systems are related by Lorentz transformation.
Newton's first law and Newton's third law can be grouped together in a single principle: the Principle of uniformity of Inertia. Inertia is uniform everywhere, in every direction.
That is why I argue that for Newtonian mechanics two principles are sufficient.
The Newtonian formulation is in terms of F=ma, the Lagrangian formulation is in terms of interconversion between potential energy and kinetic energy
The work-energy theorem expresses the transformation between F=ma and potential/kinetic energy
The work-energy theorem: I give a link to an answer by me on physics.stackexchange where I derive the work-energy theorem
https://physics.stackexchange.com/a/788108/17198
The work-energy theorem is the most important theorem of classical mechanics.
About the type of situation where the Energy formulation of mechanics is more suitable:
When there are multiple degrees of freedom then the force and the acceleration of F=ma are vectorial. So F=ma has the property that the there are vector quantities on both sides of the equation.
When expressing in terms of energy:
As we know: the value of kinetic energy is a single value; there is no directional information. In the process of squaring the velocity vector directional information is discarded, it is lost.
The reason we can afford to lose the directional information of the velocity vector: the description of the potential energy still carries the necessary directional information.
When there are, say, two degrees of freedom the function that describes the potential must be given as a function of two (generalized) coordinates.
This comprehensive function for the potential energy allows us to recover the force vector. To recover the force vector we evaluate the gradient of the potential energy function.
The function that describes the potential is not itself a vector quantity, but it does carry all of the directional information that allows us to recover the force vector.
I will argue the power of the Lagrangian formulation of mechanics is as follows:
when the motion is expressed in terms of interconversion of potential energy and kinetic energy there is directional information only on one side of the equation; the side with the potential energy function.
When using F=ma with multiple degrees of freedom there is a redundancy: directional information is expressed on both sides of the equation.
Anyway, expressing mechanics taking place in terms of force/acceleration or in terms of potential/kinetic energy is closely related. The work-energy theorem expresses the transformation between the two. While the mathematical form is different the physics content is the same.
Nicely said, but I think then we are in agreement that Newtownian mechanics has a bit of redundancy that can be removed by switching to a Lagrangian framework, no? I think that's a situation where Occam's razor can be applied very cleanly: if we can make the exact same predictions with a sparser model.
Now the other poster has argued that science consists of finding minumum complexity explanations of natural phenomena, and I just argued that the 'minimal complexity' part should be left out. Science is all about making good predictions (and explanations), Occam's razor is more like a guiding principle to help find them (a bit akin to shrinkage in ML) rather than a strict criterion that should be part of the definition. And my example to illustrate this was Newtonian mechanics, which in a complexity/Occam's sense should be superseded by Lagrangian, yet that's not how anyone views this in practice. People view Lagrangian mechanics as a useful calculation tool to make equivalent predictions, but nobody thinks of it as nullifying Newtownian mechanics, even though it should be preferred from Occam's perspective. Or, as you said, the physics content is the same, but the complexity of the description is not, so complexity does not factor into whether it's physics.
There's also more than one way to think about complexity. Newtownian mechanics in practice requires introducing forces everywhere, especially for more complex systems, to the point that it can feel a bit ad hoc. Lagrangian mechanics very often requires fewer such introductions and often results in descriptions with fewer equations and fewer terms. If you can explain the same phenomenon with fewer 'entities', then it feels very much like Occam's razor would favor that explanation to me.