So it’s pretending to be an average human, and the average human gets it wrong a lot? I wonder if you can improve coding results by telling it to act as an expert programmer?
Imagine you are navigating a tiny robot on mars, or let's say a giant land region with hills and valleys.
Saying it "split it evenly" and giving it numbers is something like saying "find me the hill next to this valley". This will lead you to the valley on the left, then to the tiny hill right behind it. The hill right behind it in this example is where all the weights about splitting stuff evenly are hanging around.
Giving it "split evenly and make sure the math is correct" is like saying "find me the the hill next to this valley, but make sure it's the tallest". So it will lead you to the same valley, but then to a different hill, because of adding "tallest" it will lead you to the intersection of where all the weights about splitting stuff evenly and correct math were found.
I.e. there is other stuff you can engineer about - saying stuff like "Write the result step by step and check for correctness." afterwards will navigate it to the same hill but then towards the "all math tests are here" hill which is where step by step calculations weights are and it will find it's new home there, giving you a more detailed output and more chances to be right since the prediction is easier - basically it "splits" the task into smaller, easier to predict chunks, like you split ugly code into small testable functions to understand it.
It's not "pretending". As the comment said, the dataset has something like "paths" and that prompt narrows it down to those that (hopefully) are related to "getting the math right".