On the surface, it seems obvious that training an LLM with “high quality” data will lead to better performance than feeding ...