I've been subjecting AI models to a set of real-world programming tests for over two years. This time, we look solely at the ...