Or, to be more precise, the future of work for a lot of people doing "software development" depends on the need to test code.
AI is great. You can use it to write code without having to know how to program.
Except, the real value of a good software developer is knowing what code to write.
Yes, you may be able to eventually get AI to generate code that initially appears to produce what you want, but how can you determine if it's good, or efficient, or correctly covers all scenarios you need?
It's not just a case of having code that seems to work, you also (potentially among other things) need to know if the code is correct.
AI tools can be useful to software developers by saving time generating code more quickly than the developer could have written it themselves. This is true for almost all developers. Even me.
Today, I needed to write a function to see if two numeric ranges overlapped.
It's quite a trivial thing. There are lots of examples of this type of method in existence. I could easily write such a function, but I decided to let Copilot create it for me.
The code it produced looked fine:
public bool Intersects(int start, int length)
{
return (start >= this.Start && start < this.Start + this.Length) ||
(start + length >= this.Start && start + length < this.Start + this.Length);
}
All it needed was some boundary checks to return false if either of the inputs was negative.
That should be good enough, I thought.
But, how can I be sure?
Does it matter if it's not?
I know this is just the type of code that: it is easy to have a mistake in; such mistakes are hard to spot; and any mistake is likely to produce annoying bugs.
So, I wanted to make sure the function does all that I need. That means writing some coded tests.
I also know that Copilot can generate tests for a piece of code.
So, I decided on a challenge: Could can I write better or a more complete set of tests?
I went first and I came up with 15 test cases.
One of them failed.
But, it was easy to spot the issue with the code, make the change, and rerun the tests to see them all pass.
Then I let Copilot have a go. It came up with 5 test cases.
They all passed. First with the unmodified function.
Then, with the modified function, the tests all still passed.
Being able to change the logic within the function and not see any change in test results is a clear sign that there is insufficient test coverage.
Not that this is about the number of tests. This is about the quality of tests.
When it comes to testing, quality is significantly more important than quantity.
The question should not be "Are there tests?" or "How many tests are there?"
The question you need to ask is "Do these tests give confidence that everything works correctly?"
How do you ensure that AI generated code is 100% what you want and intend? You have to be incredibly precise and consider all scenarios.
Those scenarios make good test cases.
If you're working with code, whether it's written by a person or AI, you need to test it thoroughly. And the challenging part of thorough testing, and the bit you can't (yet?) totally outsource to AI is determining all the scenarios (test cases) to consider and account for.
What was the case that failed and how did the code need to be changed?
I needed the code to work with input of a length of zero. If this was at the end of the existing range, I need to consider this as intersecting, but the initially generated code considered it not to be.
The fix was to change the final "less than" to be a "less than or equals" and then it did what I needed.
Having now written this up, I also realize there's a potential for an overflow with internal sums that are greater than Int.Max, but as the reality of this code is that it will only be used with maximum input values of a few thousand this shouldn't be an issue...for now! ;)
This wasn't the end of the story: https://www.mrlacey.com/2024/11/even-more-testing-nonsense-with.html