Before |
After |
After creating a series of tests (120 in total), only 24 passed and it took 3 hours and 46 minutes to run.
No, these aren't simple unit tests. Something would be very wrong if they were.
These are complex integration, end-to-end system tests.
These tests relate to Windows Template Studio and verifying that apps generated with different frameworks (CodeBehind, MVVMBasic, MVVMLight, Caliburn.Micro, and Prism) all produce the same output.
I already have separate groups of tests which check equivalent apps created in C# and VB.Net produce the same result.
I also have tests which compare functionality and output at a passive, static level but I suspected there was potential value in running the generated apps and checking them too.
When creating templates, the MVVM Basic version is usually created first, so this became the reference version.
Here's what each test does:
For each navigation type and page that can be added to the app:
- Generate an app with the reference framework, specified nav type, and just the specified page.
- Generate apps with each of the other frameworks using the same nave type and page.
- Disable .NET Native compilation for the projects. (So they build much faster--the tests would take much much more time without this.)
- Build the release version of all the apps (using the created certificate.)
- Install the temporary certificate created with the app and used to sign it.
- Install the generated apps (Must be a signed, Release version for this to be possible. --This is the equivalent of what VS does when deploying/running a version for testing.)
- For each of the other frameworks.
- Create a new test project.
- Update the project to have references to the reference app and the comparison app.
- Run the tests in the test project. They:
- Launch the ref app.
- Maximize the app once opened.
- Take a screenshot.
- Restore the app size.
- Close the app.
- Launch the comparison app.
- Maximize the app once opened.
- Take a screenshot.
- Restore the app size.
- Close the app.
- Compare the screenshots (allowing for areas that will be different--like the app name in the title bar--and running on different size screens and system scaling.)
- If all screenshot images are identical.
- Uninstall the apps.
- Remove the certificates.
- Delete the apps and test projects.
- If screenshot images are not identical.
- Create an image highlighting the differences.
- Leave all artifacts created during testing to help investigate why the test failed.
See, not a simple test.
Originally I had separate tests for each comparison but changed to the above approach to reduce the number of times the reference app needed to be created. The details in the failure message meant that having separate tests didn't help with debugging at all. Also, the ability to run individual tests and customize the reference frameworks meant that I could avoid unnecessary work when verifying specific fixes.
The good news about all this work is that it did find some issues with the generated projects and fixes have been made.
Having known test suites that take more than 5 days to run in their entirety, I have strategies for managing long-running test suites. Running all the tests in this solution takes over 18 hours, but the long-running ones are only run manually before a release and many can are normally run in parallel.
No, we don't run all the tests, all the time. |
A notable extra lesson from these new tests was that page layouts behave differently for an app that is opened and then maximized and apps that are opened maximized. Seriously. It's potentially concerning but I have more important things to focus on right now.