ITK Release 4/Performance Experiments/Reducing CTest Output: Difference between revisions

From KitwarePublic
Jump to navigationJump to search
No edit summary
No edit summary
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
Testing is critical to high quality software and the ITK developers are expected to produce unit tests for each class. However, some tests produce large amounts of output which may be useful to the developer, but places a burden on the cdash database. Furthermore, some hypothesize that test output size may affect the performance of cdash.
Testing is critical to high quality software and the ITK developers are expected to [[ITK_Release_4/Modularization/_Add_tests|produce unit tests for each class]]. However, some tests produce large amounts of output which may be useful to the developer, but places a burden on the cdash database. Furthermore, some hypothesize that test output size may affect the performance of cdash.


This experiment looks at the size of output produced by ITKv4 and looks for ways to reduce a test's output.
This experiment looks at the size of output produced by ITKv4 and looks for ways to reduce a test's output.
Line 19: Line 19:


==Analyze==
==Analyze==
Analysis of test out sizes revealed the following types of tests:
The top ten test output producers are:
# ''itkSampleToHistogramFilterTest4'' reports failures regarding expected frequencies. AEven though the test fails, it does not indicate failure. It turns out that the test is flawed.
# ''itkSystemInformationTest'' echoes the output of several CMake files produced during the build process, e.g. CMakeCache.txt
# ''vnl_test_alignment'' is a third party test producing ver 400,000 characters of output.
# ''itkNumericTraitsTest'' provides information about numeric limits and capabilities for a given platform.
# ''itkImageRegionExclusionIteratorWithIndexTest'' provides useful information to the developer, but not necessarily for the test.
# ''itkSampleToHistogramFilterTest5'' provides useful information to the developer, but not necessarily for the test.
# ''itkImageRegistrationMethodTest_13'' produces intermediate results that are useless to the test.
# ''itkSliceIteratorTest'' provides useful information to the developer, but not necessarily for the test.
# ''itkCheckerBoardImageFilterTest'' produces useless output.
# ''itkTriangleMeshToBinaryImageFilterTest2'' produces too much output.


# Tests that produce valuable output, even if they pass:
Analysis of these ten tests reveals the following categories of tests:
## itkSystemInformation - echoes the output of several CMake files pruced during the build process, e.g. CMakeCache.txt
 
## itkNumericsTests - provides information about numeric limits and capabilities for a given platform.
* Tests that produce valuable output, even if they pass.
# Tests that produce reasonable output (< 1k characters)
** ''itkSystemInformationTest''
## Over 1100 of the 2202 tests produce < 1k characters
** ''itkNumericTraitsTest''
# Tests that produce reasonable output but > 1k characters
* Tests that produce reasonable output < 1k characters
# Tests that are producing erroneous output
** Over 1100 of the 2202 tests produce < 1k characters
# Tests that are producing useless output
* Tests that produce reasonable, but not necessarily valuable output > 1k and < 5k characters
** There are 690 tests that produce between 1k and 5k characters
* Tests that produce reasonable, but not necessarily valuable output > 5k characters
** ''itkImageRegionExclusionIteratorWithIndexTest''
** ''itkImageRegistrationMethodTest_13''
** ''itkSliceIteratorTest''
** ''itkSampleToHistogramFilterTest5''
* Tests that produce erroneous output
** ''itkSampleToHistogramFilterTest4''
* Tests that produce useless output
** ''itkCheckerBoardImageFilterTest''
** ''itkTriangleMeshToBinaryImageFilterTest2''
* Tests provided by Third Par''ty software producing unreasonable output
**'' vnl_test_alignment


==Improve==
==Improve==
A manual analysis of the top test output offenders resulted in a number of gerrit patches. The discussion on the mailing lists motivated some developers to review their tests. Also patches were submitted to correct test errors. The patches required manual editing of the tests. New options were added to the itk test driver to permit [http://review.source.kitware.com/#change,3043 full output] and [http://review.source.kitware.com/#change,3048 redirected output].
CMake and CTest provide capabilities that facilitate improvement:
*Limit the size of test output reported to cdash
::The variable '''CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE''' limits the size of the output to the given value. The default is 1000 characters. This variable, if present is specified in the ''CMake/CTestCustom.cmake.in'' file.
*Override the test output size limit
::If a test outputs the string '''CTEST_FULL_OUTPUT''', ctest will override the limit.
::The ITKv4 test driver flag ''--full-output'' permits a test to override the '''CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE''' limit. This is a convenient way to override the output limits without changing the test.
*Redirect the output of a test to a file
::The ITKv4 test driver flag ''--redirect-output'' '''FILENAME''' redirects a test's output to a file, usually in ''${ITK_BINARY_DIR}/Testing/Temporary''


==Control==
==Control==
The only automated mechanism to control test output is '''CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE''' (default 1000), specified in ''CMake/CTestCustom.cmake.in''.
Improved documentation for [[ITK_Release_4/Modularization/_Add_tests|adding a test]] alerts developers to keep their test output to a minimum.
Gerrit reviewers are encouraged to look at the output of tests and suggest to the submitter to use [[ITK_Release_4/Performance_Experiments/Reducing_CTest_Output#Improve|an appropriate mechanism]] to limit the test output.

Latest revision as of 22:19, 4 October 2011

Testing is critical to high quality software and the ITK developers are expected to produce unit tests for each class. However, some tests produce large amounts of output which may be useful to the developer, but places a burden on the cdash database. Furthermore, some hypothesize that test output size may affect the performance of cdash.

This experiment looks at the size of output produced by ITKv4 and looks for ways to reduce a test's output.

Approach

This experiment uses the DMAIC methodology of the Six Sigma management process to "Define", "Measure", "Analyze", "Improve" and "Control" test output in ITKv4. The basic methodology (from Wikipedia) consists of the following five steps:

  • Define process goals that are consistent with customer demands and ITKv4's strategy.
  • Measure key aspects of the current process and collect relevant data.
  • Analyze the data to verify cause-and-effect relationships. Determine what the relationships are, and attempt to ensure that all factors have been considered.
  • Improve or optimize the process.
  • Control to ensure that any deviations from target are corrected before they result in defects. Set up pilot runs to establish software quality, move on to production, set up control mechanisms and continuously monitor the process.

Define

Reduce the total test output of ITKv4 without affecting code coverage or value of the tests.

Measure

As of October 1, 2011, there were 2202 ITKv4 tests producing 10.6 meg of test output for a single platform. 2 tests produced 18% of the output and 65 of the 2202 tests produced 60% of the output. This data was gathered from a cdash file provided by Dave Cole of Kitware.

Analyze

The top ten test output producers are:

  1. itkSampleToHistogramFilterTest4 reports failures regarding expected frequencies. AEven though the test fails, it does not indicate failure. It turns out that the test is flawed.
  2. itkSystemInformationTest echoes the output of several CMake files produced during the build process, e.g. CMakeCache.txt
  3. vnl_test_alignment is a third party test producing ver 400,000 characters of output.
  4. itkNumericTraitsTest provides information about numeric limits and capabilities for a given platform.
  5. itkImageRegionExclusionIteratorWithIndexTest provides useful information to the developer, but not necessarily for the test.
  6. itkSampleToHistogramFilterTest5 provides useful information to the developer, but not necessarily for the test.
  7. itkImageRegistrationMethodTest_13 produces intermediate results that are useless to the test.
  8. itkSliceIteratorTest provides useful information to the developer, but not necessarily for the test.
  9. itkCheckerBoardImageFilterTest produces useless output.
  10. itkTriangleMeshToBinaryImageFilterTest2 produces too much output.

Analysis of these ten tests reveals the following categories of tests:

  • Tests that produce valuable output, even if they pass.
    • itkSystemInformationTest
    • itkNumericTraitsTest
  • Tests that produce reasonable output < 1k characters
    • Over 1100 of the 2202 tests produce < 1k characters
  • Tests that produce reasonable, but not necessarily valuable output > 1k and < 5k characters
    • There are 690 tests that produce between 1k and 5k characters
  • Tests that produce reasonable, but not necessarily valuable output > 5k characters
    • itkImageRegionExclusionIteratorWithIndexTest
    • itkImageRegistrationMethodTest_13
    • itkSliceIteratorTest
    • itkSampleToHistogramFilterTest5
  • Tests that produce erroneous output
    • itkSampleToHistogramFilterTest4
  • Tests that produce useless output
    • itkCheckerBoardImageFilterTest
    • itkTriangleMeshToBinaryImageFilterTest2
  • Tests provided by Third Party software producing unreasonable output
    • vnl_test_alignment

Improve

A manual analysis of the top test output offenders resulted in a number of gerrit patches. The discussion on the mailing lists motivated some developers to review their tests. Also patches were submitted to correct test errors. The patches required manual editing of the tests. New options were added to the itk test driver to permit full output and redirected output.

CMake and CTest provide capabilities that facilitate improvement:

  • Limit the size of test output reported to cdash
The variable CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE limits the size of the output to the given value. The default is 1000 characters. This variable, if present is specified in the CMake/CTestCustom.cmake.in file.
  • Override the test output size limit
If a test outputs the string CTEST_FULL_OUTPUT, ctest will override the limit.
The ITKv4 test driver flag --full-output permits a test to override the CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE limit. This is a convenient way to override the output limits without changing the test.
  • Redirect the output of a test to a file
The ITKv4 test driver flag --redirect-output FILENAME redirects a test's output to a file, usually in ${ITK_BINARY_DIR}/Testing/Temporary

Control

The only automated mechanism to control test output is CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE (default 1000), specified in CMake/CTestCustom.cmake.in.

Improved documentation for adding a test alerts developers to keep their test output to a minimum.

Gerrit reviewers are encouraged to look at the output of tests and suggest to the submitter to use an appropriate mechanism to limit the test output.