Skip to content

[clang] Clang unit test refactoring appears to be creating/exposing testing gaps and/or test counting madness and/or test invocation confusion #140799

Open
@tahonermann

Description

@tahonermann

Grab a fresh cup of coffee, this might take a while. You might want to set aside a cup of something stronger for when you reach the end.

This all started with me investigating a downstream performance issue concerning the ASTMatchersTests unit test taking 30 minutes to run on Windows (serially, with -j1) compared to 45 seconds on Linux (serially). That is why this report focuses on the ASTMatchers tests. I assume the concerns presented here are applicable to other tests that have been folded into the new AllClangUnitTests executable, but I haven't verified that. For the record, the performance issue hasn't been found, but upstream appears to be unaffected.

Recent changes made via commit 5ffd9bd (which was merged April 1st, reverted later that day via commit 03a791f, then reapplied on April 2nd via commit e3c0565) and commit db2315a (which was merged April 29th) have merged most Clang unit tests into a single AllClangUnitTests executable with the intent to reduce the number of executables produced by the build. Ok, good, mostly, though I lost a lot of time debugging a stale ASTMatchersTests executable left in my work space before I discovered these changes. Related PRs include #133545 and #134196.

I'm now finding it challenging to convince myself that all of the tests that were previously being run via the ASTMatchersTests executable are still being run via the AllClangUnitTests executable.

My Linux and Windows build environments aren't configured the same. This is at least partially why the test counts shown below don't correlate well between them. So, ignore those; the interesting data concerns the test count comparisons for the same platform. For reference, my builds are configured as follows:

  • Linux:
    • Built with GCC 11.2.1.
    • Built with the following commands:
      • cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On -DLLVM_ENABLE_PROJECTS=clang -DLLVM_ENABLE_RUNTIMES="compiler-rt;libunwind;libcxx;libcxxabi;openmp" -DLLVM_BUILD_TESTS=ON -DLLVM_LIT_ARGS='-sv -o lit-results.json'
      • make -j 80 && make install
      • make -j 80 && make check-clang
  • Windows:
    • Built within a MSVS 2022 x64 developer command prompt.
    • Built with the following commands:
      • cmake -G "Visual Studio 16 2019" -A x64 -Thost=x64 -DLLVM_ENABLE_ASSERTIONS=On -DLLVM_ENABLE_PROJECTS=clang -DLLVM_BUILD_TESTS=ON -DLLVM_LIT_ARGS='-sv -o lit-results.json'
      • cmake --build . --config Release --target install
      • cmake --build . --config Release --target check-clang

Running the Clang unit tests via a command like the following results in lit reporting that all tests passed. In each case, the current working directory is the build directory.

  • Linux:
    $ ./bin/llvm-lit -vv -j 1 tools/clang/test/Unit
    -- Testing: 372 tests, 96 workers --
    PASS: Clang-Unit :: ./AllClangUnitTests/36/96 (1 of 372)
    ...
    PASS: Clang-Unit :: ./AllClangUnitTests/94/96 (97 of 372)
    PASS: Clang-Unit :: Format/./FormatTests/75/151 (98 of 372)
    ...
    PASS: Clang-Unit :: Basic/./BasicTests/22/95 (372 of 372)
    
    Testing Time: 94.48s
        
    Total Discovered Tests: 25808
      Skipped:     4 (0.02%)
      Passed : 25804 (99.98%)
    
  • Windows:
    $ python ./Release/bin/llvm-lit.py -vv -j 1 tools/clang/test/Unit
    -- Testing: 156 tests, 24 workers --
    PASS: Clang-Unit :: Release/AllClangUnitTests.exe/45/48 (1 of 156)
    ...
    PASS: Clang-Unit :: Release/AllClangUnitTests.exe/24/48 (48 of 156)
    PASS: Clang-Unit :: Format/Release/FormatTests.exe/21/38 (49 of 156)
    ...
    PASS: Clang-Unit :: Basic/Release/BasicTests.exe/7/43 (156 of 156)
    
    Testing Time: 215.54s
    
    Total Discovered Tests: 25706
      Skipped:     4 (0.02%)
      Passed : 25702 (99.98%)
    

Personally, I find it rather confusing that lit says it is running 372 (156) tests and then reports that 25804 (25702) passed. I acknowledge the use of "Discovered" as a presumed disambiguator that these tests are not those tests, but it still seems odd to me. But I digress.

None of the lit output indicates which of those "PASS" lines, if any, corresponds to the tests formerly run by ASTMatchersTests. Fortunately, my build configuration passes -sv -o lit-results.json to lit and I can search.

  • Linux:
    $ grep ASTMatchersTests ./tools/clang/test/lit-results.json | wc -l
    12558
    
  • Windows:
    $ grep ASTMatchersTests ./tools/clang/test/lit-results.json | wc -l
    12558
    

Ok, so lots of different numbers above, but the numbers obtained from the search for "ASTMatchersTests" from the lit log match! They are also pretty close to numbers I had noted when running the stale ASTMatchersTests executables (12914 discovered tests in 140 tests for Linux, 12911 discovered tests in 65 tests for Windows; these numbers are from a downstream build with other changes and might include additional tests).

Issue 1: Lit says all is good but AllClangUnitTests aborts with a failed assertion.

Per the detail above, lit claims that all tests passed. However, if I run AllClangUnitTests by itself, the results are ... different; the program aborts with a failed assertion (on both Linux and Windows).

  • Linux:
    $ ./tools/clang/unittests/AllClangUnitTests
    [==========] Running 24477 tests from 539 test suites.
    ...
    [----------] 5 tests from ASTMatchersTest
    [ RUN      ] ASTMatchersTest.NamesMember_CXXDependentScopeMemberExpr
    [       OK ] ASTMatchersTest.NamesMember_CXXDependentScopeMemberExpr (10 ms)
    [ RUN      ] ASTMatchersTest.ArgumentCountIs_CXXUnresolvedConstructExpr
    [       OK ] ASTMatchersTest.ArgumentCountIs_CXXUnresolvedConstructExpr (2 ms)
    [ RUN      ] ASTMatchersTest.HasArgument_CXXUnresolvedConstructExpr
    [       OK ] ASTMatchersTest.HasArgument_CXXUnresolvedConstructExpr (2 ms)
    [ RUN      ] ASTMatchersTest.DecompositionDecl
    [       OK ] ASTMatchersTest.DecompositionDecl (6 ms)
    [ RUN      ] ASTMatchersTest.Finder_DynamicOnlyAcceptsSomeMatchers
    [       OK ] ASTMatchersTest.Finder_DynamicOnlyAcceptsSomeMatchers (0 ms)
    [----------] 5 tests from ASTMatchersTest (21 ms total)
    ...
    [ RUN      ] CodeGenTest.CodeGenFromIRMemBuffer
    warning: overriding the module target triple with i386-unknown-linux-gnu
    AllClangUnitTests: /.../llvm/lib/CodeGen/CodeGenTargetMachineImpl.cpp:48: void llvm::CodeGenTargetMachineImpl::initAsmInfo(): Assertion `MRI && "Unable to create reg info"' failed.
    
  • Windows:
    $ ./tools/clang/unittests/Release/AllClangUnitTests.exe
    [==========] Running 24393 tests from 520 test suites.
    ...
    [----------] 5 tests from ASTMatchersTest
    [ RUN      ] ASTMatchersTest.NamesMember_CXXDependentScopeMemberExpr
    [       OK ] ASTMatchersTest.NamesMember_CXXDependentScopeMemberExpr (22 ms)
    [ RUN      ] ASTMatchersTest.ArgumentCountIs_CXXUnresolvedConstructExpr
    [       OK ] ASTMatchersTest.ArgumentCountIs_CXXUnresolvedConstructExpr (4 ms)
    [ RUN      ] ASTMatchersTest.HasArgument_CXXUnresolvedConstructExpr
    [       OK ] ASTMatchersTest.HasArgument_CXXUnresolvedConstructExpr (5 ms)
    [ RUN      ] ASTMatchersTest.DecompositionDecl
    [       OK ] ASTMatchersTest.DecompositionDecl (13 ms)
    [ RUN      ] ASTMatchersTest.Finder_DynamicOnlyAcceptsSomeMatchers
    [       OK ] ASTMatchersTest.Finder_DynamicOnlyAcceptsSomeMatchers (0 ms)
    [----------] 5 tests from ASTMatchersTest (46 ms total)
    ...
    [ RUN      ] CodeGenTest.CodeGenFromIRMemBuffer
    warning: overriding the module target triple with i386-unknown-linux-gnu
    Assertion failed: MRI && "Unable to create reg info", file D:\iusers\thonerma\llvm-project\llvm\lib\CodeGen\CodeGenTargetMachineImpl.cpp, line 48
    Exception Code: 0x80000003
    

There appear to be at least three issues here:

  1. The assertion failure.
  2. That the assertion failure doesn't cause lit to report a failure.
  3. The output suggests that only 5 tests from ASTMatchersTests were run. Is that because the program aborted before completing? Or is this yet another way in which "tests" are differently counted?

Issue 2: How do I run just the ASTMatchersTests now?

If there is a way to use lit to run just the ASTMatchersTests portions of the Clang-Unit set of tests, I'd love to know how to do that. Previously, a command like the following sufficed to run all the relevant tests via lit:

$ python bin/llvm-lit.py -vv tools/clang/test/Unit/ASTMatchers 

I am aware that the Google Test framework results in an executable that supports lot of options for controlling how the tests are run. For example, I can do this:

$ ./tools/clang/unittests/AllClangUnitTests --gtest_filter='ASTMatchersTests*'
...
[==========] 12558 tests from 1 test suite ran. (30913 ms total)
[  PASSED  ] 12558 tests.

Hey, look! That number matches what I got from the lit log! Yay!

But, as seen above, running tests via lit and via the standalone unit test executable doesn't seem to produce the same behavior. It is therefore important to be able to do both.

Issue 3: How it the typical developer expected to investigate a failed lit test?

Let's say that lit did fail one of the tests (as it seems it should have above). The lit output is not as helpful as it could be:

********************
FAIL: Clang-Unit :: ./AllClangUnitTests/16/96 (1288 of 21587)
******************** TEST 'Clang-Unit :: ./AllClangUnitTests/16/96' FAILED ********************
...
Failed Tests (26):
  Clang-Unit :: ./AllClangUnitTests/ASTMatchersTests/ASTMatchersTest/IsExpandedFromMacro_MatchesObjectMacro/C11
  ...

The astute developer will eventually find that the key they need to actually investigate/debug the problem is hidden in the output

Script:
--
/.../tools/clang/unittests/./AllClangUnitTests --gtest_filter=ASTMatchersTests/ASTMatchersTest.IsExpandedFromMacro_MatchesObjectMacro/CXX11
--

Room for improvement:

  • Better terminology to distinguish testsuites, tests, grouped tests, discovered/unit tests, etc...
  • What do those "16/96" numbers above mean? Are they useful? Can they be replaced with something more useful?
  • Better guidance to help a developer:
    • Correlate a failed test message

      FAIL: Clang-Unit :: ./AllClangUnitTests/16/96 (1288 of 21587)

      to the failed test:

      Clang-Unit :: ./AllClangUnitTests/ASTMatchersTests/ASTMatchersTest/IsExpandedFromMacro_MatchesObjectMacro/C11

      to the command needed to run just that test:

      /.../tools/clang/unittests/./AllClangUnitTests --gtest_filter=ASTMatchersTests/ASTMatchersTest.IsExpandedFromMacro_MatchesObjectMacro/CXX11

      to the command needed to run just that test via lit:

      <I wish I knew>

Summary

I know the above is discussing issues that predate the recent refactoring, but I think these issues have been made worse by it. I spent a lot of time trying to wrap my head around what all was going on in my environments over the last week.

Metadata

Metadata

Assignees

No one assigned

    Labels

    clangClang issues not falling into any other categorytest-suite

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions