Skip to content

Endless loop in buildSymbolTable on 32-bit ARM / gcc #37

@mhx

Description

@mhx

While extending the CI test matrix for my project using fsst and adding a 32-bit ARM build with gcc in addition to the clang build, I found that my unit tests were running forever.

After a bit of debugging, I noticed that this loop:

      // add candidate symbols based on counted frequency
      for (u32 pos1=0; pos1<FSST_CODE_BASE+(size_t) st->nSymbols; pos1++) { 
         u32 cnt1 = counters.count1GetNext(pos1); // may advance pos1!!
 

was running forever with pos1 stuck at 97. Inserting some good old print statements into count1GetNext()...

   u32 count1GetNext(u32 &pos1) { // note: we will advance pos1 to the next nonzero counter in register range
      // read 16-bits single symbol counter, split into two 8-bits numbers (count1Low, count1High), while skipping over zeros
      u64 high = fsst_unaligned_load(&count1High[pos1]); // note: this reads 8 subsequent counters [pos1..pos1+7]
      std::cerr << "high = " << high << " pos1 = " << pos1 << std::endl;
      u32 zero = high?(__builtin_ctzl(high)>>3):7UL; // number of zero bytes
      std::cerr << "zero = " << zero << std::endl;
      high = (high >> (zero << 3)) & 255; // advance to nonzero counter
      std::cerr << "high = " << high << std::endl;
      if (((pos1 += zero) >= FSST_CODE_MAX) || !high) // SKIP! advance pos2
         return 0; // all zero

...shows that when high exceeds the range of a 32-bit integer, zero suddenly ends up as -1, basically cancelling out with the pos1++ from the outer loop:

high = 0 pos1 = 89
zero = 7
high = 0
high = 4294967296 pos1 = 97
zero = 4294967295
high = 0
high = 4294967296 pos1 = 97
zero = 4294967295
high = 0

Whether or not there's a compiler bug lurking somewhere in there (somehow the clang build doesn't run into this issue), the obvious problem is the use of __builtin_ctzl for a 64-bit value on a 32-bit architecture where long is just 32 bits wide.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions