Skip to content

Conversation

@dwbutler
Copy link

This fixes the issue of non-breaking spaces being discarded, resulting in words being joined together. \s and \S only match on space (character 32). HTML often uses non-breaking spaces (character 160). A more general approach is to use [[:space:]] instead. According to the Regexp documentation, [[:space:]] matches "Whitespace character ([:blank:], newline, carriage return, etc.)"

Fixes #68

dwbutler added 2 commits May 25, 2016 15:31
This fixes the issue of non-breaking spaces being discarded, resulting in words being joined together. `\s` and `\S` only match on space (character 32). HTML often uses non-breaking spaces (character 160). A more general approach is to use `[[:space:]]` instead. According to the `Regexp` documentation, `[[:space:]]` matches "Whitespace character ([:blank:], newline, carriage return, etc.)"

Fixes hgmnz#68
sonjapeterson added a commit to BookBub/truncate_html that referenced this pull request Nov 30, 2016
This is based on a currently unmerged PR to the main repo:
hgmnz#69

It fixes an issue where truncate_html would remove non breaking spaces,
resulting in words being joined together incorrectly. Now they are
treated like other whitespaces.
@elspeth-rabid
Copy link

Bump! I need this, it is such a useless gem, but I am having issues with   characters being stripped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spaces disappearing

2 participants