Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions packages/cli/src/ui/utils/textUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,14 @@ export function cpSlice(str: string, start: number, end?: number): string {
* Characters stripped:
* - ANSI escape sequences (via strip-ansi)
* - VT control sequences (via Node.js util.stripVTControlCharacters)
* - C0 control chars (0x00-0x1F) except CR/LF which are handled elsewhere
* - C0 control chars (0x00-0x1F) except CR/LF/TAB which are handled elsewhere
* - C1 control chars (0x80-0x9F) that can cause display issues
*
* Characters preserved:
* - All printable Unicode including emojis
* - DEL (0x7F) - handled functionally by applyOperations, not a display issue
* - CR/LF (0x0D/0x0A) - needed for line breaks
* - TAB (0x09) - needed for structured text
*/
export function stripUnsafeCharacters(str: string): string {
const strippedAnsi = stripAnsi(str);
Expand All @@ -99,11 +100,8 @@ export function stripUnsafeCharacters(str: string): string {
const code = char.codePointAt(0);
if (code === undefined) return false;

// Preserve CR/LF for line handling
if (code === 0x0a || code === 0x0d) return true;
if (code === 0x0a || code === 0x0d || code === 0x09) return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this change is functionally correct, it removes helpful comments that explained the logic for handling C0 control characters. This reduces the code's clarity and maintainability, making it harder for future developers to understand why certain characters are preserved while others are stripped. Please consider restoring updated versions of these comments. For instance, a comment explaining the preservation of CR, LF, and TAB would be beneficial here. The comment explaining the removal of other C0 characters, which was previously before the next if statement, was also valuable and should be restored.

Suggested change
if (code === 0x0a || code === 0x0d || code === 0x09) return true;
// Preserve CR, LF, and TAB for line breaks and structured text.
if (code === 0x0a || code === 0x0d || code === 0x09) return true;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, @naaa760 would you be able to add this comment in? also it looks like there's a comment on the following lines about bell characters etc. which was removed, could you add that back in as well?


// Remove C0 control chars (except CR/LF) that can break display
// Examples: BELL(0x07) makes noise, BS(0x08) moves cursor, VT(0x0B), FF(0x0C)
if (code >= 0x00 && code <= 0x1f) return false;

// Remove C1 control chars (0x80-0x9f) - legacy 8-bit control codes
Expand Down
Loading