Skip to content

[Breaking change]: In .NET 8 Stream Reader emits Unicode replacement character, .NET 7 did not #38262

Open
@joelverhagen

Description

@joelverhagen

Description

When a StreamReader with default constructor (UTF-8) encounters a UTF-8 character that is broken in half (one particular kind of invalid UTF-8 byte sequence), the handling changed from .NET 7 to .NET 8. I wasn't able to find docs mentioning this change.

Repro code:

using System.Runtime.InteropServices;
using System.Text;
using System.Text.Json;

var str = "  \u00B7  ";
var bytes = Encoding.UTF8.GetBytes(str);
Console.WriteLine("Framework: " + RuntimeInformation.FrameworkDescription);
for (var i = 1; i <= bytes.Length; i++)
{
    var range = bytes[0..i];
    var readByStreamReader = new StreamReader(new MemoryStream(range)).ReadToEnd();
    Console.WriteLine(JsonSerializer.Serialize(readByStreamReader));
}

Output in .NET 7 (no replacement character emitted):

Framework: .NET 7.0.14
" "
"  "
"  "
"  \u00B7"
"  \u00B7 "
"  \u00B7  "

Output in .NET 8 (replacement character emitted)

Framework: .NET 8.0.0
" "
"  "
"  \uFFFD"
"  \u00B7"
"  \u00B7 "
"  \u00B7  "

Version

.NET 8 GA

Previous behavior

I noticed this on .NET 8 GA. I did not test .NET 8 previews.

New behavior

A \uFFFD character (Unicode replacement character) is emitted by the StreamReader now. Previously nothing was emitted.

Type of breaking change

  • Binary incompatible: Existing binaries may encounter a breaking change in behavior, such as failure to load or execute, and if so, require recompilation.
  • Source incompatible: When recompiled using the new SDK or component or to target the new runtime, existing source code may require source changes to compile successfully.
  • Behavioral change: Existing binaries may behave differently at run time.

Reason for change

Product team can provide details I think.

Recommended action

Document the change.

Feature area

Globalization

Affected APIs

System.IO.StreamReader

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions