8000 Change Encoding.Default to UTF-8 *without BOM* for Unix interoperability · Issue #7779 · dotnet/runtime · GitHub
[go: up one dir, main page]

Skip to content
Change Encoding.Default to UTF-8 *without BOM* for Unix interoperability #7779
@mklement0

Description

@mklement0

This is a follow-up from here.

  • Encoding.Default in the full framework on Windows reports the legacy "ANSI" code page (encoding).

  • Core doesn't support these legacy pages (by default) so its Encoding.Default must return an encoding that makes sense in a cross-platform world.

  • Currently, Encoding.UTF8 is returned, which is a UTF-8 encoding with BOM.

    • The reason for returning Encoding.UTF8 as Encoding.Default is a historic one, it seems, as the source-code comment mentions Silverlight(!).
  • UTF-8 with BOM is problematic on all Unix platforms, whose utilities do not expect a BOM and instead treat it as data, leading to unexpected results.

  • Having Encoding.Default return a UTF-8 encoding with BOM is pointless and confusing, because it neither represents any platform's nor the framework's own true default, and may cause disruption in practice.

    • Note that both .NET Framework's and .NET Core's true default is BOM-less UTF-8, because that's what's used in the absence of specifying an encoding with the methods of the System.IO class.

Metadata

Metadata

Assignees

Labels

area-System.GlobalizationenhancementProduct code improvement that does NOT require public API changes/additions

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0