TextEncodingConverterGetCodePage(ReadOnlySpan`1Byte, Int32) Method

Examines a read-only Byte span, that represents the contents of a text file, to see if it starts with a Byte Order Mark (BOM), and returns an appropriate code page. (The fallback value is 65001 for UTF-8.)

Definition

Namespace: FolkerKinzel.Strings
Assembly: FolkerKinzel.Strings (in FolkerKinzel.Strings.dll) Version: 9.4.0+10a7d4d71aa960998e32ac0ac6c4fcbe4164c917
C#
public static int GetCodePage(
	ReadOnlySpan<byte> data,
	out int bomLength
)

Parameters

data  ReadOnlySpanByte
The span to examine. It should be at least 4 Bytes long.
bomLength  Int32
When the method returns, it contains the length of the BOM found or zero if no BOM was found. The parameter is passed uninitialized.

Return Value

Int32
An appropriate code page for data or the code page for UTF-8 (65001) if the code page could not be determined from data.

Remarks

The method recognizes the byte order marks for the following character sets:
  • UTF-8
  • UTF-16LE
  • UTF-16BE
  • UTF-32LE
  • UTF-32BE
  • UTF-7
  • GB18030

UTF-16LE, UTF-16BE, UTF-32LE and UTF-32BE can also be recognized by the method from the data if there is no Byte Order Mark.

See Also