BOM (Byte Order Mark)없이 텍스트 파일을 작성 하시겠습니까?
BOM없이 UTF8 인코딩으로 VB.Net을 사용하여 텍스트 파일을 작성하려고합니다. 아무도 나를 도울 수 있습니까, 어떻게해야합니까?
UTF8 인코딩으로 파일을 작성할 수는 있지만 Byte Order Mark를 제거하는 방법은 무엇입니까?
edit1 : 나는 이런 코드를 시도했다;
Dim utf8 As New UTF8Encoding()
Dim utf8EmitBOM As New UTF8Encoding(True)
Dim strW As New StreamWriter("c:\temp\bom\1.html", True, utf8EmitBOM)
strW.Write(utf8EmitBOM.GetPreamble())
strW.WriteLine("hi there")
strW.Close()
Dim strw2 As New StreamWriter("c:\temp\bom\2.html", True, utf8)
strw2.Write(utf8.GetPreamble())
strw2.WriteLine("hi there")
strw2.Close()
1.html은 UTF8 인코딩으로 만 작성되고 2.html은 ANSI 인코딩 형식으로 작성됩니다.
단순화 된 접근법-http: //whatilearnttuday.blogspot.com/2011/10/write-text-files-without-byte-order.html
바이트 순서 표시 (BOM)를 생략하려면 스트림에서 (BOM 을 생성하도록 구성된) UTF8Encoding
이외 의 인스턴스를 사용해야합니다 System.Text.Encoding.UTF8
. 이를 수행하는 두 가지 쉬운 방법이 있습니다.
1. 적절한 인코딩을 명시 적으로 지정하십시오.
매개 변수에 대해
UTF8Encoding
생성자 를 호출하십시오 .False
encoderShouldEmitUTF8Identifier
패스
UTF8Encoding
스트림 생성자로 인스턴스를.
' VB.NET:
Dim utf8WithoutBom As New System.Text.UTF8Encoding(False)
Using sink As New StreamWriter("Foobar.txt", False, utf8WithoutBom)
sink.WriteLine("...")
End Using
// C#:
var utf8WithoutBom = new System.Text.UTF8Encoding(false);
using (var sink = new StreamWriter("Foobar.txt", false, utf8WithoutBom))
{
sink.WriteLine("...");
}
2. 기본 인코딩 사용 :
Encoding
to StreamWriter
의 생성자 를 전혀 제공하지 않으면 StreamWriter
기본적으로 BOM없이 UTF8 인코딩을 사용하므로 다음과 같이 작동합니다.
' VB.NET:
Using sink As New StreamWriter("Foobar.txt")
sink.WriteLine("...")
End Using
// C#:
using (var sink = new StreamWriter("Foobar.txt"))
{
sink.WriteLine("...");
}
마지막으로 BOM 생략은 UTF-16이 아닌 UTF-8에만 허용됩니다.
이 시도:
Encoding outputEnc = new UTF8Encoding(false); // create encoding with no BOM
TextWriter file = new StreamWriter(filePath, false, outputEnc); // open file with encoding
// write data here
file.Close(); // save and close it
의 방법 만 사용하면 WriteAllText
됩니다 System.IO.File
.
File.WriteAllText 에서 샘플을 확인하십시오 .
This method uses UTF-8 encoding without a Byte-Order Mark (BOM), so using the GetPreamble method will return an empty byte array. If it is necessary to include a UTF-8 identifier, such as a byte order mark, at the beginning of a file, use the WriteAllText(String, String, Encoding) method overload with UTF8 encoding.
Interesting note with respect to this: strangely, the static "CreateText()" method of the System.IO.File class creates UTF-8 files without BOM.
In general this the source of bugs, but in your case it could have been the simplest workaround :)
If you do not specify an Encoding
when creating a new StreamWriter
the default Encoding
object used is UTF-8 No BOM
which is created via new UTF8Encoding(false, true)
.
So to create a text file without the BOM use of of the constructors that do not require you to provide an encoding:
new StreamWriter(Stream)
new StreamWriter(String)
new StreamWriter(String, Boolean)
I think Roman Nikitin is right. The meaning of the constructor argument is flipped. False means no BOM and true means with BOM.
You get an ANSI encoding because a file without a BOM that does not contain non-ansi characters is exactly the same as an ANSI file. Try some special characters in you "hi there" string and you'll see the ANSI encoding change to without-BOM.
XML Encoding UTF-8 without BOM
We need to submit XML data to the EPA and their application that takes our input requires UTF-8 without BOM. Oh yes, plain UTF-8 should be acceptable for everyone, but not for the EPA. The answer to doing this is in the above comments. Thank you Roman Nikitin.
Here is a C# snippet of the code for the XML encoding:
Encoding utf8noBOM = new UTF8Encoding(false);
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = utf8noBOM;
…
using (XmlWriter xw = XmlWriter.Create(filePath, settings))
{
xDoc.WriteTo(xw);
xw.Flush();
}
To see if this actually removes the three leading character from the output file can be misleading. For example, if you use Notepad++ (www.notepad-plus-plus.org), it will report “Encode in ANSI”. I guess most text editors are counting on the BOM characters to tell if it is UTF-8. The way to clearly see this is with a binary tool like WinHex (www.winhex.com). Since I was looking for a before and after difference I used the Microsoft WinDiff application.
It might be that your input text contains a byte order mark. In that case, you should remove it before writing.
Dim sWriter As IO.StreamWriter = New IO.StreamWriter(shareworklist & "\" & getfilename() & ".txt", False, Encoding.Default)
Gives you results as those you want(I think).
참고 URL : https://stackoverflow.com/questions/2437666/write-text-files-without-byte-order-mark-bom
'programing tip' 카테고리의 다른 글
macOS에서 Anaconda를 완전히 제거하는 방법 (0) | 2020.07.23 |
---|---|
WPF 유효성 검사 오류 감지 (0) | 2020.07.23 |
행 1과 열 A를 동시에 고정 (0) | 2020.07.23 |
페이징을 구현하는 효율적인 방법 (0) | 2020.07.23 |
패턴에 따라 R에서 부분 문자열을 추출 (0) | 2020.07.23 |