Tuesday, April 20, 2010

BOM Remover for passthrough receive pipeline

If text file has utf-8 signature (BOM), we want to remove that without using disassambler, I used the following code
byte[] buffer = new byte[streamLength];//new byte[originalStrm.Length];

originalStrm.Read(buffer, 0, Convert.ToInt32(streamLength));


byte[] preamble = Encoding.UTF8.GetPreamble();

//239 187 191

if (preamble[0] == buffer[0] && preamble[1] == buffer[1] && preamble[2] == buffer[2])
{
System.Diagnostics.Trace.WriteLine(" file has utf-8 signature and removed ");

byte[] bufferN = new byte[streamLength-3];//new
Buffer.BlockCopy(buffer, 3, buffer, 0, streamLength - 3);
buffer = bufferN;
}
else
System.Diagnostics.Trace.WriteLine(" file does not have utf-8 signature");

2 comments:

Alfredo said...

I never know that BOM is a part of file and we need to remove that for UTF-8 format. After reading this article, now I know how to remove BOM indicator for passthrough data / files.

PeterChu2940 said...

you have made a mistake in the the BockCopy command, source is the same as target

 test