Encoding.UTF8.GetChars Exception

leforban · November 19, 2015, 10:23am

Hi Everyone,

I am receiving ASCII characters as byte arrays over can bus. When trying to convert the byte array into string, I am observing an exception while conrting the following byte: E9. This corresponds to “é”.

Does anyone know how to properly manage such case?

Reinhard_Ostermeier · November 19, 2015, 10:42am

As @ andre.m stated, ASCII encoding is not supported by NETMF.
String has an constructor that takes an array of chars (char[]) but a char is Unicode as well.

So the quickes solution would be to


myByteArray
var sb = new StringBuilder(myByteArray.Length);
foreach(var by in myByteArray)
{
  sb.Append((char)by);
}
var myStr = sb.ToString();

Well, performance is something different, but it would work for sure

leforban · November 19, 2015, 11:13am

Supporting extended ascii is a must have especially when you have to deal with non english spoken devices…

As a quick fix, I am using the Reinhard solution, however I am also concerning by runtime efficiency and I am interesting to try to develop and bench a solution to inherits from the encoding class. I will try to code one soon.

Thanks guys for your help!

Bec_a_Fuel · November 19, 2015, 12:00pm

Use this :

Boolean _completed;
Int32 _bytesUsed, _charsUsed;
var param = YOURBYTEARRAY;
var _chars = new Char[param.Length];
Encoding.UTF8.GetDecoder().Convert(param, 0, param.Length, _chars, 0, param.Length, false, out _bytesUsed, out _charsUsed, out _completed);
var YOURRESULTINGSTRING = new String(_chars, 0, _charsUsed).Trim('\r', '\n');

You will not get the exception anymore. But extended chars greater than 127 will not be in the resulting string, of course.

leforban · November 20, 2015, 5:04am

@ andre.m - I am not aware of coding standard for special char (ext ascii utf8 and unicode stuff). The thing is that in french there’s tons of àéèù… and that using encoding.utf8 class fires exceptions for such chars.

leforban · November 20, 2015, 5:05am

@ Bec a Fuel -

I need to represent the chars greater than 127…

Bec_a_Fuel · November 20, 2015, 5:27am

Just FYI, if you did not notice, I am french

That said, about your problem, are you sure that you are indeed receiving UTF-8 data and not extended ASCII ?

For ascii codes greater than 128, UTF-8 needs (at least) two bytes to code them. Are you receiving “0xc3 0xa9” for the char ‘é’, for example ? If you receive only 0x82, then what you are receiving is not UTF-8 encoded but instead ASCII-encoded.

leforban · November 20, 2015, 5:52am

I should have read your profile before Ricard Inside…

I am receiving E9 for ‘é’. This sounds to me that this is not UTF8 but extended ascii which explains why I can’t use Encoding.UTF8.getchars() method.