A Performance topic

taylorza · July 4, 2012, 1:29pm

I am chiming in late here, so I hope I am not stating something that has already been said. Since .NETMF has no JIT the optimization that is achieved on the desktop is not seen in .NETMF. But for the desktop there is a significant optimization that takes place, that is ‘bounds check elimination’.

Let’s take the following example


void SomeFunc(int[] numbers)
{
  int count = numbers.Length;
  for (int i = 0; i < count; i++)
  {
    numbers[i] = i;
  }
}

vs.


void SomeFunc(int[] numbers)
{
  for (int i = 0; i < numbers.Length; i++)
  {
    numbers[i] = i;
  }
}

In the second example, the JIT compiler can statically determine that ‘i’ is constrained to the bounds of the array and will therefore eliminate the bounds check that is done every time ‘numbers[i] = i;’ is executed. For high-performance processor intensive applications this is can be a source of significant optimization that is lost if the developer manually hoists ‘numbers.Length’.

As with any optimization we need to profile before optimizing, because what applies to one platform does not apply to another and as I am learning, full .NET vs .NETMF is no exception.

iwritecode · July 5, 2012, 2:43am

This is the most sensible statement of the whole topic. Optimizing perfomance can only be done after measuring where the performance issues are. Measuring is everything. Assuming something will be faster is a nogo. Only by measuring you know where the performance problem resides and only by measuring you can determine if the problem was solved.

leforban · July 5, 2012, 3:18am

You’re right, the rules that we propose to edict have to be argumented by portion of code that proves a certain speed up.

twospoons · July 11, 2012, 12:19am

Keep in mind that a lot of these rules go out the window for a few reasons; it’s a RISC CPU and it’s .netmf. I would bet that some of the opts done in .net code for the Xbox (which is also RISC ) would fit better.

What I did to get better perf on Xbox was to avoid too complicated math (matrix mult and stuff like that), and unwind some of that logic since we are dealing with RISC CPUs.

Not Jitting doesn’t help us either… I did take a peek at the net tiny clr code and they did try to do it at some point. The people who disabled it are not stupid so I’ll stick with RLP where needed and C# elsewhere. (and follow RISC rules)

twospoons · July 11, 2012, 12:51am

Prejitting would probably be a hell of a lot simpler to implement, and could avoid some of the pitfalls of too much memory / CPU being used trying to manage it real time. We used to do that in the “old” days to avoid the jit hiccups for certain apps. You loose MSIL portability when you do that; but that doesn’t matter that much for a micro controller.

godefroi · July 11, 2012, 1:20am

The NETMF team abandoned AOT (Ahead Of Time) native compilation because the MSIL was more compact, and therefore, more code would fit on the device in a given amount of flash. This may or may not be an issue today, with our micros that have 1MB or more of flash. They abandoned JIT because, with such low RAM available, they had to keep throwing out JITted code to reclaim memory, and thus had to re-JIT the same code over and over. It ended up being slower.

On a device with external RAM and flash, I would guess that either of these options would be perfectly viable, but neither is implemented in the current NETMF.

You can read the post here: .NET Micro Framework - FAQ

twospoons · July 11, 2012, 2:32am

Would have been nice if they left those options in there and let the developer decide the deployment model. Seems pretty hardcore to implement that now. Prejitting could be “easier”, just dump out the interpreted code. I think who is up for writing a micro ngen?

leforban · July 11, 2012, 6:14am

As an example I am looking for the most efficient way to perform bytes to string conversion of NMEA data.

After receiving the array of byte coming from the serial port, an heavy task is to convert the bytes aray into string. I saw some function using UTF8Encoding but this fails since NMEA data is not UTF8, it is ASCII.

Actually I have tried several way of coding it but I think there’s still better to do, what do youo think of investing time to have the most efficient function for doing this? If you agree, I already have several version that I can share with you in order to have a discussion base.

Architect · July 11, 2012, 9:45am

If I remeber right. This has been discussed before. Try searching the forum. There is also this parser, I don’t know how good it is, but here it is anyways:

http://www.tinyclr.com/forum/topic?id=6007

leforban · July 11, 2012, 10:18am

Hello Architect

As I said, this topic is oriented performance. My NMEA parser works fine in terms of behaviour, but unfortunately, it is really slow. Most of us when performance become critical are in this situation, code works but too slowly. That is why I think we should concentrate on the slow task. I already had a look on the HZ.NMEA parser and this does not answer to the question that is: how to convert in the most efficient way (as fast as possible) an array of bytes that are bytes from a serial port into a string.

The HZ.NMEA parser uses the serialport.readline…

twospoons · July 11, 2012, 10:59am

RLP will be your friend in this scenario. Any bit banging or heavy iterative logic is about 100 times faster in RLP (my tests).

Even in .net on a fullblown intel stack this is not going to be fast if you need several mbit/s over tcp/ip or similar (converting bytes into UTF8 and then comparing).

One other option is not to convert it to strings, but keep the bytes and just compare bytes if you’re looking for something in particular in a byte array.

You could even do this without using a lot of memory where you just stream byte by byte and compare them.

twospoons · July 11, 2012, 11:04am

What I was trying to say is;

It’s a lot faster to do this:
-Convert string into bytes, find that in a byte array instead of converting the whole byte array into UTF8 and look for a string.

Or even faster:
-Convert string into bytes, stream the bytes from input and compare without storing them in an array

Even faster:
-Look for bytes in a byte stream (where you hardcode what you are looking for into smaller byte arrays)

RorschachUK · July 11, 2012, 11:07am

@ leforban, I don’t have access to my source code at the minute but I think my code for turning serial data bytes into string was along the lines of - first fetch into byte[] buffer, then turn that into char[] array with System.Text UTF8 encoder, then create new string using the char[] array as parameter to the constructor. If it’s straight ASCII data though, could the byte[] array be recast as char[] and skip the encoder?

godefroi · July 11, 2012, 12:05pm

That’s REALLY pessimistic. I can parse 46MB of HL7 v2 messages (33 thousand of them) into individual values, which involves a LOT of string comparisons, replacements, and calls to String.Split(), including decoding escaped values, in 6.4 seconds on this cheapo laptop I’m using. That’s 57.5 mbits/second, and what I’m doing makes parsing NMEA strings look like “Hello, World”.

If I parallelize the piece that reads the messages off of disk, then I can parse them in 3.5 seconds. That’s 105 mbits/second. The desktop framework is REALLY fast.

If you would like to know what HL7v2 looks like, see here: Health Level 7 - Wikipedia

Justin · July 11, 2012, 12:13pm

@ godefroi - just looked at HL7v2 - eek…

godefroi · July 11, 2012, 12:14pm

It’s what I do, all day every day

Justin · July 11, 2012, 12:15pm

and i thought i had it bad with JDF and XPIF…

godefroi · July 11, 2012, 12:26pm

The world is slowly moving to HL7v3, which is XML based, but it’s so massive and so complex and so impenetrable, that if anything, it’s worse…

twospoons · July 11, 2012, 12:41pm

godefroi:

twospoons:

Even in .net on a fullblown intel stack this is not going to be fast if you need several mbit/s over tcp/ip or similar (converting bytes into UTF8 and then comparing).

That’s REALLY pessimistic. I can parse 46MB of HL7 v2 messages (33 thousand of them) into individual values, which involves a LOT of string comparisons, replacements, and calls to String.Split(), including decoding escaped values, in 6.4 seconds on this cheapo laptop I’m using. That’s 57.5 mbits/second, and what I’m doing makes parsing NMEA strings look like “Hello, World”.

If I parallelize the piece that reads the messages off of disk, then I can parse them in 3.5 seconds. That’s 105 mbits/second. The desktop framework is REALLY fast.

If you would like to know what HL7v2 looks like, see here: Health Level 7 - Wikipedia

I would typically use biztalk or mirth for parsing hl7, but that’s a different topic.
If you’re talking about reading raw bytes from a socket, convert to UTF8 then do comparisons things change.

twospoons · July 11, 2012, 12:43pm

AFAIK windows has kernel support for xml, so it shoudln’t be that bad. It’s funny that when healthcare moves to XML the rest of the world has moved to json.