Hi everyone. I am facing some troubles with my application. I am using the current version of Hashtable.cs available on codeplex but I was also experiencing the same problem with the one provided in the System.Collections namespace.
I have stored data of several types (float, int short, ushort…) into a Hashtable called HTV.
Then I need to compute substraction addition of item of the HTV and store the result in other item of the HTV.
Each object stored in the HTV is define with its own type.
as an example a bool item of HTV is declared as:
public class val_HtV_bool:val_HTV
public new bool value;
whereval_HTV is defined as:
public class val_HTV
public object value;
Unfortunately I observe very bad performance (3.7ms per substraction), I have try during all this day to improve this without any success. (HTV is about 100 of buckets) Do you have any advice? I believed that encapsulating the value into a class would avoid boxing unboxing but may be I am wrong.
May I ask, why are you keeping different numeric types in the same hashtable? Would it not be easier to convert all of them to floats, or doubles and then do math on simple known value-typed variables?
I admit to not being too familiar with the micro CLR, but the following should be faster than what you have. It avoids boxing and the use of reflection (the ‘is’ operator), and is a struct rather than a class, meaning that it doesn’t live on the heap, but rather inside objects as any value type would; this means that the garbage collector is not doing a bunch of busy work wrangling all those tiny objects. On the downside, it does use more code and data space than using a boxed object would.
I have implemented your solution this morning but did not observed enhancement in speed and the data struct is huge (I have 14 types of variables to handle from short to ulong, passing by string, sbyte…)
I am still thinking that I do not access to item of hashtable by reference but rather by value (involving copy…) this is probably the cause of bad performance.
How to avoid boxing and unboxing using collections?
HTV[key1] = HTV[key2];
Takes 2ms!!! My idea was just to have a single table to manage all the data of my application. This is why HashTable of variable has been chosen but actually I think that this design choice may not be the correct one.
Simple answer? Don’t box. Convert all your values to floats/doubles/longs and just roll with one value-type. Every time you rely on the system to have smarts for you, you will pay dearly in performance. This is especially true for NETMF. I’ve done a bit of speed testing with FEZ Spider and posted a post-mortem on my blog:
The solution, at the end of the day, was RLP. If you need to move and process any “significant” amount of data in a “reasonable” amount of time, you should switch to RLP. Unfortunately, RLP is C/C++ code, so you lose all the niceness of C#.
Interesting. That means that the poor perf is not caused by boxing or the hashtable, otherwise getting rid of one or the other should have made your code faster. I’m also surprised that simply inserting into an array takes 2ms. I mean, this is a non-JIT’ed CLR on a slow and relatively primitive CPU (I don’t think any CPU currently used for netmf does superscalar or out of order execution), but that’s just a bit much Are you completely sure you’re measuring what you think you’re measuring?
Wow. I knew that netmf was not going to be a speed demon, but these numbers are just not very encouraging. 7.5ms to insert 32 elements into a fixed-size array… I think my Commodore 64 running BASIC ran faster than that
What board are you using? Hopefully it’s one of the slower ones.
This does not make any sense. Are you sure the unit is correct that you are using milliseconds, (1/1000 of a second) not microseconds (1/1000000 of a second)?
Saving int32 data into a static array takes roughly 0.01 ms, or 10 microsecond, on the Hydra board I just received. See my results from this thread. http://www.tinyclr.com/forum/21/6546/#/2/
Hopefully I was not too wrong.
Even if I was correct, it is a bit shocking (already) to see a 200MHz CPU taking 2000 cycles to access one address in RAM, even knowing all that address indirections involved in .Net runtime. Maybe the bottleneck is somewhere else? RAM is too slow, or the memory bus is too slow, or the cache is too small?
I question MoonDragon’s numbers. In my testing of an empty loop, I got 31.8 microseconds per empty loop iteration (compared to his 54us), using my FEZ Panda II.
On my Cerb40, I get 7.4 microseconds. Interesting that the increase in MHz alone doesn’t explain the speed increase. Something else must have gotten faster in NETMF 4.2.
Also, for those that are curious, the numbers are extremely consistent (doesn’t change from run to run, which makes sense), and it doesn’t matter whether one uses DateTime.Now.Ticks or Microsoft.SPOT.Hardware.Utility.GetMachineTime().Ticks for measurement. The numbers come out the same.
Here’s the code I used to test:
// do some math
var iterations = 10000;
var one_ms_span = new TimeSpan(0, 0, 0, 0, 1);
var ticks_per_us = (double)one_ms_span.Ticks / 1000d;
var start = Microsoft.SPOT.Hardware.Utility.GetMachineTime();
for( var x = 0; x < iterations; x++ )
// emtpy loop
var end = Microsoft.SPOT.Hardware.Utility.GetMachineTime();
// more math
var total_ticks = end.Ticks - start.Ticks;
var ticks_per_iter = (double)total_ticks / (double)iterations; // get single iteration value in ticks
var us_per_iter = ticks_per_iter / ticks_per_us; // get single iteration value in microseconds