G120 vs pic32

First question would be ; Do I really need to compute an hash table ?

Answering the question depends on what type of datas and how you need to access it…That’s why I tlaked about “your plumb” in a previous post. If you can provide more details on that key point, I will help with pleasure !

When the project was starting:
Data could have been float, ushort, string, boolean, byte… To know what kind of data I had in the hash table of variable (HTV) I had an other hash table called hash table of type (HTT) where type was 1 for boolean, 2 for ushort and so on…

Finally, to avoid too many accesses on HTT, I am only storing float and string in the HTV and I check the type only in specific case. I should also mention that this data structure is not known a priori, data are declared in a xml file that the user provide using rules. Let say that he want to use the gps time. He will write in an xml file that the variable “gps time” will be stored in the HTV as a string with the key being a ushort equal to 200. He also defines in the xml file the rule to retrieve the “gps time” as the first word in the ZDA sentence on the third com port…

In fact as you can see we are developping a kind of interpretor. Our application interprets a user application defined as a xml file…

Being a c/c++ software developper, the hash table has been chosen due to its compactness and also due to its fast access time. Sadly the second criteria is not true in .netmf.

@ leforban - It is not effectively !

And moreover, if you store your datas in string for float and so on, that mean that you have to play with type conversions in all senses, that also costs resource and time !

What about having a tinyclass that contains something like this :


class MyItem
{
Type ItemType {get;set;}
object ItemValue {get;set;}
}

so that you have the type of the item in a strong typed manner, and the item value itself as an abstract object so that you can do something like that :


switch(ItemType.ToString())
{
 case "System.Int16":
     int _myvalue = (int)ItemValue;
     break;
  case "System.String":
    string _myvalue = (string)ItemValue;
    break;
  ...
}

Actually my experiments shows a ratio of 10 in access runtime between a float[200] and a hashtable of 200 elements…

Actually all is float (unless less than ten variables that are string) boolean false is stored as (float)0 , true as (float)1 and so on. As you said casting involves overhead but it is nothing compared to boxing unboxing of hashtable…

Let me think about it…

I don’t see the point with your tinyclass. Each time I will need to record a new value or to retrieve a data I will need to switch case on a string. For sure it avoid to check the HTT but as I said, I check it only in specific case (less than 10 time over 200).

For my particular case, I think that the problematic is the extremely bad performance of HT on netmf while this is something really common in software engineering…

@ leforban - OK, let’s go into…

Do you only use Hashtable because you want to have a sort of keyed collection ? What is your key composed of, and can’t you find another way to access datas ?

How do you Get() and Set() in this table ?..

Yes you’re right, I have chosen the hash table because it is a keyed collection. The key is a ushort defined in the xml file by the user. After defining the variable and its associated rule in the xml file, user defines conditions based on values of these variables in a second xml file these conditions allows the board to switch on relays, activate a buzzer…

we also discussed about hash table performance at:
http://www.tinyclr.com/forum/topic?id=6552

The other way to access data could be to have a large (>200) array of float.

I set value in the HT by using HT.Add((ushort) key, (float) value) for initialization then I use HT[(ushort) key]=(float)value
To get value I am just doing temp=(float)HT[(ushort) key].

@ leforban - Is your key a sequence ? I mean an increment such as a PK in SQL ?

I might be not clear.

The key is a unique ushort defined by the user. As an exemple let’s imagine that the user want to get the values of analog inputs. He will write in the xml file something like this:

<Variable>
    <Channel>41</Channel>
    <Type>5</Type>
    <ShrtName>AI1</ShrtName>
    <Enable>1</Enable>
    <Source>15</Source>
    <Parameters>
      <ID>1</ID>
      <value>1</value>
    </Parameters>
    <User>0</User>
    <Parameters>
      <ID>2</ID>
      <value>64</value>
    </Parameters>
    <Parameters>
      <ID>3</ID>
      <value>36.630000000000003</value>
    </Parameters>
  </Variable>
  <Variable>
    <Channel>42</Channel>
    <Type>5</Type>
    <ShrtName>AI2</ShrtName>
    <Enable>1</Enable>
    <Source>15</Source>
    <Parameters>
      <ID>1</ID>
      <value>1</value>
    </Parameters>
    <User>0</User>
    <Parameters>
      <ID>2</ID>
      <value>10</value>
    </Parameters>
    <Parameters>
      <ID>3</ID>
      <value>10.3</value>
    </Parameters>
  </Variable>

the channel is the key, the type 5 defines the variable as a float type, the source 15 defines that the value comes from an analog input , the shortname identifies the input, the parameters defines the voltage conversion.

@ leforban - OK Nice !

Than when you want to Get or Set with HT[key], do you use a foreach, a for or stuf like that ? between both performance are slightly different !

Lets say that I want to update values of the variables corresponding to the analog inputs:
I am just doing the following for each input:

HTV[ana_input.channel]=(float)ana_input.Read()*ratio;

@ leforban -

This was the problem for me when yu say, a ushort the user will choose…Using an auto generated Increment would be nicer as you will be able to use a simple sorted list an could provide index search without hash computing nor sequential search…

Also…Is 200 the maximum amount of elements you will playing with ?

The amount of element can not be known a priori since it depends on the number of variables declared in the xml file. I think a reasonnable limit could be 300 or 400

The user can choose the value of the key to organize its data as he want. As an example NMEA related data from 500 to 1000, IMU related data from 1001 to 1050…

@ leforban -

I thnik that mixing a key with a user element is already a problem, which makes you have the only solution to play with hash. If you want the user to play with user ID, than add another UniqueID that is generated by the system between 1 and 400 (your max proposal).

Than :

Add a class that define the xml params elements
A pre-defined array of 400 null objects (wont take too much memory),

when you insert an element, you do the insert at the rank that represents the UniqueID betweeen 1 and 400 (leaving some null holes where there is no ID)
When you get an object, you simply address the element index with in this case is facing the UniqueID…

This will be a huge work of refactoring!!! I have a tons of class that use it…

The extreme overhead of boxing and unboxing make the Hashtable and ArrayList mostly useless when performance is an issue. Unfortunately, this is true on the full desktop .NET as well as NETMF. This is the one big reason why generics were such a nice addition (and why “generics” as implemented by Java were not).

It’s up to you…

You could always implement your own hashtable class that uses strongly typed values instead of object… that would eliminate the boxing/unboxing overhead.