What are GC messages telling us?

sgtyar95 · August 2, 2022, 2:28pm

So I have a “memory leak” (not strictly accurate, yes, but it’s similar in practice, so I don’t know what to call it - memory bloat, maybe). I let it run overnight and then parse out the garbage collection messages into an excel spreadsheet. Over the course of 12 hours, both my String and Value type memory usage are increasing at exactly the same rate - there is a difference of 14KB between them when I powered on the machine yesterday, and 14KB when I came in this morning, however, the values had increased dramatically.
before
Type 11 (STRING ): 358240 bytes
Type 14 (VALUETYPE ): 341424 bytes

after
Type 11 (STRING ): 1063120 bytes
Type 14 (VALUETYPE ): 1048800 bytes

They appear to be increasing roughly linearly, and my array types aren’t increasing in size so it’s not an arraylist or hashset that is growing in size without my knowledge. I also force GC when memory hits a certain point because I’ve had issues with Out of Memory exceptions in the past crashing my program, and I force it to wait for finalizers. This also has cropped up as I have been moving to the UI library, so I may be doing something wrong with the UI and breaking it.

As far as I understand it, a string is a value type, but it is more than the valuetype designation - wouldn’t valuetype include things like ints, doubles, structs, etc. and should be larger? They’re clearly related, otherwise they wouldn’t increase at an identical rate, but I’m just trying to figure out what’s going on under the hood so I know where to start looking.

Ultimately, my question is this - What is VALUETYPE? the valuetypes are broken out into their own categories, so if VALUETYPE is something specific I can narrow it down.

mcalsyn · August 2, 2022, 8:48pm

What you are seeing in the GC message is heap size - so anything you created with ‘new’ or defined as a field within something you created with ‘new’.

Value types (mostly scalars like int, float, double, and structs) are allocated on the stack if they are declared within a method, but they are allocated on the heap if they are fields in a class or are part of a struct that has been created with ‘new’. The VALUETYPE class is the set of fixed-sized scalars (and I think arrays, too) that were allocated on the heap. The TinyCLR heap tries to keep predictable size things (scalars, arrays, structs, etc) in separate zones on the heap to cause less memory churn and save some cpu cycles.

Strings are actually never value types and never stored on the stack (though references to strings can be on the stack). Strings that are local to a method are stored as a reference to an allocation on the heap. So, string foo = “bar” ; is actually shorthand for a ‘new’ operation that the compiler generates for you.

As you probably already know, things are garbage collected off the heap when nothing is referencing them anymore. So, if you are doing forced GC’s and still seeing monotonically increasing memory usage, then you do have a mem leak and somewhere there is a reference (or chain of references) that still point to the leaked objects.

On full .Net, there are tools (for instance, by JetBrains) that can tell you who is pointing at what, but we don’t have that in the TinyCLR runtime. Instead, you can try putting a breakpoint on class finalizers and seeing if the expected finalizers are getting called during a GC. It’s hard to leak strings, but very easy to leak classes that reference strings, so start by chasing down leaked classes.

Failing to call IDisposable.Dispose can be one source of leaks. If you new up an IDisposable, you must call Dispose() or you will leak something.

My most common mistake is not unsubscribing from events with the “-=” operator. Delegate references will also keep an object alive after all other refs have been released. So if you have a button and you added a .Click handler, be sure to unsubscribe.

Lambdas (anonymous methods) that reference a member of a class will also hold that class in memory. The collection of things outside a lambda that you reference from within a lambda is called a closure. Instead of using a class member in your lambda, copy the class variable into a local variable and then reference the local variable within the lambda. This is a subtle one that can cause big leaks.

Static objects will hold the entire connected tree of references in memory as well. But that’s kind of the idea of having a static. Just be careful what you reference from a static class.

Hope this helps out!

sgtyar95 · August 2, 2022, 9:08pm

Thanks for your detailed response! this at least helps me understand where to begin. I do have a few questions though -

If I use a string literal within a lambda (for instance, a delegate in the timed scheduler I was working on), will the string be reallocated every time the lambda is called? Will the lambda itself ever be disposed of? I have quite a few of these tasks that are wrapped in Task wrapper objects - I was assuming that once the task was run and dereferenced, the lambda and all enclosed variables would be disposed of as well. Is this accurate or does the lambda stick around?

I am pretty religious about doing this, but there are a lot of buttons that sit around on the heap in runtime. From what I can tell, these shouldn’t ever get cleaned up as I have a bunch of Panels with a a ton of on screen UI elements operating as logical screen objects that I retain references to. Could this cause issues if any delegates are only assigned once and their referencing object is never deleted?

mcalsyn · August 2, 2022, 9:23pm

String literals in a lambda won’t get reallocated, but references to string fields from within lambdas can cause leakage by keeping the whole containing class around. I tried to cobble up an example:

public class SomeClass
{
    private int luckyNumber = 7;

    public void ConfigureSomething()
    {
        // This will create a persistent reference to 'this'
        //   from within the StaticThing class. The lambda has
        //   to 'capture' the 'this' pointer and consequently SomeClass
        //   won't get garbage collected as long as StaticThing
        //   exists. This is also true if 'StaticThing' is not
        //   a static class. As long as it exists, so does SomeCLass.
        StaticThing.CallMeBack((context) => {
            context.Frob(this.luckyNumber);
        });
    }

    public void ConfigureSomething_better()
    {
        // In this case, the lambda captures the stack
        //   variable called 'myLuckyNumber'. The 'this'
        //   pointer is not captured, so SomeClass can
        //   get garbage collected.
        var myLuckyNumber = this.luckyNumber;
        StaticThing.CallMeBack((context) => {
            context.Frob(myLuckyNumber);
        });
    }
}

Allocating delegates just once should not cause monotonically increasing memory usage. Maybe one big hit, but it should not change after that.

I would recommend sprinkling around some breakpoints on constructors and finalizers just to make sure you are seeing the allocation patterns that you expect.