The Black Box of .NET Headline Animator

May 24, 2012

References to missing Types or Methods in referenced DLL

Ever wonder what happens if you have a binary reference to an external .dll and decide not to recompile the application or library that references/depends on it? You can get some strange errors depending on the changes that have been made. Ever experienced a BadImageFormatException, ExecutionEngineException, TypeLoadException, or MissingMethodException?

Firstly, the manifest file for the dependent app/library is not updated pointing to the new version. This can cause mismatched assembly version errors (BadImageFormatException).

Here are the results from some tests with the removal of types and/or methods on a referenced assembly (hereinafter referred to as ‘Bad Assembly’). All tests were done with x86 Console App/Library in separate solutions with a “static hardcoded” path reference to Bad Assembly (x64 shouldn’t matter):
  1. Results were always the same for Release/Debug builds.
  2. The bad assembly was always successfully loaded. ‘fuslogvw’ (.NET Assembly Load Viewer) confirmed this.
  3. Setting the reference to Bad Assembly as “Specific Version” (using v1.0.0.0) and changing the version on Bad Assembly to v1.1.0.0 had no effect. However, I didn’t try defining Bad Assembly in the “assemblies” section of the app.config. It is possible that would have given a different result.
  4. References to a missing Type OR calls to a missing Method from Bad Assembly in "static void Main()" resulted in a "System.ExecutionEngineException" (fatal error as shown below). This exception cannot be caught by any means: Assembly events, AppDomain events, try/catch block in "static void Main()". I confirmed this thru WinDbg. This is because it is the first method that the EE (CLR Execution Engine) tells the JIT to compile. Since JIT happens on a method-by-method basis and "static void Main()" is the entry point for the app, there is no place “upstream” where an exception can be caught. The error in the Event Viewer is completely cryptic and provides no indication what went wrong.
  5. If the reference to a missing Type OR calls to a missing Method from Bad Assembly occurred “downstream” of "static void Main()" AND there WAS NOT exception handling upstream, OR there WAS exception handling upstream but the exception was rethrown so that is was never caught again, then results were same as #4.  (i.e. unhandled exception)
  6. If the reference to a missing Type OR calls to a missing Method from Bad Assembly occurred “downstream” of "static void Main()" AND there WAS exception handling upstream, then the exception was caught as either a "System.TypeLoadException" or "System.MissingMethodException" respectively.  The exceptions were thrown from the JIT as the Type or Method was accessed.


HTH

Share

May 18, 2012

Why you should use ReadOnlyCollection<T>

Many people believe that you should be using List<T> to return collections from a public API. However, there are several reasons not to do so; instead return a ReadOnlyCollection<T>. The are several benefits of using a ReadOnlyCollection<T>:
  1. The *collection itself* cannot be modified - i.e. you cannot add, remove, or replace any item in the collection. So, it keeps the collection itself intact. Note that a ReadOnlyCollection<T> does not mean that the elements within the collection are "readonly"; they can still be modified. To protect the items in it, you’d probably have to have private “setters” on any properties for the reference object in the collection – which is generally recommended. However, items that are declared as [DataMember] must have both public getters and setters. So if you are in that case, you can’t do the private setter, but you can still use the ReadOnlyCollection<T>. Using this as a Design paradigm can prevent subtle bugs from popping up. One case that comes to mind is the case of a HashSet<T>. It is recommended to avoid having mutable properties on class or struct types that are going to be used in any collection that utilizes the types hash code to refer to an object of that type - e.g. HashSet<T>,
  2. Performance: The List<T> class is essentially a wrapper around a strongly-typed array. Since arrays are immutable (cannot be re-sized), any modifications made to a List<T> must create a complete copy of the List<T> and add the new item. Note that in the case of value types, a copy of the value type is created whereas, in the case of reference types, a copy of the reference to the object is created. The performance impact for reference types is negligible. The initial capacity for a List<T> is 4 unless specified thru the overloaded constructor - see here. https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.collectionsmarshal.getvaluereforadddefault Obviously, this can be very bad for memory and performance. Don’t forget that not only the items directly within the List<T> itself are copied – but every value and object with the entire object graph for that reference type. This could easily contain other references types, collections, etc. This is why it is recommended to create/initialize a List<T> instance with the overloaded constructor which takes an int denoting the size of the List<T> to be created when possible. This can often be done since you are typically iterating on a "known" size value at runtime. For example, creating a List<T> of objects from a "Repository or DataService" method may iterate/read from a IDataReader object which has a property of RecordsAffected. If you are going to be putting an item in the List<T> based on the number of times the IDataReader is read: e.g. while(reader.Read()) you can easily create the list like so:
if (reader != null && reader.RecordsAffected > 0)
{
    // initialize the list with a pre-defined size
    List<Foo> someList = new List<Foo>(reader.RecordsAffected);
    while (reader.Read())
    {
         someList.Add(...);
         ...
    }
}

Just as a side note, every time that a List<T> needs to grow in size (its new size would exceed its current Capactity, the size is *doubled*. So, if you happen to add just one more item to a List<T> that wasn’t constructed with a pre-determined size, and the List<T> expands to accommodate the new item, there will be a whole lot of unused space at the end of the List<T> even though there aren’t any items in it – bad for memory and performance.
Share

May 17, 2012

Don't implement GetHashCode() on mutable fields/properties

CodeProject You shouldn't ever implement GetHashCode on mutable properties (properties that could be changed by someone) - i.e. non-private setters.   I've seen this done in several places and it results in very difficult to find bugs.

Here's why - imagine this scenario:
  1. You put an instance of your object in a collection which uses GetHashCode() "under the covers" or directly (Hashtable).
  2. Then someone changes the value of the field/property that you've used in your GetHashCode() implementation.
Guess what...your object is permanently lost in the collection since the collection uses GetHashCode() to find it! You've effectively changed the hashcode value from what was originally placed in the collection. Probably not what you wanted.
Share

April 17, 2012

How to determine which garbage collector is running

CodeProjectYou can determine which version of GC you're running via 2 methods:
  1. calling the System.Runtime.GCSettings.IsServerGC property
  2. attaching to the process using WinDbg and checking how many GC threads you have using the command "!sos.threads" without the quotes and (according to the below criteria)...
If you are running a Console app, WinForm app or a Windows Service, you will get the Workstation GC. Just because you are running on a Server OS doesn't mean that you will get the Server version of GC.
  • If your app is non-hosted on a multi-proc machine, you will get the Workstation GC - Concurrent by default.
  • If your app is hosted on a multi-proc machine, you will get the ServerGC by default.
The following apply to any given .NET Managed Process:

Workstation GC

  • Uni-processor machine
  • Always suspends threads
  • 1 Ephemeral GC Heap (SOH), 1 LOH GC Heap
  • Runs on thread that triggered GC
  • Thread priority is the same as the thread that triggered GC

Workstation GC - Concurrent

  • Only runs concurrent in Gen2/LOH (full collection)
  • Mutually exclusive with Server Mode
  • Slightly larger working set
  • GC Thread expires if not in use after a while
  • 1 Ephemeral GC Heap (SOH), 1 LOH GC Heap
  • Has a dedicated GC Thread
  • Thread priority is Normal

Server GC

  • Larger segment sizes
  • Faster than Workstation GC
  • Always suspends threads
  • 1 Ephemeral GC Heap (SOH) for each logical processor (this includes hyperthreaded), 1 LOH GC Heap for each logical processor (this includes hyperthreaded)
  • Has dedicated GC Threads
  • Thread priority is THREAD_PRIORITY_HIGHEST
There is only 1 Finalizer thread per managed process regardless of GC Mode. Even during a concurrent GC, managed threads are suspended (blocked) twice to do some phases of the GC.

A seldom known fact is that even if you try to set the Server mode of GC, you might not be running in Server GC; the GC ultimately determines which mode will be optimal for your app and WILL override your settings if it determines your ServerGC setting will negatively impact your application. Also, any hosted CLR app will have any manual GC settings overridden.

In CLR 4.0, things change just a little bit

  • Concurrent GC is now Background GC
  • Background GC only applies to Workstation GC
  • Old (Concurrent GC):
    • During a Full GC Allowed allocations up to end of ephemeral segment size
    • Otherwise, suspends all other threads
  • New (Background GC):
    • Allows for ephemeral GC’s simultaneously with Background GC if necessary
    • Performance is much faster
  • Server GC always blocks threads for collection of any generation

In CLR 4.5, things change just a little bit...again

  • Background Server GC
    • Server GC no longer blocks. Instead, it uses dedicated background GC threads that can run concurrently with user code - see MSDN: Background Server GC
Thus, in .NET 4.5+, all applications now have background GC available to them, regardless of which GC they use.

.NET 4.7.1 GC Improvements

.NET Framework 4.7.1 brings in changes in Garbage Collection (GC) to improve the allocation performance, especially for Large Object Heap (LOH) allocations. This is due to an architectural change to split the heap’s allocation lock into 2, for Small Object Heap (SOH) and LOH. Applications that make a lot of LOH allocations, should see a reduction in allocation lock contention, and see better performance. These improvements allow LOH allocations while Background GC (BGC) is sweeping SOH. Usually the LOH allocator waits for the whole duration of the BGC sweep process before it can satisfy requests to allocate memory. This can hinder performance. You can observe this problem in PerfView’s GCStats where there is an ‘LOH allocation pause (due to background GC) > 200 msec Events’ table. The pause reason is ‘Waiting for BGC to thread free lists’. This feature should help mitigate this problem.
Share

March 13, 2012

Don't use 'using()' with a WCF proxy


If you're trying to be a conscientious developer and making sure that you cleanup your resources - great! You are writing 'using()' blocks around all of your disposable items - great...except when the disposable item is a WCF Client/Proxy! The using() statement and the try/finally effectively have the same IL:

    // The IL for this block is effectively the same as
    // the IL for the second block below
    using (var win = new Form())
    {
    }

   
    // This is the second block
    Form f = null;
    try
    {
        f = new Form();
    }
    finally
    {
        if (f != null)
        {
            f.Dispose();
        }
    }

Here's the IL for the 'using()' block above compiled in Release mode:

     IL_0000:  newobj     instance void [System.Windows.Forms]System.Windows.Forms.Form::.ctor()
     IL_0005:  stloc.0
     .try
     {
         IL_0006:  leave.s    IL_0012
     }  // end .try
     finally
     {
         IL_0008:  ldloc.0
         IL_0009:  brfalse.s  IL_0011
         IL_000b:  ldloc.0
         IL_000c:  callvirt   instance void [mscorlib]System.IDisposable::Dispose()
         IL_0011:  endfinally
     }  // end handler


Here's the IL for the second block (try/finally) compiled in Release mode:

     IL_0012:  ldnull
     IL_0013:  stloc.1
     .try
     {
         IL_0014:  newobj     instance void [System.Windows.Forms]System.Windows.Forms.Form::.ctor()
         IL_0019:  stloc.1
         IL_001a:  leave.s    IL_0026
     }  // end .try
     finally
     {
         IL_001c:  ldloc.1
         IL_001d:  brfalse.s  IL_0025
         IL_001f:  ldloc.1
         IL_0020:  callvirt   instance void [System]System.ComponentModel.Component::Dispose()
         IL_0025:  endfinally
     }  // end handler

As you can see, the IL is nearly identical.

Well this is all fine and good but let's get back to the issue with WCF.  The problem is that if an exception is thrown during disposal of the WCF client/proxy, the channel is never closed.  Now, in general, any exception that occurs during disposal of an object is indeed undesirable.  But, in the case of WCF, multiple channels remaining open could easily cause your entire service to fall on its face - not to mention what might eventually happen to your web server.

Here is an alternative solution that can be used:

    WCFProxy variableName = null;
    try
    {
        variableName = new WCFProxy();

        // TODO code here

        variableName.Close();
    }
// if you need to catch other exceptions, do so here...
    catch (Exception)
    {
        if (variableName != null)
        {
            variableName.Abort();
        }
        throw;
    }

MSDN does have a brief on this issue which you can read here - http://msdn.microsoft.com/en-us/library/aa355056.aspx

Share

January 26, 2012

Why you should comment your code...

Here are the most important reasons for including comments/documentation when writing code:
  1. I absolutely believe in documenting code using both XML doc comments on methods as well as unlined comments when/where necessary. This facilitates docentation files being generated automatically and can be used to create .chm files. For all of us who are lazy or those that think it takes too much time, use GhostDoc. Thus, laziness is no excuse.
  2. What if you have to maintain code that was written without comments? Do you want to waste your time digging thru code? What if it's more than just a method or class? What if its a library or framework? I personally have better things to do with my time.
  3. What about someone new to your team? What if your the new guy? Is it easier to learn it with or without documentation/comments?
  4. If you are writing a framework, library or API that will be used by other teams/API consumers, how are they supposed to know what it does without documentation? Do you expect them to dig thru your code to figure it out? What if they don't have access to the code? Would you want to have to do this?
  5. What if you find some code without comments and the code looks wrong or inefficient? It's certainly possible that it was written that way for a reason. Only comments would help.
  6. In addition to adding comments, code itself should be self documenting: use descriptive class, member, variable, parameter and method names.
When you write code, remember that you're not always going to be the one modifying, supporting it or consuming it...

Peter Ritchie has a good post about what comments are NOT for - a pretty good read: http://msmvps.com/blogs/peterritchie/archive/2012/01/30/what-code-comments-are-not-for.aspx


Share