# Internet Wrongs, Scalability sins + other dev links

– Stop Doing Internet Wrong <Scott Hanselman> – Agreed; I don’t want to install your app, please give me the web version 🙂
5 More Things Deadly to Scalability <Sean Hull>
Microsoft Office arrives for iOS – looks to be only the US store at this stage
jQuery.each vs for loop – Don’t you just love micro-optimization comparisons
3 Tips for New Scrum Masters <Mike Cohn>

# GetHashCode – what, how, why

If you don’t already know, GetHashCode is used by HashTable/Dictionary to obtain a key value used in lookups/comparisons. Writing an appropriate Hash algorithm is in itself a black art, because you’re trying to generate a unique number for every value in a given Hash table.

When items are added to the Hash table, they are assigned to a “bucket”. The hash code modulo the size of the Hash table gives the “bucket” to assign the new item to. When different hash keys resolve to the same bucket then this is known as a collision. The goal of a good Hash algorithm is to provide even distribution to these buckets (if there’s one value per bucket you get O(1) resolution, and if all values are in a single bucket you get O(n)).

The following from Stackover – Jon Skeet’s entry is a generic example of the GetHashCode override that can be applied in many cases.

```
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course 🙂
hash = hash * 23 + field1.GetHashCode();
hash = hash * 23 + field2.GetHashCode();
hash = hash * 23 + field3.GetHashCode();
return hash;
}
}

```

31 is another prime number that is commonly used in the algorithm above. So why prime numbers?

Primes are unique numbers. They are unique in that, the product of a prime with any other number has the best chance of being unique (not as unique as the prime itself of-course) due to the fact that a prime is used to compose it

<Why do hash functions use prime numbers – Computing life>

In cases where a more complex algorithm required you can have a look at Google’s sparsehash or Eternally Confuzzled – The Art of Hashing

# Comparing collections for value equality

First off, the object that will be used in your collection should implement IEquatable, which has a single method:

1) Equals – for the comparison. Collections will use this function for determining value equality

```
public bool Equals(BasicItem other)

{

if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return other.XCo == XCo && other.YCo == YCo && Equals(other.Description, Description);

}

```

From MSDN we also have the following

If you implement IEquatable, you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behavior is consistent with that of the IEquatable.Equals method. If you do override Object.Equals(Object), your overridden implementation is also called in calls to the static Equals(System.Object, System.Object) method on your class. This ensures that all invocations of the Equals method return consistent results.

```
public override int GetHashCode()
{
unchecked
{
int result = XCo;
result = (result*397) ^ YCo;
result = (result*397) ^ (Description != null ? Description.GetHashCode() : 0);
return result;
}
}

```

The guidelines for ==/!= operators is to override for value types and immutable reference types. Of course the whole usage of these operators can be confusing so it’s best to be as explicit as possible and use object.ReferenceEquals for reference types rather than just relying on == operator for comparing reference types. Be sure to take into account that derived and base class comparisons can equate to the same reference so depending on your requirements you might need to check the types of both using GetType; but I digress, so back to comparing the contents of a collection

Once you have these foundations in the item’s class you can start performing a comparison of the contents. Here’s one version courtesy of Stackoverflow:

```
public class CollectionComparer : IEqualityComparer<IEnumerable>
{
public bool Equals(IEnumerable first, IEnumerable second)
{
if ((first == null) != (second == null))
return false;

if (!object.ReferenceEquals(first, second) && (first != null))
{
if (first.Count() != second.Count())
return false;

if ((first.Count() != 0) && HaveMismatchedElement(first, second))
return false;
}

return true;
}

private static bool HaveMismatchedElement(IEnumerable first,
IEnumerable second)
{
int firstCount;
int secondCount;

var firstElementCounts = GetElementCounts(first, out firstCount);
var secondElementCounts = GetElementCounts(second, out secondCount);

if (firstCount != secondCount)
return true;

foreach (var kvp in firstElementCounts)
{
firstCount = kvp.Value;
secondElementCounts.TryGetValue(kvp.Key, out secondCount);

if (firstCount != secondCount)
return true;
}

return false;
}

private static Dictionary<T, int> GetElementCounts(IEnumerable enumerable,
out int nullCount)
{
var dictionary = new Dictionary<T, int>();
nullCount = 0;

foreach (T element in enumerable)
{
if (element == null)
{
nullCount++;
}
else
{
int num;
dictionary.TryGetValue(element, out num);
num++;
dictionary[element] = num;
}
}

return dictionary;
}

public int GetHashCode(IEnumerable enumerable)
{
int hash = 17;

foreach (T val in enumerable.OrderBy(x => x))
hash = hash * 23 + val.GetHashCode();

return hash;
}
}
```

Simply use the equals call. It sure beats iterating over multiple lists 😉

# Managed leak from event subscription? – Use Rx and WeakReference

The problem is well known in .Net development (Have a look at Dustin Campbell’s blog on the problem); a form subscribes to some event, the form is closed, but never gets garbage collected because it maintains a strong reference to the publisher. The obvious solution is explicitly unsubscribing from the event in the dispose. Another more automated approach is to use the Reactive Framework (Rx) as shown on Samuel Jack’s blog.

If you just want to jump straight into debugging an example you can simply create a new console project, add Rx via Nuget and replace the main block with this (essentially all the code from Samuel’s article). The key things to take away is (and you can break to watch it happen in realtime):

1) The target is held as a WeakReference – if it’s still alive when the event occurs it’s passed to the subscriber, if not the subscription is removed.

2) The handler passed in to the SubscribeWeakly must call instance methods via the reference to the target, rather than using the implicity this. i.e. target.HandleEvent(item) not HandleEvent(item) – otherwise you’ll have a strong reference (class instance) to the target.

3) Wrap your events  into an IObservable using Observable.FromEventPattern. More details on briding .Net events to observable sequences can be found here.

# Why consider using GC.Collect?

I can hear the cringes from my .Net colleagues already. Are you serious? Do you think you’re smarter than GC. The reason you avoid doing so is because it effects how much space is allocated to any given generation section in the heap. You’re essentially reducing the effectiveness of the GC to determine when and how often managed memory is cleaned up.

Under normal operations the GC makes predictions on the size of each generation based on past allocations. You probably know where I’m going with this; if you suddenly attempt to dispose of a whole load of objects that are sitting in generation 2 and allocate a whole load of unmanaged resources (such as font and bitmap handles) you might find handles excessively spikin

Check the comments here as well if you need more convincing: <When to call GC.Collect – Rico Mariani>

“To optimize performance, GDI+ retains some resources throughout the lifetime of the invoking process. These resources are not freed by the call to dispose. The only supported way to free these resources is to end the process”.

<When Moses was in GDI+ land, “Let my font go”  – Michael Kaplan>

GC.Collect should be used as a last resort, but you shouldn’t take it out of your option list.