.NET knows a big list of collection-like types like: IEnumerable
, IQueryable
, IList
, ICollection
, Array
, ISet
, ImmutableArray
, ReadOnlyCollection
, ReadOnlyList
, and many more.
This blog post will give you an exhaustive list of types in .NET and when to use what.
Interfaces vs Implementation
In this article as well as in the real world you often see a concrete implementation as well as the interface type (List<T>
vs IList<T>
). I will discuss this is also a bit more in detail later, but a general world before. Interfaces are contracts and implementations are details. So as a rule of thumb: If you have an API, which is public-facing (normally interfaces or methods that are either public
or protected
) the general advice is to use the interface type rather than the implementation. Only use concrete implementations if you really have to. List<T>
is not the same as IList<T>
. At first sight, it seems like this (mainly because List<T>
implements IList<T>
) but there are differences. Don't constrict your users artificially.
Interfaces define just an operational contract, which in the majority of cases is good enough. They abstract away certain details for you (which is a very good thing!).
IEnumerable
IEnumerable
is an interface that defines a method for retrieving elements from a collection one at a time. It is often used as the return type of methods that returns a sequence of elements. This interface allows a collection to be used with the foreach loop and other methods that expect a sequence of elements. It is also the base type for almost all LINQ operations (Count()
, Where()
, Select()
, and friends).
The nature of IEnumerable
is to tell the user that we can enumerate an object. It does that by moving to the next item one-by-one until we are at the end of our enumeration. IEnumerable
is lazily evaluated.
Lazy evaluation is a technique for delaying the computation of a value until it is actually needed. In the context of IEnumerable
, it means that the elements of the sequence are not computed until they are actually accessed by the code that is using the sequence. This can be useful for optimizing performance, particularly when working with large sequences because it allows you to avoid computing elements of the sequence that you never end up using.
Here is a small example where we generate an exhaustive list of prime numbers until int.MaxValue
. But if the user only wants 100 instead of all numbers then we don't continue after the 100th prime number.
var first100Primes = GetPrimes().Take(100);
IEnumerable<int> GetPrimes()
{
int i = 2;
while (i < int.MaxValue)
{
if (IsPrime(i))
{
yield return i;
}
i++;
}
}
When to use?
I will keep it very abstract and generic at the beginning and will explain it later on. I want to quote Vladimir Khorikov here:
Return the most specific type, accept the most generic type
If you just want to enumerate things in a foreach loop and you might even abort early, IEnumerable
is your candidate of choice. We can also put it the other way around: If you don't need any of the other choices I will present later on, use IEnumerable
.
IQueryable
I will keep this section short, as I already covered that in greater detail in "IEnumerable vs IQueryable - What's the difference". IQueryable
extends IEnumerable
and is often times used with objects or collections that are not held in memory. The most prominent example is Entity Framework where the DbContext (or better the DbSet) offers you an IQueryable object to gather your data from the underlying storage provider. The same exists also for Linq to XML and friends.
As the name suggests IQuerayable
offers methods for expressing queries against a collection of elements. It behaves similarly to IEnumerable
in that sense, that it is also lazy evaluated. If you want to know the exact differences, please have a look at the linked article above.
When to use?
You want to use the IQueryable
interface if you want to work with a sequence of data sources that are out of your memory location. In contrast, you would use IEnumerable
if your data source is in memory (RAM).
ICollection
Whereas IEnumerable
and IQueryable
were lazy evaluated ICollection
is not. So now we are in the realm of your collection is already materialized in some way or another. ICollection
adds a few functionalities to the enumeration: Add
, Remove
, and Clear
(as well as some others). The core idea is now that we can mutate the collection we have. That was not possible with the other two interfaces I showed earlier. ICollection
also inherits from IEnumerable
. So you can see that they built on top of each other. So whenever you have a type of ICollection
you can also use it for IEnumerable
. That is why LINQ works basically on all collections because almost all of them inherit from IEnumerable
.
When to use?
Use ICollection
when you need an already materialized object (for example as a result of a LINQ query) or you want to mutate the collection itself. So you want to add or remove certain entries.
IList
IList
inherits from ICollection
so whatever you can do with ICollection
you can do with IList
as well. So what is the difference then? What does IList
bring more to the table? And the simple answer is, it has an indexer. So we have some kind of well-defined and fixed order in our enumeration.
List<int> list = new List<int> {1, 2, 3, 4 };
Console.WriteLine(list[1]); // Prints 2
When to use?
So the use case here is clear. If you want to get an element via index as well as removing or adding items, IList
might be your candidate.
IReadOnlyCollection
and IReadOnlyList
This brings me to a special place: The read-only collection types. There are not really different from their siblings. As the name suggests you can only read items but you are not allowed to add, remove or delete something from the collection. Doesn't IReadOnlyCollection
sound like your regular array? Well partially, but there are differences. First arrays are very special (more to that later). But you can model things with IReadOnlyCollection
or IReadOnlyList
that you can't with an array. I discussed this in "ReadOnlyCollection is not an immutable collection". A ReadOnlyCollection
is like a VIEW
in SQL. If the originated collection updates, so will your read-only one:
var numbers = new List<int> { 1, 2 };
var readOnlyNumbersViaExtension = numbers.AsReadOnly();
var readOnlyNumbers = new ReadOnlyCollection<int>(numbers);
numbers.Add(3);
// All of them will print 3
Console.WriteLine($"List count: {numbers.Count}");
Console.WriteLine($"ReadOnlyCollection via extension count: {readOnlyNumbersViaExtension.Count}");
Console.WriteLine($"ReadOnlyCollection via new count: {readOnlyNumbers.Count}");
When to use?
Almost every time you have a materialized collection where you don't want to mutate the state of the collection (Add
, Remove
, Clear
) you want to consider these types. They also offer Contains
and other helper functions. If you need the indexer take IReadOnlyList
instead of IReadOnlyCollection
).
ISet
A set is a collection of unique elements. ISet
defines handy methods to interact with such objects. If you have an object of type ISet
you know that there are no duplicates inside (at least the implementation of that interface should guarantee that). The interface provides methods that are commonly known in set theory like creating a union or an intersection of methods.
// Create two sets
HashSet<int> set1 = new HashSet<int> { 1, 2, 3 };
HashSet<int> set2 = new HashSet<int> { 2, 3, 4 };
// Get the union of the two sets
HashSet<int> union = new HashSet<int>(set1);
// The union set contains 1, 2, 3, and 4
union.UnionWith(set2);
When to use?
You might use ISet
in your code when you want to store a collection of items and ensure that there are no duplicates. Like the given operation (UnionWith
) and others, they can outperform "regular" collections as they are specialized in those operations. On the negative side, creating a set is normally much more expensive than your regular list.
IImmutableList
and its implementations
Now we are in the realm of immutable objects. I will greatly simplify here and I will also put all the interfaces and implementations into one big bucket. Why? Because it is unlikely that you will use them very often and if so, you should invest time into the specifics of each collection depending on your concrete use case.
We saw earlier that read-only collections are just a wrapper around a collection, that gets changed when the underlying collection changes. But sometimes you don't want this at all. For example, you want to enumerate through your IReadOnlyCollection
while the underlying collection changes. Well you get greeted by an InvalidOperationException
. This might happen if you do multithreading.
So immutable collection would create a new and completely disconnected collection that is based on the origin at the exact point it was created and it will never ever update again. That is perfect for thread safety. Still there are ways to create or remove items from that immutable collection, but this will result in a new object rather than changing the original object - hence the name immutable.
The good thing is that those collections are specialized. For example, if you add an item to an ImmutableList
, it will doesn't allocate a completely new allocation but rather share part of the "old" one. That is possible because we know that the array of the original immutable list can't change any more.
When to use?
Use those collections and interfaces when you want to true immutability. This is oftentimes handy when handling with multiple threads at a time.
FrozenSet
A special form of that immutable structure is coming with .NET 8: The FrozenSet
and the FrozenDictionary
. I'll quote here one person of the .NET team (@geeknoid) which commented on my post ("Frozen collections in .NET 8")regarding said topic:
ImmutableSet and ImmutableDictionary are designed to be immutable and to make it easy to create slightly modified copies of a given instance at a reasonable cost. So imagine you create an immutable set with a million entries in it. You cannot mutate this set since it is immutable. Now, you need a new set with 1 extra entry. You could recreate the set from the ground up, which would be very expensive. Instead, you can start from your original immutable set and apply a delta to create a new distinct set instance which under the covers shares most of the state from the original set. You use less memory this way, and it takes much less time to create the new instance.
The problem is that in order to allow this mode of operation, ImmutableDictionary and ImmutableSet are complex implementations which introduce substantial compromises in overall read performance as a trade-off for this ability to make cheap delta clones.
FrozenSet and FrozenDictionary do not provide the delta clone ability, they are optimized strictly for fast read performance. You pay more for creation, you pay more for making a clone with modifications, as a trade-off for getting faster steady state read performance.
When to use?
The new behavior would indicate that you use such a structure where you create a collection at the beginning of your application life time and never update the contents. As thos structures are optimize for look up, they come in handy if you often have to do this. Kind of trade-off between a bit more startup time vs actual faster run time in your method.
Array
I talked earlier about array's and they seem to have an overlap with IReadOnyList
as they are both read-only and have an indexer. From my personal experience if in doubt between those two take IReadOnlyList
and friends. The API is more friendly in my opinion. An array is a fixed-size and contiguous block of memory. So it guarantees O(1) access time, but so does a List<T>
as well (now that we are in the implementation realm, that is a fair comparison).
When to use?
There are 3 primary use-cases in my opinion.
Performance. If you are on a very low level where every allocation and nano second count, you might consider an array. They are faster for sure, but with the trade-off of being less usable than the other types.
Multidimensional arrays. If you need a multidimensional array, often times you don't have any other choice. You could work with
List<List<T>>
but that seems rather ugly.
int[][] jaggedArray = { new int[] {1,2,3,4},
new int[] {5,6,7},
new int[] {8},
new int[] {9}
};
int[,] multiDimArray = {{1,2,3,4},
{5,6,7,0},
{8,0,0,0},
{9,0,0,0}
};
params
keyword. If you want to have a variable amount of arguments likestring.Format
you have to use an array.
Show(2);
Show(2, 3);
Show(new[] {1, 2, 3});
public void Show(params int[] val)
{
for (int i=0; i<val.Length; i++)
{
Console.WriteLine(val[i]);
}
}
List
List
is your allrounder when it comes down to API friendliness and ease of use. It has a lot of useful operations, which don't exist on IList
, for example, BinarySearch
or Sort
. Furthermore List
gives you a guarantee that the indexer call (myList[0]
) is O(1). That guarantee does not exist on IList
. We will see later where this comes into play. The reason it can guarantee that is that the underlying storage is a normal one-dimensional array.
When to use
For internal use cases (private
, internal
modifier) this is your 80% case. List
has a nice balance between performance and useability. It is a safe bet as long as other types I will showcase later will not fit better.
Collection
This one is a bit special. Collection
is not really used directly as a return value or as an argument. To understand why this type exists, let's have a look at List
once more. If you want to derive from List
and try to extend Add
or Remove
you are out of luck. Those methods are not virtual
and therefore you can't easily extend the behavior. Sure you could use the new
keyword, but that falls short in such circumstances:
public class MyList<T> : List<T>
{
public new void Add() ...
}
List<T> list = new MyList<T>();
list.Add(); // This will call List<T>.Add and not MyList<T>.Add
And exactly here comes Collection
into play. It has virtual
methods that you can override easily,
When to use?
If you want to have an already good enough implementation of a collection you want to build on top off.
LinkedList
Even though the name suggests it implements IList
it does not. The LinkedList
does not offer you an indexer like List
does. The reason is, that in a LinkedList
accessing a member isn't O(1). You have to go from node to node until you found your index. So if you see yourself using often times the indexer, LinkedList
would not be a viable candidate. The power of LinkedList
s are manifold:
- As it is not a contiguous block of memory removing an item is fairly cheap (you don't have to move elements around after the position where you deleted something). The same applies to random insertions.
- Also if a
List
is very big (>85kb) it goes onto the large object heap -LinkedList
wouldn't (here some explanation - basically aLinkedList
consists out of multiple elements / nodes instead of one large array) - They can help with fragmentation
When to use?
See the advantages above. If they outplay your everyday List
then LinkedList
can be a candidate. As you see a bit special.
ObservableCollection
ObservableCollection
s have one extra purpose: You can observe them as the name suggests. They offer event's which get triggered when someone adds, removes or completely clears the collection.
var numbers = new ObservableCollection<int>() { 1, 2, 3, 4, 5 };
numbers.CollectionChanged => Console.WriteLine("Something happened");
numbers.Add(2); // Will trigger the CollectionChanged event
When to use?
The use case is pretty clear I guess. You want to observe and react to changes in the collection. If you ever used WPF you know what I am talking about.
HashSet
Since HashSet
only has unique elements, its internal structure is optimized for faster searches. It also doesn't make sense to use foreach
over an HashSet
even though that is valid syntax. A set is defined by having no order, using foreach
imposes order to some extent.
When to use?
As said above. If you have an internal API that greatly benefits from that data structure, you can use that. To some extent, the same applies to ISet
.
SortedSet
A SortedSet
is also an ISet
with the difference of having a specified order. It kind of behaves like a List
without having duplicates and without the ability to access a random index via mySet[1]
. But it provides you functions like Min
and Max
that have a big O notation of log(n).
When to use?
Never ever came across that in my career 😄.
ConcurrentBag
The ConcurrentBag
as well as the other concurrent types are used in a multi-threaded scenario where multiple threads might access or read from the type. A bag is a collection that can have duplicated items (in contrast to a set for example) but the order is not defined. You might wonder why is not a ConcurrentList
? And the answer is simple: A list has an order. What is the order if two or more threads simultaneously add an item to that list? Therefore we have a bag.
When to use?
You are in a scenario where multiple threads can read or write to the collection and you need a thread-safe way to handle that.
Stack
A stack gives you an easy way of modeling "Last in First out" behavior. So the last element you put into the Stack
is the first one coming out.
var stack = new Stack<int>();
stack.Push(2);
stack.Push(3);
Console.WriteLine(stack.Pop()); // Prints 3
When to use?
Every time you need this "Last in First out" behavior.
## Queue
Like a real queue in the real world, the first thing put into the queue is also the first thing coming out again.
var queue = new Queue<int>();
queue.Enqueue(2);
queue.Enqueue(3);
Console.WriteLine(queue.Dequeue()); // Prints 2
When to use?
Every time you need this "First in First out" behavior.
Span
and ReadOnlySpan
Now we are in a very special place here. Span
and ReadOnlySpan
are not really collections (like an array). They just represent a contiguous block of memory that is not managed by the Garbage Collector. It offers functions so that you can enumerate through them. I have an article that goes deeper into that topic: "Create a low allocation and faster StringBuilder - Span in Action".
Here is also a big difference between let's say List
and IList
. I said earlier that List
gives you the guarantee that you have a contiguous block of memory, so you can easily create an object which spans that memory block. The .NET Framework has helper methods for that:
CollectionsMarshal.AsSpan(myList);
CollectionsMarshal.AsSpan()
does take only List
as argument and not IList
.
When to use?
Every time you are in a high-performance scenario (paired with the least amount of allocations possible) Span
and ReadOnlySpan
are your friends.
Conclusion
There are lots and lots of interfaces and implementations in .NET. My general advise is to take the appropriate interface for public-facing API for sure. For internal and private APIs it is a bit up to you. In the end, you want to make your intent clear. Try to use interfaces as much as possible and only rely on concrete implementations if you have to.