Mastering Strongly-Typed Collections in .NET

Why Generic Collections Matter

Generic collections give you type safety without sacrificing performance. Before generics existed, developers used ArrayList and Hashtable, which stored everything as objects and required constant casting. You'd spend hours tracking down bugs caused by putting the wrong type in a collection.

Modern .NET provides strongly-typed collections through generics. List<T>, Dictionary<TKey, TValue>, and HashSet<T> form the foundation of most applications. These collections catch type errors at compile time and eliminate boxing overhead for value types.

Understanding when to use each collection type directly impacts your application's performance and maintainability. You'll learn the characteristics of each collection, how to choose the right one, and how to optimize operations for production workloads.

Working with List<T>

List<T> is the most commonly used collection in .NET. It provides a dynamically-sized array that maintains insertion order and allows duplicate elements. You get constant-time access by index and fast additions to the end of the list.

The internal array grows automatically when you add elements beyond its capacity. By default, it doubles in size when full. If you know the approximate size upfront, setting the initial capacity prevents unnecessary reallocations.

Here's how you work with List<T> for typical scenarios like adding items, searching, and removing elements:

Program.cs - List<T> operations
using System;
using System.Collections.Generic;
using System.Linq;

var products = new List<Product>(capacity: 100);

// Adding items
products.Add(new Product { Id = 1, Name = "Laptop", Price = 999.99m });
products.Add(new Product { Id = 2, Name = "Mouse", Price = 29.99m });
products.Add(new Product { Id = 3, Name = "Keyboard", Price = 79.99m });

// Adding multiple items at once
products.AddRange(new[]
{
    new Product { Id = 4, Name = "Monitor", Price = 299.99m },
    new Product { Id = 5, Name = "Webcam", Price = 89.99m }
});

// Accessing by index (O(1) operation)
var firstProduct = products[0];
Console.WriteLine($"First: {firstProduct.Name}");

// Searching (O(n) operation)
var laptop = products.Find(p => p.Name == "Laptop");
var expensiveProducts = products.FindAll(p => p.Price > 100);

// Checking existence
bool hasMonitor = products.Any(p => p.Name == "Monitor");
bool allAffordable = products.All(p => p.Price < 1000);

// Removing items
products.Remove(laptop); // Removes first occurrence
products.RemoveAll(p => p.Price < 50); // Removes all matching

// Sorting
products.Sort((a, b) => a.Price.CompareTo(b.Price));

Console.WriteLine($"Total products: {products.Count}");

class Product
{
    public int Id { get; set; }
    public string Name { get; set; }
    public decimal Price { get; set; }
}

List<T> excels when you need to maintain insertion order, access items by position, or iterate through all elements. The Find and FindAll methods provide convenient searching, though they perform linear scans. For large collections with frequent lookups, consider Dictionary<TKey, TValue> instead.

Leveraging Dictionary<TKey, TValue> for Fast Lookups

Dictionary<TKey, TValue> is your go-to collection when you need fast lookups by key. It uses hash tables internally, providing O(1) average-case performance for additions, lookups, and deletions. This makes it dramatically faster than List<T> for finding specific items.

The key must be unique and should have a good GetHashCode implementation. Value types and strings work great as keys. For custom types, you'll need to override GetHashCode and Equals properly to ensure correct behavior.

Dictionary is perfect for caching, indexing objects by ID, or maintaining lookup tables. Here's how you use it effectively:

Program.cs - Dictionary<TKey, TValue> operations
using System;
using System.Collections.Generic;

// Creating with initial capacity
var userCache = new Dictionary<int, User>(capacity: 1000);

// Adding items
userCache.Add(1, new User { Id = 1, Name = "Alice", Email = "alice@example.com" });
userCache.Add(2, new User { Id = 2, Name = "Bob", Email = "bob@example.com" });

// Using collection initializer
var statusCodes = new Dictionary<int, string>
{
    { 200, "OK" },
    { 404, "Not Found" },
    { 500, "Internal Server Error" }
};

// Safe addition (doesn't throw if key exists)
userCache.TryAdd(1, new User { Id = 1, Name = "Alice2" }); // Returns false

// Fast lookup (O(1) average case)
if (userCache.TryGetValue(1, out var user))
{
    Console.WriteLine($"Found: {user.Name}");
}

// Direct access (throws KeyNotFoundException if missing)
var bob = userCache[2];

// Updating values
userCache[1] = new User { Id = 1, Name = "Alice Updated", Email = "alice@example.com" };

// Checking for keys
if (userCache.ContainsKey(3))
{
    Console.WriteLine("User 3 exists");
}

// Iterating
foreach (var kvp in userCache)
{
    Console.WriteLine($"ID: {kvp.Key}, Name: {kvp.Value.Name}");
}

// Getting all keys or values
var allIds = userCache.Keys;
var allUsers = userCache.Values;

// Removing items
userCache.Remove(2);

class User
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Email { get; set; }
}

Always use TryGetValue instead of checking ContainsKey followed by accessing the indexer. TryGetValue performs a single lookup instead of two, doubling your performance for that operation. The indexer throws exceptions for missing keys, so TryGetValue also provides safer access patterns.

Using HashSet<T> for Unique Collections

HashSet<T> stores unique elements with no defined order. It's perfect when you need to eliminate duplicates, perform set operations, or check membership quickly. Like Dictionary, it uses hash tables for O(1) lookups.

HashSet really shines for deduplication scenarios and when you need mathematical set operations like union, intersection, and difference. If you're converting a List<T> to eliminate duplicates, HashSet does this efficiently.

Here's how HashSet solves real-world problems with set operations and fast membership testing:

Program.cs - HashSet<T> operations
using System;
using System.Collections.Generic;
using System.Linq;

// Creating and adding unique items
var visitedPages = new HashSet<string>();
visitedPages.Add("/home");
visitedPages.Add("/products");
visitedPages.Add("/home"); // Duplicate, not added
visitedPages.Add("/about");

Console.WriteLine($"Unique pages visited: {visitedPages.Count}"); // 3

// Fast membership testing (O(1))
bool visitedHome = visitedPages.Contains("/home");

// Set operations
var categoryA = new HashSet<int> { 1, 2, 3, 4, 5 };
var categoryB = new HashSet<int> { 4, 5, 6, 7, 8 };

// Union: all items from both sets
var union = new HashSet<int>(categoryA);
union.UnionWith(categoryB);
Console.WriteLine($"Union: {string.Join(", ", union)}"); // 1, 2, 3, 4, 5, 6, 7, 8

// Intersection: items in both sets
var intersection = new HashSet<int>(categoryA);
intersection.IntersectWith(categoryB);
Console.WriteLine($"Intersection: {string.Join(", ", intersection)}"); // 4, 5

// Difference: items in first set but not second
var difference = new HashSet<int>(categoryA);
difference.ExceptWith(categoryB);
Console.WriteLine($"Difference: {string.Join(", ", difference)}"); // 1, 2, 3

// Symmetric difference: items in either set but not both
var symmetricDiff = new HashSet<int>(categoryA);
symmetricDiff.SymmetricExceptWith(categoryB);
Console.WriteLine($"Symmetric: {string.Join(", ", symmetricDiff)}"); // 1, 2, 3, 6, 7, 8

// Checking relationships
bool isSubset = categoryA.IsSubsetOf(categoryB);
bool isSuperset = categoryA.IsSupersetOf(new[] { 1, 2 });
bool overlaps = categoryA.Overlaps(categoryB);

// Removing duplicates from List
var numbersWithDupes = new List<int> { 1, 2, 2, 3, 3, 3, 4, 5, 5 };
var uniqueNumbers = new HashSet<int>(numbersWithDupes);
Console.WriteLine($"Unique count: {uniqueNumbers.Count}"); // 5

HashSet's set operations perform efficiently even on large collections. When you need to find common elements between two collections or eliminate duplicates, HashSet typically outperforms LINQ-based approaches. The Contains method on HashSet is significantly faster than List.Contains for large datasets.

Combining LINQ with Collections

LINQ provides a powerful query syntax for working with collections. It makes complex data transformations readable and maintainable. Most LINQ operations use deferred execution, meaning they don't run until you iterate the results.

LINQ works with any IEnumerable<T>, so all standard collections support it. You can filter, project, group, join, and aggregate data with concise, declarative code. While LINQ adds some overhead compared to loops, the readability benefits usually justify the cost.

Here are practical LINQ patterns you'll use frequently with collections:

Program.cs - LINQ with collections
using System;
using System.Collections.Generic;
using System.Linq;

var orders = new List<Order>
{
    new Order { Id = 1, CustomerId = 100, Amount = 250.00m, Date = new DateTime(2025, 10, 1) },
    new Order { Id = 2, CustomerId = 101, Amount = 175.50m, Date = new DateTime(2025, 10, 5) },
    new Order { Id = 3, CustomerId = 100, Amount = 300.00m, Date = new DateTime(2025, 10, 10) },
    new Order { Id = 4, CustomerId = 102, Amount = 425.75m, Date = new DateTime(2025, 10, 15) },
    new Order { Id = 5, CustomerId = 101, Amount = 199.99m, Date = new DateTime(2025, 10, 20) }
};

// Filtering
var largeOrders = orders.Where(o => o.Amount > 200).ToList();

// Projection (transforming data)
var orderSummaries = orders.Select(o => new
{
    o.Id,
    o.CustomerId,
    o.Amount
}).ToList();

// Ordering
var ordersByAmount = orders.OrderByDescending(o => o.Amount).ToList();
var ordersByDateAndAmount = orders
    .OrderBy(o => o.Date)
    .ThenByDescending(o => o.Amount)
    .ToList();

// Grouping
var ordersByCustomer = orders
    .GroupBy(o => o.CustomerId)
    .Select(g => new
    {
        CustomerId = g.Key,
        OrderCount = g.Count(),
        TotalAmount = g.Sum(o => o.Amount),
        AverageAmount = g.Average(o => o.Amount)
    })
    .ToList();

// Aggregation
var totalRevenue = orders.Sum(o => o.Amount);
var averageOrder = orders.Average(o => o.Amount);
var largestOrder = orders.Max(o => o.Amount);
var orderCount = orders.Count();

// Finding specific items
var firstLargeOrder = orders.FirstOrDefault(o => o.Amount > 400);
var hasSmallOrders = orders.Any(o => o.Amount < 100);
var allPaid = orders.All(o => o.Amount > 0);

// Taking subsets
var recentOrders = orders.OrderByDescending(o => o.Date).Take(3).ToList();
var skipFirst = orders.Skip(2).Take(2).ToList();

// Distinct values
var uniqueCustomerIds = orders.Select(o => o.CustomerId).Distinct().ToList();

// Joining collections (example with dictionary)
var customers = new Dictionary<int, string>
{
    { 100, "Alice" },
    { 101, "Bob" },
    { 102, "Charlie" }
};

var ordersWithNames = orders
    .Join(customers,
        order => order.CustomerId,
        customer => customer.Key,
        (order, customer) => new
        {
            order.Id,
            CustomerName = customer.Value,
            order.Amount
        })
    .ToList();

foreach (var order in ordersWithNames)
{
    Console.WriteLine($"Order {order.Id}: {order.CustomerName} - ${order.Amount}");
}

class Order
{
    public int Id { get; set; }
    public int CustomerId { get; set; }
    public decimal Amount { get; set; }
    public DateTime Date { get; set; }
}

Remember that LINQ uses deferred execution for most operations. The query doesn't run until you call ToList, ToArray, Count, or iterate with foreach. This can improve performance by chaining operations without creating intermediate collections. However, if you iterate the same query multiple times, call ToList once to cache the results.

Try It Yourself

This complete example demonstrates all three collection types working together in a product inventory system. You'll see how to choose the right collection for each scenario and use LINQ for reporting.

Program.cs - Complete inventory example
using System;
using System.Collections.Generic;
using System.Linq;

var inventory = new InventorySystem();

// Add products
inventory.AddProduct(new Product { Id = 1, Name = "Laptop", Price = 999.99m, Category = "Electronics" });
inventory.AddProduct(new Product { Id = 2, Name = "Mouse", Price = 29.99m, Category = "Electronics" });
inventory.AddProduct(new Product { Id = 3, Name = "Desk", Price = 299.99m, Category = "Furniture" });
inventory.AddProduct(new Product { Id = 4, Name = "Chair", Price = 199.99m, Category = "Furniture" });
inventory.AddProduct(new Product { Id = 5, Name = "Monitor", Price = 349.99m, Category = "Electronics" });

// Track views
inventory.TrackView(1);
inventory.TrackView(1);
inventory.TrackView(2);
inventory.TrackView(1);

// Generate report
inventory.GenerateReport();

class InventorySystem
{
    // List for ordered collection of all products
    private List<Product> _products = new List<Product>();

    // Dictionary for fast lookup by ID
    private Dictionary<int, Product> _productIndex = new Dictionary<int, Product>();

    // HashSet for tracking unique viewed products
    private HashSet<int> _viewedProductIds = new HashSet<int>();

    public void AddProduct(Product product)
    {
        _products.Add(product);
        _productIndex.Add(product.Id, product);
    }

    public Product GetProductById(int id)
    {
        return _productIndex.TryGetValue(id, out var product) ? product : null;
    }

    public void TrackView(int productId)
    {
        _viewedProductIds.Add(productId);
    }

    public void GenerateReport()
    {
        Console.WriteLine("=== Inventory Report ===\n");

        Console.WriteLine($"Total Products: {_products.Count}");
        Console.WriteLine($"Unique Products Viewed: {_viewedProductIds.Count}\n");

        // Group by category using LINQ
        var byCategory = _products
            .GroupBy(p => p.Category)
            .Select(g => new
            {
                Category = g.Key,
                Count = g.Count(),
                TotalValue = g.Sum(p => p.Price),
                AvgPrice = g.Average(p => p.Price)
            });

        Console.WriteLine("By Category:");
        foreach (var cat in byCategory)
        {
            Console.WriteLine($"  {cat.Category}: {cat.Count} items, " +
                            $"Total: ${cat.TotalValue:F2}, Avg: ${cat.AvgPrice:F2}");
        }

        // Most expensive items
        Console.WriteLine("\nTop 3 Most Expensive:");
        var topExpensive = _products.OrderByDescending(p => p.Price).Take(3);
        foreach (var product in topExpensive)
        {
            Console.WriteLine($"  {product.Name}: ${product.Price}");
        }

        // Viewed products
        Console.WriteLine("\nViewed Products:");
        foreach (var id in _viewedProductIds)
        {
            var product = GetProductById(id);
            Console.WriteLine($"  {product.Name} (${product.Price})");
        }
    }
}

class Product
{
    public int Id { get; set; }
    public string Name { get; set; }
    public decimal Price { get; set; }
    public string Category { get; set; }
}
InventoryDemo.csproj
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <Nullable>enable</Nullable>
  </PropertyGroup>
</Project>

Running the example:

  1. Create a new folder and save both files
  2. Open a terminal in that folder
  3. Run dotnet run to see the output
  4. Try adding more products and tracking different views
  5. Experiment with different LINQ queries in the report

Expected output:

Console Output
=== Inventory Report ===

Total Products: 5
Unique Products Viewed: 2

By Category:
  Electronics: 3 items, Total: $1379.97, Avg: $459.99
  Furniture: 2 items, Total: $499.98, Avg: $249.99

Top 3 Most Expensive:
  Laptop: $999.99
  Monitor: $349.99
  Desk: $299.99

Viewed Products:
  Laptop ($999.99)
  Mouse ($29.99)

Performance & Scalability Notes

Understanding collection performance characteristics helps you choose the right tool for each job. Each collection type has different trade-offs in memory usage, speed, and functionality.

List<T> provides fast indexed access and sequential iteration but slow searches. Dictionary<TKey, TValue> trades memory for speed with its hash table. HashSet<T> excels at membership tests and set operations but doesn't maintain order.

For critical performance paths, measure with tools like BenchmarkDotNet. Here's a benchmark comparing lookup performance across collection types:

CollectionBenchmarks.cs - Performance comparison
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Collections.Generic;
using System.Linq;

BenchmarkRunner.Run<CollectionBenchmarks>();

[MemoryDiagnoser]
public class CollectionBenchmarks
{
    private List<int> _list;
    private Dictionary<int, int> _dictionary;
    private HashSet<int> _hashSet;
    private const int Size = 10000;

    [GlobalSetup]
    public void Setup()
    {
        _list = Enumerable.Range(0, Size).ToList();
        _dictionary = Enumerable.Range(0, Size).ToDictionary(x => x, x => x);
        _hashSet = Enumerable.Range(0, Size).ToHashSet();
    }

    [Benchmark]
    public bool List_Contains()
    {
        // O(n) linear search
        return _list.Contains(Size - 1);
    }

    [Benchmark]
    public bool Dictionary_ContainsKey()
    {
        // O(1) hash lookup
        return _dictionary.ContainsKey(Size - 1);
    }

    [Benchmark]
    public bool HashSet_Contains()
    {
        // O(1) hash lookup
        return _hashSet.Contains(Size - 1);
    }

    [Benchmark]
    public int List_Find()
    {
        return _list.Find(x => x == Size - 1);
    }

    [Benchmark]
    public int Dictionary_Lookup()
    {
        return _dictionary.TryGetValue(Size - 1, out var value) ? value : -1;
    }

    [Benchmark]
    public List<int> List_Where()
    {
        return _list.Where(x => x > 5000).ToList();
    }

    [Benchmark]
    public List<int> List_ForLoop()
    {
        var result = new List<int>();
        for (int i = 0; i < _list.Count; i++)
        {
            if (_list[i] > 5000)
                result.Add(_list[i]);
        }
        return result;
    }
}

Typical results show Dictionary and HashSet lookups are hundreds of times faster than List for large collections. For filtering operations, traditional loops can be slightly faster than LINQ, though LINQ remains more readable. Choose based on your performance requirements and team preferences.

Key performance tips: Set initial capacity when you know the size. Use TryGetValue instead of ContainsKey plus indexer. Call ToList on LINQ queries you'll iterate multiple times. Consider custom equality comparers for complex types. Profile before optimizing.

Frequently Asked Questions (FAQ)

When should I use HashSet<T> instead of List<T>?

Use HashSet<T> when you need fast lookups, want to eliminate duplicates automatically, or need set operations like unions and intersections. HashSet provides O(1) lookups versus List's O(n) search. If you need to maintain order or access items by index, stick with List<T>.

What's the performance difference between Dictionary and List for lookups?

Dictionary<TKey, TValue> provides O(1) average-case lookups using hash-based indexing, while List<T> requires O(n) linear search with Find or Contains. For datasets with frequent lookups, Dictionary can be hundreds of times faster. However, Dictionary uses more memory due to hash table overhead.

How does LINQ affect collection performance?

LINQ operations use deferred execution, meaning they don't execute until you iterate the results. Each LINQ method creates an iterator, which adds some overhead. For hot paths, consider using for loops instead. However, LINQ's readability often outweighs the minor performance cost in most applications.

Should I use IEnumerable<T> or List<T> for method return types?

Return IEnumerable<T> when you want to hide implementation details and support deferred execution. Return List<T> when the caller needs to modify the collection or access items by index. For read-only collections, consider returning IReadOnlyList<T> to signal immutability while preserving indexing capabilities.

Back to Articles