C# – Iterating a class representing a collection: IEnumerable vs custom method

c

I often find myself needing to implement a class which is an enumeration/collection of something. Consider for this thread the contrived example of IniFileContent which is an enumeration/collection of lines.

The reason this class must exist in my codebase is that I want to avoid business logic being spread all over the place (=encapsulating the where) and I want to do it in the most object-oriented possible way.

Usually I would implement it as below:

public sealed class IniFileContent : IEnumerable<string>
{
    private readonly string _filepath;
    public IniFileContent(string filepath) => _filepath = filepath;
    public IEnumerator<string> GetEnumerator()
    {
        return File.ReadLines(_filepath)
                   .Where(l => !l.StartsWith(";"))
                   .GetEnumerator();
    }
    public IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

I choose to implement IEnumerable<string> because it makes its usage convenient:

foreach(var line in new IniFileContent(...))
{
    //...
}

However I'm wondering if doing so "shadows" the class intent ? When one looks at IniFileContent's interface one only see Enumerator<string> GetEnumerator(). I think it makes non-obvious which service the class is actually providing.

Consider then this second implementation:

public sealed class IniFileContent2
{
    private readonly string _filepath;
    public IniFileContent2(string filepath) => _filepath = filepath;
    public IEnumerable<string> Lines()
    {
        return File.ReadLines(_filepath)
                   .Where(l => !l.StartsWith(";"));
    }
}

Which is used less conveniently (by the way, seeing new X().Y() feels like something went wrong with the class design):

foreach(var line in new IniFileContent2(...).Lines())
{
    //...
}

But with a clear interface IEnumerable<string> Lines() making obvious what this class can actually do.

Which implementation would you foster and why ? Implied, is it a good practice to implement IEnumerable to represent an enumeration of something ?

I'm not looking for answers on how to:

  • unit test this code
  • make a static function instead of a class
  • make this code more prone to future business logic evolution
  • optimize performance

Appendix

Here is the kind of real code living in my codebase which implements IEnumerable

public class DueInvoices : IEnumerable<DueInvoice>
{
    private readonly IEnumerable<InvoiceDto> _invoices;
    private readonly IEnumerable<ReminderLevel> _reminderLevels;
    public DueInvoices(IEnumerable<InvoiceDto> invoices, IEnumerable<ReminderLevel> reminderLevels)
    {
        _invoices = invoices;
        _reminderLevels = reminderLevels;
    }
    public IEnumerator<DueInvoice> GetEnumerator() => _invoices.Where(invoice => invoice.DueDate < DateTime.Today && !invoice.Paid)
                                                               .Select(invoice => new DueInvoice(invoice, _reminderLevels))
                                                               .GetEnumerator();
    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

Best Answer

I'm reviewing your approach before suggesting a wholly different approach. I'm preferring the different approach but it seems important to explain why your approach has flaws.


I choose to implement IEnumerable<string> because it makes its usage convenient

Convenience should not outweigh correctness.

I wonder if your MyFile class will contain more logic than this; because that will affect the correctness of this answer. I'm especially interested in:

.Where(l => ...) //some business logic for filtering

because if this is complex or dynamic enough, you're hiding that logic in a class whose name does not reveal that it filters it content before serving it to the consumer.
Part of me hopes/assumes that this filter logic is intended to be hardcoded (e.g. a simple filter that ignores commented lines, e.g. like how .ini files consider lines starting with # to be a comment) and not a file-specific ruleset.


public class MyFile : IEnumerable<string>

There is something really grating about having a singular (File) represent a plural (IEnumerable). A file is a singular entity. It consist of more than just its content. It also contains metadata (filename, extension, creation date, modify date, ...).

A human is more than the sum of its children. A car is more than the sum of its parts. A painting is more than a collection of paints and canvas. And a file is more than a collection of lines.


If I assume that your MyFile class will never contain more logic than just this enumeration of lines (and the Where only applies a simple static hardcoded filter), then what you've got here is a confusing usage of the "file" name and its intended designation. This can easily be fixed by renaming the class as FileContent. It retains your intended syntax:

foreach(var line in new FileContent(@"C:\Folder\File.txt"))

It also makes more sense from a semantical point of view. The content of a file can be broken into separate lines. This still assumes that the file's content is text and not binary, but that's fair enough.


If, however, your MyFile class will contain more logic, the situation changes. There are a few ways this could happen:

  • You start using this class to represent the file's metadata, not just its contents.

When you start doing this, then the file represents the file in the directory, which is more than just its content.
The correct approach here is then what you've done in MyFile2.

  • The Where() filter starts having complicated filter logic that is not hardcoded, e.g. when different files start being filtered differently.

When you start doing this, files start having identities of their own, as they have their own custom filter. This means that your class has more become a FileType than a FileContent. The two behaviors need to be either separated, or combined using composition (which favors your MyFile2 approach), or preferably both (separate classes for the FileType and FileContent behavior, and then having both composed into the MyFile class).


A wholly different suggestion.

As it stands, both your MyFile and MyFile2 exist purely to give you a wrapper around your .Where(l => ...) filter. Secondly, you're effectively creating a class to wrap a static method (File.ReadLines()), which is not a great approach.

As an aside, I don't get why you've chose to make your class sealed. If anything, I'd expect that inheritance would be its biggest feature: different derived classes with different filtering logic (assuming that it's more complex than a simple value change, because inheritance shouldn't be used just to change a single value)

I would rewrite your whole class as:

foreach(var line in File.ReadLines(...).Where(l => ...))

The only downside to this simplified approach is that you'd have to repeat the Where() filter every time you wish to access the file's contents. I agree that that is not desirable.

However, it seems overkill that when you want to create a reusable Where(l => ...) statement, that you then also force that class to implement File.ReadLines(...). You're bundling more than you really need.

Instead of trying to wrap the static method in a custom class, I think it's much more fitting if you wrap it in a static method of its own:

public static IEnumerable<string> GetFilteredFileContent(string filePath)
{
    return File.ReadLines(filePath).Where(l => ...);
}

Assuming you have different filters, you could pass the appropriate filter as a a parameter. I'll show you an example that can handle multiple filters, which should be able to handle everything you need it to do while maximizing reusability:

public static class MyFile
{
    public static Func<string, bool> IgnoreComments = 
                  (l => !l.StartsWith("#"));

    public static Func<string, bool> OnlyTakeComments = 
                  (l => l.StartsWith("#"));

    public static Func<string, bool> IgnoreLinesWithTheLetterE = 
                  (l => !l.ToLower().contains("e"));

    public static Func<string, bool> OnlyTakeLinesWithTheLetterE = 
                  (l => l.ToLower().contains("e"));

    public static IEnumerable<string> ReadLines(string filePath, params Func<string, bool>[] filters)
    {
        var lines = File.ReadLines(filePath).Where(l => ...);

        foreach(var filter in filters)
            lines = lines.Where(filter);

        return lines;
    }
}

And its usage:

MyFile.ReadLines("path", MyFile.IgnoreComments, MyFile.OnlyTakeLinesWithTheLetterE);

This is just a run of the mill example that's meant to prove the point that static methods make more sense than creating a class here.

Don't get caught up on the specifics of implementing the filters. You can implement them however you want (I just personally like parametrizing Func<> because of its inherent extensibility and adaptability to refactoring). But since you didn't actually an example of the filters you intend to use, I made a few assumption to show you a workable example.


seeing new X().Y() feels like something went wrong with the class design)

In your approach, you could've made it new X().Y which is less grating.

However, I think that your dislike of new X().Y() proves the point that you feel like a class is not warranted here, but a method is; which can only be represented without a class by being static.

Related Topic