Wednesday, October 04, 2017

Comparing a variable with multiple options in C#

In a code review recently there was some debate about how to compare a variable with multiple values.
I'm writing this to put all my thoughts on the subject into one place and in a coherent (hopefully) manner.

As an example, consider this `if` statement.

if (new[] { MyEnum.Value1, MyEnum.Value2 }.Contains(someVariable))

It's the equivalent of "if (X or Y) = Z".

Also, consider this variation

if (someVariable == MyEnum.Value1 || someVariable == MyEnum.Value2)

The repetition of the variable typically makes it longer than the version using the array and reads, IMO, less naturally. It's the equivalent of "if (X = Y) or (X = Z)".

People tend to prefer the second option.

The argument against the first option basically comes down to:
I know an array of 2 doesn't add too much memory but it's just good practice.
I'm not convinced about the "good practice" argument.

  • Yes, an array is created, but it's tiny and there's nothing keeping it around and so can be collected as soon as is necessary.
  • It's not "common practice" I'll admit, but I've never seen code guidelines that explicitly say anything about how to do multiple comparisons. Is the first option considered wrong as it's not seen as frequently as the second?
  • It feels like a premature optimization. The argument of don't create anything that uses memory unless you need to puts preserving memory consumption above all else. I don't think that's right or something that anyone making the above argument sticks too rigidly.
If someone was really concerned about performance, I'd expect them to point out that the array method can cause boxing and this is likely to have more impact on performance than array creation. For a one-off check though the impact is still so small as to not be noticeable. This makes me think the above argument comes from a convention and perception that "creating arrays is bad" and should be avoided.

I also prefer the first option as it makes it easier to add another option. The adding of an item to a list is simpler than another comparison. As the variable needs to be compared with more options lines get very long if using direct comparisons.  Both options get longer but this gets much longer faster.
This isn't great.

if (someVariable == MyEnum.Value1 || someVariable == MyEnum.Value2 || someVariable == MyEnum.Value3 || someVariable == MyEnum.Value4 || someVariable == MyEnum.Value5)

I prefer a solution that uses an array rather than multiple explicit comparisons because I follow this maxim in my code:

Readability of code is more important than performance unless performance is an issue.

IMO, reading multiple comparison statements is harder.

In my own code, I don't actually do the above. I wrap it in a call to an extension method so it's even shorter and, IMO, reads even more naturally,

if (someVariable.IsOneOf(MyEnum.Value1MyEnum.Value2))

This reads like "if X = (Y or Z)" which is closest to spoken English.
It's also easy to add options to.

The extension method, if you're interested, is really simple.

public static bool IsOneOf(this object item, params object[] options)
    return options.Contains(item);

Some potential questions.

What if calling this type of code enough times that there is a performance impact?
- Then do what's necessary to improve performance. That's what the clause in my maxim is for.

What about not using an if statement?
- It might be more appropriate to use a `select` statement. However, if the `if` statement doesn't have an `else` clause then this leads to a `select` statement with only a single match which feels odd to me. (As above though, this may be just because I'm not used to seeing code like that.)

What about starting with direct comparisons if there are only two options and then switch to the "IsOneOf" methods when multiple comparisons are made?
- You could go that way but I'm not a fan as it becomes hard to enforce a consistent style across a codebase. It also means that the change to add functionality unnecessarily requires changing the structure of code which can make it harder to identify the meaningful differences in the change.


follow up post on optimizations for the above.

1 comment:

  1. Your IsOneOf method should be made generic to avoid boxing.

    I have another suggestion, based on tuples:

    Pros: it doesn't cause any allocation; the syntax is better than with an array (at least for the Contains method)
    Cons: it doesn't accept an arbitrary number of values, and the double parentheses for IsOneOf are annoying