OfType(x) vs Where(_ => _ is x) vs Where with enum
OfType(x) vs Where(_ => _ is x) vs Where with enum
I'm a bit confused with the result here, maybe someone could give me some insights?
Basically, I am trying to test the performance between using
OfType(x)
Where(_ = _ is x).Select((X)x)
Where(_ = _.Type = type).Select((X)x)
Here are the classes:
public enum AnimalTypes { Alligator, Bear, Cat, Dog, Elephant }
public interface IAnimal
{
AnimalTypes Type { get; }
}
public class Bear : IAnimal
{
public AnimalTypes Type => AnimalTypes.Bear;
}
public class Cat : IAnimal
{
public AnimalTypes Type => AnimalTypes.Cat;
}
edit: this code was fixed based on the comments! sorry for the error
and here is the testing method
void Main()
{
List<IAnimal> animals = new List<IAnimal>();
for (int i = 0; i < 100000; i++)
{
animals.Add(new Bear());
animals.Add(new Cat());
}
// tests
IEnumerable<Cat> test1 = animals.OfType<Cat>();
IEnumerable<Cat> test2 = animals.Where(_ => _ is Cat).Select(_ => (Cat)_);
IEnumerable<Cat> test3 = animals.Where(_ => _.Type == AnimalTypes.Cat).Select(_ => (Cat)_);
Stopwatch sw = new Stopwatch();
// OfType
sw.Start();
test1.ToArray();
sw.Stop();
Console.WriteLine($"OfType = {sw.ElapsedTicks} ticks");
sw.Reset();
// Where (is) + Select
sw.Start();
test2.ToArray();
sw.Stop();
Console.WriteLine($"Where (is) + Select = {sw.ElapsedTicks} ticks");
sw.Reset();
// Where (enum) + Select
sw.Start();
test3.ToArray();
sw.Stop();
Console.WriteLine($"Where (type) + Select = {sw.ElapsedTicks} ticks");
sw.Reset();
}
Oddly, the results always ensure that the last test gets the best results...
.ToArray()
And note that
type == type
is exact type comparison, while is
/OfType
support subclasses (for example a Siamese is a Cat, OfType<Cat>() and is Cat will both return Siamese cats)– xanatos
Jul 1 at 12:04
type == type
is
OfType
This code is actually measuring jitting overhead, first one is always expensive. Always repeat a test at least 10 times, you'll see this overhead disappear. And get a feel for the numbers, 2 ticks is far too fast. Otherwise easy to see by doubling the collection size. Google "linq deferred execution" to learn more. Once you fix it, beware the noisy results from very fast code like this, you ought to see that the Type comparison is fastest. It should be, no need to check the type hierarchy like is and OfType needs to do.
– Hans Passant
Jul 1 at 12:09
Thanks everyone, I updated my code, but still getting unusual results.
– Svek
Jul 1 at 12:26
1 Answer
1
Your testing code has three big problems:
Look at something like this instead:
var animals = new List<IAnimal>();
for (int i = 0; i < 1000000; i++)
{
animals.Add(new Bear());
animals.Add(new Cat());
}
// remove overhead of the first query
int catsCount = animals.Where(x => x == x).Count();
var whereIsTicks = new List<long>();
var whereTypeTicks = new List<long>();
var ofTypeTicks = new List<long>();
var sw = Stopwatch.StartNew();
// a performance test with a single pass doesn't make a lot of sense
for (int i = 0; i < 100; i++)
{
sw.Restart();
// Where (is) + Select
catsCount = animals.Where(_ => _ is Cat).Select(_ => (Cat)_).Count();
whereIsTicks.Add(sw.ElapsedTicks);
// Where (enum) + Select
sw.Restart();
catsCount = animals.Where(_ => _.Type == AnimalTypes.Cat).Select(_ => (Cat)_).Count();
whereTypeTicks.Add(sw.ElapsedTicks);
// OfType
sw.Restart();
catsCount = animals.OfType<Cat>().Count();
ofTypeTicks.Add(sw.ElapsedTicks);
}
sw.Stop();
// get the average run time for each test in an easy-to-print format
var results = new List<Tuple<string, double>>
{
Tuple.Create("Where (is) + Select", whereIsTicks.Average()),
Tuple.Create("Where (type) + Select", whereTypeTicks.Average()),
Tuple.Create("OfType", ofTypeTicks.Average()),
};
// print results orderer by time taken
foreach (var result in results.OrderBy(x => x.Item2))
{
Console.WriteLine($"{result.Item1} => {result.Item2}");
}
Running this multiple times, Where (is)
can be a little faster or slower than Where (type)
, however, OfType
is always the slowest by a good margin:
Where (is)
Where (type)
OfType
i < 10
:
i < 10
Where (type) + Select => 111428.9
Where (is) + Select => 132695.8
OfType => 158220.7
i < 100
:
i < 100
Where (is) + Select => 110541.8
Where (type) + Select => 119822.74
OfType => 150087.22
i < 1000
:
i < 1000
Where (type) + Select => 113196.381
Where (is) + Select => 115656.695
OfType => 160461.465
The reason why OfType
will always be slower is pretty obvious when you look at the source code for the OfType
method:
OfType
OfType
static IEnumerable<TResult> OfTypeIterator<TResult>(IEnumerable source)
{
foreach (object obj in source)
{
if (obj is TResult)
{
yield return (TResult)obj;
}
}
}
As you can see, the source items are type checked with is
and then casted back to TResult
. The difference would be bigger for value types due to the boxing.
is
TResult
Thank you. So the "winner" is Where (is). That was unexpected.
– Svek
Jul 1 at 12:49
@Svek Not always, as I mentioned, sometimes Where (type) was faster. Do notice that there's a huge difference though in the results as the
Enum
wouldn't allow you to check for children while is
would, as it was mentioned in a comment to the question– Camilo Terevinto
Jul 1 at 12:51
Enum
is
@Svek You are welcome. See my update for an explanation
– Camilo Terevinto
Jul 1 at 12:58
@CamiloTerevinto When you make tests, you don't want the GC to run... So you should try to minimize memory allocations if you can. Instead of
.ToArray();
(that will create two collections, one internally plus the array), use .Count()
, that won't allocate anything.– xanatos
Jul 1 at 15:25
.ToArray();
.Count()
"As you can see, the source items are boxed" - no, there's no boxing here at all. The elements are references already. Boxing would only be involved if there were value types involved.
– Daisy Shipton
Jul 1 at 15:37
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Emh... you are only creating the queries, not actually executing them. Add a
.ToArray()
to the end of each query– Camilo Terevinto
Jul 1 at 11:58