Entity Framework has two methods for converting a query to a list or an array: ToListAsync
and ToArrayAsync
. Of course there is also the sync version of these methods: ToList
and ToArray
. And sometimes, you should use the latter one!
The bug
In one of my projects, I have the following code:
public static async Task<IPagedList<T>> ToPagedListAsync<T>(this IQueryable<T> source, int page, int pageSize, CancellationToken token = default)
{
var count = await source.CountAsync(token);
if (count > 0)
{
var items = await source
.Skip((page - 1) * pageSize)
.Take(pageSize)
.ToListAsync(token);
return new PagedList<T>(items, count, page, pageSize);
}
return PagedList<T>.Empty;
}
That looks fairly trivial - and of course we want ToListAsync
here, or do we? Well: Not necessarily! To give a bigger picture, this call is used in that very blog post you are reading right now. And the datamodel of a blog post has the following column: Content NVARCHAR(MAX) NOT NULL
. NVARCHAR(MAX)
(as well as VARBINARY(MAX)
and friends) cause a known issue with Entity Framework since almost 10 years: Performance degradation. The async
version spawns a whole lot of tasks, see the stackoverflow article.
If you want to follow this in more depth: Reading large data (binary, text) asynchronously is extremely slow .
The fix
The fix is fairly simple: Just use the synchronous version:
public static async Task<IPagedList<T>> ToPagedListAsync<T>(this IQueryable<T> source, int page, int pageSize, CancellationToken token = default)
{
var count = await source.CountAsync(token);
if (count > 0)
{
// I tried ToListAsync and it performed just poorly!
// Mainly because we have a VARCHAR(max) column
// See here: https://stackoverflow.com/questions/28543293/entity-framework-async-operation-takes-ten-times-as-long-to-complete/28619983
var items = source
.Skip((page - 1) * pageSize)
.Take(pageSize)
.ToList();
return new PagedList<T>(items, count, page, pageSize);
}
return PagedList<T>.Empty;
}
And yes, adding a comment here is absolutely necessary and okay! Especially because CountAsync
is still async and the whole method is marked as async
.
What is the takeaway?
Well, check if you are handling large binary data via ToListAsync
in your selector? If so, maybe you are suffering from the issue!