Tackling poor performance by replacing QueryConvertForGrid with SQL Window functions and Common Table Expressions
I was digging through some code recently and discovered that it was using the
QueryConvertForGrid function. It looked something like this (yes it was written back in the day when cfgrid and Hungarian notation was all the rage!):
<cffunction name="getRecords" output="false" access="remote"> <cfargument name="page" default="1" type="numeric"> <cfset local.pageSize = 10> <cfquery name="local.qData" datasource="whatever"> select id, name from myTable where active = 1 order by name </cfquery> <cfreturn QueryConvertForGrid(local.qData, arguments.page, local.pageSize)> </cffunction>
It’s old code, but looks innocent enough, and it was probably reasonably performant when the code was written and the database only had a hundred rows in it. Of course overtime the database table has grown to thousands of rows and so what is actually happening is the query is retrieving thousands of rows from the database and then
QueryConvertForGrid is slicing that recordset and returning a mere ten. The database is returning much more data than we need and of course this is quite slow and will get slower as the database table grows.
The obvious thing to do is to add a limit (pagesize) and offset (startrow) to the query so it only gets the 10 rows it needs to return. Unfortunately we also need to know the total number of rows so that the UI knows if there is a next page etc. In the past I might have written two queries, one to do a count and one to get the data to return. That approach quickly becomes hard to maintain as you need to ensure that the where clause in the two SQL statements are the same. If there is one constant when it comes to software, it is that things will change!
Fortunately modern SQL engines give us the power of Window Functions and Common Table Expressions. I’m still pretty new to them myself so I’m not going to try and explain how they work but I thought if I posted some examples then it may prompt people to read up on them and try them out – they can be very useful!
Considering the problem above, I tackled it using a SQL windows function.
select id, name, count(*) over () AS totalrows from myTable where active = 1 order by name limit 0, 10
I’m not going to go into the syntax of the SQL as I think you are better off reading the docs for whichever database engine you are using (and I’m certainly not an SQL guru), but needless to say this returned the 10 rows and the totalrows column has the count of the total number of rows which was exactly what I needed.
I had gained a significant performance boost and meant I could remove the
QueryConvertForGrid. However, a far smarter person than me suggested on the CFML Slack channel that this was likely to be working the database engine hard and suggested I try using Common Table Expressions instead.
WITH allRows AS ( select id, name from myTable where active = 1 order by name ), totalCount AS ( select count(*) as totalrows from allRows ) select id, name, totalrows from allRows cross join totalCount limit 0, 10
Again I’m not going to go into the syntax, but this gives the same recordset as the windows function version. Both are significantly faster than retrieving all the database rows and then slicing the recordset.
I wouldn’t want to recommend one technique over the other as it will likely vary depending on data and which database engine you are using but hopefully this has made you want to go off and read up on how brushing up on your SQL skills may be able to improve your applications.