Cursor Based Pagination
Understanding cursor based pagination and queries
This is not a code based article, I’m trying to explain the concept of cursor based pagination, I urge you to read as this is not centered to any language in particular. Cursor based pagination is solution to the all the cons of Offset based pagination, and it’s the least used because most times developers come across OBP first and stick with to the end. CBP is not difficult to understand, it is like every logic, you play with it till you stop seeing bugs.
To understand CBP you have to know the idea behind the name. In offset based pagination you send the limit and the number of items to skip to the server but this does not scale when the dataset increases. This is because for every number of data skipped, the request still goes through those data before returning the limit, so in essence, if you skipped 1000 items to return 100, the request goes on for 1100 times, this increases the database work load and slows down performance, in essence, O(n) time complexity, more data more time.
While for cursor based queries, you send the cursor ( could be the ‘id’, or any column in the database that is unique), the limit (number of data to return from the dataset). This is easier on the server because it knows the exact point to continue fetching the data, which means cursor based pagination has O(1) time complexity, the amount of time it used in getting the first query is the same time it will take to get subsequent queries. It gets tricky because you begin to ask how do you know the cursor to send, so the query continues from where it stopped, or how to know when to stop the client (front-end) from requesting data, “the has next page” of OFP, but that is why I’m writing this to help you fully understand Cursor based pagination.
Pros of Cursor Based Pagination
- You reduce the server load and query time.
- You get the exact data you wanted because your request is precise.
- It is very easy to implement continuous scroll in the UI.
- Data is not repeated on the UI, when new entries are added.
Cons of Cursor based Pagination
- Has no previous page.
- Can not skip to a particular data.
Like other solution cursor based pagination has its down sides, this is not a bad thing because it was introduced as a solution for specific problems of the OFB, so it is not suitable for all scenarios.
When to implement cursor based pagination
- When you need to return a huge dataset in a particular order.
- If you are fetching data for a real-time application.
Here I’m assuming you know a bit of one backend technology, what we are going to look at is more of the logic and not the technology.
The General idea of pagination is that you don’t want to over work the server and the UI, that is having to deal with fetching hundreds of dataset at once and rendering them all at once, when the user will probably not go through all the results. This is similar to how front-end developers deal with images on the UI, using lazy-loading so any part of the UI that the user does get to, the images won’t load.
So to achieve this you need to create a Route if you are using REST or Resolver for GRAPHQL. The route receives a limit and cursor variable in the body, params or query strings for REST. Graphql uses args parameter so it is included in the query parameter.
How to get the cursor
The cursor is the most vital part of this type of pagination and it is always the last item in the result, so if you had requested for 50 items, the fiftieth item in the returned dataset houses the column you use as the cursor, this should be serially distinctive or you can easily index a column of your choice (be careful with indexing as it has its cons), so the cursor could be the id or the createdAt column. I normally use the createdAt column it is naturally distinct. So you will have to choose the order in which you want the server to return data Ascending or descending, this means sorting the column you intend to use as your cursor, then you find the last item of the result then get your cursor column. The order of the result is really important as the subsequent requests will have to do with fetching data that are greater or less than the cursor.
Assuming the cursor is to be a serialized ID with 1 as the starting point and the result of 50 is returned in ascending order, the first request would have a result of 1 to 50 where the cursor becomes 50. The next request would be to get items from a table where id is greater than 50, in ascending order, the result would be from ‘ID’ 51 to 100.
Now the next cursor is 100, that will be sent to the front-end, so the next request knows where to start from. And the reason you provide the ability for the frontend to request a limit is to help the User Experience (UX), because the user might want less or more data. This is just the primary step and would introduce a lot of bug in your code because the FE might still be requesting for more when the dataset has been exhausted.
The Logic of “has more pages”
To know if there are more data in the database, (If the database has more pages) is quite tricky, to do this you have to increase the limit received from the frontend by one and you have to set a maximum limit as well because you want to give limited control to the frontend. If the requested limit to get is 50 items at a time, the query should add 1 to the initial limit so the result is 51, the reason for this is to know if there are more items in the database.
To verify the result, check if the result is less than or equal to the limit requested by the frontend which is 50, if this is true this means there are no more data to fetch, no more pages.
The new problem now is the cursor, remember the cursor is suppose to be last item in the result, if you use the last item as cursor, there will be repetition of data in the UI, (back to square one).
How to handle repetition of items in the UI, Cursor based pagination
To handle this bug, we have to pop or remove the last item in the result so it will be equal to the initial limit of 50, now the last item is the right cursor for the next query. What if the result is not greater than 50, there will be no need to pop the last item, else an item will be missing in the UI.
So, in essence, you should always check if the result is greater than the initial limit. I hope this is enough to help you understand the logic behind cursor based pagination.