Implementing Diverse Pagination Strategies in DRF and FastAPI
Emily Parker
Product Engineer · Leapcell

Introduction: Navigating Large Datasets with Efficient Pagination
In contemporary web development, handling vast amounts of data is a common challenge. When exposing a collection of resources through an API, returning the entire dataset in a single response is often impractical, if not impossible. Such an approach can lead to slow response times, excessive memory consumption on both server and client, and a poor user experience. Pagination emerges as the essential solution, allowing clients to retrieve data in manageable chunks. While the concept of breaking down data into pages seems straightforward, different pagination strategies offer distinct advantages and disadvantages, catering to various use cases. This article will explore two prominent pagination techniques – Limit/Offset and Cursor-based pagination – and demonstrate their implementation within two popular Python web frameworks: Django Rest Framework (DRF) and FastAPI. Understanding these methods is crucial for building scalable and robust APIs that can effectively serve large datasets.
Core Pagination Concepts: A Primer
Before diving into the implementation details, let's clarify the fundamental concepts underpinning pagination strategies.
- Pagination: The process of dividing a large dataset into smaller, discrete pages or chunks, served sequentially to the client. This improves performance and manages resource usage.
- Page: A subset of the total data, typically defined by a size (number of items per page) and an identifier (page number, offset, or cursor).
- Limit: Refers to the maximum number of items to return in a single response (i.e., the page size).
- Offset: Indicates the number of items to skip from the beginning of the dataset before starting to return results.
- Cursor: An opaque string or value that points to a specific item in the dataset. It's used as a bookmark to retrieve the "next" or "previous" set of items relative to that point, without relying on an absolute position like an offset.
- Stable Pagination: A pagination strategy is considered stable if adding or removing items from the dataset while a client is paginating does not cause items to be skipped or duplicated across pages.
Limit/Offset Pagination: Simplicity and Its Pitfalls
Limit/Offset is arguably the most common and intuitive pagination strategy. It operates by specifying two parameters: limit (how many items to return) and offset (how many items to skip from the beginning).
How it works:
Clients request data by providing a limit and an offset. The server then fetches limit items starting from the offset-th record. For instance, to get the second page with 10 items per page, a client would request limit=10&offset=10.
Advantages:
- Simplicity: Easy to understand and implement for both server and client.
- Direct Access: Clients can easily jump to any specific page by calculating the
offset(offset = (page_number - 1) * limit).
Disadvantages:
- Performance Degradation with Large Offsets: As the
offsetincreases, the database might still need to scan through all the skipped records, leading to performance bottlenecks, especially on large tables without proper indexing. - Instability (Skipped/Duplicated Items): If items are added to or deleted from the dataset before the current offset while a client is paginating, the results can become inconsistent. An item might appear on two pages or be entirely skipped. Consider a list of products – if a new product is added to the beginning of the list while a user is on page 5, the subsequent pages might contain items already seen or skip new ones.
Implementing Limit/Offset in DRF
DRF provides a built-in LimitOffsetPagination class, making implementation straightforward.
# project/settings.py REST_FRAMEWORK = { 'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.LimitOffsetPagination', 'PAGE_SIZE': 10 # Default page size } # app/views.py from rest_framework import generics from .models import Product from .serializers import ProductSerializer class ProductListView(generics.ListAPIView): queryset = Product.objects.all().order_by('id') # Always order for consistent pagination serializer_class = ProductSerializer # pagination_class = LimitOffsetPagination # Can also be set per-view
Clients would then make requests like /products/?limit=5&offset=10. They can omit limit to use the default PAGE_SIZE.
Implementing Limit/Offset in FastAPI
FastAPI, being a more minimalist framework, requires a bit more manual setup, leveraging Pydantic and dependencies.
# main.py from typing import List, Optional from fastapi import FastAPI, Depends, Query from pydantic import BaseModel from sqlalchemy import create_engine, Column, Integer, String from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import sessionmaker, Session # Database setup (simplified for example) DATABASE_URL = "sqlite:///./test.db" engine = create_engine(DATABASE_URL) SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine) Base = declarative_base() class ProductModel(Base): __tablename__ = "products" id = Column(Integer, primary_key=True, index=True) name = Column(String, index=True) description = Column(String) Base.metadata.create_all(bind=engine) class ProductCreate(BaseModel): name: str description: str class Product(ProductCreate): id: int class Config: orm_mode = True app = FastAPI() # Dependency to get DB session def get_db(): db = SessionLocal() try: yield db finally: db.close() # LimitOffset Pagination dependency class LimitOffsetParams: def __init__( self, limit: int = Query(10, ge=1, le=100), offset: int = Query(0, ge=0), ): self.limit = limit self.offset = offset @app.post("/products/", response_model=Product) def create_product(product: ProductCreate, db: Session = Depends(get_db)): db_product = ProductModel(**product.dict()) db.add(db_product) db.commit() db.refresh(db_product) return db_product @app.get("/products/", response_model=List[Product]) def get_products( pagination: LimitOffsetParams = Depends(), db: Session = Depends(get_db) ): products = db.query(ProductModel).offset(pagination.offset).limit(pagination.limit).all() return products
In this FastAPI example, LimitOffsetParams serves as a dependency to inject the limit and offset parameters directly into the route function. The SQL query then uses .offset() and .limit() to retrieve the data.
Cursor-based Pagination: Ensuring Stability and Performance
Cursor-based pagination (also known as keyset pagination) addresses the stability and performance issues of Limit/Offset, particularly with large datasets. Instead of using a numeric offset, it uses a pointer (cursor) to the "last item seen" to fetch the next set of results.
How it works:
The client receives a cursor value (often an encoded identifier like an ID or a timestamp) along with the paginated data. To get the next page, the client sends this cursor back to the server, which then fetches items after that cursor value. This relies heavily on consistently sorted data. For example, to get items after ID X, the query would be WHERE id > X ORDER BY id LIMIT N.
Advantages:
- Stability: Items added or removed while paginating do not affect which items are included in subsequent pages, as long as the sorting order remains consistent. This prevents skipping or duplicating records.
- Performance: Databases can efficiently use indexes on the sorted column (e.g.,
idortimestamp) to quickly locate the starting point, avoiding the slow scan associated with large offsets. This scales much better for very large datasets. - Scalability: Better suited for infinitely scrolling feeds or timelines where users typically only move forward or backward one page at a time.
Disadvantages:
- No Direct Page Access: Clients cannot "jump" to an arbitrary page (e.g., page 5) as there's no numerical page concept. They can only move relative to the current cursor.
- Requires Stable Sort Key: Relies on having a unique, immutable, and sequentially sortable column (like a primary key or a timestamp) to serve as the cursor.
- Backward Pagination Complexity: Implementing backward pagination (e.g., "previous page") can be more complex, requiring additional logic to reverse the sorting and filter conditions.
Implementing Cursor-based Pagination in DRF
DRF offers CursorPagination which smartly handles the encoding/decoding of cursor values.
# project/settings.py # If you want to use it as default # REST_FRAMEWORK = { # 'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.CursorPagination', # 'PAGE_SIZE': 10, # 'CURSOR_PAGINATION_USE_REL_LINK_HEADERS': True # Optional, for HATEOAS links # } # app/views.py from rest_framework import generics from rest_framework.pagination import CursorPagination from .models import Product from .serializers import ProductSerializer # Custom CursorPagination for specific ordering class ProductCursorPagination(CursorPagination): page_size = 10 ordering = 'created_at' # Or 'id', 'name', etc. Must be unique and consistently sorted # cursor_query_param = 'cursor' # Default, can be changed # page_size_query_param = 'page_size' # Default, can be changed class ProductListView(generics.ListAPIView): queryset = Product.objects.all().order_by('created_at', 'id') # Crucial for stability serializer_class = ProductSerializer pagination_class = ProductCursorPagination
The ordering attribute in ProductCursorPagination is critical. It defines the column(s) used for the cursor and the required sort order. It's often good practice to include a secondary unique field like id in the ordering to handle cases where the primary sort field (e.g., created_at) might not be unique.
Requests would look like /products/?cursor=AbcD... for the next page, where AbcD... is the opaque cursor string provided in the previous response.
Implementing Cursor-based Pagination in FastAPI
Implementing cursor-based pagination in FastAPI involves a custom dependency and careful handling of the query logic.
# main.py (building on previous FastAPI example) import base64 from typing import List, Optional from fastapi import FastAPI, Depends, Query, HTTPException from pydantic import BaseModel, Field from sqlalchemy import create_engine, Column, Integer, String, DateTime from sqlalchemy.sql import func from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import sessionmaker, Session from datetime import datetime # (Database setup and ProductModel/ProductCreate/Product are the same as before) class ProductModel(Base): __tablename__ = "products" id = Column(Integer, primary_key=True, index=True) name = Column(String, index=True) description = Column(String) created_at = Column(DateTime, default=func.now()) # Added for cursor pagination Base.metadata.create_all(bind=engine) class Product(BaseModel): id: int name: str description: str created_at: datetime # Include created_at in the response class Config: orm_mode = True app = FastAPI() # (get_db function is the same) class CursorParams: def __init__( self, limit: int = Query(10, ge=1, le=100), after_cursor: Optional[str] = Query(None, description="Cursor for the next page"), ): self.limit = limit self.after_cursor = after_cursor def decode_cursor(encoded_cursor: str) -> tuple[datetime, int]: try: decoded_string = base64.b64decode(encoded_cursor).decode('utf-8') timestamp_str, item_id_str = decoded_string.split(":") return datetime.fromisoformat(timestamp_str), int(item_id_str) except (ValueError, TypeError) as e: raise HTTPException(status_code=400, detail=f"Invalid cursor format: {e}") def encode_cursor(created_at: datetime, item_id: int) -> str: cursor_string = f"{created_at.isoformat()}:{item_id}" return base64.b64encode(cursor_string.encode('utf-8')).decode('utf-8') @app.post("/products/", response_model=Product) def create_product(product: ProductCreate, db: Session = Depends(get_db)): db_product = ProductModel(**product.dict()) db.add(db_product) db.commit() db.refresh(db_product) return db_product @app.get("/products_cursor/", response_model=List[Product]) def get_products_cursor( pagination: CursorParams = Depends(), db: Session = Depends(get_db) ): query = db.query(ProductModel) if pagination.after_cursor: last_created_at, last_id = decode_cursor(pagination.after_cursor) # Handle ties in created_at: if created_at is the same, use id to break ties query = query.filter( (ProductModel.created_at > last_created_at) | ((ProductModel.created_at == last_created_at) & (ProductModel.id > last_id)) ) products = query.order_by(ProductModel.created_at, ProductModel.id).limit(pagination.limit + 1).all() # We fetch one extra to determine if there's a next page has_next_page = len(products) > pagination.limit if has_next_page: products_to_return = products[:pagination.limit] last_product = products_to_return[-1] next_cursor = encode_cursor(last_product.created_at, last_product.id) else: products_to_return = products next_cursor = None # You'd typically return the data along with the next_cursor, e.g., in a dict return { "products": products_to_return, "next_cursor": next_cursor }
In this FastAPI example, CursorParams injects the limit and after_cursor into the route. We define decode_cursor and encode_cursor functions to manage the transparent cursor value. The database query specifically filters for items "after" the decoded cursor values, ordered by created_at and id to ensure consistent and stable pagination even when created_at values are identical. We fetch limit + 1 items to easily determine if a next_cursor should be provided.
Choosing the Right Strategy
The choice between Limit/Offset and Cursor-based pagination depends heavily on your application's requirements:
-
Use Limit/Offset when:
- Dataset size is relatively small to medium.
- Clients need to jump to arbitrary pages (e.g., displaying "Page 1 of 10").
- Data updates are infrequent or consistency across pagination isn't a critical concern.
- Simplicity of implementation is prioritized.
-
Use Cursor-based pagination when:
- Working with very large, frequently updated, or fast-growing datasets.
- Stability and consistent results across pagination are crucial (e.g., social media feeds, event logs).
- Performance at scale is a primary concern.
- Clients primarily navigate forward or backward one step at a time (e.g., "load more" functionality).
Conclusion: Tailoring Pagination to Your API's Needs
Effective pagination is a cornerstone of well-designed APIs dealing with significant data volumes. Limit/Offset pagination offers simplicity and direct page access but can suffer from performance and stability issues at scale. Cursor-based pagination, while slightly more complex to implement, provides superior performance and stability for large, dynamic datasets by relying on a consistent sort order and a "last seen" pointer. By carefully evaluating the characteristics of your data and the navigation patterns of your clients, you can select the most appropriate pagination strategy, guaranteeing a performant and reliable API experience. The key lies in understanding the trade-offs and aligning the chosen method with your specific application's demands for efficiency and data integrity.

