Does Using Slots Actually Boost Pydantic and ORM Performance? A Benchmark Study

Introduction

In the world of Python, optimizing memory usage and execution speed is a perennial pursuit, especially when dealing with data-intensive applications. Two critical libraries in this domain are Pydantic, widely used for data validation and parsing, and Object-Relational Mappers (ORMs), which abstract database interactions. Developers often look for ways to squeeze out every drop of performance, and __slots__ is a frequently cited optimization technique. But does applying __slots__ to Pydantic models and ORM objects genuinely deliver the promised memory and performance benefits? This question, while seemingly straightforward, involves understanding Python's internal mechanisms and requires empirical validation rather than mere assumptions. In this article, we'll delve into the specifics, conduct a benchmark, and provide clear answers.

Unpacking Slots, Pydantic, and ORMs

Before diving into the benchmarks, let's clarify the key concepts at play.

What are `slots`?

In Python, instances of a class typically store their attributes in a dictionary called __dict__. This dictionary provides immense flexibility, allowing attributes to be added or removed dynamically at runtime. However, this flexibility comes at a cost: each instance carries the overhead of this dictionary, consuming more memory.

The __slots__ attribute, when defined in a class, tells Python not to create an instance __dict__ for objects of that class. Instead, it pre-allocates a fixed amount of space for a predefined set of attributes. This trade-off—losing dynamic attribute assignment in exchange for reduced memory footprint and potentially faster attribute access—is the core promise of __slots__.

# Example of a class without __slots__
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

p = Point(1, 2)
# print(p.__dict__) # {'x': 1, 'y': 2}
# print(f"Memory size without slots: {sys.getsizeof(p)}")


# Example of a class with __slots__
class SlottedPoint:
    __slots__ = ('x', 'y')

    def __init__(self, x, y):
        self.x = x
        self.y = y

sp = SlottedPoint(1, 2)
# print(hasattr(sp, '__dict__')) # False
# print(f"Memory size with slots: {sys.getsizeof(sp)}")

Pydantic Models

Pydantic is a data validation and parsing library that uses Python type hints to define data schemas. It's renowned for its robust validation, serialization, and deserialization capabilities. Pydantic models, by default, are regular Python classes that store their validated data in an internal __dict__. This allows them to integrate seamlessly with other Python features and provide a flexible data structure.

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str

user = User(id=1, name="Alice", email="alice@example.com")
# print(user.__dict__)

ORM Objects

Object-Relational Mappers (ORMs) provide an object-oriented way to interact with databases. Libraries like SQLAlchemy, Django ORM, and Peewee map database tables to Python classes and rows to Python objects. These ORM objects often carry a significant amount of metadata, including their attributes, relationships, and database session information, typically stored in their instance __dict__ or a similar internal structure.

The Hypothesis: How `slots` Might Help

The theory is that by applying __slots__ to Pydantic models or ORM objects, we could:

Reduce Memory Consumption: Each instance would require less memory because the __dict__ overhead is eliminated. This is particularly relevant when dealing with millions of objects.
Improve Attribute Access Speed: Accessing attributes directly from fixed slots might be faster than dictionary lookups.

The Benchmarking Setup

To test our hypothesis, we will benchmark:

Memory Usage: How much memory is consumed by a large number of instances.
Object Instantiation Time: How long it takes to create a large number of objects.
Attribute Access Time: How long it takes to read attributes from objects.

We will compare:

Standard Pydantic Models vs. Pydantic Models with __slots__
Standard ORM-like Objects vs. ORM-like Objects with __slots__ (Simulating ORM objects since a full ORM setup adds too many variables not directly related to __slots__ itself.)

We'll use the sys.getsizeof function for a rough estimation of object size and the timeit module for performance measurements.

import sys
import timeit
from pydantic import BaseModel, ConfigDict
from memory_profiler import profile

# 1. Pydantic Models without slots
class UserNoSlots(BaseModel):
    id: int
    name: str
    email: str

# 2. Pydantic Models with slots
class UserWithSlots(BaseModel):
    model_config = ConfigDict(slots=True) # Pydantic v2 way
    id: int
    name: str
    email: str

    # For Pydantic v1, you would define __slots__ directly:
    # __slots__ = ('id', 'name', 'email', '__pydantic_fields_set__', '__pydantic_extra__', ..., etc.)
    # Note: Pydantic v1 required more careful management of internal attributes in slots.
    # Pydantic v2 handles `__slots__` more gracefully internally via `model_config`.


# 3. Simple ORM-like object without slots
class ProductNoSlots:
    def __init__(self, item_id: int, name: str, price: float):
        self.item_id = item_id
        self.name = name
        self.price = price

# 4. Simple ORM-like object with slots
class ProductWithSlots:
    __slots__ = ('item_id', 'name', 'price')

    def __init__(self, item_id: int, name: str, price: float):
        self.item_id = item_id
        self.name = name
        self.price = price


# Helper for memory comparison (approximate)
def get_total_memory_usage(objects):
    return sum(sys.getsizeof(obj) for obj in objects)

NUM_OBJECTS = 100000

print("--- Pydantic Benchmarks ---")

# Instantiation and Memory - Pydantic
print("\n[Pydantic Instantiation and Memory]")
setup_pydantic_noslots = f"""
from __main__ import UserNoSlots
objects = [UserNoSlots(id=i, name=f"User {{i}}", email=f"user{{i}}@example.com") for i in range({NUM_OBJECTS})]
"""
time_noslots = timeit.timeit(setup_pydantic_noslots, number=1)
print(f"UserNoSlots instantiation ({NUM_OBJECTS} objects): {time_noslots:.4f} seconds")

setup_pydantic_withslots = f"""
from __main__ import UserWithSlots
objects = [UserWithSlots(id=i, name=f"User {{i}}", email=f"user{{i}}@example.com") for i in range({NUM_OBJECTS})]
"""
time_withslots = timeit.timeit(setup_pydantic_withslots, number=1)
print(f"UserWithSlots instantiation ({NUM_OBJECTS} objects): {time_withslots:.4f} seconds")

# Memory (requires creating objects outside timeit to measure)
user_noslots_list = [UserNoSlots(id=i, name=f"User {i}", email=f"user{i}@example.com") for i in range(NUM_OBJECTS)]
user_withslots_list = [UserWithSlots(id=i, name=f"User {i}", email=f"user{i}@example.com") for i in range(NUM_OBJECTS)]

print(f"UserNoSlots total memory ({NUM_OBJECTS} objects): {get_total_memory_usage(user_noslots_list) / (1024*1024):.2f} MB")
print(f"UserWithSlots total memory ({NUM_OBJECTS} objects): {get_total_memory_usage(user_withslots_list) / (1024*1024):.2f} MB")

# Attribute Access - Pydantic
print("\n[Pydantic Attribute Access]")
access_pydantic_noslots = f"""
for user in user_noslots_list:
    _ = user.id
    _ = user.name
    _ = user.email
"""
time_access_noslots = timeit.timeit(access_pydantic_noslots, globals=globals(), number=10)
print(f"UserNoSlots attribute access ({NUM_OBJECTS*3*10} accesses): {time_access_noslots:.4f} seconds")

access_pydantic_withslots = f"""
for user in user_withslots_list:
    _ = user.id
    _ = user.name
    _ = user.email
"""
time_access_withslots = timeit.timeit(access_pydantic_withslots, globals=globals(), number=10)
print(f"UserWithSlots attribute access ({NUM_OBJECTS*3*10} accesses): {time_access_withslots:.4f} seconds")


print("\n--- ORM-like Object Benchmarks ---")

# Instantiation and Memory - ORM-like
print("\n[ORM-like Instantiation and Memory]")
setup_orm_noslots = f"""
from __main__ import ProductNoSlots
objects = [ProductNoSlots(item_id=i, name=f"Product {{i}}", price=float(i)/100) for i in range({NUM_OBJECTS})]
"""
time_orm_noslots = timeit.timeit(setup_orm_noslots, number=1)
print(f"ProductNoSlots instantiation ({NUM_OBJECTS} objects): {time_orm_noslots:.4f} seconds")

setup_orm_withslots = f"""
from __main__ import ProductWithSlots
objects = [ProductWithSlots(item_id=i, name=f"Product {{i}}", price=float(i)/100) for i in range({NUM_OBJECTS})]
"""
time_orm_withslots = timeit.timeit(setup_orm_withslots, number=1)
print(f"ProductWithSlots instantiation ({NUM_OBJECTS} objects): {time_orm_withslots:.4f} seconds")

# Memory (requires creating objects outside timeit to measure)
product_noslots_list = [ProductNoSlots(item_id=i, name=f"Product {i}", price=float(i)/100) for i in range(NUM_OBJECTS)]
product_withslots_list = [ProductWithSlots(item_id=i, name=f"Product {i}", price=float(i)/100) for i in range(NUM_OBJECTS)]

print(f"ProductNoSlots total memory ({NUM_OBJECTS} objects): {get_total_memory_usage(product_noslots_list) / (1024*1024):.2f} MB")
print(f"ProductWithSlots total memory ({NUM_OBJECTS} objects): {get_total_memory_usage(product_withslots_list) / (1024*1024):.2f} MB")

# Attribute Access - ORM-like
print("\n[ORM-like Attribute Access]")
access_orm_noslots = f"""
for product in product_noslots_list:
    _ = product.item_id
    _ = product.name
    _ = product.price
"""
time_access_orm_noslots = timeit.timeit(access_orm_noslots, globals=globals(), number=10)
print(f"ProductNoSlots attribute access ({NUM_OBJECTS*3*10} accesses): {time_access_orm_noslots:.4f} seconds")

access_orm_withslots = f"""
for product in product_withslots_list:
    _ = product.item_id
    _ = product.name
    _ = product.price
"""
time_access_orm_withslots = timeit.timeit(access_orm_withslots, globals=globals(), number=10)
print(f"ProductWithSlots attribute access ({NUM_OBJECTS*3*10} accesses): {time_access_orm_withslots:.4f} seconds")

Analysis of Benchmark Results

(Note: The exact numbers will vary based on hardware and Python version, but the trends should be consistent.)

Pydantic Models:

Memory Usage: When Pydantic models are configured with slots=True (especially from v2 onwards, which handles it gracefully), there can be a noticeable reduction in memory footprint. This is because Pydantic internally manages the __slots__ definition to include its own necessary internal attributes, alongside your declared fields. For example, __pydantic_fields_set__, __pydantic_extra__, etc. For a simple Pydantic model, enabling slots will likely reduce the per-object overhead.
Instantiation Time: Instantiation with __slots__ can sometimes be slightly slower or negligibly faster. The overhead of setting up __slots__ during class creation, and the process of assigning values to fixed slots instead of a dynamic dictionary, might introduce minor differences. Pydantic's internal validation and parsing logic also dominate the instantiation time.
Attribute Access Time: Attribute access on slotted Pydantic models is often negligibly faster or about the same. Again, Pydantic's internal mechanisms might abstract away some of the direct __slots__ benefits here.

The key takeaway for Pydantic is that for __slots__ to be effective, it needs to be carefully implemented alongside Pydantic's internal attributes. Fortunately, Pydantic v2's model_config = ConfigDict(slots=True) simplifies this by handling the complex parts for you, often yielding good memory savings.

ORM-like Objects:

Memory Usage: For simple, plain Python objects (which our ORM-like objects simulate), __slots__ provides significant memory savings. The __dict__ overhead for each instance is completely removed, directly translating to a smaller memory footprint, especially when NUM_OBJECTS is large.
Instantiation Time: Creating objects with __slots__ is often slightly faster because the interpreter doesn't need to create and initialize an instance __dict__ for each object.
Attribute Access Time: Accessing attributes on slotted objects is typically faster. Instead of a dictionary lookup, Python performs a direct lookup into a fixed-size array-like structure.

Important Considerations:

Immutability: When using __slots__, you typically cannot dynamically add new attributes to instances after creation. This is a core trade-off that might not be suitable for all use cases, especially with ORMs that sometimes add proxy attributes or lazy-loaded relationships.
Inheritance: __slots__ can introduce complexities with inheritance. A subclass of a slotted class might have its own __dict__ unless it also defines __slots__ and includes a __dict__ entry in its own __slots__ or its parent class defines __slots__ with a __dict__ entry.
Pydantic's Internal Workings: Pydantic models are more complex than simple Python objects. They have internal state (e.g., __pydantic_fields_set__, validators, computed properties). For __slots__ to work effectively, Pydantic needs to slot these internal attributes as well. As mentioned, Pydantic v2 has embraced this with the ConfigDict(slots=True) option, making it much more practical and beneficial than manually defining __slots__ in Pydantic v1.
ORM Complexity: Real-world ORM objects (like SQLAlchemy models) are highly dynamic and manage their state in intricate ways, often using descriptor protocols, proxy objects, and lazy loading. Directly applying __slots__ to an ORM class might break its internal mechanisms or lead to unexpected behavior. ORM designers rarely expose __slots__ as a configurable option for this very reason. The __slots__ benefit might be completely negated or even detrimental due to the ORM's need for dynamic attribute management.

Conclusion

For** plain Python objects or custom data structures**, employing __slots__ can yield substantial memory savings and modest performance improvements in attribute access and instantiation, making it a valuable optimization technique for large collections of simple, immutable objects. For Pydantic models, particularly with Pydantic v2's ConfigDict(slots=True), it offers genuine memory optimization without significant performance regressions, making it a viable option for memory-constrained applications. However, applying __slots__ to ORM objects directly is generally not recommended due to the complex internal state management and dynamic nature of ORMs, where the benefits are unlikely to outweigh the potential for breakage.

Does Using Slots Actually Boost Pydantic and ORM Performance? A Benchmark Study

Introduction

Unpacking Slots, Pydantic, and ORMs

What are `slots`?

Pydantic Models

ORM Objects

The Hypothesis: How `slots` Might Help

The Benchmarking Setup

Analysis of Benchmark Results

Conclusion

Share this article

More Posts from Leapcell

Unpacking GIL's Impact on FastAPI/Django and the Power of Gunicorn/Uvicorn

Mastering Django ORM for Advanced Queries with F() and Q() Objects

Popular Posts

Introduction

Unpacking Slots, Pydantic, and ORMs

What are __slots__?

Pydantic Models

ORM Objects

The Hypothesis: How __slots__ Might Help

The Benchmarking Setup

Analysis of Benchmark Results

Conclusion

Share this article

More Posts from Leapcell

Unpacking GIL's Impact on FastAPI/Django and the Power of Gunicorn/Uvicorn

Mastering Django ORM for Advanced Queries with F() and Q() Objects

Popular Posts

What are `slots`?

The Hypothesis: How `slots` Might Help