The Hidden Challenges of Eloquent at Scale

 What They Don't Tell You About Scaling Eloquent for Millions of Rows

The Hidden Challenges of Eloquent at Scale

Laravel developers love Eloquent. And for good reason. As it makes database interactions expressive, readable, and incredibly productive. You can build an entire business application without writing a single raw SQL query.

The problem?

Most Eloquent examples are demonstrated against tables containing a few hundred or a few thousand records.

Everything looks fast. Everything feels elegant. Then one day your application reaches:

  • 5 million users
  • 20 million orders
  • 100 million activity logs

Suddenly endpoints that once responded in 50ms start taking multiple seconds. Database CPU usage spikes. Queue workers consume gigabytes of memory. A simple dashboard query brings production to its knees.

I've seen this happen repeatedly.

The uncomfortable truth is: Eloquent scales remarkably well... until it doesn't.

The goal isn't to avoid Eloquent. The goal is to understand where its abstractions starts becoming expensive and knowing when to switch strategies.

🔥The Real-World Bottleneck

Let's start with a common example. Imagine an e-commerce platform. You need all users who have placed an order in the last 30 days.

Most Laravel developers write:

$users = User::whereHas('orders', function ($query) {
    $query->where(
        'created_at',
        '>=',
        now()->subDays(30)
    );
})->get();

Looks clean. Looks Laravel-ish. Works perfectly.

Until:

users:      5 million rows
orders:    50 million rows

Now this simple query becomes one of the most expensive operations in your application.
The framework isn't the problem, the generated SQL is.

📊 Understanding What Eloquent Actually Generates

That query roughly becomes:

SELECT *
FROM users
WHERE EXISTS (
    SELECT 1
    FROM orders
    WHERE orders.user_id = users.id
    AND created_at >= ?
)


As the data scales, this forces the database optimizer to work much harder.

In production, I've seen nested EXISTS queries become the primary source of slow queries once datasets exceed tens of millions of rows.

A common belief here is that assuming Eloquent abstractions are always free. But actually they're not.

Every relationship abstraction eventually translates into SQL. The database executes SQL—not Eloquent.

🎯Indexes Matter More Than Eloquent

Many teams spend weeks optimizing PHP code while ignoring the database. Meanwhile, a missing index is causing a 3-second query.

Consider:

Order::where('status', 'completed')
    ->where('created_at', '>=', now()->subMonth())
    ->get();

Without indexes it does a FULL TABLE SCAN.
With proper indexing it does INDEX SEEK.

Huge difference.

Bad Index Strategy:

INDEX(status)
INDEX(created_at)

Looks reasonable. Often isn't. Better Composite Index:

INDEX(status, created_at)

Because your query filters both columns together. The database can efficiently locate matching rows without scanning millions of records. In production, always index according to actual query patterns, not individual columns.

⚠️ Why whereHas Can Become Dangerous

Developers often chain multiple relationships as below

User::whereHas('orders', function ($query) {
    $query->where('status', 'completed');
})
->whereHas('subscriptions')
->whereHas('payments')
->get();

This reads beautifully but the generated SQL often does not as much as optimized. Multiple nested existence checks can become extremely expensive.

Alternative Approach

Sometimes a direct join performs significantly better:

User::query()
    ->join(
        'orders',
        'orders.user_id',
        '=',
        'users.id'
    )
    ->where('orders.status', 'completed')
    ->select('users.*')
    ->distinct()
    ->get();

Now this might not seem much elegant but its much faster now. This is one of those situations where understanding SQL matters more than framework knowledge.

🧠 The Hidden Cost of Eloquent Models

Developers often forget that the following simple piece of code:

$users = User::all();

doesn't only just fetch rows.

It creates:

  • Model objects
  • Attribute arrays
  • Relationship containers
  • Event capabilities
  • Casting infrastructure
For 10 users? the query time is negligible.
For 500,000 users? it becomes very expensive.

Query Builder vs Eloquent vs Raw SQL

One of the biggest mistakes I see, is treating Eloquent as the solution for every database interaction/operations but actually it isn't. Each tool has a place. 

Use Eloquent When:

  • You need relationships
  • You need model events
  • You need mutators and casts
  • You need rich domain behavior
  • Standard CRUD operations
$orders = Order::with('items')
    ->latest()
    ->paginate();


Use Query Builder When:

  • Reading large datasets
  • Reports and exports
  • Background jobs
  • Aggregations
$users = DB::table('users')
    ->select('id', 'email')
    ->get();

When we avoid model hydration completely. Memory usage drops significantly.

Use Raw SQL When:

  • Complex analytics
  • Window functions
  • Advanced aggregations
  • Highly optimized reporting
$results = DB::select(
    '
    SELECT
        customer_id,
        SUM(total) as revenue
    FROM orders
    GROUP BY customer_id
    ORDER BY revenue DESC
    LIMIT 100
    '
);

Senior engineers aren't measured by how little SQL they write. They're measured by how effectively they solve performance problems.

🚨The "Load Everything Into Memory" Disaster

I've seen this in production countless times:

User::all()->each(function ($user) {

    $this->sendPromotion($user);

});

Looks harmless. Until:

users table = 8 million rows

Then the worker dies because of memory exhaustion.

The Correct Approach: chunkById()

User::query()
    ->chunkById(
        1000,
        function ($users) {

            foreach ($users as $user) {

                $this->sendPromotion($user);

            }
        }
    );


Benefits:

✅ Low memory usage
✅ Stable execution
✅ Works on huge tables
✅ Handles millions of records

Why chunk() Can Still Bite You

Many developers use:

User::chunk(1000, function ($users) {
   // Logic
});

The issue here is if records are inserted or deleted while processing, results may shift between chunks.
You can skip records or You can process duplicates.

Instead:

User::chunkById(1000, function ($users) {
  // Lgic
});

This uses the primary key as a stable cursor and becomes much safer.

Lazy Collections: The Forgotten Performance Weapon

Laravel's Lazy Collections are rarely being used by developer while its very useful while working with heavy datasets.

Instead of:

$users = User::all();
foreach ($users as $user) {
   // Logic to process records
}


Use:

User::lazy()->each(function ($user) {
    // Logic to process records
});

Records are streamed from the database. Memory remains almost constant regardless of dataset size and for background processing jobs, this can be a game changer.

⚡N+1 Queries Become Catastrophic at Scale

The classic example:

$users = User::all();
foreach ($users as $user) {
    echo $user->orders->count();
}

With: 1000 users, you may execute: 1001 queries

With: 100,000 users, things become much worse.

Better:

$users = User::with('orders')->get();

Even Better:

If you only need counts:

$users = User::withCount('orders')->get();

Now the database does the aggregation. Much more efficient.

🏗️ Database Views: The Performance Tool Laravel Developers Forget Exists

One thing I rarely see discussed in Laravel performance conversations is Database Views. As applications keeps on getting complex, reporting queries often become monsters.

SELECT
    users.id,
    users.name,
    SUM(orders.total) as revenue,
    COUNT(orders.id) as orders_count,
    MAX(payments.created_at) as last_payment
FROM users
JOIN orders ON orders.user_id = users.id
JOIN payments ON payments.order_id = orders.id
GROUP BY users.id;

The query works. The problem is that it eventually gets duplicated across:
  • Admin dashboards
  • Revenue reports
  • Scheduled exports
  • Analytics endpoints
  • Internal APIs
And soon, your application contains multiple versions of the same query.

Using Database Views

Instead:

CREATE VIEW customer_summary AS
SELECT
    users.id,
    users.name,
    SUM(orders.total) as revenue,
    COUNT(orders.id) as orders_count
FROM users
JOIN orders ON orders.user_id = users.id
GROUP BY users.id;


Laravel can treat it like a normal table:

$customers = DB::table('customer_summary')
    ->where('revenue', '>', 10000)
    ->get();


The Important Limitaion

A common misconception is:

Database Views automatically make queries faster. But not necessarily. Standard Views are simply stored SQL queries. The database still executes the underlying query. The real benefit is maintainability and consistency. For actual performance improvements, many teams create precomputed summary tables or CQRS read models.

🚀 Sometimes Denormalization Beats Perfect Database Design

Most developers are taught to normalize everything. That's generally good advice until performance becomes a business problem. The joins becomes costlier.

Consider an ecommerce application.

A dashboard query may need:

  • orders
  • customers
  • addresses
  • countries
  • states
  • currencies
  • products
  • categories

Generating a report could require 7-8 joins ...
for every request. At a few thousand rows there is no issue. At hundreds of millions of rows? it becomes a different story.

Strategic Denormalization

Sometimes storing duplicated data intentionally can be the correct engineering decision.
Instead of repeatedly joining:

$order->customer->country->name

you might store:

customer_country_name

directly on the order.

Example:

Order::create([
    'customer_id' => $customer->id,
    'customer_country_name' => $customer->country->name,
]);

Now reporting queries become dramatically simpler.

Real Production Example. I've seen reporting endpoints drop from:

2.5 seconds to 150 milliseconds

simply by denormalizing frequently queried attributes.

Common candidates include:

  • Customer name
  • Country name
  • Product category name
  • Aggregated counters
  • Revenue totals
  • Lifetime purchase counts

The Trade-Off

Denormalization always introduces complexity.

You gain:

✅ Faster reads
✅ Fewer joins
✅ Simpler reporting queries
✅ Better scalability

But you lose:

❌ Single source of truth
❌ Simpler updates
❌ Perfect consistency

This is why denormalization should be treated as a performance optimization, not a default design choice.

🏗️ When Even Query Builder Isn't Enough: CQRS

Eventually some applications hit another wall.

Example:

  • orders: 100 million rows
  • customers: 20 million rows
  • payments: 500 million rows

Building dashboards from transactional tables becomes increasingly expensive. This is where CQRS starts becoming attractive.

Traditional Approach:

Dashboard Query
      ↓
Operational Tables

Every dashboard request hits production tables and it becomes very expensive.

CQRS Approach:

Write Model
      ↓
Events
      ↓
Read Model
      ↓
Dashboard Queries

Pre-computed reporting tables dramatically reduce query complexity.

Instead of:

JOIN 8 tables
GROUP BY multiple columns


your query:

SELECT *
FROM customer_revenue_summary

which is effectively instant.


⚙️ Edge Case #1: Exporting Millions of Records

The naive approach:

$users = User::all();

It will try to generate CSV and may crash

Instead:

User::lazy()
    ->each(function ($user) use ($file) {
        fputcsv(
            $file,
            [
                $user->id,
                $user->email
            ]
        );
    });

Memory remains stable. Exports become practical.

⚙️ Edge Case #2: Background Jobs Processing Huge Tables

Imagine:

ProcessInactiveUsersJob

scanning millions of users.

Bad:

User::all();


Good:

User::where(
        'last_login_at',
        '<',
        now()->subYear()
    )
    ->chunkById(
        1000,
        function ($users) {
            foreach ($users as $user) {
                // process
            }
        }
    );

The job remains reliable regardless of table growth.

🧠 The Most Important Scaling Lesson

Most Laravel performance discussions focus entirely on Eloquent. That's often the wrong layer. When an application reaches tens or hundreds of millions of rows, performance is usually decided by following factors more than the ORM:
  • Database design
  • Index strategy
  • Query architecture
  • Read/write separation
  • Reporting models
  • Denormalization decisions
Before replacing Eloquent, ask yourself first :
  • Are the indexes correct?
  • Are we overusing whereHas() or whereExist()?
  • Are we loading unnecessary models?
  • Should this query use Query Builder instead?
  • Should this become a View?
  • Should this data be denormalized?
  • Is this actually a CQRS read-model problem?
The biggest scaling wins rarely come from changing frameworks. They come from understanding how the database behaves under real-world load.

⚖️ The Engineering Trade-Offs

Approach Best For Pros Cons
Eloquent Business logic & CRUD Fast development, expressive syntax Hydration overhead on large datasets
Query Builder Large reads, reports, exports Faster execution, lower memory usage Less domain abstraction
Raw SQL Complex analytics and aggregations Maximum control and performance Harder to maintain and test
Database Views Reusable reporting queries Centralized query logic, cleaner code Doesn't automatically improve performance
Denormalized Tables High-volume reporting and dashboards Fewer joins, faster reads Data duplication and consistency challenges
CQRS Massive-scale systems Extremely fast read performance Higher architectural complexity

🏁 The Engineering Reality

The biggest mistake teams make is believing there is a single "correct" database access strategy.
There isn't.

Use the highest abstraction that still meets your performance requirements.

For most business operations:

✅ Eloquent - For large reports and analytics:
✅ Query Builder - For highly optimized aggregations:
✅ Raw SQL - For reusable reporting logic:
✅ Database Views - For read-heavy systems:
✅ Denormalized Tables - For extreme scale and complex reporting:
✅ CQRS

The goal is not to eliminate Eloquent.
The goal is knowing exactly when its convenience becomes a bottleneck.
Because at a few thousand rows, Eloquent feels magical.

At a few hundred million rows, the database reminds you who's really in charge. 🚀


Previous Post Next Post

Contact Form