MongoDB Aggregation Pipeline
The aggregation pipeline is MongoDB's most powerful feature for data transformation and analysis.
What is Aggregation?
A sequence of stages that process documents:
db.orders.aggregate([
{ $match: { status: 'completed' } },
{ $group: { _id: '$customer', total: { $sum: '$amount' } } },
{ $sort: { total: -1 } }
]);
Essential Stages
$match - Filter Documents
{ $match: { status: 'active', age: { $gte: 18 } } }
$project - Shape Output
{
$project: {
name: 1,
fullName: { $concat: ['$firstName', ' ', '$lastName'] },
yearJoined: { $year: '$createdAt' }
}
}
$group - Aggregate Data
{
$group: {
_id: '$category',
count: { $sum: 1 },
avgPrice: { $avg: '$price' },
maxPrice: { $max: '$price' },
products: { $push: '$name' }
}
}
$sort and $limit
{ $sort: { createdAt: -1 } },
{ $limit: 10 }
$lookup - Join Collections
{
$lookup: {
from: 'authors',
localField: 'authorId',
foreignField: '_id',
as: 'author'
}
}
Real-World Examples
Sales Report by Month
db.sales.aggregate([
{
$group: {
_id: {
year: { $year: '$date' },
month: { $month: '$date' }
},
totalSales: { $sum: '$amount' },
orderCount: { $sum: 1 }
}
},
{ $sort: { '_id.year': -1, '_id.month': -1 } }
]);
Top Products with Details
db.orders.aggregate([
{ $unwind: '$items' },
{
$group: {
_id: '$items.productId',
totalSold: { $sum: '$items.quantity' },
revenue: { $sum: { $multiply: ['$items.price', '$items.quantity'] } }
}
},
{
$lookup: {
from: 'products',
localField: '_id',
foreignField: '_id',
as: 'product'
}
},
{ $unwind: '$product' },
{ $sort: { totalSold: -1 } },
{ $limit: 10 }
]);
Performance Tips
- Place $match early - Filter before processing
- Use indexes - $match and $sort can use indexes
- Limit fields with $project - Reduce memory usage
- Use allowDiskUse - For large datasets
db.collection.aggregate(pipeline, { allowDiskUse: true });
Conclusion
Master the aggregation pipeline to unlock MongoDB's full potential for data analysis.