MongoDB {aggregation $match} vs {find} speed

aggregation-frameworkmongodb

I have a mongoDB collection with millions of rows and I'm trying to optimize my queries. I'm currently using the aggregation framework to retrieve data and group them as I want. My typical aggregation query is something like : $match > $group > $ group > $project

However, I noticed that the last parts only take a few ms, the beginning is the slowest.

I tried to perform a query with only the $match filter, and then to perform the same query with collection.find. The aggregation query takes ~80ms while the find query takes 0 or 1ms.

I have indexes on pretty much each field so I guess this isn't the problem. Any idea on what could go wrong ? Or is it just a "normal" drawback of the aggregation framework ?

I could use find queries instead of aggregation queries, however I would have to perform a lot of processing after the request and this process can be done quickly with $group etc. so I would rather keep the aggregation framework.

Thanks,

EDIT :

Here is my criteria :

{
    "action" : "click",
    "timestamp" : {
            "$gt" : ISODate("2015-01-01T00:00:00Z"),
            "$lt" : ISODate("2015-02-011T00:00:00Z")
    },
    "itemId" : "5"
}

Best Answer

The main purpose of the aggregation framework is to ease the query of a big number of entries and generate a low number of results that hold value to you.

As you have said, you can also use multiple find queries, but remember that you can not create new fields with find queries. On the other hand, the $group stage allows you to define your new fields.

If you would like to achieve the functionality of the aggregation framework, you would most likely have to run an initial find (or chain several ones), pull that information and further manipulate it with a programming language.

The aggregation pipeline might seem to take longer, but at least you know you only have to take into account the performance of one system - MongoDB engine.

Whereas, when it comes to manipulating the data returned from a find query, you would most likely have to further manipulate the data with a programming language, thus increasing the complexity depending on the intricacies of the programming language of choice.

Related Solutions

Sql – How to query MongoDB with “like”

That would have to be:

db.users.find({"name": /.*m.*/})

Or, similar:

db.users.find({"name": /m/})

You're looking for something that contains "m" somewhere (SQL's '%' operator is equivalent to regular expressions' '.*'), not something that has "m" anchored to the beginning of the string.

Note: MongoDB uses regular expressions which are more powerful than "LIKE" in SQL. With regular expressions you can create any pattern that you imagine.

For more information on regular expressions, refer to Regular expressions (MDN).

Mongodb – Update MongoDB field using value of another field

The best way to do this is in version 4.2+ which allows using of aggregation pipeline in the update document and the updateOne, updateMany or update collection methods. Note that the latter has been deprecated in most if not all languages drivers.

###MongoDB 4.2+

Version 4.2 also introduced the $set pipeline stage operator which is an alias for $addFields. I will use $set here as it maps with what we are trying to achieve.

db.collection.<update method>(
    {},
    [
        {"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
    ]
)

Note that square brackets in the second argument to the method which define an aggregation pipeline instead of a plain update document. Using a plain document will not work correctly.

###MongoDB 3.4+

In 3.4+ you can use $addFields and the $out aggregation pipeline operators.

db.collection.aggregate(
    [
        { "$addFields": { 
            "name": { "$concat": [ "$firstName", " ", "$lastName" ] } 
        }},
        { "$out": "collection" }
    ]
)

Note that this does not update your collection but instead replace the existing collection or create a new one. Also for update operations that require "type casting" you will need client side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method.

##MongoDB 3.2 and 3.0 The way we do this is by $projecting our documents and use the $concat string aggregation operator to return the concatenated string. we From there, you then iterate the cursor and use the $set update operator to add the new field to your documents using bulk operations for maximum efficiency.

###Aggregation query:

var cursor = db.collection.aggregate([ 
    { "$project":  { 
        "name": { "$concat": [ "$firstName", " ", "$lastName" ] } 
    }}
])

###MongoDB 3.2 or newer

from this, you need to use the bulkWrite method.

var requests = [];
cursor.forEach(document => { 
    requests.push( { 
        'updateOne': {
            'filter': { '_id': document._id },
            'update': { '$set': { 'name': document.name } }
        }
    });
    if (requests.length === 500) {
        //Execute per 500 operations and re-init
        db.collection.bulkWrite(requests);
        requests = [];
    }
});

if(requests.length > 0) {
     db.collection.bulkWrite(requests);
}

###MongoDB 2.6 and 3.0

From this version you need to use the now deprecated Bulk API and its associated methods.

var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;

cursor.snapshot().forEach(function(document) { 
    bulk.find({ '_id': document._id }).updateOne( {
        '$set': { 'name': document.name }
    });
    count++;
    if(count%500 === 0) {
        // Excecute per 500 operations and re-init
        bulk.execute();
        bulk = db.collection.initializeUnorderedBulkOp();
    }
})

// clean up queues
if(count > 0) {
    bulk.execute();
}

###MongoDB 2.4

cursor["result"].forEach(function(document) {
    db.collection.update(
        { "_id": document._id }, 
        { "$set": { "name": document.name } }
    );
})

Best Answer

Related Solutions

Sql – How to query MongoDB with “like”

Mongodb – Update MongoDB field using value of another field

Related Topic