Solved thread

This post is marked as solved. If you think the information contained on this thread must be part of the official documentation, please contribute submitting a pull request to its repository.

The best way to query two collections in mongodb

Hi,

I have two collections. The structure is as below

 Products
_id
name
description
companyId

Companies
_id
companyname

I am using manual references to link each product document to company.

How do i do a query such that when i find all products, i can get the info from company as well. In relational dataase, we do this via a join.

Here is a working version of what i come out with, however i think it can be improved. In my code, i am calling Companies::findbyid in every loop.

$products = Products::find(array(
        "limit" => 5,
        "sort"  => array("_id" => -1)
    ));
    foreach($promotions as $key => $value){
        $company = Companies::findById($value->companyId);
        $product[$key]->name = $value->name;
        $product[$key]->description = $value->description;
        $product[$key]->companyName = $company->name;
    }
    $this->view->products = $product;

Advice appreciate greatly!!



19.6k
Accepted
answer
edited Jul '15

I found this video quite helpfull.

I would consider embedding Companies in Products or the other way round. Reselling or producing.

There is a section on manual references here which is supported by the php driver.

Hope this provides usefull.



10.0k

Thanks @baychae! I am actually re-thinking about embeding the documents. Reason being the info (which is embed) may be changed frequently. If i were to embed it, maintaining it may be an issue. what do you think?

Any others have any insights?

I found this video quite helpfull.

I would consider embedding Companies in Products or the other way round. Reselling or producing.

There is a section on manual references here which is supported by the php driver.

Hope this provides usefull.



19.6k
edited Jul '15

@Derek

I think I am asking myself does the performance gain on reading aggregated data (potentially vast amounts) outweigh the performance hit on the inserts which I am assuming leans on your indexing strategy. What would be the impact of flattening this out? What scope of queries do I need?

I know in my case data will be read consideraby more than it is inserted.

Do the hours put in on maintenance in this area incur greater savings and performance increases in other area's.