One of the most common mistakes that newcomers to NoSQL databases make is always implementing their designs as if it were a MySQL or other relational database. While you can implement systems with such designs, you're not taking advantage of the flexibility and many other features that come with NoSQL databases. Likewise, if someone gives you a new computer for your birthday, I certainly hope you're not just using it as a calculator. However, this is by no means saying that modeling a schema in a nonrelational database as you would a relational database is always less efficient, as I'll describe in examples below.

Constrain Relationships and Avoid Bi-Directional Relationships, If Possible

Let's take a pretty simple system, a blog for example, consisting of the following documents: User, Post, and Comment.

Post.php:

<?php
// Post.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class Post
{

    /**
     * @ODM\ReferenceOne(targetDocument="User", simple=true, inversedBy="posts")
     */
    protected $user;
    
    /**
     * @ODM\ReferenceMany(targetDocument="Comment", mappedBy="post")
     */
    protected $comments;

    // ...
}

User.php:

<?php
// User.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class User
{
    
    /**
     * @ODM\ReferenceMany(targetDocument="Post", mappedBy="user")
     */
    protected $posts;
    
    /**
     * @ODM\ReferenceMany(targetDocument="Comment", mappedBy="user")
     */
    protected $comments;

    // ...

}

Comment.php:

<?php
// Comment.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class Comment
{
    
    /**
     * @ODM\ReferenceOne(targetDocument="Post", simple=true, inversedBy="comments")
     */
    protected $posts;
    
    /**
     * @ODM\ReferenceOne(targetDocument="Comment", simple=true, inversedBy="user")
     */
    protected $comments;

    // ...

}

This design implements a bidirectional relationship between User & Post and User and Comment. This seems convenient since you can easily fetch related documents by using $user->getComments() or vice-versa $comment->getUser(). However, this design may lead to extra queries for data that may not even be used. For example, think about the following relationship: $user->getComments(). Is it really necessary in your application to query for a specific user's comments from the User document? This query is not essential and it doesn't make much sense to me. How often in your application will you really need all the user's comments? Not very often, if any at all, in most cases. The same goes for using $user->getPosts(). However, it does make sense, in some cases, for Post to have bi-directional relationships with Comment since posts in blogs usually have all related comments shown on the same page. So here is the revised version of the design: remove bi-directional relationship from User & Comment and User & Post but keep the bi-directional relationship between Post and Comment

Post.php:

<?php
// Post.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class Post
{

    /**
     * @ODM\ReferenceOne(targetDocument="User", simple=true)
     */
    protected $user;
    
    /**
     * @ODM\ReferenceMany(targetDocument="Comment", mappedBy="post")
     */
    protected $comments;

    // ...
}

User.php:

<?php
// User.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class User
{

    // ...

}

Comment.php:

<?php
// Comment.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class Comment
{
    
    /**
     * @ODM\ReferenceOne(targetDocument="Post", simple=true, inversedBy="comments")
     */
    protected $posts;

    // ...

}

This design seems a lot cleaner and may work for smaller blogs, but let's consider something bigger. Let's say this blog gets millions of user traffic per day, which usually results in more than just a couple of user generated comments. Now looking back at the design, would it still make sense to load all the comments from Post if there are a couple thousand comments per post? Definitely not. It makes much more sense to remove the bi-directional relationship between Post & Comment since it would be overkill to load all the comments on the post when we only need about 10 at a time. So let's make the final revision to our database schema.

Post.php:

<?php
// Post.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class Post
{

    /**
     * @ODM\ReferenceOne(targetDocument="User", simple=true)
     */
    protected $user;

    // ...
}

User.php:

<?php
// User.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class User
{

    // ...

}

Comment.php:

<?php
// Comment.php
use Doctrine\ODM\MongoDB\Mapping\Annotations as ODM

/**
 * @ODM\Document
 */
class Comment
{
    
    /**
     * @ODM\ReferenceOne(targetDocument="Post", simple=true)
     */
    protected $posts;

    // ...

}

We have now eliminated all bi-directional relationships while keeping all the necessary functions for our application. This means weaker coupling between documents and a better structured design.

On a sidenote, simple=true annotation is used on owning sides of the relationship. If you specify this, the document will only store the MongoId for references. By default, references not only contain $id, but also $db and $ref. Storing only MongoId saves space and is better for performance and indexing. Also, if you decide to store references using default options, your queries will look a bit different.

Querying inside Post document's repository using default options:

$this->createQueryBuilder()->field('user.$id')->equals(new \MongoId($someId));

Querying inside Post document's repository using simple=true:

$this->createQueryBuilder()->field('user')->equals(new \MongoId($someId));

Specifying $someId as MongoId is necessary when you're querying inside the repository. The only special case you do not need to specify MongoId is when you use the find method. Ex: $user = $dm->getRepository('MeltAwesomeBundle:User')->find($someId);

Embedding vs Referencing Documents

In MongoDB, instead of referencing documents like the examples above, you can embed an entire document inside another document. By doing so, the embedded document becomes a part of the parent document. So if the document Comment is embedded inside Post, querying for comments inside a post by using $post->getComments() will not make a separate query to retrieve comments from reference since the Comment document is a part of the Post collection. This means faster read speeds, and in exchange for faster reads, you sacrifice speeds for complex searches and updates. A disadvantage of embedding documents is not so much data duplication but data inconsistency. If Comment is embedded in both User and Post, updating a comment document will not always be atomic and will put more workload on Doctrine compared to having a separate collection for comments.

Another good example of this situation may be a feed system in social networks. If every User has a Feed embedded in their collection instead of having references and for some reason, we needed to update changes to a specific feed that is embedded in hundreds of different User documents, it would certainly take a lot more work than to have references to a single document in a separate Feed collection and make a change to that single Feed document.

Conclusion

There are no set rules on when to use bi-directional/uni-directional relationships or when to embed/reference documents. It all depends on your application. When in doubt for establishing relationships, just ask yourself "Is it essential to create this relationship?" If the answer is no, then you're probably better off using uni-directional references. When in doubt for using embed vs. references, ask yourself, "Do I need to update or do complex queries within embedded documents? Do I care about atomicity in my application?" If the answer is no, then you may go ahead and embed your documents. For more information on performance and best practices, refer to Doctrine Project's ODM Documentation.