MongoDB – Thinking in Documents

综合技术 2017-04-10

As we all know document databases are very different from the so-got-used-to Relational Databases (RDBMS).

In RDBMS when designing the database, one goes through the process of the so called data normalization . Normalization involves arranging attributes in relations based on dependencies between attributes, ensuring that the dependencies are properly enforced by database integrity constraints. Normalization is accomplished through applying some formal rules either by a process of synthesis or decomposition.

  • Synthesis creates a normalized database design based on a known set of dependencies.
  • Decomposition takes an existing (insufficiently normalized) database design and improves it based on the known set of dependencies.

Once done, what is the result of this is a nice set of tables, interlinked through foreign keys, where the redundant data are banned from existence, unless needed for some edge-case-scenario (At least this is how it should be :))

In MongoDB (or any other document database) above stated should not be strictly followed as some very complex data can be “packed” into a single document. A document in reality can be a set of sub-documents, and all of it would just work seamlessly. What can be expressed with many inter-related tables in a relational database, it can simply be one type of a document in a document database.

MongoDB supports mainly two ways of representing documents, by Referencing documents ( a bit like in a relational database) , and embedding documents.

Referencing Documents

MongoDB permits the referencing documents is very similar to the data normalization in the RDMBS, where the tables are linked by the foreign key. In MongoDB in this sense is not any different.

What is very different is that this relationship is not enforced in any way by the database itself, and the relationships are handled fully by the application code itself.

Embedding Documents

When embedding documents we are de-facto combining all of the parts into one bigger unit, a document itself.

The same example as used above can be represented in a different way.

We can see that the document now contains all the data aggregated.

Document Design Strategy

As we have seen, there are mainly two ways of representing the relationships. However, the need for one or the other would need to be carefully weighted as it obviously can have some side effects. As a rule of thumb the following can be recommended:

Embed as much as possible

Document database should eliminate quite a lot of joins, and therefore the very option we have is to put as much as possible in a single document. In this way, the really great advantage is that the saving and retrieving document is atomic and very fast (See below. Consistency). There is no need to normalize data. Therefore “embed” as much as possible, especially the data that is not being used by other documents.

Normalize Data

Normalize data that can be referred to from multiple places into its own collection. This means, create reusable collections (i.e.: country, user, etc.). In this way is definitively more efficient way to handle duplicate values in only one place.

Document size

Keep in mind that the maximum document size in MongoDB is of 16MB. The limit is imposed mainly in order to ensure that a single document cannot use excessive amount of RAM or bandwidth. 16MB is quite a large quantity of data (just think how much data is usually displayed on a single web page). In most of the case this limit is not a problem, however it’s good to keep it in mind and avoid premature optimizations.

Complex data structures and queries

MongoDB can store arbitrary deep nested data structures, but cannot search them efficiently. If your data forms a tree, forest or graph, you effectively need to store each node and its edges in a separate document


MongoDB makes a trade-off between efficiency and consistency. The rule is changes to a single document are atomic, while updates to multiple documents should never be assumed to be atomic. When designing the schema consider how to keep your data consistent! Generally, the more that you keep in a document the better, as referred in the first point of this list.

责编内容 (源链)。感谢您的支持!


Apache Kibble Software Manager Apache Kibble Software Manager Wr...
DB――数据的读取和存储方式 RDBMS是我们常见的一些存储数据的仓库,无论是做前端还是后端,都会接触到。 我们常见的数据处理,都是通过sql来和数据库做交互的,因此造成了许多人对数据库...
Spring Boot 入门之数据库篇  由于授权问题,Maven3不提供Oracle JDBC driver, 为了在Maven项目中应用Oracle JDBC driver,必须...
都在说微服务,那么微服务的反模式和陷阱是什么(一)... 网上看到一本关于微服务反模式的电子书,看后感觉内容非常棒,于是我决定分阶段翻译成中文书,翻译的目的也是想帮助想深入了解微服务的朋友,由于英文水平有限,如有翻译不...
Postmortem: Deducely by Aswin Vayiravan , Deducely ( @Deducely ) Editor: Mei Nagappan ( @Me...