[go: up one dir, main page]

0% found this document useful (0 votes)
28 views4 pages

MongoDB Best Practices - Schema Design, Indexes

The document outlines best practices for optimizing MongoDB, emphasizing the importance of schema design, data embedding, and indexing for performance. It highlights that MongoDB's schema should be tailored to application needs rather than traditional relational database structures. Additionally, it discusses server sizing, replication, and sharding as strategies for managing performance and redundancy in larger databases.

Uploaded by

aryamoneycontrol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views4 pages

MongoDB Best Practices - Schema Design, Indexes

The document outlines best practices for optimizing MongoDB, emphasizing the importance of schema design, data embedding, and indexing for performance. It highlights that MongoDB's schema should be tailored to application needs rather than traditional relational database structures. Additionally, it discusses server sizing, replication, and sharding as strategies for managing performance and redundancy in larger databases.

Uploaded by

aryamoneycontrol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

MongoDB Best Practices: Schema

Design, Indexes & More

Sync, Store, Access and Visualize your Data


No-code ELT
Cloud Data Warehouse
Unlimited Connectors

Book a Demo
By Dawid Ziolkowski | August 2, 2022 | Updated On: September 15, 2023 | Data
Management
MongoDB database is really popular these days. Developers often use it instead of
MySQL, but these two platforms aren’t in direct competition. While MySQL is a
relational database, MongoDB is a NoSQL document-oriented database, so the two
work quite differently. And for that reason, optimizing MongoDB is not the same as
optimizing a traditional relational database, although some best practices are similar.
Read on to learn what to do and what to avoid when using MongoDB.

Understand Schema Differences Between


Relational and Document-based Databases
Let's start with the most important difference: the schema. Designing your database
schema is a crucial task, and while making changes is possible and common, it can be
expensive from an engineering perspective. When a database schema needs changes,
your deployment process becomes much more complicated, so good design is critical.
How do you design a good MongoDB database schema? Rule number one: don't
design it as you would with relational databases. It sounds logical to split your schema
into small table-like pieces, right?

In the case of MongoD, no. For relational databases, you usually construct a schema
based on the data. You need to figure out how to split the data your application will use
into tables so it’s logically organized and not duplicated. But when it comes to a
MongoDB schema, you should look not at the data itself, but at the application.
Specifically, how your application will use the data, what kind of queries it will likely
execute, and so on. This means that two different applications using the exact same
data might have very different schema designs in MongoDB, whereas for relational
databases the schema would probably be the same or very similar across applications.

Another thing you need to know is that MongoDB has almost no rules or guidelines on
how you should structure the data, because MongoDB operates on JSON-like
documents. This gives you the ability to embed data into arrays and objects within one
document. If you want to learn more about modelling data, take a look at this free
course from MongoDB.

Embed Your Data Instead of Relying on Joins


One of the best practices when using MongoDB is to embed your data within one
document instead of performing lookups or creating in-application joins. It may be a bit
counterintuitive, but MongoDB performs better when you stuff all the data you need into
one document. For example, instead of putting user details in one document and user
order history in another, chuck them into the same one. Reading documents is
extremely fast in MongoDB. Performing lookups or joins within the application is slower
in most cases.

Keep in mind that this is only a general rule and you should always start by
understanding your application query pattern. Including the data in the document is
preferred over lookup operations. But, of course, there is no point in dumping all
possible data in one document.

Use Indexes For Frequent Operations


Let's talk about indexes. This next MongoDB best practice is similar to what you'd do
with relational databases. In the previous best practice we mentioned that MongoDB
prefers to embed data (instead of splitting it into smaller logical pieces). Therefore it's
normal for MongoDB documents to become quite big. This will naturally impact
performance, but indexes can solve that. Indexes in MongoDB work pretty much the
same way as with relational databases. These special data structures store a small
subset of the whole document in order to speed up the matching of data for frequently
used queries.

For example, imagine that you have your user's data together with their order history in
a single document and you want to find all users who ordered something in the last
month. Normally (without indexes) MongoDB would have to scan the whole user
collection, going one by one through the user document and checking the last order
data for each user. It's not horrible; that's how the database performs a lot of operations.
But if you frequently ask the database for this kind of matching, then indexing will help
you a lot. Coming back to our example - with indexes, MongoDB stores a separate,
small list containing pointers to the data (for example user id, email address, or last
order date).

Properly Size Your Servers


It may sound obvious, but server RAM sizing in MongoDB is crucial. There are two
things to keep in mind: first, more memory won't increase the performance of your
database. It's not just a matter of getting the server with the most RAM memory you can
afford. Second, MongoDB performs best when its working set can fit an application's
RAM.

Get your guide to Modern Data Management


Download Now
Sizing your MongoDB machine is not dependent on the size of the database itself. It
doesn't matter if you have 100MB or 2TB of data in your MongoDB instance. What
matters is the size of indices and frequently accessed data. To size your MongoDB
instance, you need to perform some tests to find out how much data your application
normally uses. Then, make sure to use a server with slightly more memory than that. If
your working set won't fit in the RAM, MongoDB will read the data from disk. And even if
you use superfast SSD disks, the operation will be much slower than reading from
RAM.

So, how do you know if your MongoDB working set fits in your RAM? The simplest way
is to execute MongoDB's serverStatus command. From there, take a look at the pages-
read-into-cache and unmodified-pages-evicted metrics. If you see high numbers in
these two, it most likely means that your working set does not fit in your RAM memory.
Use Replication or Sharding
As with relational databases, another MongoDB best practice is to
use replication and/or sharding when your database becomes slow. MongoDB
implements replication by use of replica sets, and works similarly to other database
systems using primary and secondary nodes. You can instruct your application to run
some queries on secondary servers (or use load balancers), relieving some pressure
on your primary server.

What's good about MongoDB replication is that it also serves as a great redundancy
mechanism. Since it simply copies documents from primary to secondary nodes,
electing one of the secondary nodes to be a primary in case your original primary server
fails is simple. You won't run into any inconsistencies or complicated election processes
with MongoDB. Therefore, replicating your MongoDB is good not only for better
performance, but for redundancy.

Replication helps the most for small and medium databases, so once your dataset gets
really big, consider sharding. Although replication just copies all the data across multiple
servers, sharding actually splits the data into smaller pieces and distributes them across
servers. This brings great performance improvement for large data sets and allows you
to horizontally scale both reads and writes. You can read more about how it works here.

Summary
As you can see, MongoDB best practices are a mix of typical database best practices
and some specific to MongoDB. The nice thing about MongoDB is that you don’t need
to start worrying about performance until you have a relatively big database - it’s fast
and optimized by design. This doesn't mean you should ignore best practices when
working with smaller databases. Some of the best practices we mentioned aren’t just for
boosting performance, but can ensure good database design. They should always be
top of mind no matter the size of the database.

If you want to learn more about the differences between SQL and NoSQL databases,
take a look at our blog post here.

You might also like