MongoDB sharded cluster by physical location

Jose Figueredo
8 min readNov 18, 2022

--

In order to comply with GDPR, I needed to make a proof of concept of a sharded cluster with infrastructure in different world regions.

Actors

Router Servers

  • Provides the interface between the client applications and the sharded cluster.
  • Route queries and write operations to the shards.
  • Behaves identically to any other MongoDB instance from the perspective of the application.

Config Servers

  • Stores the metadata for a sharded cluster. The metadata reflects state and organization for all data and components within the sharded cluster. The metadata includes the list of chunks on every shard and the ranges that define the chunks. The mongos instances cache this data and use it to route read and write operations to the correct shards. mongos updates the cache when there are metadata changes for the cluster, such as Chunk Splits or adding a shard. Shards also read chunk metadata from the config servers.
  • Stores Authentication configuration information such as Role-Based Access Control or internal authentication settings for the cluster.
  • MongoDB also uses the config servers to manage distributed locks.
  • Each sharded cluster must have its own config servers. Do not use the same config servers for different sharded clusters.

Shard

  • Contains a subset of sharded data for a sharded cluster. Together, the cluster’s shards hold the entire data set for the cluster.
  • As of MongoDB 3.6, shards must be deployed as a replica set to provide redundancy and high availability.
  • Users, clients, or applications should only directly connect to a shard to perform local administrative and maintenance operations.
  • Performing queries on a single shard only returns a subset of data. Connect to the mongos to perform cluster level operations, including read or write operations.
  • Each database in a sharded cluster has a primary shard that holds all the un-sharded collections for that database. Each database has its own primary shard. The primary shard has no relation to the primary in a replica set. The mongos selects the primary shard when creating a new database by picking the shard in the cluster that has the least amount of data. Mongos uses the totalSize field returned by the listDatabases command as a part of the selection criteria.
  • Contains a replica set which is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments.

The PoC

To keep it minimal we will deploy a Router, a Config Server, two Shards containing 3 instances each with a Primary-Secondary-Secondary replica set. Of those two shards one represent infrastructure created in EU and the other infrastructure created in America. In the EU shard will be saved the documents that have country equal to ES referring to Spain and in the America shard will be saved documents for the countries AR, MX, PE, UY (Argentina, Mexico, Peru, and Uruguay).

Note: ideally we should have redundancy in the router and config server, but for simplicity we will avoid it.

This will be the architecture diagram if it were to live in AWS:

Architecture diagram

To make our communications secure we will create a key file with this shell commands.

> openssl rand -base64 756 > ./custom-build/auth/mongodb-keyfile
> chmod 400 ./custom-build/auth/mongodb-keyfile

We will have a Dockerfile for creating a custom MongoDB with the key file created before.

FROM mongo:6.0.2
COPY /auth/mongodb-keyfile /data
RUN chmod 400 /data/mongodb-keyfile
RUN chown 999:999 /data/mongodb-keyfile

We’ll define our cluster in a docker-compose file and the relevant services will be the following.

The router service

Router Server
router01:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: router-01
command: mongos --port 27017 --configdb rs-config-server/configsvr01:27017 --bind_ip_all --keyFile /data/mongodb-keyfile
ports:
- 27117:27017
volumes:
- ./scripts:/scripts
- mongodb_cluster_router01_db:/data/db
- mongodb_cluster_router01_config:/data/configdb

Executes a mongos daemon indicating its config server and its key file.

The configuration service

Configuration Server
configsvr01:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: mongo-config-01
command: mongod --port 27017 --configsvr --replSet rs-config-server --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_configsvr01_db:/data/db
- mongodb_cluster_configsvr01_config:/data/configdb
ports:
- 27119:27017
links:
- rs-shard-gdpr-a
- rs-shard-rotw-a

Executes a mongod daemon indicating its config server and key file.

The service that represents the shard for GDPR

GDPR Shard
rs-shard-gdpr-a:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: shard-01-node-a
command: mongod --port 27017 --shardsvr --replSet rs-shard-gdpr --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_shard_gdpr_a_db:/data/db
- mongodb_cluster_shard_gdpr_a_config:/data/configdb
ports:
- 27122:27017
links:
- rs-shard-gdpr-b
- rs-shard-gdpr-c
rs-shard-gdpr-b:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: shard-01-node-b
command: mongod --port 27017 --shardsvr --replSet rs-shard-gdpr --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_shard_gdpr_b_db:/data/db
- mongodb_cluster_shard_gdpr_b_config:/data/configdb
ports:
- 27123:27017
rs-shard-gdpr-c:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: shard-01-node-c
command: mongod --port 27017 --shardsvr --replSet rs-shard-gdpr --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_shard_gdpr_c_db:/data/db
- mongodb_cluster_shard_gdpr_c_config:/data/configdb
ports:
- 27124:27017

Executes a mongod daemon in each server indicating that it is a sharded server with replica set and key file.

The service that represents the shard for RotW

Rest of the world shard
rs-shard-rotw-a:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: shard-02-node-a
command: mongod --port 27017 --shardsvr --replSet rs-shard-rotw --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_shard_rotw_a_db:/data/db
- mongodb_cluster_shard_rotw_a_config:/data/configdb
ports:
- 27125:27017
links:
- rs-shard-rotw-b
- rs-shard-rotw-c
rs-shard-rotw-b:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: shard-02-node-b
command: mongod --port 27017 --shardsvr --replSet rs-shard-rotw --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_shard_rotw_b_db:/data/db
- mongodb_cluster_shard_rotw_b_config:/data/configdb
ports:
- 27126:27017
rs-shard-rotw-c:
build:
context: mongodb-build
image: wnc-mongo:6.0.2
container_name: shard-02-node-c
command: mongod --port 27017 --shardsvr --replSet rs-shard-rotw --keyFile /data/mongodb-keyfile
volumes:
- ./scripts:/scripts
- mongodb_cluster_shard_rotw_c_db:/data/db
- mongodb_cluster_shard_rotw_c_config:/data/configdb
ports:
- 27127:27017

Executes a mongod daemon in each server indicating that it is a sharded server with replica set and key file.

To create the infra run and wait until it finishes.

> docker-compose up -d

Configuring our cluster

To initialize the config server run the command.

> docker-compose exec configsvr01 sh -c "mongosh < /scripts/init-config-server.js"

The script will indicate the instance that it should behave as a config server.

rs.initiate({
_id: "rs-config-server",
configsvr: true,
version: 1,
members: [
{
_id: 0,
host : 'configsvr01:27017'
}
]
})

To configure the shard for GDPR run:

> docker-compose exec rs-shard-gdpr-a sh -c "mongosh < /scripts/init-shard-gdpr.js"

The script indicates that it’s a replica set with 3 nodes and everyone's priority.

rs.initiate(
{
_id: "rs-shard-gdpr",
version: 1,
members: [
{
_id: 0,
host: "rs-shard-gdpr-a:27017",
priority: 1
},
{
_id: 1,
host: "rs-shard-gdpr-b:27017",
priority: 0.5
},
{
_id: 2,
host: "rs-shard-gdpr-c:27017",
priority: 0.5
},
]
},
{
force: true
}
)

To configure the shard for RotW run:

> docker-compose exec rs-shard-rotw-a sh -c "mongosh < /scripts/init-shard-rotw.js"

The script indicates that it’s a replica set with 3 nodes and everyone’s priority.

rs.initiate(
{
_id: "rs-shard-rotw",
version: 1,
members: [
{
_id: 0,
host: "rs-shard-rotw-a:27017"
},
{
_id: 1,
host: "rs-shard-rotw-b:27017"
},
{
_id: 2,
host: "rs-shard-rotw-c:27017"
},
]
},
{
force: true
}
)

Wait 30 seconds approximately.

> docker-compose exec router01 sh -c "mongosh < /scripts/init-router.js"

The script will add the shard nodes to the router.

sh.addShard("rs-shard-gdpr/rs-shard-gdpr-a:27017")
sh.addShard("rs-shard-gdpr/rs-shard-gdpr-b:27017")
sh.addShard("rs-shard-gdpr/rs-shard-gdpr-c:27017")
sh.addShard("rs-shard-rotw/rs-shard-rotw-a:27017")
sh.addShard("rs-shard-rotw/rs-shard-rotw-b:27017")
sh.addShard("rs-shard-rotw/rs-shard-rotw-c:27017")

Let’s create an admin user in the config server and the two shards.

> docker-compose exec configsvr01 bash "/scripts/auth.sh"
> docker-compose exec rs-shard-gdpr-a bash "/scripts/auth.sh"
> docker-compose exec rs-shard-rotw-a bash "/scripts/auth.sh"
#!/bin/bash

mongosh <<EOF
use admin;
db.createUser({user: "jose.figueredo", pwd: "jfwnc", roles:[{role: "root", db: "admin"}]});
exit;
EOF

Post initialize router

docker-compose exec router01 mongosh --port 27017 -u "jose.figueredo" --authenticationDatabase admin
mongodb> load('/scripts/post-init-router.js')

The script will:

  • disables balancing
  • add the shards to their respective zones
  • updates the zone key range
  • enable sharding for the database
  • re enable balancing
sh.disableBalancing("WncDB.Client")

printjson("Add Zones to Router");

sh.addShardToZone("rs-shard-gdpr", "GDPR")
sh.addShardToZone("rs-shard-rotw", "ROTW")

printjson("Update Zone Ranges on Router");

sh.updateZoneKeyRange(
"WncDB.Client",
{ country: "AR" },
{ country: "AR_" },
"ROTW"
)

sh.updateZoneKeyRange(
"WncDB.Client",
{ country: "ES" },
{ country: "ES_" },
"GDPR"
)

sh.updateZoneKeyRange(
"WncDB.Client",
{ country: "MX" },
{ country: "MX_" },
"ROTW"
)

sh.updateZoneKeyRange(
"WncDB.Client",
{ country: "PE" },
{ country: "PE_" },
"ROTW"
)

sh.updateZoneKeyRange(
"WncDB.Client",
{ country: "UY" },
{ country: "UY_" },
"ROTW"
)

printjson("Enable Sharding")

sh.enableSharding("WncDB")

printjson("Shard Collection")

sh.shardCollection(
"WncDB.Client",
{
country: 1
}
)
sh.enableBalancing("WncDB.Client")

Testing

This python script will connect to our cluster and insert a random choice from a list of countries.

> python3 inserter-auth.py
import random
from pymongo import MongoClient

DB_CONNECTION_URL = 'mongodb://jose.figueredo:jfwnc@127.0.0.1:27117,127.0.0.1:27118/?authMechanism=DEFAULT'

try:
conn = MongoClient(DB_CONNECTION_URL)
print("Connected")
except Exception as e:
print("Could not connect to MongoDB", str(e))
exit(1)

# database
db = conn.WncDB

# collection
collection = db.Client

# insert 10 documents
for i in range(10):
emp_rec = {
"country": random.choice(["AR", "ES", "MX", "PE", "UY"]),
"name": str(random.randint(1, 1000)),
}

# Insert Data
rec_id = collection.insert_one(emp_rec)
print("Data inserted with record ids", rec_id)

# Printing the data inserted
for record in collection.find():
print(record)

Check which documents here saved to the GDPR shard:

docker-compose exec rs-shard-gdpr-a mongosh --port 27017 -u "jose.figueredo" --authenticationDatabase admin
mongodb> load('/scripts/query-shard-gdpr.js')
use WncDB
printjson(db.Client.find())

Check which documents here saved to the RotW shard:

docker-compose exec rs-shard-rotw-a mongosh --port 27017 -u "jose.figueredo" --authenticationDatabase admin
mongodb> load('/scripts/query-shard-rotw.js')
use WncDB
printjson(db.Client.find())

Clean up

This command will shut down the servers and erase the volumes created by our docker-compose definition:

> docker-compose down -v

The github repo

Git clone this repo to test the PoC:

Final words

If you have corrections or you want to improve the post let me know in the comments.

--

--

Jose Figueredo
Jose Figueredo

Written by Jose Figueredo

Solutions/Cloud Architect. Software developer since forever

No responses yet