Welcome

Passionately curious about Data, Databases and Systems Complexity. Data is ubiquitous, the database universe is dichotomous (structured and unstructured), expanding and complex. Find my Database Research at SQLToolkit.co.uk . Microsoft Data Platform MVP

"The important thing is not to stop questioning. Curiosity has its own reason for existing" Einstein



Saturday, 8 November 2014

MongoDB Days London 2014








I attended the MongoDB event in London on 6 November. This was the first NoSQL event I have attended.

MongoDB  (from "humongous") is an open-source document database, agile, scalable and for general purpose data.  The schema is dynamic and the data model can evolve as the application evolves.  There are 3 core design principals to MongoDB

  • Increasing development productivity
  •  Ensuring it is easy to maintain
  •  Horizontal scalability

New features in version 2.8 include document level locking and pluggable storage engines. The WiredTiger (Non-locking algorithms, access data at RAM speed) storage engine is available in MongoDB.

There are two base architecture models

  • Replica Sets (for High Availability and Disaster Recovery)
  • Sharding (increasing the volume of persisted data too large for host machines)

The MongoDB Management Service (MMS) is a hosted service that provides monitoring, backup, and automated deployment of MongoDB instances.  This tool will soon be available on premises as well. Currently scripts for automating management can be deployed using Chef  and Puppet etc.  These can be difficult to maintain. The new automation component of MMS makes deployment and elastic scale easy to manage.  

Backups can also be done by using the mongodump utility however, if you need to restore the data you need to rebuild the indexes after restore.

Security in MongoDB has databases roles, can use certificates and encryption. MongoDB comes initially with no permissions set so you can do everything. It is important to set permissions following the principle of least priveledge. $redact is a new  aggregation framework operator to protect data in the database from viewing.

MongoDB can be used for analytics and has a business data connector to Hadoop.

Tools for Troubleshooting
% mongostat - Provides a quick overview of the status of a currently running mongod or mongos instance.
% mongotop  - Shows the amount of time a MongoDB instance spends reading and writing data.
Db.currentOp () -Returns information on in-progress operations for the database instance
Db.serverStatus() - Provides an overview of the database process's state
Rs.status() - Reflects the current status of the replica set
Sh.status() - Reports on the sharding configuration and the information regarding existing chunks in a sharded cluster

The log explained  


M Tools scripts help visualise the MongoDB log files. The commands used  from this tool in the troubleshooting session were
Mloginfo
Mlogfilter
Mplotqueries

Definitions
Terms mentioned during the day and their definitions

Oplog - stores an ordered history of logical writes to a MongoDB database.
Config servers -  are special mongod instances that store the metadata for a sharded cluster.  
Mongod - The MongoDB database server.
Mongos - The routing and load balancing process that acts an interface between an application and a MongoDB sharded cluster.

The event provided a useful introduction to MongoDB.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.