MySQL Replication & Cluster Evaluation

Posted on Updated on

I have recently written a short evaluation of MySQL Replication and Cluster. This was done mainly as a study to evaluate which setup of MySQL Replication or Cluster is better suited to my needs for certain projects.

MySQL Replication

  • Scales the database beyond the single instance capacity constraints
  • Load balance read/write queries
  • Highly Available setup
  • Increase read performance and decrease database load

MySQL replication setups usually consists of a master and slave server. The master server handles reads and writes while slave servers handle only reads. In a situation where SELECT queries dominate over INSERT/UPDATE/DELETE it is ideal to have 1 master handle only writes and many slaves handle all the reads. It is common practice to see a setup of 1 master and 30 slaves.

Replication Topologies

Master to Slave Master to Many Slaves

Master to Many Slaves of Slaves Master to Master

Multi-Master Ring

Master to Slave

  • Most popular and easy to administer
  • Only 2 servers

Master to Many Slaves

  • Also popular but has extra administration costs
  • Can use a mix of asynchronous and semi-synchronous slaves

Master to Many Slaves of Slaves

  • Slaves have their own slave servers
  • Salve acts like a master but all writers are done to the original master
  • Extra administration costs
  • More complex setup

Master to Master

  • Both servers act as master and slave to each other
  • Write performance is increased because its balanced on 2 servers
  • No support for write collisions and detection
  • Application has to handle write collisions
  • Setup is complex

Multi Master Ring

  • Multiple masters connected in a ring replicating to each other
  • Like master to master but with more than 2 servers
  • No support for write collisions and detection
  • Application has to handle write collisions
  • Administration cost is high
  • Setup is complex

Types of Replication

  • Asynchronous
    • Data is replicated from master to salve with a delay
    • Setup is fast because master sends transactions to slaves and returns immediately
    • A slave’s data may get updated much after the master
    • A slave’s data may be from a few hours to a few days behind in relation to the master
    • Data consistency not 100% guaranteed
  • Semi-Synchronous
    • Data is replicated from master to slaves with a commit timeout parameter
    • Master waits for at least 1 semi-synchronous slave to commit before returning
    • Setup has performance penalty because master has to wait for slave to commit
    • At least 1 semi-synchronous slave’s data will be consistent with master at all times
    • Data consistency is 100% guaranteed
    • Any semi-synchronous slave is an ideal candidate for promotion to master if master fails
  • Synchronous
    • Data is replicated to all slaves in real time
    • Data on all slaves and masters are consistent at all times
    • MySQL alone does not support this feature
      • Use disk replication software like DRBD
      • Use MySQL Cluster 7.2

Semi-Synchronous replication setup is the preferred replication type to use for load balanced servers with high-read performance, 100% data consistency and high-availability.

Preferred Semi-Synchronous Replication Setup

Use Cases

MySQL Replication is best used for scaling-out your database architecture. This is not the same as scaling-up. Scaling-out is to handle rapidly growing database workload in order to increase performance. Scaling-up is to increase your resources such as RAM, CPU and disk space to satisfy the increased capacity.

  • Scale-Out = Horizontal Scaling
  • Scale-Up = Vertical Scaling

High availability is provided when using a replication setup. A slave can be promoted to a master through the intervention of a database administrator. A master server may go offline due to an error, crash, disaster or just plain old maintenance.

Slave servers are very useful due to their ability to offload reads from the master. They can also be used to in groups to increase read performance from a very demanding client base. They’re also useful for performing backups and analytics.

  • Backups (such as mysqldump) can be performed on a slave without affecting load on other slaves or the master
  • Backups can be done on asynchronous slaves since backups are usually 1 day old
  • Heavy analytic queries that may slow down the master can be done on slaves
  • Heavy analytic queries of historical data can be done on asynchronous slaves

Replication with MySQL Cluster 7.2

Replication can be used in conjunction with MySQL Cluster 7.2. Asynchronus replication is often used to replicate a cluster to another geographic location. A cluster may have its data replicated to a slave server so data can be backed up or use heavy analytic queries that may take a long time like historical reports or data mining.

MySQL Cluster

  • Real-time, write scalable, ACID-Compliant transactional database
  • Distributed multi-master architecture with no single point of failure
  • Highly available with 99.999% availability
  • Synchronous replication of data with real-time performance
  • Load balanced and automatic data partitioning (sharding)
  • Automatic fail over
  • Automatic resynchronization of nodes after failure for self-healing

MySQL Cluster 7.2 uses the NDB storage engine, it is different than the InnoDB and MyISAM engines. All data (that can be accessed via an index) is stored in memory. Non-indexed data may be stored on disk. For this reason clusters are used primarily for primary key based queries and statements. The NDB API is provided to bypass the SQL layer and access data faster and with better real-time behavior and provide more throughput.

  • NDB has no foreign key support
  • Table integrity must be maintained by application side or use of triggers via MySQL server access to the cluster
  • NDB has no support for full text search. You have to replicate to a slave server (using MyISAM or InnoDB) and perform full text search on those tables
  • Cluster my perform slower than the standard single instance of InnoDB
  • Fetching data incurs a network overhead
  • MySQL Cluster is optimized to perform many small transactions on a massive scale
  • Optimizing application for these type of transactions can increase speed
  • Complex join queries are optimized to be about 70x times faster than regular storage engine
MySQL Cluster with 4 Data Nodes

Hardware Requirements

The cluster is designed to store all data in memory. Only non-indexed data can be stored on disk. A formula is provided to calculate how much memory is needed in total and per-node. MySQL Cluster requires a full time database administrator.

Formula: Data Size * Replicas * 1.25 = Total Database Memory Needed

Total RAM needed: 20GB * 2 * 1.25 = 50GB

We have 4 nodes so …

(20GB * 2 * 1.25) / 4 = 12.25 GB RAM per node

Meeting Data Node hardware requirements

  • 32x x86_64 bit CPU cores
  • Large CPU cache assist
  • 64bit hosts with enough ram to fit all in-memory data (use formula)
  • Linux, Solaris or Windows
  • 2 network cards bonded for redundancy
  • 2 power supply units for redundancy
  • 2 Data nodes to make up 1 node group. Each node group should use a different power and cooling infrastructure to avoid single points of failure
  • Configure system to reduce disk swapping
  • Install 7x times the amount of data memory (from formula) required for each node
  • Space is needed for 2 stored local check points, redo log and 3 backups
  • Extra space is needed for table-spaces if using disk-based data and allow for extra space to perform backups on those tables

MySQL Cluster needs a minimum of 6 physical servers to be configured:

  • 2x Data Nodes
  • 2x SQL/NoSQL Application Nodes
  • 2x Management Nodes

MySQL Cluster Manager

MySQL Cluster Manager is a non-free software designed to manage MySQL Cluster for administrators. It simplifies the creation and management of the cluster by automating common management tasks. For example; A management operation that requires a rolling restart of MySQL cluster database needed 46 manual commands that consume 2.5 hours of an administrator’s time. With the Manager the same task can be performed with a single command. Cluster Manager also supports monitoring.

Guide to Scaling Web Databases with MySQL Cluster
MySQL Cluster Evaluation Guide
Scaling Web Services with MySQL Cluster: An Alternative to the MySQL Memory Storage Engine

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s