Keeping integrity between two separate data stores during backups (MySQL and MongoDB) -


i have application designed relational data sits , fits naturally mysql. have other data has evolving schema , doesn't have relational data, figured natural way store data in mongodb document. issue here 1 of documents references mysql primary id. far has worked without issues. concern when production traffic comes in , start working backups, there might inconsistency when document changes, might not point correct id in mysql database. way guarantee degree shutdown application , take backups, doesn't make sense.

there has other people deploy similar strategy. best way ensure data integrity between 2 data stores, particularly during backups?

mysql perspective

all mysql data have use innodb. make snapshot of mysql data follows:

mysqldump_options="--single-transaction --routines --triggers" mysqldump -u... -p... ${mysqldump_options} --all-databases > mysqldata.sql 

this create clean point-in-time snapshot of mysql data single transaction.

for instance, if start mysqldump @ midnight, data in mysqldump output midnight. data can still added mysql (provided data uses innodb storage engine) , can have mongodb reference new data added mysql after midnight, if during backup.

if have myisam tables, need convert them innodb. let's cut chase. here how make script convert myisam tables innodb:

myisam_to_innodb_conversion_script=/root/convertmyisamtoinnodb.sql echo "set sql_log_bin = 0;" > ${myisam_to_innodb_conversion_script} mysql -u... -p... -an -e"select concat('alter table ',table_schema,'.',table_name,' engine=innodb;') innodbconversionsql information_schema.tables engine='myisam' , table_schema not in ('information_schema','mysql','performance_schema') order (data_length+index_length)" >> ${myisam_to_innodb_conversion_script} 

just run script when ready convert user-defined myisam tables. system-related myisam tables ignored , should not touched anyway.

mongodb perspective

i cannot speak mongodb know little. yet, mongodb side of things, if setup replica set mongodb data, use mongodump against replica. since mongodump not point-in-time, have disconnect replica (to stop changes coming over) , perform mongodump on replica. reestablish replica master. find out developers or 10gen if mongodump can used against disconnected replica set.

common objectives

if point-in-time matters you, please makes sure os clocks have same synchronized time , timezone. if have perform such synchronization, must restart mysqld , mongod. then, crontab jobs mysqldump , mongodump go off @ same time. personally, delay mongodump 30 seconds assure ids mysql want posted in mongodb accounted for.

if have mysqld , mongod running on same server, not need mongodb replication. start mysqldump @ 00:00:00 (midnight) , mongodump @ 00:30:00 (30 sec after midnight).


Comments

Popular posts from this blog

jasper reports - Fixed header in Excel using JasperReports -

media player - Android: mediaplayer went away with unhandled events -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -