I often need to split large mysql dumps into smaller files so I can do selective imports from live to dev for example where you might not want all the data. Each time I seem to rescript some solution for the problem. So here’s my current solution to the problem, it’s a simple Ruby script, you give it the path to a mysqldump and it outputs a string of echo’s and sed commands to do the work.
UPDATE: Please do not use this code, it’s too slow and inefficient, new code can be found here.
Just pipe it’s output to a file and run it via shell when you’re ready to do the splitting. At the end you’ll have a file per table in your cwd.
#!/usr/bin/ruby prevtable = "" prevline = 0 if ARGV.length == 1 dumpfile = ARGV.shift else puts("Please specify a dumpfile to process") exit 1 end if File.exist?(dumpfile) %x[grep -n "Table structure for table" #{dumpfile}].each do |line| if line =~ /(\d+):-- Table structure for table .(.+)./ curline = $1.to_i table = $2 unless prevtable == "" puts("echo \"\`date\`: Processing #{prevtable} - lines #{prevline - 1} to #{curline - 2}\"") puts("sed -n '#{prevline - 1},#{curline - 2}p;#{curline - 2}q' #{dumpfile} > #{prevtable}.sql") puts end prevline = curline prevtable = table end end else puts("Can't find dumpfile #{dumpfile}") exit 1 end |
It’s pretty fast, the heavy lifting is all done with grep and sed, ruby just there to drive those commands and parse a few lines of output.
Running it produces something like this:
$ split-mysql-dump.rb exim.sql echo "`date`: Processing domain_sender_whitelist - lines 32 to 47" sed -n '32,47p;47q' exim.sql > domain_sender_whitelist.sql echo "`date`: Processing domain_valid_users - lines 48 to 64" sed -n '48,64p;64q' exim.sql > domain_valid_users.sql |