diff --git a/doc/capacity_plan.rdoc b/doc/capacity_plan.rdoc index 8ecefa3..84196c8 100644 --- a/doc/capacity_plan.rdoc +++ b/doc/capacity_plan.rdoc @@ -25,7 +25,7 @@ First you want to run the following create table statment on a database you woul Next you want to fill out the \Jetpants configuration file (either /etc/jetpants.yaml or ~/.jetpants.yaml). For example you configuration might look like this: # ... rest of Jetpants config here - + plugins: capacity_plan: critical_mount: 0.85 @@ -54,7 +54,7 @@ Next you want to is create a cron to capture the historical data 0 * * * * /your_bin_path/jetpants capacity_snapshot 2>&1 > /dev/null Then you want create a cron that will email you the report everyday (if you want that) - + 0 10 * * * /your_bin_path/jetpants capacity_plan --email=your_email@example.com 2>&1 > /dev/null If you want the hardware stats part of the email you have to create a function in Jetpants.topology.machine_status_counts that returns a hash that will be used to output the email @@ -67,11 +67,10 @@ Also you should have the pony gem installed == USAGE: -If you want to run the capacity plan you can do - +If you want to run the capacity plan you can do + jetpants capacity_plan To capture a one off snapshot of your data usage - - jetpants capacity_snapshot + jetpants capacity_snapshot diff --git a/doc/faq.rdoc b/doc/faq.rdoc index c3b70cd..4d50632 100644 --- a/doc/faq.rdoc +++ b/doc/faq.rdoc @@ -40,14 +40,14 @@ A sharding key is a core foreign key column that is present in most of your larg For example, on a blogging site the sharding key might be blog_id. Most tables that contain a blog_id column can be sharded, which will mean that all data related to a particular blog (posts, comments on those posts, authors, etc) is found on the same shard. By organizing data this way, you can continue to use relational operations such as JOIN when querying data that lives on the same shard. -Regardless of sharding key, some tables will not be shardable. This includes any "global" table that doesn't contain your sharding key column, as well as any tables that have global lookup patterns. For this reason you might not be able to shard the core table which has your sharding_key as its primary key! +Regardless of sharding key, some tables will not be shardable. This includes any "global" table that doesn't contain your sharding key column, as well as any tables that have global lookup patterns. For this reason you might not be able to shard the core table which has your sharding_key as its primary key! In other words: if your sharding key is user_id, you might not actually be able to shard your users table because you need to do global lookups (ie, by email address) on this table. Denormalization is a common work-around; you could split your users table into a "global lookup" portion in a global pool and an "extended data" portion that lives on shards. == What is range-based sharding? Why use it, and what are the alternatives? -Range-based sharding groups data based on ranges of your sharding key. For example, with a sharding key of user_id, all sharded data for users 1-1000 may be on the first shard, users 1001-3000 on the second shard, and users 3001-infinity on the third and final shard. +Range-based sharding groups data based on ranges of your sharding key. For example, with a sharding key of user_id, all sharded data for users 1-1000 may be on the first shard, users 1001-3000 on the second shard, and users 3001-infinity on the third and final shard. The main benefit of range-based sharding is simplicity. You can express the shard ranges in a language-neutral format like YAML or JSON, and the code to route queries to the correct DB can be implemented in a trivially small amount of code. There's no need for a lookup service, so we avoid a single point of failure. It's also easy for a human to look at the ranges and figure out which DB to query when debugging a problem by hand. @@ -60,9 +60,9 @@ The main downside to the range-based approach is lack of even distribution of "h * Modulus or hash: Apply a function to your sharding key to determine which shard the data lives on. This approach helps to distribute data very evenly. Many sites find that their latest users behave differently than their oldest users, so grouping users together by ranges of ID (essentially ranges of account creation date) can be problematic. Using a modulus or hash avoids this problem. - + The main issue with this approach is how to rebalance shards that are too large. A simple modulus can't do this unless you want to simultaneously split all of your shards in half, which leads to painful exponential growth. A hash function can be more versatile but can still lead to great complexity. Worse yet, there's no way to rebalance _quickly_ because data is not stored on disk in sorted order based on the hash function. - + * Lookup table: Use a separate service or data store which takes a sharding key value as an input and returns the appropriate shard as an output. This scheme allows you to very specifically allocate particular data to shards, and works well for sites that have a lot of "hot" data from celebrity users. However, the lookup service is essentially a single point of failure, which counteracts many of the attractive features of sharded architectures. Rebalancing can also be slow and tricky, since you need a notion of "locking" a sharding key value while its rows are being migrated. diff --git a/doc/jetpants_collins.rdoc b/doc/jetpants_collins.rdoc index 8bc51d9..d21e3bb 100644 --- a/doc/jetpants_collins.rdoc +++ b/doc/jetpants_collins.rdoc @@ -18,13 +18,13 @@ remote_lookup:: Supply "remoteLookup" parameter for \Collins requests, to search To enable this plugin, add it to your \Jetpants configuration file (either /etc/jetpants.yaml or ~/.jetpants.yaml). For example, in a single-datacenter environment, you configuration might look like this: # ... rest of Jetpants config here - + plugins: jetpants_collins: user: jetpants password: xxx url: http://collins.yourdomain.com:8080 - + # ... other plugins configured here == ASSUMPTIONS AND REQUIREMENTS: @@ -53,17 +53,17 @@ Adding functional partitions (global / unsharded pools): # Create the pool object, specifying pool name and IP of current master p = Pool.new('my-pool-name', '10.42.3.4') - + # Tell Jetpants about IPs of any existing active slaves (read slaves), if any. # For example, say this pool has 2 active slaves and 2 standby slaves. \Jetpants # can automatically figure out which slaves exist, but won't automatically know # which ones are active for reads, so you need to tell it. p.has_active_slave('10.42.3.30') p.has_active_slave('10.42.3.32') - + # Sync the information to Collins p.sync_configuration - + Repeat this process for each functional partition, if you have more than one. Adding shard pools: diff --git a/doc/online_schema_change.rdoc b/doc/online_schema_change.rdoc index 2deb68c..52d25e9 100644 --- a/doc/online_schema_change.rdoc +++ b/doc/online_schema_change.rdoc @@ -11,7 +11,7 @@ This plugin has no extra options, just add the name to your plugins section and To enable this plugin, add it to your \Jetpants configuration file (either /etc/jetpants.yaml or ~/.jetpants.yaml). For example you configuration might look like this: # ... rest of Jetpants config here - + plugins: online_schema_change: # ... other plugins configured here @@ -25,7 +25,7 @@ Also you should be using \Collins and the jetpants_collins plugin == EXAMPLES: dry run of an alter on a single pool - jetpants alter_table --database=allmydata --table=somedata --pool=users --dry-run --alter='ADD COLUMN c1 INT' + jetpants alter_table --database=allmydata --table=somedata --pool=users --dry-run --alter='ADD COLUMN c1 INT' alter a single pool jetpants alter_table --database=allmydata --table=somedata --pool=users --alter='ADD COLUMN c1 INT' @@ -42,4 +42,3 @@ the alter table does not drop the old table automatically, so to remove the tabl to drop the tables on all your shards jetpants alter_table_drop --database=allmydata --table=somedata --all_shards - diff --git a/doc/plugins.rdoc b/doc/plugins.rdoc index 7ef3b91..39842b9 100644 --- a/doc/plugins.rdoc +++ b/doc/plugins.rdoc @@ -41,8 +41,8 @@ If you're writing your own asset-tracker plugin, you will need to override the f * Jetpants::Topology#count_spares * Returns a count of spare database nodes * Jetpants::Pool#sync_configuration - * Updates the asset tracker with the current status of a pool. - * This should update the asset tracker's internal knowledge of the database topology immediately, but not necessarily cause the application's config file to be regenerated immediately. + * Updates the asset tracker with the current status of a pool. + * This should update the asset tracker's internal knowledge of the database topology immediately, but not necessarily cause the application's config file to be regenerated immediately. You may also want to override or implement these, though it's not strictly mandatory: diff --git a/doc/requirements.rdoc b/doc/requirements.rdoc index 28c7e67..8bf2355 100644 --- a/doc/requirements.rdoc +++ b/doc/requirements.rdoc @@ -1,6 +1,6 @@ = Jetpants Requirements and Assumptions -The base classes of \Jetpants currently make a number of assumptions about your environment and database topology. +The base classes of \Jetpants currently make a number of assumptions about your environment and database topology. Plugins may freely override these assumptions, and upstream patches are very welcome to incorporate support for alternative configurations. We're especially interested in plugins or pull requests that add support for: Postgres and other relational databases; Redis and other non-relational data stores; non-Redhat Linux distributions or *BSD operating systems; master-master topologies; multi-instance-per-host setups; etc. We have attempted to design \Jetpants in a way that is sufficiently flexible to eventually support a wide range of environments. diff --git a/doc/upgrade_helper.rdoc b/doc/upgrade_helper.rdoc index e6d7bac..388a27f 100644 --- a/doc/upgrade_helper.rdoc +++ b/doc/upgrade_helper.rdoc @@ -17,14 +17,14 @@ new_version:: major.minor string of MySQL version being upgraded to, for Example usage: # ... rest of Jetpants config here - + plugins: jetpants_collins: # config for jetpants_collins here - + upgrade_helper: new_version: "5.5" - + # ... other plugins configured here @@ -65,4 +65,4 @@ For subsequent shard upgrades, you may optionally use this simplified process. 3. Use "jetpants shard_upgrade --writes" to regenerate your application configuration in a way that moves read AND write queries to the upgraded mirror shard's master. 4. Use "jetpants shard_upgrade --cleanup" to eject all non-upgraded nodes from the pool entirely. This will tear down replication between the version of the shard and the old version. -Using a custom Ruby script, this process can be automated to perform each step on several shards at once. \ No newline at end of file +Using a custom Ruby script, this process can be automated to perform each step on several shards at once. diff --git a/lib/jetpants/callback.rb b/lib/jetpants/callback.rb index 0179950..d0b847f 100644 --- a/lib/jetpants/callback.rb +++ b/lib/jetpants/callback.rb @@ -2,16 +2,16 @@ module Jetpants # Exception class used to halt further processing in callback chain. See # description in CallbackHandler. class CallbackAbortError < StandardError; end - + # If you include CallbackHandler as a mix-in, it grants the base class support # for Jetpants callbacks, as defined here: # - # If you invoke a method "foo", Jetpants will first + # If you invoke a method "foo", Jetpants will first # automatically call any "before_foo" methods that exist in the class or its # superclasses. You can even define multiple methods named before_foo (in the # same class!) and they will each be called. In other words, Jetpants # callbacks "stack" instead of overriding each other. - # + # # After calling any/all before_foo methods, the foo method is called, followed # by all after_foo methods in the same manner. # @@ -37,7 +37,7 @@ def method_added(name) # Intercept before_* and after_* methods and create corresponding Callback objects if name.to_s.start_with? 'before_', 'after_' Callback.new self, name.to_s.split('_', 2)[1].to_sym, name.to_s.split('_', 2)[0].to_sym, @callback_priority - + # Intercept redefinitions of methods we've already wrapped, so we can # wrap them again elsif Callback.wrapped? self, name @@ -45,31 +45,31 @@ def method_added(name) end end end - + # Default priority for callbacks is 100 @callback_priority = 100 end end end - + # Generic representation of a before-method or after-method callback. # Used internally by CallbackHandler; you won't need to interact with Callback directly. class Callback @@all_callbacks = {} # hash of class obj -> method_name symbol -> type string -> array of callbacks @@currently_wrapping = {} # hash of class obj -> method_name symbol -> bool - + attr_reader :for_class # class object attr_reader :method_name # symbol containing method name (the one being callback-wrapped) attr_reader :type # :before or :after attr_reader :priority # high numbers get triggered first attr_reader :my_alias # method name alias OF THE CALLBACK - + def initialize(for_class, method_name, type=:after, priority=100) @for_class = for_class @method_name = method_name @type = type @priority = priority - + @@all_callbacks[for_class] ||= {} @@all_callbacks[for_class][method_name] ||= {} already_wrapped = Callback.wrapped?(for_class, method_name) @@ -82,16 +82,16 @@ def initialize(for_class, method_name, type=:after, priority=100) alias_method new_name, old_name end Callback.wrap_method(for_class, method_name) unless already_wrapped - + @@all_callbacks[for_class][method_name][type] << self end - + def self.wrap_method(for_class, method_name) @@currently_wrapping[for_class] ||= {} @@currently_wrapping[for_class][method_name] ||= false return if @@currently_wrapping[for_class][method_name] # prevent infinite recursion from the alias_method call @@currently_wrapping[for_class][method_name] = true - + for_class.class_eval do alias_method "#{method_name}_without_callbacks".to_sym, method_name define_method method_name do |*args| @@ -110,10 +110,10 @@ def self.wrap_method(for_class, method_name) result end end - + @@currently_wrapping[for_class][method_name] = false end - + def self.trigger(for_object, method_name, type, *args) my_callbacks = [] for_object.class.ancestors.each do |for_class| @@ -124,7 +124,7 @@ def self.trigger(for_object, method_name, type, *args) my_callbacks.sort_by! {|c| -1 * c.priority} my_callbacks.each {|c| for_object.send(c.my_alias, *args)} end - + def self.wrapped?(for_class, method_name) return false unless @@all_callbacks[for_class] && @@all_callbacks[for_class][method_name] @@all_callbacks[for_class][method_name].count > 0 diff --git a/lib/jetpants/db/import_export.rb b/lib/jetpants/db/import_export.rb index 14bccd8..81aecd6 100644 --- a/lib/jetpants/db/import_export.rb +++ b/lib/jetpants/db/import_export.rb @@ -356,7 +356,7 @@ def rebuild!(tables=false, min_id=false, max_id=false) export_schemata tables export_data tables, min_id, max_id - + # We need to be paranoid and confirm nothing else has restarted mysql (re-enabling binary logging) # out-of-band. Besides the obvious slowness of importing things while binlogging, this is outright # dangerous if GTID is in-use. So we check before every method or statement that does writes @@ -365,7 +365,7 @@ def rebuild!(tables=false, min_id=false, max_id=false) import_schemata! if respond_to? :alter_schemata raise "Binary logging has somehow been re-enabled. Must abort for safety!" if binary_log_enabled? - alter_schemata + alter_schemata # re-retrieve table metadata in the case that we alter the tables pool.probe_tables tables = pool.tables.select{|t| pool.tables.map(&:name).include?(t.name)} @@ -430,7 +430,7 @@ def clone_to!(*targets) }.reject { |s| Jetpants.mysql_clone_ignore.include? s } - + # If using GTID, we need to remember the source's gtid_executed from the point-in-time of the copy. # We also need to ensure that the targets match the same gtid-related variables as the source. # Ordinarily this should be managed by my.cnf, but while a fleet-wide GTID rollout is still underway, @@ -467,7 +467,7 @@ def clone_to!(*targets) t.start_query_killer t.enable_monitoring end - + # If the source is using GTID, we need to set the targets' gtid_purged to equal the # source's gtid_executed. This is needed because we do not copy binlogs, which are # the source of truth for gtid_purged and gtid_executed. (Note, setting gtid_purged diff --git a/lib/jetpants/db/privileges.rb b/lib/jetpants/db/privileges.rb index c6a4b90..3398083 100644 --- a/lib/jetpants/db/privileges.rb +++ b/lib/jetpants/db/privileges.rb @@ -1,5 +1,5 @@ module Jetpants - + #-- # User / Grant manipulation methods ########################################## # @@ -16,7 +16,7 @@ module Jetpants # Overall, best practice in MySQL is only manage grants locally on each node, # never via replication. #++ - + class DB # Create a MySQL user. If you omit parameters, the defaults from Jetpants' # configuration will be used instead. Does not automatically grant any @@ -36,7 +36,7 @@ def create_user(username=false, password=false) output "Created user '#{username}'@'#{ip}' (only on this node -- not binlogged)" end end - + # Drops a user. SEE NOTE ABOVE RE: ALWAYS SKIPS BINLOG def drop_user(username=false) username ||= app_credentials[:user] @@ -51,7 +51,7 @@ def drop_user(username=false) output "Dropped user '#{username}'@'#{ip}' (only on this node -- not binlogged)" end end - + # Grants privileges to the given username for the specified database. # Pass in privileges as additional params, each as strings. # You may omit parameters to use the defaults in the Jetpants config file. @@ -59,7 +59,7 @@ def drop_user(username=false) def grant_privileges(username=false, database=false, *privileges) grant_or_revoke_privileges('GRANT', username, database, privileges) end - + # Revokes privileges from the given username for the specified database. # Pass in privileges as additional params, each as strings. # You may omit parameters to use the defaults in the Jetpants config file. @@ -67,7 +67,7 @@ def grant_privileges(username=false, database=false, *privileges) def revoke_privileges(username=false, database=false, *privileges) grant_or_revoke_privileges('REVOKE', username, database, privileges) end - + # Helper method that can do grants or revokes. # SEE NOTE ABOVE RE: ALWAYS SKIPS BINLOG def grant_or_revoke_privileges(statement, username, database, privileges) @@ -77,7 +77,7 @@ def grant_or_revoke_privileges(statement, username, database, privileges) privileges = Jetpants.mysql_grant_privs if privileges.empty? privileges = privileges.join(',') commands = ['SET SESSION sql_log_bin = 0'] - + Jetpants.mysql_grant_ips.each do |ip| commands << "#{statement} #{privileges} ON #{database}.* #{preposition} '#{username}'@'#{ip}'" end @@ -90,8 +90,8 @@ def grant_or_revoke_privileges(statement, username, database, privileges) output "#{verb} privileges #{preposition.downcase} '#{username}'@'#{ip}' #{target_db}: #{privileges.downcase} (only on this node -- not binlogged)" end end - - # Disables access to a DB by the application user, and sets the DB to + + # Disables access to a DB by the application user, and sets the DB to # read-only. Useful when decommissioning instances from a shard that's # been split, or a former slave that's been permanently removed from the pool def revoke_all_access! @@ -99,7 +99,7 @@ def revoke_all_access! enable_read_only! drop_user(user_name) # never written to binlog, so no risk of it replicating end - + # Enables global read-only mode on the database. def enable_read_only! if read_only? @@ -111,7 +111,7 @@ def enable_read_only! read_only? end end - + # Disables global read-only mode on the database. def disable_read_only! if read_only? @@ -123,7 +123,7 @@ def disable_read_only! true end end - + # Generate and return a random string consisting of uppercase # letters, lowercase letters, and digits. def self.random_password(length=50) @@ -147,6 +147,6 @@ def override_mysql_grant_ips(ips) end Jetpants.mysql_grant_ips = ip_holder end - + end -end \ No newline at end of file +end diff --git a/lib/jetpants/db/schema.rb b/lib/jetpants/db/schema.rb index 43fe2f0..0f9856c 100644 --- a/lib/jetpants/db/schema.rb +++ b/lib/jetpants/db/schema.rb @@ -21,7 +21,7 @@ def detect_table_schema(table_name) 'create_table' => create_statement, 'indexes' => connection.indexes(table_name), 'pool' => pool, - 'columns' => connection.schema(table_name).map{|schema| schema[0]} + 'columns' => connection.schema(table_name).map{|schema| schema[0]} } if pool.is_a? Shard diff --git a/plugins/capacity_plan/capacity_plan.rb b/plugins/capacity_plan/capacity_plan.rb index 8b5a1cf..ab88af7 100644 --- a/plugins/capacity_plan/capacity_plan.rb +++ b/plugins/capacity_plan/capacity_plan.rb @@ -15,7 +15,7 @@ def initialize @@db.connect(user: Jetpants.plugins['capacity_plan']['user'], schema: Jetpants.plugins['capacity_plan']['schema'], pass: Jetpants.plugins['capacity_plan']['pass']) end - ## grab snapshot of data and store it in mysql + ## grab snapshot of data and store it in mysql def snapshot storage_sizes = {} timestamp = Time.now.to_i @@ -95,9 +95,9 @@ def plan(email=false) end critical = mount_stats_storage[name]['total'].to_f * Jetpants.plugins['capacity_plan']['critical_mount'] if (per_day(bytes_to_gb(growth_rate))) <= 0 || ((critical - mount_stats_storage[name]['used'].to_f)/ per_day(growth_rate)) > 999 - output += "%30s %20.2f %10.2f %10s\n" % [name, bytes_to_gb(mount_stats_storage[name]['used'].to_f), (per_day(bytes_to_gb(growth_rate+0))), 'N/A'] + output += "%30s %20.2f %10.2f %10s\n" % [name, bytes_to_gb(mount_stats_storage[name]['used'].to_f), (per_day(bytes_to_gb(growth_rate+0))), 'N/A'] else - output += "%30s %20.2f %10.2f %10.2f\n" % [name, bytes_to_gb(mount_stats_storage[name]['used'].to_f), (per_day(bytes_to_gb(growth_rate+0))),((critical - mount_stats_storage[name]['used'].to_f)/ per_day(growth_rate))] + output += "%30s %20.2f %10.2f %10.2f\n" % [name, bytes_to_gb(mount_stats_storage[name]['used'].to_f), (per_day(bytes_to_gb(growth_rate+0))),((critical - mount_stats_storage[name]['used'].to_f)/ per_day(growth_rate))] end end @@ -261,7 +261,7 @@ def segmentify(hash, timeperiod) new_hash[name][before_timestamp.to_s+"-"+last_timestamp.to_s] = (keeper[0]['used'].to_f - last_value['used'].to_f )/(before_timestamp.to_f - last_timestamp.to_f) end end - + new_hash end @@ -269,7 +269,7 @@ def segmentify(hash, timeperiod) # you need to have a method in Jetpants.topology.machine_status_counts to get # your machine types and states def get_hardware_stats - + #see if function exists return '' unless Jetpants.topology.respond_to? :machine_status_counts @@ -325,7 +325,7 @@ def outliers from_per = {} now_block = get_history_block(name, start_time, start_time + block_sizes) - unless now_block.count == 0 + unless now_block.count == 0 now_per = (now_block.first[1]['used'].to_f - now_block.values.last['used'].to_f)/(now_block.first[0].to_f - now_block.keys.last.to_f) @@ -373,7 +373,7 @@ def outliers output_buffer = '' counter = 0 counter_time = 0 - end + end output @@ -413,15 +413,15 @@ def snapshot_autoinc(timestamp) pools_list = Jetpants.topology.pools.reject { |p| ignore_list.include? p } end query = %Q| - SELECT * + SELECT * FROM INFORMATION_SCHEMA.COLUMNS - WHERE TABLE_SCHEMA NOT IN - ('mysql', 'information_schema', 'performance_schema', 'test') AND + WHERE TABLE_SCHEMA NOT IN + ('mysql', 'information_schema', 'performance_schema', 'test') AND LOCATE('auto_increment', EXTRA) > 0 | pools_list.each do |p| slave = p.standby_slaves.first - if !slave.nil? + if !slave.nil? slave.query_return_array(query).each do |row| table_name = row[:TABLE_NAME] schema_name = row[:TABLE_SCHEMA] @@ -446,14 +446,14 @@ def snapshot_autoinc(timestamp) def get_autoinc_history(date) auto_inc_history = {} query = %Q| - select + select from_unixtime(timestamp, '%Y-%m-%d'), pool, table_name, column_name, column_type, max_val, data_type_max, - round((max_val / data_type_max), 2) as ratio - from auto_inc_checker - where from_unixtime(timestamp, '%Y-%m-%d') = '#{date}' - group by pool, table_name - order by ratio desc + round((max_val / data_type_max), 2) as ratio + from auto_inc_checker + where from_unixtime(timestamp, '%Y-%m-%d') = '#{date}' + group by pool, table_name + order by ratio desc limit 5 | diff --git a/plugins/jetpants_collins/asset.rb b/plugins/jetpants_collins/asset.rb index 4229b98..26bb950 100644 --- a/plugins/jetpants_collins/asset.rb +++ b/plugins/jetpants_collins/asset.rb @@ -2,14 +2,14 @@ module Collins class Asset - + # Convert a Collins::Asset to a Jetpants::DB. Requires asset TYPE to be SERVER_NODE. def to_db raise "Can only call to_db on SERVER_NODE assets, but #{self} has type #{type}" unless type.upcase == 'SERVER_NODE' backend_ip_address.to_db end - - + + # Convert a Collins::Asset to a Jetpants::Host. Requires asset TYPE to be SERVER_NODE. def to_host raise "Can only call to_host on SERVER_NODE assets, but #{self} has type #{type}" unless type.upcase == 'SERVER_NODE' @@ -31,7 +31,7 @@ def to_pool raise "Can only call to_pool on CONFIGURATION assets, but #{self} has type #{type}" unless type.upcase == 'CONFIGURATION' raise "Unknown primary role #{primary_role} for configuration asset #{self}" unless ['MYSQL_POOL', 'MYSQL_SHARD'].include?(primary_role.upcase) raise "No pool attribute set on asset #{self}" unless pool && pool.length > 0 - + # if this node is iniitalizing we know there will be no server assets # associated with it if !shard_state.nil? and shard_state.upcase == "INITIALIZING" @@ -47,11 +47,11 @@ def to_pool master_assets = results if results.count > 0 end puts "WARNING: multiple masters found for pool #{pool}; using first match" if master_assets.count > 1 - + if master_assets.count == 0 puts "WARNING: no masters found for pool #{pool}; ignoring pool entirely" result = nil - + elsif primary_role.upcase == 'MYSQL_POOL' result = Jetpants::Pool.new(pool.downcase, master_assets.first.to_db) if aliases @@ -65,26 +65,26 @@ def to_pool weight = asset.slave_weight && asset.slave_weight.to_i > 0 ? asset.slave_weight.to_i : 100 result.has_active_slave(asset.to_db, weight) end - + elsif primary_role.upcase == 'MYSQL_SHARD' - result = Jetpants::Shard.new(shard_min_id.to_i, - shard_max_id == 'INFINITY' ? 'INFINITY' : shard_max_id.to_i, - master_assets.first.to_db, + result = Jetpants::Shard.new(shard_min_id.to_i, + shard_max_id == 'INFINITY' ? 'INFINITY' : shard_max_id.to_i, + master_assets.first.to_db, shard_state.downcase.to_sym, shard_pool) - + # We'll need to set up the parent/child relationship if a shard split is in progress, # BUT we need to wait to do that later since the shards may have been returned by # Collins out-of-order, so the parent shard object might not exist yet. # For now we just remember the NAME of the parent shard. result.has_parent = shard_parent if shard_parent - + else raise "Unknown configuration asset primary role #{primary_role} for asset #{self}" end - + result end - + end end diff --git a/plugins/jetpants_collins/db.rb b/plugins/jetpants_collins/db.rb index db13cd8..553d4eb 100644 --- a/plugins/jetpants_collins/db.rb +++ b/plugins/jetpants_collins/db.rb @@ -2,11 +2,11 @@ module Jetpants class DB - + ##### JETCOLLINS MIX-IN #################################################### - + include Plugin::JetCollins - + collins_attr_accessor :slave_weight, :nodeclass, :nobackup # Because we only support 1 mysql instance per machine for now, we can just @@ -14,15 +14,15 @@ class DB def collins_asset @host.collins_asset end - - + + ##### METHOD OVERRIDES ##################################################### - + # Add an actual collins check to confirm a machine is a standby def is_standby? !(running?) || (is_slave? && !taking_connections? && collins_secondary_role == 'standby_slave') end - + # Treat any node outside of current data center as being for backups. # This prevents inadvertent cross-data-center master promotion. def for_backups? @@ -55,7 +55,7 @@ def clone_settings_to!(*targets) end ##### CALLBACKS ############################################################ - + # Determine master from Collins if machine is unreachable or MySQL isn't running. def after_probe_master unless @running @@ -66,7 +66,7 @@ def after_probe_master @master = pool.master if pool end end - + # We completely ignore cross-data-center master unless inter_dc_mode is enabled. # This may change in a future Jetpants release, once we support tiered replication more cleanly. if @master && @master.in_remote_datacenter? && !Jetpants::Plugin::JetCollins.inter_dc_mode? @@ -76,20 +76,20 @@ def after_probe_master in_remote_datacenter? # just calling to cache for current node, before we probe its slaves, so that its slaves don't need to query Collins end end - + # Determine slaves from Collins if machine is unreachable or MySQL isn't running def after_probe_slaves # If this machine has a master AND has slaves of its own AND is in another data center, # ignore its slaves entirely unless inter_dc_mode is enabled. # This may change in a future Jetpants release, once we support tiered replication more cleanly. @slaves = [] if @running && @master && @slaves.count > 0 && in_remote_datacenter? && !Jetpants::Plugin::JetCollins.inter_dc_mode? - + unless @running p = Jetpants.topology.pool(self) @slaves = (p ? p.slaves_according_to_collins : []) end end - + # After changing the status of a node, clear its list of spare-node-related # validation errors, so that we will re-probe when necessary def after_collins_status=(value) diff --git a/plugins/jetpants_collins/host.rb b/plugins/jetpants_collins/host.rb index a188ba8..be88b2b 100644 --- a/plugins/jetpants_collins/host.rb +++ b/plugins/jetpants_collins/host.rb @@ -2,11 +2,11 @@ module Jetpants class Host - + ##### JETCOLLINS MIX-IN #################################################### - + include Plugin::JetCollins - + def collins_asset # try IP first; failing that, try hostname selector = {ip_address: ip, details: true} @@ -18,7 +18,7 @@ def collins_asset selector[:remoteLookup] = true if Jetpants.plugins['jetpants_collins']['remote_lookup'] assets = Plugin::JetCollins.find selector, true end - + raise "Multiple assets found for #{self}" if assets.count > 1 if ! assets || assets.count == 0 output "WARNING: no Collins assets found for this host" @@ -27,7 +27,7 @@ def collins_asset assets.first end end - + # Returns which datacenter this host is in. Only a getter, intentionally no setter. def collins_location return @collins_location if @collins_location @@ -36,6 +36,6 @@ def collins_location @collins_location.upcase! @collins_location end - + end end diff --git a/plugins/jetpants_collins/pool.rb b/plugins/jetpants_collins/pool.rb index bac977f..a6edc47 100644 --- a/plugins/jetpants_collins/pool.rb +++ b/plugins/jetpants_collins/pool.rb @@ -2,23 +2,23 @@ module Jetpants class Pool - + ##### JETCOLLINS MIX-IN #################################################### - + include Plugin::JetCollins - + # Used at startup time, to keep track of parent/child shard relationships attr_accessor :has_parent - + # Collins accessors for configuration asset metadata collins_attr_accessor :slave_pool_name, :aliases, :master_read_weight, :config_sort_order - + # Returns a Collins::Asset for this pool. Can optionally create one if not found. def collins_asset(create_if_missing=false) selector = { operation: 'and', details: true, - type: 'CONFIGURATION', + type: 'CONFIGURATION', primary_role: 'MYSQL_POOL', pool: "^#{@name.upcase}$", status: 'Allocated', @@ -26,13 +26,13 @@ def collins_asset(create_if_missing=false) selector[:remoteLookup] = true if Jetpants.plugins['jetpants_collins']['remote_lookup'] results = Plugin::JetCollins.find selector, !create_if_missing - + # If we got back multiple results, try ignoring the remote datacenter ones if results.count > 1 filtered_results = results.select {|a| a.location.nil? || a.location.upcase == Plugin::JetCollins.datacenter} results = filtered_results if filtered_results.count > 0 end - + if results.count > 1 raise "Multiple configuration assets found for pool #{name}" elsif results.count == 0 && create_if_missing @@ -45,8 +45,8 @@ def collins_asset(create_if_missing=false) collins_set asset: asset, status: 'Allocated' end - collins_set asset: asset, - primary_role: 'MYSQL_POOL', + collins_set asset: asset, + primary_role: 'MYSQL_POOL', pool: @name.upcase Plugin::JetCollins.get new_tag elsif results.count == 0 && !create_if_missing @@ -55,10 +55,10 @@ def collins_asset(create_if_missing=false) results.first end end - - + + ##### METHOD OVERRIDES ##################################################### - + # Examines the current state of the pool (as known to Jetpants) and updates # Collins to reflect this, in terms of the pool's configuration asset as # well as the individual hosts. @@ -74,7 +74,7 @@ def sync_configuration db.collins_pool = @name end @master.collins_secondary_role = 'MASTER' - slaves(:active).each do |db| + slaves(:active).each do |db| db.collins_secondary_role = 'ACTIVE_SLAVE' weight = @active_slave_weights[db] db.collins_slave_weight = (weight == 100 ? '' : weight) @@ -86,7 +86,7 @@ def sync_configuration @claimed_nodes = [] true end - + # Return the count of Allocated:RUNNING slaves def running_slaves(secondary_role=false) slaves.select { |slave| @@ -106,10 +106,10 @@ def active_slaves @active_slave_weights.keys end end - - + + ##### CALLBACKS ############################################################ - + # Pushes slave removal to Collins. (Normally this type of logic is handled by # Pool#sync_configuration, but that won't handle this case, since # sync_configuration only updates hosts still in the pool.) @@ -118,12 +118,12 @@ def after_remove_slave!(slave_db) current_status = (slave_db.collins_status || '').downcase slave_db.collins_status = 'Unallocated' unless current_status == 'maintenance' end - + # If the demoted master was offline, record some info in Collins, otherwise # there will be 2 masters listed def after_master_promotion!(promoted, enslave_old_master=true) Jetpants.topology.clear_asset_cache - + # Find the master asset(s) for this pool, filtering down to only current datacenter assets = Jetpants.topology.server_node_assets(@name, :master) assets.reject! {|a| a.location && a.location.upcase != Plugin::JetCollins.datacenter} @@ -138,7 +138,7 @@ def after_master_promotion!(promoted, enslave_old_master=true) end end end - + # Clean up any slaves that are no longer slaving (again only looking at current datacenter) assets = Jetpants.topology.server_node_assets(@name, :slave) assets.reject! {|a| a.location && a.location.upcase != Plugin::JetCollins.datacenter} @@ -151,10 +151,10 @@ def after_master_promotion!(promoted, enslave_old_master=true) end end end - - + + ##### NEW METHODS ########################################################## - + # Returns the pool's creation time (as a unix timestamp) according to Collins. # (note: may be off by a few hours until https://github.com/tumblr/collins/issues/80 # is resolved) @@ -163,7 +163,7 @@ def after_master_promotion!(promoted, enslave_old_master=true) def collins_creation_timestamp collins_asset.created.to_time.to_i end - + # Called from DB#after_probe_master and DB#after_probe_slave for machines # that are unreachable via SSH, or reachable but MySQL isn't running. def slaves_according_to_collins diff --git a/plugins/jetpants_collins/shard.rb b/plugins/jetpants_collins/shard.rb index ea3a8e2..cb9d547 100644 --- a/plugins/jetpants_collins/shard.rb +++ b/plugins/jetpants_collins/shard.rb @@ -2,34 +2,34 @@ module Jetpants class Shard < Pool - + ##### JETCOLLINS MIX-IN #################################################### - + include Plugin::JetCollins - + collins_attr_accessor :shard_min_id, :shard_max_id, :shard_state, :shard_parent, :shard_pool - + # Returns a Collins::Asset for this pool def collins_asset(create_if_missing=false) selector = { operation: 'and', details: true, - type: 'CONFIGURATION', + type: 'CONFIGURATION', primary_role: '^MYSQL_SHARD$', shard_min_id: "^#{@min_id}$", shard_max_id: "^#{@max_id}$", shard_pool: "^#{@shard_pool.name}$" } selector[:remoteLookup] = true if Jetpants.plugins['jetpants_collins']['remote_lookup'] - + results = Plugin::JetCollins.find selector, !create_if_missing - + # If we got back multiple results, try ignoring the remote datacenter ones if results.count > 1 filtered_results = results.select {|a| a.location.nil? || a.location.upcase == Plugin::JetCollins.datacenter} results = filtered_results if filtered_results.count > 0 end - + if results.count > 1 raise "Multiple configuration assets found for pool #{name}" elsif results.count == 0 && create_if_missing @@ -42,8 +42,8 @@ def collins_asset(create_if_missing=false) collins_set asset: asset, status: 'Allocated' end - collins_set asset: asset, - primary_role: 'MYSQL_SHARD', + collins_set asset: asset, + primary_role: 'MYSQL_SHARD', pool: @name.upcase, shard_min_id: @min_id, shard_max_id: @max_id, @@ -55,10 +55,10 @@ def collins_asset(create_if_missing=false) results.first end end - - + + ##### METHOD OVERRIDES ##################################################### - + # Examines the current state of the pool (as known to Jetpants) and updates # Collins to reflect this, in terms of the pool's configuration asset as # well as the individual hosts. @@ -93,23 +93,23 @@ def sync_configuration standby_slaves.each {|db| db.collins_secondary_role = 'STANDBY_SLAVE'} backup_slaves.each {|db| db.collins_secondary_role = 'BACKUP_SLAVE'} end - + # handle lockless master migration situations if @state == :child && master.master && !@parent to_be_ejected_master = master.master to_be_ejected_master.collins_secondary_role = :standby_slave # not accurate, but no better option for now end - + true end - - + + ##### CALLBACKS ############################################################ - + # After altering the state of a shard, sync the change to Collins immediately. def after_state=(value) sync_configuration end - + end end diff --git a/plugins/jetpants_collins/topology.rb b/plugins/jetpants_collins/topology.rb index 081494b..88ef80f 100644 --- a/plugins/jetpants_collins/topology.rb +++ b/plugins/jetpants_collins/topology.rb @@ -7,7 +7,7 @@ module Jetpants class Topology ##### METHODS THAT OTHER PLUGINS CAN OVERRIDE ############################## - + # IMPORTANT NOTE # This plugin does NOT implement write_config, since this format of # your app configuration file entirely depends on your web framework! @@ -16,14 +16,14 @@ class Topology # approach is to add serialization methods to Pool and Shard, and call it # on each @pool, writing out to a file or pinging a config service, depending # on whatever your application uses. - - + + # Handles extra options for querying spare nodes. Takes a Collins selector # hash and an options hash, and returns a potentially-modified Collins # selector hash. # The default implementation here implements no special logic. Custom plugins # (loaded AFTER jetpants_collins is loaded) can override this method to - # manipulate the selector; see commented-out example below. + # manipulate the selector; see commented-out example below. def process_spare_selector_options(selector, options) # If you wanted to support an option of :role, and map this to the Collins # SECONDARY_ROLE attribute, you could implement this via: @@ -31,11 +31,11 @@ def process_spare_selector_options(selector, options) # This could be useful if, for example, you use a different hardware spec # for masters vs slaves. (doing so isn't really recommended, which is why # we omit this logic by default.) - + # return the selector selector end - + ##### METHOD OVERRIDES ##################################################### def load_shard_pools @@ -119,7 +119,7 @@ def claim_spares(count, options={}) end if(compare_pool && claimed_dbs.select{|db| db.proximity_score(compare_pool) > 0}.count > 0) - compare_pool.output "Unable to claim #{count} nodes with an ideal proximity score!" + compare_pool.output "Unable to claim #{count} nodes with an ideal proximity score!" end claimed_dbs @@ -160,7 +160,7 @@ def db_location_report(shards_only = nil) # SECONDARY_ROLE values in Collins. def server_node_assets(pool_name=false, *roles) roles = normalize_roles(roles) if roles.count > 0 - + # Check for previously-cached result. (Only usable if a pool_name supplied.) if pool_name && @pool_role_assets[pool_name] if roles.count > 0 && roles.all? {|r| @pool_role_assets[pool_name].has_key? r} @@ -169,7 +169,7 @@ def server_node_assets(pool_name=false, *roles) return @pool_role_assets[pool_name].values.flatten end end - + per_page = Jetpants.plugins['jetpants_collins']['selector_page_size'] || 50 selector = { operation: 'and', @@ -185,7 +185,7 @@ def server_node_assets(pool_name=false, *roles) values = roles.map {|r| "secondary_role = ^#{r}$"} selector[:query] += ' AND (' + values.join(' OR ') + ')' end - + assets = [] done = false page = 0 @@ -196,7 +196,7 @@ def server_node_assets(pool_name=false, *roles) # find() apparently alters the selector object now, so we dup it # also force JetCollins to retry requests to the Collins server results = Plugin::JetCollins.find selector.dup, true, page == 0 - done = (results.count < per_page) || (results.count == 0 && page > 0) + done = (results.count < per_page) || (results.count == 0 && page > 0) page += 1 assets.concat(results.select {|a| a.pool}) # filter out any spare nodes, which will have no pool set end @@ -211,23 +211,23 @@ def server_node_assets(pool_name=false, *roles) @pool_role_assets[p] ||= {} roles.each {|r| @pool_role_assets[p][r] = []} end - + # Filter assets.select! {|a| a.pool && a.secondary_role && %w(allocated maintenance).include?(a.status.downcase)} - + # Cache assets.each {|a| @pool_role_assets[a.pool.downcase][a.secondary_role.downcase.to_sym] << a} - + # Return assets end - - + + # Returns an array of configuration assets with the supplied primary role(s) def configuration_assets(*primary_roles) raise "Must supply at least one primary_role" if primary_roles.count < 1 per_page = Jetpants.plugins['jetpants_collins']['selector_page_size'] || 50 - + selector = { operation: 'and', details: true, @@ -241,7 +241,7 @@ def configuration_assets(*primary_roles) values = primary_roles.map {|r| "primary_role = ^#{r}$"} selector[:query] += ' AND (' + values.join(' OR ') + ')' end - + selector[:remoteLookup] = true if Jetpants.plugins['jetpants_collins']['remote_lookup'] done = false @@ -256,19 +256,19 @@ def configuration_assets(*primary_roles) page += 1 done = (page_of_results.count < per_page) || (page_of_results.count == 0 && page > 0) end - + # If remote lookup is enabled, remove the remote copy of any pool that exists # in both local and remote datacenters. if Jetpants.plugins['jetpants_collins']['remote_lookup'] dc_pool_map = {Plugin::JetCollins.datacenter => {}} - + assets.each do |a| location = a.location || Plugin::JetCollins.datacenter pool = a.pool ? a.pool.downcase : a.tag[6..-1].downcase # if no pool, strip 'mysql-' off front and use that dc_pool_map[location] ||= {} dc_pool_map[location][pool] = a end - + # Grab everything from current DC first (higher priority over other # datacenters), then grab everything from remote DCs. final_assets = dc_pool_map[Plugin::JetCollins.datacenter].values @@ -283,8 +283,8 @@ def configuration_assets(*primary_roles) assets end - - + + def clear_asset_cache(pool_name=false) if pool_name @pool_role_assets.delete pool_name @@ -292,10 +292,10 @@ def clear_asset_cache(pool_name=false) @pool_role_assets = {} end end - - + + private - + # Helper method to query Collins for spare DBs. def query_spare_assets(count, options={}) per_page = Jetpants.plugins['jetpants_collins']['selector_page_size'] || 50 @@ -327,9 +327,9 @@ def query_spare_assets(count, options={}) done = (page_of_results.count < per_page) || (page_of_results.count == 0 && page > 0) page += 1 end - + keep_assets = [] - + nodes.map(&:to_db).concurrent_each {|db| db.probe rescue nil} nodes.concurrent_each do |node| db = node.to_db diff --git a/plugins/merge_helper/lib/aggregator.rb b/plugins/merge_helper/lib/aggregator.rb index 893f2df..1f5561f 100644 --- a/plugins/merge_helper/lib/aggregator.rb +++ b/plugins/merge_helper/lib/aggregator.rb @@ -1,5 +1,5 @@ module Jetpants - + class Aggregator < DB include CallbackHandler @@ -49,7 +49,7 @@ def probe_aggregate_nodes @replication_states[aggregate_node] = :paused end end - end + end end def aggregating_for?(node) diff --git a/plugins/merge_helper/lib/commandsuite.rb b/plugins/merge_helper/lib/commandsuite.rb index fe095db..c9b388b 100644 --- a/plugins/merge_helper/lib/commandsuite.rb +++ b/plugins/merge_helper/lib/commandsuite.rb @@ -269,7 +269,7 @@ def merge_shards_cleanup aggregator_instance.pause_all_replication aggregator_instance.remove_all_nodes! combined_shard.master.disable_replication! - + shards_to_merge.each do |shard| shard.master.enable_read_only! end diff --git a/plugins/merge_helper/lib/shard.rb b/plugins/merge_helper/lib/shard.rb index e335eb1..ce3009c 100644 --- a/plugins/merge_helper/lib/shard.rb +++ b/plugins/merge_helper/lib/shard.rb @@ -218,7 +218,7 @@ def self.set_up_aggregate_node(shards_to_merge, aggregate_node, new_shard_master # in a non-replicating state for as little time as possible data_nodes.concurrent_map { |db| # load data and inject export counts from earlier for validation - slaves_to_replicate.map { |slave| + slaves_to_replicate.map { |slave| db.inject_counts export_counts[slave] db.import_data tables, slave.pool.min_id, slave.pool.max_id } @@ -233,8 +233,8 @@ def self.set_up_aggregate_node(shards_to_merge, aggregate_node, new_shard_master slaves_to_replicate.each do |slave| aggregate_node.add_node_to_aggregate slave, slave_coords[slave] end - - + + # Set up replication from aggregator to new_master. # We intentionally pass no options to change_master_to, since it's smart enough # to do the right thing (in this case: use aggregator's current coordinates) @@ -242,7 +242,7 @@ def self.set_up_aggregate_node(shards_to_merge, aggregate_node, new_shard_master end def combined_shard - Jetpants.shards(shard_pool.name).select { |shard| ( + Jetpants.shards(shard_pool.name).select { |shard| ( shard.min_id.to_i <= @min_id.to_i \ && shard.max_id.to_i >= @max_id.to_i \ && shard.max_id != 'INFINITY' \ diff --git a/plugins/simple_tracker/lib/commandsuite.rb b/plugins/simple_tracker/lib/commandsuite.rb index 33c1139..9ffbf8e 100644 --- a/plugins/simple_tracker/lib/commandsuite.rb +++ b/plugins/simple_tracker/lib/commandsuite.rb @@ -17,7 +17,7 @@ def add_pool Jetpants.topology.write_config puts 'Be sure to manually register any active read slaves using "jetpants activate_slave"' if p.slaves.count > 0 end - + desc 'add_shard', 'inform the asset tracker about a shard that was not previously tracked' method_option :min_id, :desc => 'Minimum ID of shard to track' method_option :max_id, :desc => 'Maximum ID of shard to track' @@ -32,7 +32,7 @@ def add_shard s.sync_configuration Jetpants.topology.write_config end - + desc 'add_spare', 'inform the asset tracker about a spare node that was not previously tracked' method_option :node, :desc => 'Clean-state node to register as spare -- should be previously untracked' def add_spare diff --git a/plugins/simple_tracker/lib/pool.rb b/plugins/simple_tracker/lib/pool.rb index 5667492..eb44684 100644 --- a/plugins/simple_tracker/lib/pool.rb +++ b/plugins/simple_tracker/lib/pool.rb @@ -1,19 +1,19 @@ module Jetpants class Pool - + ##### METHOD OVERRIDES ##################################################### - + # This actually re-writes ALL the tracker json. With a more dynamic - # asset tracker (something backed by a database, for example) this + # asset tracker (something backed by a database, for example) this # wouldn't be necessary - instead Pool#sync_configuration could just # update the info for the current pool (self) only. def sync_configuration Jetpants.topology.update_tracker_data end - + # If the pool's master hasn't been probed yet, return active_slaves list - # based strictly on what we found in the asset tracker. This is a major - # speed-up at start-up time, especially for tasks that need to iterate + # based strictly on what we found in the asset tracker. This is a major + # speed-up at start-up time, especially for tasks that need to iterate # over all pools' active slaves only, such as Topology#write_config. alias :active_slaves_from_probe :active_slaves def active_slaves @@ -23,10 +23,10 @@ def active_slaves @active_slave_weights.keys end end - - + + ##### NEW CLASS-LEVEL METHODS ############################################## - + # Converts a hash (from asset tracker json file) into a Pool. def self.from_hash(h) return nil unless h['master'] @@ -40,12 +40,12 @@ def self.from_hash(h) end p end - - + + ##### NEW METHODS ########################################################## - + # Converts a Pool to a hash, for use in either the internal asset tracker - # json (for_app_config=false) or for use in the application config file yaml + # json (for_app_config=false) or for use in the application config file yaml # (for_app_config=true) def to_hash(for_app_config=false) if for_app_config @@ -55,7 +55,7 @@ def to_hash(for_app_config=false) standby_slaves.map {|db| {'host' => db.to_s, 'role' => 'STANDBY_SLAVE'}} + backup_slaves.map {|db| {'host' => db.to_s, 'role' => 'BACKUP_SLAVE'}} end - + { 'name' => name, 'aliases' => aliases, @@ -65,6 +65,6 @@ def to_hash(for_app_config=false) 'slaves' => slave_data } end - + end end diff --git a/plugins/simple_tracker/lib/shard.rb b/plugins/simple_tracker/lib/shard.rb index d632f91..e743916 100644 --- a/plugins/simple_tracker/lib/shard.rb +++ b/plugins/simple_tracker/lib/shard.rb @@ -49,7 +49,7 @@ def self.assign_relationships(h, all_shards) ##### NEW METHODS ########################################################## # Converts a Shard to a hash, for use in either the internal asset tracker - # json (for_app_config=false) or for use in the application config file yaml + # json (for_app_config=false) or for use in the application config file yaml # (for_app_config=true) def to_hash(for_app_config=false) diff --git a/plugins/simple_tracker/lib/shardpool.rb b/plugins/simple_tracker/lib/shardpool.rb index f46dcc1..72bd67e 100644 --- a/plugins/simple_tracker/lib/shardpool.rb +++ b/plugins/simple_tracker/lib/shardpool.rb @@ -18,12 +18,12 @@ def self.from_hash(h) ##### NEW METHODS ########################################################## # Converts a Shard to a hash, for use in either the internal asset tracker - # json (for_app_config=false) or for use in the application config file yaml + # json (for_app_config=false) or for use in the application config file yaml # (for_app_config=true) def to_hash(for_app_config = true) { shard_pool: @name - } + } end end end diff --git a/plugins/simple_tracker/lib/topology.rb b/plugins/simple_tracker/lib/topology.rb index be89b91..04e1816 100644 --- a/plugins/simple_tracker/lib/topology.rb +++ b/plugins/simple_tracker/lib/topology.rb @@ -1,14 +1,14 @@ module Jetpants class Topology - + def self.tracker @tracker ||= Jetpants::Plugin::SimpleTracker.new @tracker end - + ##### METHOD OVERRIDES ##################################################### - + # Populates @pools by reading asset tracker data def load_pools @@ -71,10 +71,10 @@ def claim_spares(count, options={}) pool.claimed_nodes << db unless pool.claimed_nodes.include? db end end - + dbs end - + def count_spares(options={}) self.class.tracker.spares.count end @@ -83,9 +83,9 @@ def spares(options={}) self.class.tracker.spares.map(&:to_db) end - + ##### NEW METHODS ########################################################## - + # Called by Pool#sync_configuration to update our asset tracker json. # This actually re-writes all the json. With a more dynamic asset tracker # (something backed by a database, for example) this wouldn't be necessary - diff --git a/plugins/simple_tracker/simple_tracker.rb b/plugins/simple_tracker/simple_tracker.rb index e37da0a..7ad375f 100644 --- a/plugins/simple_tracker/simple_tracker.rb +++ b/plugins/simple_tracker/simple_tracker.rb @@ -7,28 +7,28 @@ module Jetpants module Plugin - + # The SimpleTracker class just handles the manipulations of the asset JSON file and the application # YAML file. The Jetpants::Topology class is monkeypatched to maintain a single SimpleTracker object, # which it uses to interact with these files. class SimpleTracker # Array of hashes, each containing info from Pool#to_hash attr_accessor :global_pools - + # Array of hashes, each containing info from Shard#to_hash attr_accessor :shards # Array of hashes, each containing info from ShardPool#to_hash attr_accessor :shard_pools - + # Clean state DB nodes that are ready for use. Array of any of the following: # * hashes each containing key 'node'. could expand to include 'role' or other metadata as well, # but currently not supported. # * objects responding to to_db, such as String or Jetpants::DB attr_accessor :spares - + attr_reader :app_config_file_path - + def initialize @tracker_data_file_path = Jetpants.plugins['simple_tracker']['tracker_data_file_path'] || '/etc/jetpants_tracker.json' @app_config_file_path = Jetpants.plugins['simple_tracker']['app_config_file_path'] || '/var/lib/mysite/config/databases.yaml' @@ -38,7 +38,7 @@ def initialize @spares = data['spares'] @shard_pools = data['shard_pools'] end - + def save File.open(@tracker_data_file_path, 'w') do |f| data = {'pools' => @global_pools, 'shards' => @shards, 'spares' => @spares, 'shard_pools' => @shard_pools} @@ -59,17 +59,17 @@ def determine_pool_and_role(ip, port=3306) raise "Unable to find #{ip} among tracked assets" end - + def determine_slaves(ip, port=3306) ip += ":#{port}" - + (@global_pools + @shards).each do |h| next unless h['master'] == ip return h['slaves'].map {|s| s['host'].to_db} end [] # return empty array if not a master end - + end end end diff --git a/plugins/upgrade_helper/commandsuite.rb b/plugins/upgrade_helper/commandsuite.rb index e84c303..11b284c 100644 --- a/plugins/upgrade_helper/commandsuite.rb +++ b/plugins/upgrade_helper/commandsuite.rb @@ -14,7 +14,7 @@ def upgrade_clone_slave source = ask_node('Please enter IP of node to clone from: ', options[:source]) source.master.probe if source.master # fail early if there are any replication issues in this pool describe source - + puts "You may clone to particular IP address(es), or can type \"spare\" to claim a node from the spare pool." target = options[:target] || ask('Please enter comma-separated list of targets (IPs or "spare") to clone to: ') target = 'spare' if target.strip == '' || target.split(',').length == 0 @@ -35,73 +35,73 @@ def upgrade_clone_slave error "target (#{ip}) does not appear to be an IP." end end - + source.start_mysql if ! source.running? error "source (#{source}) is not a standby slave or a backup slave" unless (source.is_standby? || source.for_backups?) - + targets.each do |t| error "target #{t} already has a master; please clear out node (including in asset tracker) before proceeding" if t.master end - + # Claim all the targets from the pool targets.each(&:claim!) # Disable fast shutdown on the source source.mysql_root_cmd 'SET GLOBAL innodb_fast_shutdown = 0' - + # Flag the nodes as needing upgrade, which will get triggered when # enslave_siblings restarts them targets.each {|t| t.needs_upgrade = true} - + # Remove ib_lru_dump if present on targets targets.concurrent_each {|t| t.ssh_cmd "rm -rf #{t.mysql_directory}/ib_lru_dump"} - + source.enslave_siblings!(targets) targets.concurrent_each {|t| t.resume_replication; t.catch_up_to_master} source.pool.sync_configuration - + puts "Clone-and-upgrade complete." Jetpants.topology.write_config end - - + + desc 'upgrade_promotion', 'demote and destroy a master running an older version of MySQL' method_option :demote, :desc => 'node to demote' def upgrade_promotion demoted = ask_node 'Please enter the IP address of the node to demote:', options[:demote] demoted.probe - + # This task should not be used for emergency promotions (master failures) # since the regular "jetpants promotion" logic is actually fine in that case. error "Unable to connect to node #{demoted} to demote" unless demoted.running? - + # Before running this task, the pool should already have an extra standby slave, # since we're going to be removing the master from the pool. standby_slaves_needed = demoted.pool(true).slaves_layout[:standby_slave] + 1 error "Only run this task on a pool with 3 standby slaves!" unless demoted.pool(true).standby_slaves.size >= standby_slaves_needed - + # Verify that all nodes except the master are running the same version, and # are higher version than the master unless demoted.slaves.all? {|s| s.version_cmp(demoted.slaves.first) == 0 && s.version_cmp(demoted) > 0} error "This task can only be used when all slaves are running the same version of MySQL," error "and the master's version is older than that of all the slaves." end - + puts inform "Summary of affected pool" inform "Binary log positions and slave lag shown below are just a snapshot taken at the current time." if demoted.running? puts demoted.pool(true).summary(true) puts - + promoted = ask_node 'Please enter the IP address of a standby slave to promote: ' - + error "Node to promote #{promoted} is not a standby slave of node to demote #{demoted}" unless promoted.master == demoted && promoted.role == :standby_slave error "The chosen node cannot be promoted. Please choose another." unless promoted.promotable_to_master?(false) - + inform "Going to DEMOTE AND DESTROY existing master #{demoted} and PROMOTE new master #{promoted}." error "Aborting." unless agree "Proceed? [yes/no]: " - + # Perform the promotion, but without making the old master become a slave of the new master # We then rely on the built-in call to Pool#sync_configuration or Pool#after_master_promotion! # to remove the old master from the pool in the same way it would handle a failed master (which @@ -114,8 +114,8 @@ def self.after_upgrade_promotion 'Deploy the configuration to all machines.', ) end - - + + desc 'shard_upgrade', 'upgrade a shard via four-step lockless process' method_option :min_id, :desc => 'Minimum ID of shard to upgrade' method_option :max_id, :desc => 'Maximum ID of shard to upgrade' @@ -149,12 +149,12 @@ def shard_upgrade 'Wait for writes to stop on the old parent master.', 'Proceed to next step: jetpants shard_upgrade --cleanup', ) - + elsif options[:cleanup] raise 'The --reads, --writes, and --cleanup options are mutually exclusive' if options[:reads] || options[:writes] s = ask_shard_being_upgraded(:cleanup, shard_pool) s.cleanup! - + else self.class.reminders( 'This process may take an hour or two. You probably want to run this from a screen session.', @@ -171,8 +171,8 @@ def shard_upgrade ) end end - - + + desc 'checksum_pool', 'Run pt-table-checksum on a pool to verify data consistency after an upgrade of one slave' method_option :pool, :desc => 'name of pool' method_option :no_check_plan, :desc => 'sets --nocheck_plan option in pt-table-checksum', :type => :boolean @@ -195,8 +195,8 @@ def checksum_pool pool.checksum_tables checksum_options end - - + + desc 'check_pool_queries', 'Runs pt-upgrade on a pool to verify query performance and results between different MySQL versions' method_option :pool, :desc => 'name of pool' method_option :dumptime, :desc => 'number of seconds of tcpdump data to consider' @@ -210,11 +210,11 @@ def check_pool_queries machines ||= [] gather_machine = options[:gather_machine].to_db if options[:gather_machine] gather_machine ||= nil - + pool = Jetpants.topology.pool(pool_name) or raise "Pool #{pool_name} does not exist" pool.collect_and_compare_queries!(dump_time, *machines) end - + no_tasks do def ask_shard_being_upgraded(stage = :prep, shard_pool = nil) shards_being_upgraded = Jetpants.shards(shard_pool).select {|s| [:child, :needs_cleanup].include?(s.state) && !s.parent && s.master.master} @@ -239,6 +239,6 @@ def ask_shard_being_upgraded(stage = :prep, shard_pool = nil) s end end - + end end diff --git a/plugins/upgrade_helper/host.rb b/plugins/upgrade_helper/host.rb index 6c2cae0..546abba 100644 --- a/plugins/upgrade_helper/host.rb +++ b/plugins/upgrade_helper/host.rb @@ -1,8 +1,8 @@ module Jetpants class Host - + ##### NEW METHODS ########################################################## - + # Converts tcpdump output into slowlog format using pt-query-digest. Requires that # pt-query-digest is installed and in root's path. Returns the full path to the # slowlog. Does not delete or remove the tcpdump output file. @@ -24,7 +24,7 @@ def dumpfile_to_slowlog(tcpdump_output_file_path, delete_tcpdumpfile=true) slowlog_file_path = filter_slowlog(slowlog_file_path) slowlog_file_path end - + # Perform any slowlog filtering eg. removing SELECT UNIX_TIMESTAMP() like queries def filter_slowlog(slowlog_file_path) slowlog_file_path diff --git a/plugins/upgrade_helper/pool.rb b/plugins/upgrade_helper/pool.rb index a3fc6d2..6e5d6de 100644 --- a/plugins/upgrade_helper/pool.rb +++ b/plugins/upgrade_helper/pool.rb @@ -3,7 +3,7 @@ module Jetpants class Pool collins_attr_accessor :checksum_running - + # Runs pt-table-checksum on the pool. # Returns true if no problems found, false otherwise. # If problems were found, the 'checksums' table will be @@ -12,7 +12,7 @@ def checksum_tables options={} schema = master.app_schema success = false output_lines = [] - + # check if already running, or a previous run died previous_run = collins_checksum_running previous_run = nil if previous_run == '' @@ -24,13 +24,13 @@ def checksum_tables options={} raise "Checksum already in progress from #{previous_host}, pid=#{previous_pid}" if still_running output "Previous failed run detected, will use --resume parameter" end - + # Determine what to pass to --max-load master.output "Polling for normal max threads_running, please wait" max_threads_running = master.max_threads_running limit_threads_running = [(max_threads_running * 1.2).ceil, 20].max master.output "Found max threads_running=#{max_threads_running}, will use limit of #{limit_threads_running}" - + # Operate with a temporary user that has elevated permissions master.with_pt_checksum_user do |username, password| # Build command line @@ -53,19 +53,19 @@ def checksum_tables options={} command_line << '--resume' if previous_run command_line = command_line.join ' ' - + # Spawn the process Open3.popen3(command_line) do |stdin, stdout, stderr, wait_thread| exit_code = nil pid = wait_thread.pid puts "Running pt-table-checksum targetting #{master}, pid on Jetpants host is #{pid}" - + self.collins_checksum_running = { 'from_host' => Host.local.ip, 'from_pid' => pid, 'timestamp' => Time.now.to_i, }.to_json - + # Display STDERR output in real-time, via a separate thread Thread.new do begin @@ -74,7 +74,7 @@ def checksum_tables options={} nil end end - + # Capture STDOUT and buffer it; since this is the main thread, also # watch out for broken pipe or ctrl-c begin @@ -84,7 +84,7 @@ def checksum_tables options={} puts "Caught exception #{ex.message}" exit_code = 130 # by unix convention, return 128 + SIGINT end - + # Dump out stdout: first anything we buffered on our end, plus anything # that Perl or the OS had buffered on its end puts @@ -93,10 +93,10 @@ def checksum_tables options={} stdout.each {|line| puts line} rescue nil end puts - + puts "Checksum completed with exit code #{exit_code}.\n" success = (exit_code == 0) - + # Run again with --replicate-check-only to display ALL diffs, including ones from # prior runs of the tool. puts 'Verifying all results via --replicate-check-only...' @@ -109,7 +109,7 @@ def checksum_tables options={} puts output success = false end - + # Drop the checksums table, but only if there were no diffs if success output "Dropping table #{schema}.checksums..." @@ -127,7 +127,7 @@ def checksum_tables options={} end # with_pt_checksum_user success end - + def display_buffer_pool_hit_rate(nodes=nil) return if nodes.nil? nodes.each { |node| @@ -155,7 +155,7 @@ def compare_queries(slowlog_path, silent_run_first, *compare_nodes) pt_upgrade_version = `pt-upgrade --version`.to_s.split(' ').last.chomp rescue '0.0.0' raise "pt-upgrade executable is not available on the host" unless $?.exitstatus == 1 - + # We need to create a temporary SUPER user on the nodes to compare # Also attempt to silence warning 1592 about unsafe-for-replication statements if # using Percona Server 5.5.10+ which supports this. @@ -171,7 +171,7 @@ def compare_queries(slowlog_path, silent_run_first, *compare_nodes) node.grant_privileges username, 'percona_schema', 'ALL PRIVILEGES' node.mysql_root_cmd "SET SESSION sql_log_bin = 0; USE percona_schema;CREATE TABLE IF NOT EXISTS pt_upgrade ( id INT NOT NULL PRIMARY KEY );" end - + # We only want to try this if (a) the node supports log_warnings_suppress, # and (b) the node isn't already suppressing warning 1592 if node.global_variables[:log_warnings_suppress] == '' @@ -179,10 +179,10 @@ def compare_queries(slowlog_path, silent_run_first, *compare_nodes) remove_suppress_1592 << node end end - + node_text = compare_nodes.map {|s| s.to_s + ' (v' + s.normalized_version(3) + ')'}.join ' vs ' dsn_text = compare_nodes.map {|n| "h=#{n.ip},P=#{n.port},u=#{username},p=#{password},D=#{n.app_schema}"}.join ' ' - + display_buffer_pool_hit_rate(compare_nodes) # Do silent run if requested (to populate buffer pools) @@ -198,7 +198,7 @@ def compare_queries(slowlog_path, silent_run_first, *compare_nodes) puts puts end - + display_buffer_pool_hit_rate(compare_nodes) # Run pt-upgrade for real. Note that we only compare query times and results, NOT warnings, @@ -215,7 +215,7 @@ def compare_queries(slowlog_path, silent_run_first, *compare_nodes) display_buffer_pool_hit_rate(compare_nodes) output "pt-upgrade completed with exit code #{exit_code}" - + # Drop the SUPER user and re-enable logging of warning 1592 compare_nodes.each do |node| node.mysql_root_cmd "SET SESSION sql_log_bin = 0; DROP DATABASE IF EXISTS percona_schema;" if pt_upgrade_version.to_f >= 2.2 @@ -224,15 +224,15 @@ def compare_queries(slowlog_path, silent_run_first, *compare_nodes) end remove_suppress_1592.each {|node| node.mysql_root_cmd "SET GLOBAL log_warnings_suppress = ''"} end - - + + # Collects query slowlog on the master (and one active slave, if there are any) # using tcpdump, copies over to the host Jetpants is running on, converts to a # slowlog, and then uses Pool#compare_queries to run pt-upgrade. # # The supplied *compare_nodes should be standby slaves, and you may omit them # to automatically select two standby slaves (of different versions, if available) - # + # # When comparing exactly two nodes, we stop replication on the nodes temporarily # to ensure a consistent dataset for comparing query results. Otherwise, async # replication can naturally result in false-positives. @@ -244,7 +244,7 @@ def collect_and_compare_queries!(tcpdump_time=30, *compare_nodes) master.fast_copy_chain(Jetpants.export_location, local, files: master_dump_filename, overwrite: true) master.ssh_cmd "rm #{Jetpants.export_location}/#{master_dump_filename}" master_slowlog_path = local.dumpfile_to_slowlog("#{Jetpants.export_location}/#{master_dump_filename}") - + # If we also have an active slave running, grab sampled slowlog from there too active_slowlog_path = nil if active_slaves.size > 0 @@ -254,7 +254,7 @@ def collect_and_compare_queries!(tcpdump_time=30, *compare_nodes) active_slave.ssh_cmd "rm #{Jetpants.export_location}/#{active_dump_filename}" active_slowlog_path = local.dumpfile_to_slowlog("#{Jetpants.export_location}/#{active_dump_filename}") end - + # Gather our comparison nodes if compare_nodes.size == 0 higher_ver_standby = standby_slaves.select {|s| s.version_cmp(master) > 0}.first @@ -265,7 +265,7 @@ def collect_and_compare_queries!(tcpdump_time=30, *compare_nodes) compare_nodes = standby_slaves[0, 2] end end - + # Disable monitoring on our comparison nodes, and then stop replication # at the same position. We only proceed with this if we're comparing # exactly two nodes; this may be improved in a future release. @@ -276,21 +276,21 @@ def collect_and_compare_queries!(tcpdump_time=30, *compare_nodes) } compare_nodes.first.pause_replication_with(compare_nodes.last) end - + # Run pt-upgrade using the master dumpfile puts output "COMPARISON VIA QUERY LOG FROM MASTER" compare_queries(master_slowlog_path, true, *compare_nodes) - + if active_slowlog_path puts output "COMPARISON VIA QUERY LOG FROM ACTIVE SLAVE" compare_queries(active_slowlog_path, true, *compare_nodes) end - + # If we previously paused replication and disabled monitoring, un-do this if compare_nodes.size == 2 - compare_nodes.concurrent_each do |n| + compare_nodes.concurrent_each do |n| n.resume_replication n.catch_up_to_master n.start_query_killer @@ -298,6 +298,6 @@ def collect_and_compare_queries!(tcpdump_time=30, *compare_nodes) end end end - + end end diff --git a/plugins/upgrade_helper/shard.rb b/plugins/upgrade_helper/shard.rb index a79327b..5fc5ed6 100644 --- a/plugins/upgrade_helper/shard.rb +++ b/plugins/upgrade_helper/shard.rb @@ -32,21 +32,21 @@ def branched_upgrade_prep(upgrade_shard_master_ip) next if needed == 0 targets.concat Jetpants.topology.claim_spares(needed, role: "#{role}_slave".to_sym, like: like_node, version: Plugin::UpgradeHelper.new_version) end - + # Disable fast shutdown on the source source.mysql_root_cmd 'SET GLOBAL innodb_fast_shutdown = 0' - + # Flag the nodes as needing upgrade, which will get triggered when # enslave_siblings restarts them targets.each {|t| t.needs_upgrade = true} - + # Remove ib_lru_dump if present on targets targets.concurrent_each {|t| t.ssh_cmd "rm -rf #{t.mysql_directory}/ib_lru_dump"} - + source.enslave_siblings!(targets) targets.concurrent_each {|t| t.resume_replication; t.catch_up_to_master} source.pool.sync_configuration - + # Make the 1st new slave be the "future master" which the other new # slaves will replicate from future_master = targets.shift @@ -59,7 +59,7 @@ def branched_upgrade_prep(upgrade_shard_master_ip) future_master.resume_replication future_master.catch_up_to_master end - + # Hack the pool configuration to send reads to the new master, but still send # writes to the old master (they'll replicate over) def branched_upgrade_move_reads @@ -73,7 +73,7 @@ def branched_upgrade_move_reads @state = :child sync_configuration end - + # Move writes over to the new master def branched_upgrade_move_writes raise "Shard #{self} in wrong state to perform this action! expected :child, found #{@state}" unless @state == :child @@ -81,6 +81,6 @@ def branched_upgrade_move_writes @state = :needs_cleanup sync_configuration end - + end end