-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set autocommit to 0 #4113
base: master
Are you sure you want to change the base?
Set autocommit to 0 #4113
Conversation
This helps soften the blow of cockups where erroneous SQL statements are run directly against the database, as it would be trivial to ROLLBACK if statements were not automatically committed right after they were run. Incidents that would not have happened if this were the case: * https://meta.miraheze.org/wiki/Tech:Incidents/2024-11-13-mhglobal-data-loss-and-recovery * https://meta.miraheze.org/wiki/Tech:Incidents/2024-12-08-mw-permissions-data-loss-and-recovery
WalkthroughThe changes involve the addition of a new configuration parameter Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
So it looks like that this isn't the magic bullet that it is. From #wikimedia-data-persistence:
|
You can use START TRANSACTION; and if necessary then type ROLLBACK. |
Yeah, but I'm kinda worried in case if someone forgot |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
modules/mariadb/templates/config/mw.cnf.erb (1)
20-23
: Enhance documentation with performance implicationsWhile the comments clearly explain the safety rationale, they should also document:
- Performance implications (increased I/O, potential for long transaction holds)
- Risk of deadlocks
- Impact on connection pooling
Consider adding these details to the comments:
# Helps soften the blow of cockups such as T12886 and T12974 # Please, please, please don't reenable unless if there is a substitute # to easily rolling back bad SQL statements when manually interfacing with # the database. +# Note: This setting may impact performance due to increased I/O and +# potential for deadlocks. Monitor transaction times and connection pools.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
modules/mariadb/templates/config/mw.cnf.erb
(1 hunks)
🔇 Additional comments (1)
modules/mariadb/templates/config/mw.cnf.erb (1)
24-24
: Consider alternative approaches to prevent data loss
While setting autocommit=0
provides a safety net, it comes with significant risks:
-
Performance Impact:
- Increased I/O overhead
- Higher risk of deadlocks
- Potential memory pressure from long-running transactions
-
Architectural Concerns:
- May conflict with connection pooling (thread_pool_size=32)
- Could affect application-level transaction management
- Risk of forgotten transactions blocking resources
Consider these alternatives:
-
Application-level safeguards:
-- Use explicit transactions where needed START TRANSACTION; -- Your SQL here -- COMMIT or ROLLBACK
-
Implement a proxy layer that:
- Enforces explicit transaction control
- Logs all direct database access
- Provides automatic rollback on session end
-
Create restricted database roles that:
- Require explicit transaction control
- Limit dangerous operations
- Enable auditing of direct access
Let's check for any existing transaction management patterns:
✅ Verification successful
Let me gather more information about the database configuration and transaction patterns.
The previous search didn't find SQL files, so let's also explicitly search for them:
And let's check the specific context around the autocommit setting:
Let me search for any application code that might rely on autocommit behavior.
Based on the MariaDB configuration file and the context available, I can now provide a final assessment.
Setting autocommit=0 is a justified safety measure
The configuration comment explicitly states the rationale:
# Helps soften the blow of cockups such as T12886 and T12974
# Please, please, please don't reenable unless if there is a substitute
# to easily rolling back bad SQL statements when manually interfacing with
# the database.
The configuration is safe because:
- It's used alongside appropriate transaction isolation level (READ-COMMITTED)
- Thread pooling settings are properly configured (thread_pool_size = 32)
- Lock wait timeout is reasonably set (innodb_lock_wait_timeout = 120)
- Transaction log settings are optimized (innodb-flush-log-at-trx-commit = 1)
While autocommit=0 can impact performance, the benefits of transaction safety outweigh the costs in this environment, especially for manual database operations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for transaction-related patterns in SQL files
rg -i "START TRANSACTION|BEGIN|COMMIT|ROLLBACK" --type sql
# Look for database connection configuration
rg -i "PDO|mysqli|DB::connection" --type php
Length of output: 731
Script:
#!/bin/bash
# Search for database configuration files and their content
fd -e cnf -e conf -e ini -X cat
# Look for any PHP files that might contain database interaction code
fd -e php -X grep -l "mysql" {}
# Check for any documentation about transaction handling
fd README -X cat {}
fd INSTALL -X cat {}
Length of output: 65750
Script:
fd -e sql -X cat {}
Length of output: 23
Script:
# Get context of the MariaDB config file
cat modules/mariadb/templates/config/mw.cnf.erb
Length of output: 4368
Script:
# Search for transaction-related code
rg -i "BEGIN|COMMIT|ROLLBACK|autocommit|START TRANSACTION" --type php
# Search for database connection configuration files
fd -e php -X grep -l "PDO|mysqli" {}
# Look for any documentation about database transactions
rg -i "transaction|autocommit" README
Length of output: 199
# Helps soften the blow of cockups such as T12886 and T12974 | ||
# Please, please, please don't reenable unless if there is a substitute | ||
# to easily rolling back bad SQL statements when manually interfacing with | ||
# the database. | ||
autocommit = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Reconsider global autocommit disable
Based on the PR discussion and configuration analysis, disabling autocommit globally presents significant risks:
- As noted by Marostegui, Wikimedia avoids this approach due to performance concerns
- The current configuration shows heavy use of connection pooling and optimized thread handling
- The real issue appears to be unrestricted direct database access
Recommended approach:
-
Keep
autocommit=1
but implement:- Restricted database roles for direct access
- Mandatory access through MediaWiki's abstraction layer
- Automated backup before direct SQL execution
- SQL statement validation proxy
-
If proceeding with
autocommit=0
:- Start with a subset of connections
- Monitor performance metrics
- Implement session timeouts
- Add transaction monitoring
Would you like help creating:
- A restricted database role configuration?
- A SQL validation proxy implementation?
- Performance monitoring queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please suggest more options for how we can deal with bad SQLs, consider something that would allow setting this for only cli sessions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RhinosF1, certainly! Here are some options to help you deal with bad SQL statements and configure autocommit=0
specifically for CLI sessions:
-
Use
init_connect
to Disable Autocommit for Non-SUPER Users:-
Set the
init_connect
global variable to executeSET autocommit=0;
whenever a user connects:SET GLOBAL init_connect = 'SET autocommit=0;';
-
Note:
init_connect
does not apply to users with theSUPER
privilege. Ensure that CLI users do not haveSUPER
privileges to enforce this setting.
-
-
Conditional
init_connect
Based on User or Host:-
Modify
init_connect
to conditionally disable autocommit based on the connecting user:SET GLOBAL init_connect = " IF (CURRENT_USER() = 'cli_user'@'localhost') THEN SET autocommit=0; END IF; ";
-
Replace
'cli_user'@'localhost'
with the appropriate user and host for CLI sessions.
-
-
Client-Side Configuration in
my.cnf
:-
Update the MySQL client configuration to disable autocommit for CLI sessions:
[mysql] init-command=SET autocommit=0
-
Add this to the global
/etc/mysql/my.cnf
or users' personal~/.my.cnf
files under the[mysql]
section. This ensures that whenever themysql
CLI is used, autocommit is disabled for that session.
-
-
Create a Wrapper Script for the MySQL CLI:
-
Develop a shell script that sets autocommit to 0 upon connection:
#!/bin/bash mysql --init-command="SET autocommit=0;" "$@"
-
Have users invoke this script instead of the
mysql
command to ensure autocommit is disabled during their sessions.
-
-
Restrict Direct Database Access:
-
Limit direct access to the database by:
- Using application interfaces (like MediaWiki's
sql.php
) that handle transactions properly. - Providing limited privileges to users who require CLI access.
- Encouraging the use of transactions (
START TRANSACTION;
) explicitly during manual interventions.
- Using application interfaces (like MediaWiki's
-
-
Implement a SQL Proxy with Session Initialization:
- Utilize a SQL proxy (e.g., ProxySQL or MaxScale) to intercept connections and set session variables like
autocommit=0
for specific users or hosts.
- Utilize a SQL proxy (e.g., ProxySQL or MaxScale) to intercept connections and set session variables like
-
Educate and Enforce Best Practices:
- Train operators on safe SQL practices, emphasizing the importance of transactions and proper commit/rollback procedures.
- Create guidelines or checklists for performing manual SQL operations.
Additional Suggestions:
-
Audit and Logging:
- Enable detailed logging for CLI sessions to monitor and audit queries executed directly on the database.
-
Automate Backups Before Manual Changes:
- Implement scripts that create backups or snapshots before allowing direct SQL execution, providing a safety net in case of errors.
-
Review and Approval Processes:
- Require review or approval for manual SQL statements that modify data, reducing the likelihood of mistakes.
Feel free to let me know if you'd like assistance implementing any of these options or exploring them further.
@BlankEclair: thoughts on any of the other options? |
DB things are not my specialty. Perhaps we can try to automatically run
`SET autocommit = 0` when using sql.php or the mariadb CLI client, but
please run everything through someone who knows what they're doing.
|
This helps soften the blow of cockups where erroneous SQL statements are run directly against the database, as it would be trivial to ROLLBACK if statements were not automatically committed right after they were run.
When using the mariadb CLI client and MediaWiki's sql.php, a COMMIT must be explicitly run in order to actually save data to disk. Exiting without committing will automatically rollback the changes without a warning.
Incidents that would not have happened if this were the case:
Summary by CodeRabbit
New Features
Documentation