Lewati ke konten
Kembali ke Blog

Troubleshooting MySQL: Panduan Lengkap Mengatasi Masalah Database Production

Β· Β· 7 menit baca

Database production yang down bisa berarti kehilangan ribuan dollar per menit. Setelah menangani berbagai crisis selama bertahun-tahun, saya telah mengembangkan systematic approach untuk troubleshooting MySQL. Artikel ini adalah playbook lengkap untuk situasi emergency.

Methodologi Troubleshooting

1. Systematic Approach

  1. Identify: Apa masalahnya?
  2. Isolate: Di mana masalahnya?
  3. Investigate: Kenapa terjadi?
  4. Implement: Solusi apa?
  5. Prevent: Bagaimana mencegah?

2. Data Gathering Checklist

# Simpan sebelum melakukan perubahan
mysql -u root -p -e "SHOW GLOBAL STATUS;" > /tmp/status_before.txt
mysql -u root -p -e "SHOW PROCESSLIST;" > /tmp/processlist.txt
mysql -u root -p -e "SHOW ENGINE INNODB STATUS;" > /tmp/innodb_status.txt

Common Errors dan Solusi

1. “Can’t connect to local MySQL server through socket”

Penyebab: MySQL tidak berjalan atau socket file tidak ada

Solusi:

# Cek status service
sudo systemctl status mysql

Jika stopped, start

sudo systemctl start mysql

Cek error log

tail -f /var/log/mysql/error.log

Jika socket file missing, cek konfigurasi

mysql -u root -p -h 127.0.0.1 # Gunakan TCP/IP

Fix socket path di my.cnf

[mysqld] socket = /var/run/mysqld/mysqld.sock

[client] socket = /var/run/mysqld/mysqld.sock

2. “Too many connections”

Lihat artikel: Cara Mengatasi Error Too Many Connections

Quick Fix:

-- Kill idle connections
SELECT GROUP_CONCAT(ID) INTO @kill_list 
FROM INFORMATION_SCHEMA.PROCESSLIST 
WHERE COMMAND = 'Sleep' AND TIME > 300;

SET @kill_cmd = CONCAT('KILL ', @kill_list); PREPARE stmt FROM @kill_cmd; EXECUTE stmt; DEALLOCATE PREPARE stmt;

3. “Table is full”

Penyebab: Disk penuh atau table size limit

Solusi:

# Cek disk space
df -h

Cek table size

du -sh /var/lib/mysql/database_name/

Jika disk penuh, cleanup

sudo find /var/lib/mysql -name "*.log" -mtime +7 -delete

Atau expand disk (cloud)

4. “Lock wait timeout exceeded”

Penyebab: Long-running transaction holding locks

Solusi:

-- Identify blocking transaction
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

-- Kill blocking thread KILL <blocking_thread_id>;

5. “Innodb: Database page corruption”

Penyebab: Hardware failure, power outage, atau disk corruption

Solusi:

# 1. Backup segera
mysqldump -u root -p --all-databases > /backup/corruption_backup.sql

2. Cek table corruption

cd /var/lib/mysql sudo innochecksum database_name/table_name.ibd

3. Restart dengan recovery

sudo systemctl stop mysql

Edit my.cnf

[mysqld] innodb_force_recovery = 1 # Coba 1-6, semakin tinggi semakin agresif

4. Restart dan dump data

sudo systemctl start mysql mysqldump -u root -p database_name > /backup/recovery_dump.sql

5. Recreate database dan restore

mysql -u root -p -e "DROP DATABASE database_name; CREATE DATABASE database_name;" mysql -u root -p database_name < /backup/recovery_dump.sql

6. Remove force recovery dan restart normal

Performance Issues

1. Query Lambat Mendadak

Diagnosa:

-- Cek slow query log
SELECT * FROM mysql.slow_log 
WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR)
ORDER BY query_time DESC;

-- Cek execution plan berubah EXPLAIN FORMAT=JSON SELECT ... FROM slow_query;

-- Cek statistics update SHOW TABLE STATUS LIKE 'problematic_table'; ANALYZE TABLE problematic_table;

Common Causes:
– Statistics outdated: ANALYZE TABLE
– Index corruption: OPTIMIZE TABLE atau recreate index
– Data distribution changed: Reconsider index strategy
– Server restarted: Query plan cache cleared

2. High CPU Usage

Diagnosa:

-- Top CPU consuming queries
SELECT 
    DIGEST_TEXT,
    COUNT_STAR,
    AVG_TIMER_WAIT/1000000000 as avg_ms,
    SUM_CREATED_TMP_DISK_TABLES as disk_tmp_tables
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

Solusi:

-- Enable profiling
SET profiling = 1;

-- Run query SELECT ... FROM ...;

-- Analyze SHOW PROFILES; SHOW PROFILE FOR QUERY 1;

3. Memory Exhaustion

Diagnosa:

# Cek memory usage
top -p $(pgrep mysqld)

Atau

ps aux | grep mysqld | grep -v grep

Konfigurasi Memory:

# my.cnf
[mysqld]
# 70-80% dari RAM untuk InnoDB
innodb_buffer_pool_size = 4G

Connection memory

max_connections = 100

Setiap connection ~256KB - 1MB

Query cache (jika di-enable)

query_cache_size = 256M query_cache_type = 1

4. Disk I/O Bottleneck

Diagnosa:

# Monitor disk I/O
iostat -x 1 10

Atau

iotop -o -b -d 1

Solusi:

-- Cek slow I/O tables
SELECT 
    OBJECT_SCHEMA,
    OBJECT_NAME,
    SUM_TIMER_WAIT/1000000000 as latency_ms
FROM performance_schema.table_io_waits_summary_by_table
ORDER BY SUM_TIMER_WAIT DESC;

-- Cek temp tables on disk SHOW GLOBAL STATUS LIKE 'Created_tmp%';

Replication Issues

1. Replication Lag Tinggi

Diagnosa:

-- Di slave
SHOW SLAVE STATUS\G
-- Perhatikan: Seconds_Behind_Master

-- Identify cause SELECT * FROM performance_schema.replication_applier_status;

Solusi:

-- Enable parallel replication
STOP SLAVE;
SET GLOBAL slave_parallel_threads = 4;
SET GLOBAL slave_parallel_mode = 'optimistic';
START SLAVE;

2. Replication Error

Diagnosa:

SHOW SLAVE STATUS\G
-- Perhatikan: Last_SQL_Error, Last_IO_Error

Solusi berdasarkan error:

-- Error: Duplicate entry
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

-- Error: Table doesn't exist -- Recreate table dari master

-- Error: Foreign key constraint -- Periksa data consistency

3. Master-Slave Data Inconsistency

Check:

# Percona Toolkit
pt-table-checksum --user=root --password --databases=mydb

Atau manual check

mysqldump --no-data mydb table_name | md5sum

Jalankan di master dan slave, bandingkan

Crash Recovery

1. MySQL Won’t Start

Steps:

# 1. Cek error log
sudo tail -100 /var/log/mysql/error.log

2. Cek disk space

df -h

3. Cek permission

ls -la /var/lib/mysql/ sudo chown -R mysql:mysql /var/lib/mysql

4. Recovery mode

sudo systemctl stop mysql sudo nano /etc/mysql/my.cnf

[mysqld]
innodb_force_recovery = 1  # Mulai dari 1, naik sampai 6 jika perlu
sudo systemctl start mysql

Dump semua data jika recovery berhasil

mysqldump -u root -p --all-databases > /backup/emergency_dump.sql

Restore ke instance baru

2. Corrupted Table (MyISAM)

-- Repair MyISAM table
REPAIR TABLE corrupted_table;

-- Atau dari command line myisamchk --recover /var/lib/mysql/db_name/table_name

3. Corrupted Table (InnoDB)

-- 1. Backup dengan force recovery
-- 2. Drop table
DROP TABLE corrupted_table;

-- 3. Recreate dari backup atau schema CREATE TABLE corrupted_table (...);

-- 4. Restore data

Deadlock Resolution

1. Identify Deadlock

-- Cek deadlock terakhir
SHOW ENGINE INNODB STATUS\G
-- Cari bagian "LATEST DETECTED DEADLOCK"

2. Analisis Deadlock Log

LATEST DETECTED DEADLOCK
*** (1) TRANSACTION:
TRANSACTION 12345, ACTIVE 11 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1234, OS thread handle 12345678, query id 123 localhost user
SELECT * FROM orders WHERE id = 100 FOR UPDATE
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 55 page no 3 n bits 72 index PRIMARY of table `db`.`orders`

3. Prevention

-- Akses tables dalam urutan yang konsisten
-- ORDER BY untuk lock acquisition
-- Keep transactions pendek
-- Gunakan indexes untuk reduce lock contention

Emergency Playbook

1. Complete Crash Recovery

#!/bin/bash
# Emergency recovery script

1. Stop MySQL

sudo systemctl stop mysql

2. Backup data files (meski corrupt)

sudo cp -r /var/lib/mysql /var/lib/mysql backup$(date +%s)

3. Enable recovery mode

sudo tee -a /etc/mysql/my.cnf <<EOF [mysqld] innodb_force_recovery = 4 innodb_purge_threads = 0 EOF

4. Start MySQL

sudo systemctl start mysql

5. Dump data

mysqldump -u root -p --all-databases > /backup/emergency $(date +%Y%m%d%H%M%S).sql

6. Stop MySQL dan reset config

sudo systemctl stop mysql sudo sed -i '/innodb_force_recovery/d' /etc/mysql/my.cnf sudo sed -i '/innodb_purge_threads/d' /etc/mysql/my.cnf

7. Clear data directory

sudo rm -rf /var/lib/mysql/*

8. Reinitialize

sudo mysqld --initialize-insecure --user=mysql

9. Start MySQL

sudo systemctl start mysql

10. Restore data

mysql -u root -p < /backup/emergency_*.sql

2. Data Corruption Prevention

-- Enable checksums
SET GLOBAL innodb_checksum_algorithm = 'strict_crc32';

-- Regular checks CHECK TABLE important_table;

Kesimpulan

Troubleshooting yang efektif memerlukan:
1. Systematic approach: Jangan panik, follow methodology
2. Proper logging: Always check logs first
3. Backup before fix: Simpan state sebelum perubahan
4. Test di staging: Validasi fix sebelum production
5. Document: Catat semua untuk future reference

Dengan playbook ini, Anda siap untuk menghadapi hampir semua crisis database.

Ditulis oleh

Hendra Wijaya

Hanya hamba Allah Ta'ala yang berusaha berbuat baik..

Tinggalkan Komentar

Email tidak akan ditampilkan.