Lewati ke konten
Kembali ke Blog

Troubleshooting MySQL: Panduan Lengkap Mengatasi Masalah Database Production

· · 7 menit baca

Database production yang down bisa berarti kehilangan ribuan dollar per menit. Setelah menangani berbagai crisis selama bertahun-tahun, saya telah mengembangkan systematic approach untuk troubleshooting MySQL. Artikel ini adalah playbook lengkap untuk situasi emergency.

Methodologi Troubleshooting

1. Systematic Approach

  1. Identify: Apa masalahnya?
  2. Isolate: Di mana masalahnya?
  3. Investigate: Kenapa terjadi?
  4. Implement: Solusi apa?
  5. Prevent: Bagaimana mencegah?

2. Data Gathering Checklist

# Simpan sebelum melakukan perubahan
mysql -u root -p -e "SHOW GLOBAL STATUS;" > /tmp/status_before.txt
mysql -u root -p -e "SHOW PROCESSLIST;" > /tmp/processlist.txt
mysql -u root -p -e "SHOW ENGINE INNODB STATUS;" > /tmp/innodb_status.txt

Common Errors dan Solusi

1. “Can’t connect to local MySQL server through socket”

Penyebab: MySQL tidak berjalan atau socket file tidak ada

Solusi:

# Cek status service
sudo systemctl status mysql

Jika stopped, start

sudo systemctl start mysql

Cek error log

tail -f /var/log/mysql/error.log

Jika socket file missing, cek konfigurasi

mysql -u root -p -h 127.0.0.1 # Gunakan TCP/IP

Fix socket path di my.cnf

[mysqld] socket = /var/run/mysqld/mysqld.sock

[client] socket = /var/run/mysqld/mysqld.sock

2. “Too many connections”

Lihat artikel: Cara Mengatasi Error Too Many Connections

Quick Fix:

-- Kill idle connections
SELECT GROUP_CONCAT(ID) INTO @kill_list 
FROM INFORMATION_SCHEMA.PROCESSLIST 
WHERE COMMAND = 'Sleep' AND TIME > 300;

SET @kill_cmd = CONCAT('KILL ', @kill_list); PREPARE stmt FROM @kill_cmd; EXECUTE stmt; DEALLOCATE PREPARE stmt;

3. “Table is full”

Penyebab: Disk penuh atau table size limit

Solusi:

# Cek disk space
df -h

Cek table size

du -sh /var/lib/mysql/database_name/

Jika disk penuh, cleanup

sudo find /var/lib/mysql -name "*.log" -mtime +7 -delete

Atau expand disk (cloud)

4. “Lock wait timeout exceeded”

Penyebab: Long-running transaction holding locks

Solusi:

-- Identify blocking transaction
SELECT 
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_id waiting_thread,
    r.trx_query waiting_query,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread,
    b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

-- Kill blocking thread KILL <blocking_thread_id>;

5. “Innodb: Database page corruption”

Penyebab: Hardware failure, power outage, atau disk corruption

Solusi:

# 1. Backup segera
mysqldump -u root -p --all-databases > /backup/corruption_backup.sql

2. Cek table corruption

cd /var/lib/mysql sudo innochecksum database_name/table_name.ibd

3. Restart dengan recovery

sudo systemctl stop mysql

Edit my.cnf

[mysqld] innodb_force_recovery = 1 # Coba 1-6, semakin tinggi semakin agresif

4. Restart dan dump data

sudo systemctl start mysql mysqldump -u root -p database_name > /backup/recovery_dump.sql

5. Recreate database dan restore

mysql -u root -p -e "DROP DATABASE database_name; CREATE DATABASE database_name;" mysql -u root -p database_name < /backup/recovery_dump.sql

6. Remove force recovery dan restart normal

Performance Issues

1. Query Lambat Mendadak

Diagnosa:

-- Cek slow query log
SELECT * FROM mysql.slow_log 
WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR)
ORDER BY query_time DESC;

-- Cek execution plan berubah EXPLAIN FORMAT=JSON SELECT ... FROM slow_query;

-- Cek statistics update SHOW TABLE STATUS LIKE 'problematic_table'; ANALYZE TABLE problematic_table;

Common Causes:
– Statistics outdated: ANALYZE TABLE
– Index corruption: OPTIMIZE TABLE atau recreate index
– Data distribution changed: Reconsider index strategy
– Server restarted: Query plan cache cleared

2. High CPU Usage

Diagnosa:

-- Top CPU consuming queries
SELECT 
    DIGEST_TEXT,
    COUNT_STAR,
    AVG_TIMER_WAIT/1000000000 as avg_ms,
    SUM_CREATED_TMP_DISK_TABLES as disk_tmp_tables
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

Solusi:

-- Enable profiling
SET profiling = 1;

-- Run query SELECT ... FROM ...;

-- Analyze SHOW PROFILES; SHOW PROFILE FOR QUERY 1;

3. Memory Exhaustion

Diagnosa:

# Cek memory usage
top -p $(pgrep mysqld)

Atau

ps aux | grep mysqld | grep -v grep

Konfigurasi Memory:

# my.cnf
[mysqld]
# 70-80% dari RAM untuk InnoDB
innodb_buffer_pool_size = 4G

Connection memory

max_connections = 100

Setiap connection ~256KB - 1MB

Query cache (jika di-enable)

query_cache_size = 256M query_cache_type = 1

4. Disk I/O Bottleneck

Diagnosa:

# Monitor disk I/O
iostat -x 1 10

Atau

iotop -o -b -d 1

Solusi:

-- Cek slow I/O tables
SELECT 
    OBJECT_SCHEMA,
    OBJECT_NAME,
    SUM_TIMER_WAIT/1000000000 as latency_ms
FROM performance_schema.table_io_waits_summary_by_table
ORDER BY SUM_TIMER_WAIT DESC;

-- Cek temp tables on disk SHOW GLOBAL STATUS LIKE 'Created_tmp%';

Replication Issues

1. Replication Lag Tinggi

Diagnosa:

-- Di slave
SHOW SLAVE STATUS\G
-- Perhatikan: Seconds_Behind_Master

-- Identify cause SELECT * FROM performance_schema.replication_applier_status;

Solusi:

-- Enable parallel replication
STOP SLAVE;
SET GLOBAL slave_parallel_threads = 4;
SET GLOBAL slave_parallel_mode = 'optimistic';
START SLAVE;

2. Replication Error

Diagnosa:

SHOW SLAVE STATUS\G
-- Perhatikan: Last_SQL_Error, Last_IO_Error

Solusi berdasarkan error:

-- Error: Duplicate entry
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

-- Error: Table doesn't exist -- Recreate table dari master

-- Error: Foreign key constraint -- Periksa data consistency

3. Master-Slave Data Inconsistency

Check:

# Percona Toolkit
pt-table-checksum --user=root --password --databases=mydb

Atau manual check

mysqldump --no-data mydb table_name | md5sum

Jalankan di master dan slave, bandingkan

Crash Recovery

1. MySQL Won’t Start

Steps:

# 1. Cek error log
sudo tail -100 /var/log/mysql/error.log

2. Cek disk space

df -h

3. Cek permission

ls -la /var/lib/mysql/ sudo chown -R mysql:mysql /var/lib/mysql

4. Recovery mode

sudo systemctl stop mysql sudo nano /etc/mysql/my.cnf

[mysqld]
innodb_force_recovery = 1  # Mulai dari 1, naik sampai 6 jika perlu
sudo systemctl start mysql

Dump semua data jika recovery berhasil

mysqldump -u root -p --all-databases > /backup/emergency_dump.sql

Restore ke instance baru

2. Corrupted Table (MyISAM)

-- Repair MyISAM table
REPAIR TABLE corrupted_table;

-- Atau dari command line myisamchk --recover /var/lib/mysql/db_name/table_name

3. Corrupted Table (InnoDB)

-- 1. Backup dengan force recovery
-- 2. Drop table
DROP TABLE corrupted_table;

-- 3. Recreate dari backup atau schema CREATE TABLE corrupted_table (...);

-- 4. Restore data

Deadlock Resolution

1. Identify Deadlock

-- Cek deadlock terakhir
SHOW ENGINE INNODB STATUS\G
-- Cari bagian "LATEST DETECTED DEADLOCK"

2. Analisis Deadlock Log

LATEST DETECTED DEADLOCK
*** (1) TRANSACTION:
TRANSACTION 12345, ACTIVE 11 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1234, OS thread handle 12345678, query id 123 localhost user
SELECT * FROM orders WHERE id = 100 FOR UPDATE
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 55 page no 3 n bits 72 index PRIMARY of table `db`.`orders`

3. Prevention

-- Akses tables dalam urutan yang konsisten
-- ORDER BY untuk lock acquisition
-- Keep transactions pendek
-- Gunakan indexes untuk reduce lock contention

Emergency Playbook

1. Complete Crash Recovery

#!/bin/bash
# Emergency recovery script

1. Stop MySQL

sudo systemctl stop mysql

2. Backup data files (meski corrupt)

sudo cp -r /var/lib/mysql /var/lib/mysql backup$(date +%s)

3. Enable recovery mode

sudo tee -a /etc/mysql/my.cnf <<EOF [mysqld] innodb_force_recovery = 4 innodb_purge_threads = 0 EOF

4. Start MySQL

sudo systemctl start mysql

5. Dump data

mysqldump -u root -p --all-databases > /backup/emergency $(date +%Y%m%d%H%M%S).sql

6. Stop MySQL dan reset config

sudo systemctl stop mysql sudo sed -i '/innodb_force_recovery/d' /etc/mysql/my.cnf sudo sed -i '/innodb_purge_threads/d' /etc/mysql/my.cnf

7. Clear data directory

sudo rm -rf /var/lib/mysql/*

8. Reinitialize

sudo mysqld --initialize-insecure --user=mysql

9. Start MySQL

sudo systemctl start mysql

10. Restore data

mysql -u root -p < /backup/emergency_*.sql

2. Data Corruption Prevention

-- Enable checksums
SET GLOBAL innodb_checksum_algorithm = 'strict_crc32';

-- Regular checks CHECK TABLE important_table;

Kesimpulan

Troubleshooting yang efektif memerlukan:
1. Systematic approach: Jangan panik, follow methodology
2. Proper logging: Always check logs first
3. Backup before fix: Simpan state sebelum perubahan
4. Test di staging: Validasi fix sebelum production
5. Document: Catat semua untuk future reference

Dengan playbook ini, Anda siap untuk menghadapi hampir semua crisis database.

Ditulis oleh

Hendra Wijaya

Tinggalkan Komentar

Email tidak akan ditampilkan.