Spider adalah storage engine di MariaDB yang memungkinkan transparent sharding dan distributed database architecture. Setelah mengimplementasikan sharding untuk high-scale applications, saya akan berbagi setup dan best practices.
Apa itu Spider Storage Engine?
Konsep Sharding dengan Spider
Spider memungkinkan:
– Transparent sharding: Data terdistribusi tapi diakses sebagai satu table
– Distributed queries: Query berjalan di multiple nodes
– Automatic routing: Spider mengarahkan queries ke shards yang tepat
Arsitektur
Application → Spider Node → Shard 1
→ Shard 2
→ Shard 3
Setup Spider
1. Install Spider
# Spider biasanya sudah included dalam MariaDB
# Cek apakah sudah tersedia
mysql -e "INSTALL SONAME 'ha_spider';"
Atau enable di my.cnf
[mysqld]
plugin-load-add=ha_spider
2. Configure Spider Node
-- Spider node (frontend)
CREATE SERVER shard1
FOREIGN DATA WRAPPER mysql
OPTIONS (
HOST '192.168.1.101',
DATABASE 'shard_db',
USER 'spider_user',
PASSWORD 'password',
PORT 3306
);
CREATE SERVER shard2
FOREIGN DATA WRAPPER mysql
OPTIONS (
HOST '192.168.1.102',
DATABASE 'shard_db',
USER 'spider_user',
PASSWORD 'password',
PORT 3306
);
3. Create Sharded Table
-- Create Spider table dengan sharding CREATE TABLE users ( id INT, username VARCHAR(50), email VARCHAR(100), created_at TIMESTAMP,-- Shard key (partitioning) PARTITION BY LIST COLUMNS(id) ( PARTITION p1 VALUES IN (1, 2, 3) COMMENT 'shard1', PARTITION p2 VALUES IN (4, 5, 6) COMMENT 'shard2' )) ENGINE=SPIDER
COMMENT 'wrapper "mysql", table "users"';-- Atau dengan HASH sharding
CREATE TABLE orders (
id INT,
user_id INT,
total DECIMAL(10,2),PARTITION BY HASH(user_id) PARTITIONS 4) ENGINE=SPIDER;
Configuration Options
1. Spider System Variables
# my.cnf [mysqld] # Spider settings spider_internal_sql_log_off = 1 spider_direct_sql = 1 spider_support_xa = 1Connection settings
spider_connect_timeout = 10 spider_net_read_timeout = 30 spider_net_write_timeout = 30
2. Table Options
CREATE TABLE distributed_table (...) ENGINE=SPIDER COMMENT ' wrapper "mysql", srv "shard1,shard2,shard3", table "table_name", mbk "1" ';Use Cases
1. Large Scale Data
-- Sharding untuk big data CREATE TABLE logs ( log_id BIGINT, log_time DATETIME, level VARCHAR(10), message TEXT ) ENGINE=SPIDER PARTITION BY RANGE (YEAR(log_time)) ( PARTITION p2022 VALUES LESS THAN (2023) COMMENT 'srv "shard1"', PARTITION p2023 VALUES LESS THAN (2024) COMMENT 'srv "shard2"', PARTITION p2024 VALUES LESS THAN (2025) COMMENT 'srv "shard3"' );2. Geographic Distribution
-- Sharding berdasarkan region CREATE TABLE customers ( customer_id INT, region VARCHAR(20), name VARCHAR(100) ) ENGINE=SPIDER PARTITION BY LIST COLUMNS(region) ( PARTITION asia VALUES IN ('SG', 'JP', 'ID') COMMENT 'srv "asia_shard"', PARTITION europe VALUES IN ('UK', 'DE', 'FR') COMMENT 'srv "eu_shard"', PARTITION americas VALUES IN ('US', 'CA', 'BR') COMMENT 'srv "us_shard"' );Query Distribution
1. Automatic Query Routing
-- Spider otomatis route query ke shard yang tepat SELECT * FROM users WHERE id = 5; -- Akan di-route ke shard2 (berdasarkan partitioning)SELECT * FROM orders WHERE user_id = 123; -- Akan di-route berdasarkan hash(user_id)
2. Aggregations
-- Aggregations across shards SELECT region, COUNT(*) as customer_count FROM customers GROUP BY region;-- Spider akan query semua shards dan aggregate results
Monitoring Spider
1. Spider Status
-- Check Spider servers SELECT * FROM mysql.servers;-- Check Spider table status SHOW TABLE STATUS WHERE Engine = 'SPIDER';
2. Performance Monitoring
-- Spider statistics SHOW STATUS LIKE 'Spider%';-- Check slow queries SELECT * FROM mysql.slow_log WHERE db LIKE '%spider%' ORDER BY query_time DESC;
Best Practices
- Choose shard key wisely: Harus sering digunakan dalam WHERE
- Consistent sharding: Gunakan same shard key untuk related tables
- Monitor connections: Spider menggunakan banyak connections
- Test failover: Spider bisa handle shard failures
- Balance load: Monitor shard utilization
Kesimpulan
Spider memungkinkan:
– Transparent sharding tanpa application changes
– Horizontal scalability
– Geographic distribution
– High availability dengan failoverUntuk aplikasi dengan massive data dan high traffic, Spider adalah solusi sharding yang powerful.
Ditulis oleh
Hendra Wijaya