Lewati ke konten
Kembali ke Blog

MariaDB Spider Storage Engine: Sharding untuk Scalable Database

Β· Β· 4 menit baca

Spider adalah storage engine di MariaDB yang memungkinkan transparent sharding dan distributed database architecture. Setelah mengimplementasikan sharding untuk high-scale applications, saya akan berbagi setup dan best practices.

Apa itu Spider Storage Engine?

Konsep Sharding dengan Spider

Spider memungkinkan:
Transparent sharding: Data terdistribusi tapi diakses sebagai satu table
Distributed queries: Query berjalan di multiple nodes
Automatic routing: Spider mengarahkan queries ke shards yang tepat

Arsitektur

Application β†’ Spider Node β†’ Shard 1
                         β†’ Shard 2
                         β†’ Shard 3

Setup Spider

1. Install Spider

# Spider biasanya sudah included dalam MariaDB
# Cek apakah sudah tersedia
mysql -e "INSTALL SONAME 'ha_spider';"

Atau enable di my.cnf

[mysqld] plugin-load-add=ha_spider

2. Configure Spider Node

-- Spider node (frontend)
CREATE SERVER shard1
FOREIGN DATA WRAPPER mysql
OPTIONS (
    HOST '192.168.1.101',
    DATABASE 'shard_db',
    USER 'spider_user',
    PASSWORD 'password',
    PORT 3306
);

CREATE SERVER shard2 FOREIGN DATA WRAPPER mysql OPTIONS ( HOST '192.168.1.102', DATABASE 'shard_db', USER 'spider_user', PASSWORD 'password', PORT 3306 );

3. Create Sharded Table

-- Create Spider table dengan sharding
CREATE TABLE users (
    id INT,
    username VARCHAR(50),
    email VARCHAR(100),
    created_at TIMESTAMP,
-- Shard key (partitioning)
PARTITION BY LIST COLUMNS(id) (
    PARTITION p1 VALUES IN (1, 2, 3) COMMENT 'shard1',
    PARTITION p2 VALUES IN (4, 5, 6) COMMENT 'shard2'
)

) ENGINE=SPIDER
COMMENT 'wrapper "mysql", table "users"';

-- Atau dengan HASH sharding
CREATE TABLE orders (
id INT,
user_id INT,
total DECIMAL(10,2),

PARTITION BY HASH(user_id) PARTITIONS 4

) ENGINE=SPIDER;

Configuration Options

1. Spider System Variables

# my.cnf
[mysqld]
# Spider settings
spider_internal_sql_log_off = 1
spider_direct_sql = 1
spider_support_xa = 1

Connection settings

spider_connect_timeout = 10 spider_net_read_timeout = 30 spider_net_write_timeout = 30

2. Table Options

CREATE TABLE distributed_table (...) ENGINE=SPIDER
COMMENT '
    wrapper "mysql",
    srv "shard1,shard2,shard3",
    table "table_name",
    mbk "1"
';

Use Cases

1. Large Scale Data

-- Sharding untuk big data
CREATE TABLE logs (
    log_id BIGINT,
    log_time DATETIME,
    level VARCHAR(10),
    message TEXT
) ENGINE=SPIDER
PARTITION BY RANGE (YEAR(log_time)) (
    PARTITION p2022 VALUES LESS THAN (2023) COMMENT 'srv "shard1"',
    PARTITION p2023 VALUES LESS THAN (2024) COMMENT 'srv "shard2"',
    PARTITION p2024 VALUES LESS THAN (2025) COMMENT 'srv "shard3"'
);

2. Geographic Distribution

-- Sharding berdasarkan region
CREATE TABLE customers (
    customer_id INT,
    region VARCHAR(20),
    name VARCHAR(100)
) ENGINE=SPIDER
PARTITION BY LIST COLUMNS(region) (
    PARTITION asia VALUES IN ('SG', 'JP', 'ID') COMMENT 'srv "asia_shard"',
    PARTITION europe VALUES IN ('UK', 'DE', 'FR') COMMENT 'srv "eu_shard"',
    PARTITION americas VALUES IN ('US', 'CA', 'BR') COMMENT 'srv "us_shard"'
);

Query Distribution

1. Automatic Query Routing

-- Spider otomatis route query ke shard yang tepat
SELECT * FROM users WHERE id = 5;
-- Akan di-route ke shard2 (berdasarkan partitioning)

SELECT * FROM orders WHERE user_id = 123; -- Akan di-route berdasarkan hash(user_id)

2. Aggregations

-- Aggregations across shards
SELECT region, COUNT(*) as customer_count
FROM customers
GROUP BY region;

-- Spider akan query semua shards dan aggregate results

Monitoring Spider

1. Spider Status

-- Check Spider servers
SELECT * FROM mysql.servers;

-- Check Spider table status SHOW TABLE STATUS WHERE Engine = 'SPIDER';

2. Performance Monitoring

-- Spider statistics
SHOW STATUS LIKE 'Spider%';

-- Check slow queries SELECT * FROM mysql.slow_log WHERE db LIKE '%spider%' ORDER BY query_time DESC;

Best Practices

  1. Choose shard key wisely: Harus sering digunakan dalam WHERE
  2. Consistent sharding: Gunakan same shard key untuk related tables
  3. Monitor connections: Spider menggunakan banyak connections
  4. Test failover: Spider bisa handle shard failures
  5. Balance load: Monitor shard utilization

Kesimpulan

Spider memungkinkan:
– Transparent sharding tanpa application changes
– Horizontal scalability
– Geographic distribution
– High availability dengan failover

Untuk aplikasi dengan massive data dan high traffic, Spider adalah solusi sharding yang powerful.

Ditulis oleh

Hendra Wijaya

Hanya hamba Allah Ta'ala yang berusaha berbuat baik..

Tinggalkan Komentar

Email tidak akan ditampilkan.