How Does ShardingSphere-Proxy Implement Custom Sharding Algorithms for Large Numbers of Tables? #34467

cwhfox · 2025-01-24T08:11:25Z

Question

The current business is as follows:

Each project has nearly 100 tables, and the table names of the projects are: t_{project code}_name, t_{project code}_card, t_{project code}_order, etc.
Each project's table is in the same database, and all tables contain the field {project code}: code
There are nearly 30,000 projects running now, and the projects are randomly distributed in 60 databases, and can only be read from the specified file or query from project management database
The {project code} is randomly distributed from 1 to 999999999

THE QUESTION IS:
WHEN I CONFIGURE actualDataNodes AS ds_${0..60}.t_${1..999999999}_order, I CAN NOT START SHARDINGSPHERE AND IT THROW EXCAPTION OutOfMemory
Seems that when it started, all tables were loaded into memory using groovy inline syntax. HOW SHOULD I CONFIGURE IT?

database-sharding.yaml:

databaseName: sharding_db

dataSources:
  ds_0:
    url: jdbc:mysql://127.0.0.1:3306/test1?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=CONVERT_TO_NULL
    username: 
    password: 
  ds_1:
    url: jdbc:mysql://127.0.0.1:3306/test2?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=CONVERT_TO_NULL
    username: 
    password: 
# HERE WILL CONFIGURE 60 DATABASES...
  ds_60:
    url: jdbc:mysql://127.0.0.1:3306/test3?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=CONVERT_TO_NULL
    username: 
    password: 


rules:
- !SHARDING
  tables:
    t_order:
      actualDataNodes: ds_${0..60}.t_${1..999999999}_order
    t_card:
      actualDataNodes: ds_${0..60}.t_${1..999999999}_card
# HERE WILL CONFIGURE 100 TABLES...
    t_name:
      actualDataNodes: ds_${0..60}.t_${1..999999999}_name

# DEFAULT SHARDING STRATEGY
  defaultShardingColumn: code
  defaultDatabaseStrategy:
    standard:
      shardingColumn: code
      shardingAlgorithmName: database-algorithm
  defaultTableStrategy:
    standard:
      shardingColumn: code
      shardingAlgorithmName: table-algorithm

# CUSTOM SHARDING ALGORITHMS
  shardingAlgorithms:
    database-algorithm:
      type: CLASS_BASED
      props:
        strategy: STANDARD
        algorithmClassName: com.example.DatabaseShardingAlgorithm
        config-path: /opt/shardingsphere/conf/code-mapping.properties
    table-algorithm:
      type: CLASS_BASED
      props:
        strategy: STANDARD
        algorithmClassName: com.example.TableShardingAlgorithm

com.example.DatabaseShardingAlgorithm:

package com.example;

import lombok.extern.slf4j.Slf4j;
import org.apache.shardingsphere.sharding.api.sharding.standard.StandardShardingAlgorithm;
import org.apache.shardingsphere.sharding.api.sharding.standard.PreciseShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.RangeShardingValue;

import java.io.FileInputStream;
import java.io.IOException;
import java.util.*;

@Slf4j
public class DatabaseShardingAlgorithm implements StandardShardingAlgorithm<String> {
    private final Map<String, String> mapping = new HashMap<>();

    @Override
    public void init(Properties properties) {
        log.info("DatabaseShardingAlgorithm init, properties:{}", properties);
        String configPath = properties.getProperty("config-path");
        try (FileInputStream fis = new FileInputStream(configPath)) {
            Properties mappingProps = new Properties();
            mappingProps.load(fis);
            mappingProps.forEach((key, value) -> mapping.put(key.toString(), value.toString()));
        } catch (IOException e) {
            log.error("Failed to load code mapping config", e);
            throw new RuntimeException("Failed to load code mapping config", e);
        }
        log.info("DatabaseShardingAlgorithm init, mapping:{}", mapping);
    }

    @Override
    public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<String> shardingValue) {
        log.info("DatabaseShardingAlgorithm code sharding, names:{}, value:{}", availableTargetNames, shardingValue);
        String code = shardingValue.getValue();
        String dbNumber = mapping.get(code);
        if (dbNumber == null) {
            throw new IllegalArgumentException("Cannot find mapping for code: " + code);
        }

        String dsName = "ds_" + dbNumber;
        log.info("DatabaseShardingAlgorithm code sharding, names:{}, value:{}, database:{}", availableTargetNames, shardingValue, dsName);
        return dsName;
    }

    @Override
    public Collection<String> doSharding(Collection<String> availableTargetNames, RangeShardingValue<String> shardingValue) {
        log.error("Range sharding is not supported for code, names:{}, value:{}", availableTargetNames, shardingValue);
        throw new UnsupportedOperationException("Range sharding is not supported for code");
    }

    @Override
    public String getType() {
        return "CODE_DATABASE";
    }
}

com.example.TableShardingAlgorithm:

package com.example;

import lombok.extern.slf4j.Slf4j;
import org.apache.shardingsphere.sharding.api.sharding.standard.StandardShardingAlgorithm;
import org.apache.shardingsphere.sharding.api.sharding.standard.PreciseShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.RangeShardingValue;

import java.util.*;

@Slf4j
public class TableShardingAlgorithm implements StandardShardingAlgorithm<String> {

    @Override
    public void init(Properties properties) {
        log.info("TableShardingAlgorithm init:{}", properties);
    }

    @Override
    public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<String> shardingValue) {
        log.info("TableShardingAlgorithm code sharding, names:{}, value:{}", availableTargetNames, shardingValue);
        String code = shardingValue.getValue();

        String tableName = shardingValue.getLogicTableName();
        String table = tableName.replaceFirst("_", ("_" + code + "_"));
        log.info("TableShardingAlgorithm code sharding, names:{}, value:{}, table:{}", availableTargetNames, shardingValue, table);
        return table;
    }

    @Override
    public Collection<String> doSharding(Collection<String> availableTargetNames, RangeShardingValue<String> shardingValue) {
        log.error("Range sharding is not supported for code, names:{}, value:{}", availableTargetNames, shardingValue);
        throw new UnsupportedOperationException("Range sharding is not supported for code");
    }

    @Override
    public String getType() {
        return "CODE_TABLE";
    }
}

The text was updated successfully, but these errors were encountered:

cwhfox · 2025-01-24T08:23:35Z

and one more question: how cloud i reload sharding mapping?

terrymanu · 2025-01-25T05:45:17Z

How could there be so many tables with such a large amount of data?
Is this a realistic problem based on real-world scenarios?

cwhfox · 2025-01-26T03:00:05Z

How could there be so many tables with such a large amount of data? Is this a realistic problem based on real-world scenarios?

Thanks for reply @terrymanu
Yep, of course it is based on existing business, our business use project-based data sharding and has utilized an unmaintained open-source database middleware with custom development. This business has been stable for 5 years.
Due to high maintenance difficulties with the existing middleware, I want to use shardingsphere-proxy as the new database proxy, but I've encountered development challenges.
If it's possible to implement an enumeration sharding algorithm for any range, it would likely resolve this issue.
The enumeration sharding algorithm includes using hints for DDL sharding and DML sharding.

terrymanu · 2025-01-26T06:34:17Z

Open source does not provide services and guarantees for commercial companies.
For commercial use, it is recommended that you contact the commercial company SphereEX(https://www.sphere-ex.com).

cwhfox · 2025-01-26T07:00:32Z

Open source does not provide services and guarantees for commercial companies. For commercial use, it is recommended that you contact the commercial company SphereEX(https://www.sphere-ex.com).

I don't think there's any problem with commercial companies using open-source code as system components. On the contrary, if the open-source community is active enough, it can actually help ensure the stability of the relevant features.
Of course, this is unrelated to the issue at hand.
Returning to the issue itself, how can the custom sharding algorithm provided by ss implement enumeration-based sharding?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How Does ShardingSphere-Proxy Implement Custom Sharding Algorithms for Large Numbers of Tables? #34467

How Does ShardingSphere-Proxy Implement Custom Sharding Algorithms for Large Numbers of Tables? #34467

cwhfox commented Jan 24, 2025

cwhfox commented Jan 24, 2025

terrymanu commented Jan 25, 2025

cwhfox commented Jan 26, 2025

terrymanu commented Jan 26, 2025 •

edited

Loading

cwhfox commented Jan 26, 2025

How Does ShardingSphere-Proxy Implement Custom Sharding Algorithms for Large Numbers of Tables? #34467

How Does ShardingSphere-Proxy Implement Custom Sharding Algorithms for Large Numbers of Tables? #34467

Comments

cwhfox commented Jan 24, 2025

Question

cwhfox commented Jan 24, 2025

terrymanu commented Jan 25, 2025

cwhfox commented Jan 26, 2025

terrymanu commented Jan 26, 2025 • edited Loading

cwhfox commented Jan 26, 2025

terrymanu commented Jan 26, 2025 •

edited

Loading