Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How Does ShardingSphere-Proxy Implement Custom Sharding Algorithms for Large Numbers of Tables? #34467

Open
cwhfox opened this issue Jan 24, 2025 · 5 comments

Comments

@cwhfox
Copy link

cwhfox commented Jan 24, 2025

Question

The current business is as follows:

  1. Each project has nearly 100 tables, and the table names of the projects are: t_{project code}_name, t_{project code}_card, t_{project code}_order, etc.
    Each project's table is in the same database, and all tables contain the field {project code}: code
  2. There are nearly 30,000 projects running now, and the projects are randomly distributed in 60 databases, and can only be read from the specified file or query from project management database
  3. The {project code} is randomly distributed from 1 to 999999999

THE QUESTION IS:
WHEN I CONFIGURE actualDataNodes AS ds_${0..60}.t_${1..999999999}_order, I CAN NOT START SHARDINGSPHERE AND IT THROW EXCAPTION OutOfMemory
Seems that when it started, all tables were loaded into memory using groovy inline syntax. HOW SHOULD I CONFIGURE IT?

database-sharding.yaml:

databaseName: sharding_db

dataSources:
  ds_0:
    url: jdbc:mysql://127.0.0.1:3306/test1?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=CONVERT_TO_NULL
    username: 
    password: 
  ds_1:
    url: jdbc:mysql://127.0.0.1:3306/test2?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=CONVERT_TO_NULL
    username: 
    password: 
# HERE WILL CONFIGURE 60 DATABASES...
  ds_60:
    url: jdbc:mysql://127.0.0.1:3306/test3?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=CONVERT_TO_NULL
    username: 
    password: 


rules:
- !SHARDING
  tables:
    t_order:
      actualDataNodes: ds_${0..60}.t_${1..999999999}_order
    t_card:
      actualDataNodes: ds_${0..60}.t_${1..999999999}_card
# HERE WILL CONFIGURE 100 TABLES...
    t_name:
      actualDataNodes: ds_${0..60}.t_${1..999999999}_name

# DEFAULT SHARDING STRATEGY
  defaultShardingColumn: code
  defaultDatabaseStrategy:
    standard:
      shardingColumn: code
      shardingAlgorithmName: database-algorithm
  defaultTableStrategy:
    standard:
      shardingColumn: code
      shardingAlgorithmName: table-algorithm

# CUSTOM SHARDING ALGORITHMS
  shardingAlgorithms:
    database-algorithm:
      type: CLASS_BASED
      props:
        strategy: STANDARD
        algorithmClassName: com.example.DatabaseShardingAlgorithm
        config-path: /opt/shardingsphere/conf/code-mapping.properties
    table-algorithm:
      type: CLASS_BASED
      props:
        strategy: STANDARD
        algorithmClassName: com.example.TableShardingAlgorithm

com.example.DatabaseShardingAlgorithm:

package com.example;

import lombok.extern.slf4j.Slf4j;
import org.apache.shardingsphere.sharding.api.sharding.standard.StandardShardingAlgorithm;
import org.apache.shardingsphere.sharding.api.sharding.standard.PreciseShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.RangeShardingValue;

import java.io.FileInputStream;
import java.io.IOException;
import java.util.*;

@Slf4j
public class DatabaseShardingAlgorithm implements StandardShardingAlgorithm<String> {
    private final Map<String, String> mapping = new HashMap<>();

    @Override
    public void init(Properties properties) {
        log.info("DatabaseShardingAlgorithm init, properties:{}", properties);
        String configPath = properties.getProperty("config-path");
        try (FileInputStream fis = new FileInputStream(configPath)) {
            Properties mappingProps = new Properties();
            mappingProps.load(fis);
            mappingProps.forEach((key, value) -> mapping.put(key.toString(), value.toString()));
        } catch (IOException e) {
            log.error("Failed to load code mapping config", e);
            throw new RuntimeException("Failed to load code mapping config", e);
        }
        log.info("DatabaseShardingAlgorithm init, mapping:{}", mapping);
    }

    @Override
    public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<String> shardingValue) {
        log.info("DatabaseShardingAlgorithm code sharding, names:{}, value:{}", availableTargetNames, shardingValue);
        String code = shardingValue.getValue();
        String dbNumber = mapping.get(code);
        if (dbNumber == null) {
            throw new IllegalArgumentException("Cannot find mapping for code: " + code);
        }

        String dsName = "ds_" + dbNumber;
        log.info("DatabaseShardingAlgorithm code sharding, names:{}, value:{}, database:{}", availableTargetNames, shardingValue, dsName);
        return dsName;
    }

    @Override
    public Collection<String> doSharding(Collection<String> availableTargetNames, RangeShardingValue<String> shardingValue) {
        log.error("Range sharding is not supported for code, names:{}, value:{}", availableTargetNames, shardingValue);
        throw new UnsupportedOperationException("Range sharding is not supported for code");
    }

    @Override
    public String getType() {
        return "CODE_DATABASE";
    }
} 

com.example.TableShardingAlgorithm:

package com.example;

import lombok.extern.slf4j.Slf4j;
import org.apache.shardingsphere.sharding.api.sharding.standard.StandardShardingAlgorithm;
import org.apache.shardingsphere.sharding.api.sharding.standard.PreciseShardingValue;
import org.apache.shardingsphere.sharding.api.sharding.standard.RangeShardingValue;

import java.util.*;

@Slf4j
public class TableShardingAlgorithm implements StandardShardingAlgorithm<String> {

    @Override
    public void init(Properties properties) {
        log.info("TableShardingAlgorithm init:{}", properties);
    }

    @Override
    public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<String> shardingValue) {
        log.info("TableShardingAlgorithm code sharding, names:{}, value:{}", availableTargetNames, shardingValue);
        String code = shardingValue.getValue();

        String tableName = shardingValue.getLogicTableName();
        String table = tableName.replaceFirst("_", ("_" + code + "_"));
        log.info("TableShardingAlgorithm code sharding, names:{}, value:{}, table:{}", availableTargetNames, shardingValue, table);
        return table;
    }

    @Override
    public Collection<String> doSharding(Collection<String> availableTargetNames, RangeShardingValue<String> shardingValue) {
        log.error("Range sharding is not supported for code, names:{}, value:{}", availableTargetNames, shardingValue);
        throw new UnsupportedOperationException("Range sharding is not supported for code");
    }

    @Override
    public String getType() {
        return "CODE_TABLE";
    }
} 

@cwhfox
Copy link
Author

cwhfox commented Jan 24, 2025

and one more question: how cloud i reload sharding mapping?

@terrymanu
Copy link
Member

How could there be so many tables with such a large amount of data?
Is this a realistic problem based on real-world scenarios?

@cwhfox
Copy link
Author

cwhfox commented Jan 26, 2025

How could there be so many tables with such a large amount of data? Is this a realistic problem based on real-world scenarios?

Thanks for reply @terrymanu
Yep, of course it is based on existing business, our business use project-based data sharding and has utilized an unmaintained open-source database middleware with custom development. This business has been stable for 5 years.
Due to high maintenance difficulties with the existing middleware, I want to use shardingsphere-proxy as the new database proxy, but I've encountered development challenges.
If it's possible to implement an enumeration sharding algorithm for any range, it would likely resolve this issue.
The enumeration sharding algorithm includes using hints for DDL sharding and DML sharding.

@terrymanu
Copy link
Member

terrymanu commented Jan 26, 2025

Open source does not provide services and guarantees for commercial companies.
For commercial use, it is recommended that you contact the commercial company SphereEX(https://www.sphere-ex.com).

@cwhfox
Copy link
Author

cwhfox commented Jan 26, 2025

Open source does not provide services and guarantees for commercial companies. For commercial use, it is recommended that you contact the commercial company SphereEX(https://www.sphere-ex.com).

I don't think there's any problem with commercial companies using open-source code as system components. On the contrary, if the open-source community is active enough, it can actually help ensure the stability of the relevant features.
Of course, this is unrelated to the issue at hand.
Returning to the issue itself, how can the custom sharding algorithm provided by ss implement enumeration-based sharding?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants