Skip to content

Conversation

@kasakrisz
Copy link
Contributor

@kasakrisz kasakrisz commented Oct 22, 2025

What changes were proposed in this pull request?

Support deleting table and partition column statistics when column name has special chatacters.

Why are the changes needed?

Special characters like ' are allowed in column names but this character has meaning in SQL language too.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

mvn test -Dtest=TestObjectStore#testLockDbTableThrowsExceptionWhenTableIsNotAllowedToLock,TestObjectStore#testTableStatisticsOps,TestObjectStore#testDeleteTableColumnStatisticsWhenEngineHasSpecialCharacter,TestObjectStore#testPartitionStatisticsOps,TestObjectStore#testDeletePartitionColumnStatisticsWhenEngineHasSpecialCharacter -pl standalone-metastore/metastore-server

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LTGM, just optional nits.


public void lockDbTable(String tableName) throws MetaException {
if (!ALLOWED_TABLES_TO_LOCK.contains(tableName)) {
throw new MetaException("Error while locking table " + tableName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a test case for the exception if its easy/possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want a Java comment because I can't deserialize the intention about why we should throw an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test case and comment


try (QueryWrapper queryParams = new QueryWrapper(pm.newQuery("javax.jdo.query.SQL", deleteSql))) {
executeWithArray(queryParams.getInnerQuery(), params.toArray(), deleteSql);
} catch (MetaException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching a MetaException and rethrowing it as a MetaException seems redundant and will make the stacktrace harder to follow. Since the method already throws MetaException can we simply remove the catch block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the catch

Copy link
Contributor

@okumin okumin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you

if (engine != null) {
deleteSql += " and \"ENGINE\" = '" + engine + "'";
deleteSql += " and \"ENGINE\" = ?";
params.add(engine);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test case to add a harmful value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added test case to test engine with special character ' in it's name

List<Long> partitionIds = getPartitionIdsViaSqlFilter(catName, dbName, tblName, sqlFilter,
input, Collections.emptyList(), -1);
if (!partitionIds.isEmpty()) {
String deleteSql = "delete from " + PART_COL_STATS + " where \"PART_ID\" in ( " + getIdListForIn(partitionIds) + ")";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a pure question: I don't request any change in this pull request. Can we potentially replace getIdListForIn with a placeholder in this class at some point? In my opinion, it is better to avoid string concatenation thoroughly. Otherwise, new issues might easily pass code reviews.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the partitionIds is returned as long, and getIdListForIn(partitionIds) essentially calls Long.toString() which should output only digits, therefore no special characters can be introduced here. So, i think getIdListForIn(partitionIds) is safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree getIdListForIn((Collection<Long>) partitionIds) is safe, and we don't need to update it in this patch as a short-term remediation. As a long-term remediation, I'd say we should not have a habit of concatenating strings since a single abuse and careless review can introduce a new vulnerability.


public void lockDbTable(String tableName) throws MetaException {
if (!ALLOWED_TABLES_TO_LOCK.contains(tableName)) {
throw new MetaException("Error while locking table " + tableName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want a Java comment because I can't deserialize the intention about why we should throw an exception.

Copy link
Member

@simhadri-g simhadri-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@sonarqubecloud
Copy link

sonarqubecloud bot commented Nov 3, 2025

@kasakrisz kasakrisz merged commit c18d0df into apache:master Nov 3, 2025
4 checks passed
@deniskuzZ
Copy link
Member

deniskuzZ commented Nov 5, 2025

is there a JIRA id? if yes, could you please manually link this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants