compare result based on semantics #93

ZENOTME · 2022-10-20T05:37:25Z

Background

I'm trying to introduce INTERVAL type in Postgres-extend engine and find a problem caused by the way we test result.

In sqllogictest::AsyncDB, our run interface requires to return a string to compare with the expect result.
async fn run(&mut self, sql: &str) -> Result<String, Self::Error>

But there are some case in which the String is different but the semantics is the same.
Such asinterval '30 days'and interval ' 720:00:00', there are different string but the same semantics.

For some type(such as interval), there are different string format to express a same thing in same semantics. So do we need to add a way to compare the 'semantics' rather than compare the 'string'. I try to think a way to fix it but I don't wether it's worth or necessary. Because this case only exist in 'Interval', 'timstamptz' and other time-related type. So we can also declare the format must be equal to result from psql.

Maybe we can fix it by...

make the result can be multi-format. Such as:

async fn run(&mut self, sql: &str) -> Result<Ans, Self::Error>
enum Ans{
  str(String),
  Interval(...)
}

The text was updated successfully, but these errors were encountered:

skyzh · 2022-10-20T05:40:02Z

Why not format the timestamp types into strings?

skyzh · 2022-10-20T05:40:55Z

I'd suggest implementing a custom result type and a custom comparator in sqllogictest. It's impossible to cover all data types in this framework.

ZENOTME · 2022-10-20T05:45:56Z

Why not format the timestamp types into strings?

I'm not sure what you means. A timestamp type can be format into different string with the same semantics.

I'd suggest implementing a custom result type and a custom comparator in sqllogictest. It's impossible to cover all data types in this framework.

Sounds like a good way to solve this. I can try it.

skyzh · 2022-10-20T05:48:56Z

I'm not sure what you means. A timestamp type can be format into different string with the same semantics.

You can format it to a fixed format (e.g., represented by hours, or represented by days, not sure how we can do this) before comparing.

ZENOTME · 2022-10-20T05:51:54Z

I'm not sure what you means. A timestamp type can be format into different string with the same semantics.

You can format it to a fixed format (e.g., represented by hours, or represented by days, not sure how we can do this) before comparing.

So does it also mean '*.slt' file should use the fixed format?

skyzh · 2022-10-20T05:52:54Z

Yes. All should use days / hours, probably, if we only want to work on strings instead of interpreting the actual values.

ZENOTME · 2022-10-20T05:54:42Z

Yes. All should use days / hours, probably, if we only want to work on strings instead of interpreting the actual values.

If we can unify the '*.slt' file, this way seems to be a most simple way to solve this.

I will:

unify the '*.slt' file, it can let us introduce new type(interval,timestamptz) quickly.
try to implement custom result type later, it seems a good way for us to support new type in the future.

xxchan · 2022-10-22T16:09:07Z

To summarize, when a type has more than one representations for the same value, an engine can convert it to a canonial form. The limitation is that the test cases should also be written in canonical form, which will also make some engines imcompatible, e.g., postgres and postgres-extended. A better way is to let the engine provide a custom comparator, but cmp(&str,&str) is hard to do. It can be better to let the engine provide a result type and the comparator can be cmp(&ResT, &str). It's slightly better but still need to do some parsing. 🤔

The behavior can be tweaked by using different engines. e.g., if some users do want to test the results AS-IS, they can just use a non-canonical engine.

ZENOTME · 2022-10-23T06:38:22Z

It's slightly better but still need to do some parsing. The behavior can be tweaked by using different engines. e.g., if some users do want to test the results AS-IS, they can just use a non-canonical engine.

After introducing the custom comparator, we don't need the canonical form. We can directly make postgres-extended to be a non-canonical engine.

xxchan · 2022-10-23T13:15:19Z

After introducing the custom comparator, pg-extended is still kind of canonical.

Assume A1 has canonical form A.

// before
expected actual pg    pg-extended
A1       A1     pass  fail
A        A1     fail  pass

// after
expected actual pg    pg-extended
A1       A1     pass  pass (change here)
A        A1     fail  pass

ZENOTME · 2022-10-23T13:21:11Z

Why not change all
A A1
to
A1 A1
for the type have introduced the custom comparator.

ZENOTME · 2022-10-28T05:55:00Z

pub trait CustomResult:ToString {
    fn cmp(&self,str:&str)->bool;
}

impl CustomResult for String {
    fn cmp(&self,str:&str)->bool{
        self == str
    }
}
pub type BoxResult = Box<dyn CustomResult>;

I try to create a CustomResult, but I find a this will caused we can't use unstable_sort() in runner. The reason is we can't compare the trait object.
So I thinking about two solution:

reimplement a runner for CustomResult that can't support raw_sort, user should responsible to use 'order by'.
I think the function of raw sort is to compare the result in ignoring order. We can just compare one row of object to all row of string. Such as:

output:
  [
     1: [object1 , object2 , object3]
  ]
expect:
  [
     1: [string1, string2, string3]
     2: [string4,string5, string6]
  ]
for output_row : output{
  for expect_row : expect{
     if compare(output_row,expect_row).is_success {
       ...
       break;
     }
    ...
  }
}

ZENOTME · 2022-11-08T11:13:15Z

For sqllogictest, I think the binary format isn't appropriate for us. So I try to modify the rust-postgres to request text format result in extended query protocol and use it in extended-engine locally. It can run all the e2e test in risingwave. I try to push this modification to rust-postgres but there are no reaction now. So I think whether we can maintain a downstream version of rust-postgres first.

melgenek · 2023-01-28T23:31:22Z

Hi there!
I wanted to share one more use case for the problem that you're solving here.
Datafusion uses sqllogictest-rs to run some .slt tests to check compatibility between Datafusion and Postgres.

For now, we went the "canonical form" way. The string representation is what is compared, and both Datafusion and Postgres have to produce the same string representations.
In order to achieve the same text representation, we defined a set of typed value to string conversions for:

booleans
strings ((empty) and values)
decimals. Postgres and Datafusion sometimes use different underlying number types, which causes precision errors. So the conversions produce "canonical" rounded values.

The drawback of this approach is that I had to rewrite the postgres-extended to use these "canonical" conversions.
This approach also wouldn't work if the Postgres implementation is not based on types.

The current cross-engine cross-type logic is basically

typed result 1 -> canonical string
                                      -> compare strings
typed result 2 -> canonical string

Regarding the @ZENOTME's idea of custom comparators: the problem with them is that each engine implementation would need to know how to parse an opaque "expected result" string in order to compare it. It's a quite fragile approach because an engine cannot predict what users write in the outputs.

Additionally, the errors are now quite clear, because there are +/- showing the diffs between the expected and actual results. One would need to define both a comparator and a string representation in order to show errors.

The comparator logic also arguably seems more complicated to implement and maintain:

expected string -> parse to typed value -> compare to typed result 1
expected string -> parse to typed value -> compare to typed result 2

Defining "canonical" type representations seems like a big task. But, in general, the "canonical value" approach seems to be somewhat easier both to understand and implement for multiple engines, and multiple representations per type.

ZENOTME mentioned this issue Oct 22, 2022

support more type in postgres-extended #94

Merged

melgenek mentioned this issue Jan 6, 2023

Replace python based integration test with sqllogictest apache/datafusion#4462

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compare result based on semantics #93

compare result based on semantics #93

ZENOTME commented Oct 20, 2022

skyzh commented Oct 20, 2022

skyzh commented Oct 20, 2022

ZENOTME commented Oct 20, 2022

skyzh commented Oct 20, 2022

ZENOTME commented Oct 20, 2022 •

edited

Loading

skyzh commented Oct 20, 2022

ZENOTME commented Oct 20, 2022 •

edited

Loading

xxchan commented Oct 22, 2022 •

edited

Loading

ZENOTME commented Oct 23, 2022

xxchan commented Oct 23, 2022 •

edited

Loading

ZENOTME commented Oct 23, 2022

ZENOTME commented Oct 28, 2022

ZENOTME commented Nov 8, 2022 •

edited

Loading

melgenek commented Jan 28, 2023

compare result based on semantics #93

compare result based on semantics #93

Comments

ZENOTME commented Oct 20, 2022

Background

Maybe we can fix it by...

skyzh commented Oct 20, 2022

skyzh commented Oct 20, 2022

ZENOTME commented Oct 20, 2022

skyzh commented Oct 20, 2022

ZENOTME commented Oct 20, 2022 • edited Loading

skyzh commented Oct 20, 2022

ZENOTME commented Oct 20, 2022 • edited Loading

xxchan commented Oct 22, 2022 • edited Loading

ZENOTME commented Oct 23, 2022

xxchan commented Oct 23, 2022 • edited Loading

ZENOTME commented Oct 23, 2022

ZENOTME commented Oct 28, 2022

ZENOTME commented Nov 8, 2022 • edited Loading

melgenek commented Jan 28, 2023

ZENOTME commented Oct 20, 2022 •

edited

Loading

ZENOTME commented Oct 20, 2022 •

edited

Loading

xxchan commented Oct 22, 2022 •

edited

Loading

xxchan commented Oct 23, 2022 •

edited

Loading

ZENOTME commented Nov 8, 2022 •

edited

Loading