- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3
feat(joins): added dataset and columns registry #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ceec035    to
    37d25ff      
    Compare
  
    | sourceType: ColumnType, | ||
| targetType: ColumnType | ||
| ): number { | ||
| return this['getTypeCompatibilityScore'](sourceType, targetType); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for the 'getTypeCompatibilityScore' syntax?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes as it a private method I am not able to access it directly via this
This is only done to test the funtion
| if (compatibility.totalScore === 0) { | ||
| throw new Error('Columns are not compatible for joining'); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we computing the join path here? Should not just join based on whatever the user has selected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we actually require this funtion because as soon as the user selects some column form the findCompatibleColumns function the consumer would have the join path
{
  sourceDatasetId: string;
  sourceColumnName: string;
  destinationDatasetId: string;
  destinationColumnName: string;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the same will add in future
| }); | ||
| } | ||
|  | ||
| public doesJoinPathExist({ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this return the joinPath if it exists, otherwise undefined? Would be more reusable that way
| typeScore, | ||
| nameScore, | ||
| schemaScore, | ||
| totalScore: typeScore + nameScore + schemaScore, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally if the values can be derived from the return type, then there is not need to precompute it. totalScore is already represented by the returned values of typeScore, nameScore, schemaScore,
        
          
                package.json
              
                Outdated
          
        
      | "duckdb": "^0.10.2", | ||
| "express": "^4.19.2", | ||
| "fake-indexeddb": "^5.0.1", | ||
| "fast-deep-equal": "3.1.3", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we using this anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I am using this to check if the columns schema is same for any two columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we. use lodash instead?
| Can we also update the meerkat-core and meerkat-browser version | 
| @@ -0,0 +1,126 @@ | |||
| import * as deepEqual from 'fast-deep-equal'; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is is possible to use lodash.deepEqual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already have the package, i dont want to add another one for just fast equals, we can change the package if it become a bottleneck in the future, for now lodash should be fine
| export interface Column<T = object> { | ||
| name: string; | ||
| dataType: ColumnType; | ||
| schema?: T; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should schema have a defined type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema for us to would be DevRevSchema but for other consumer it might not be the same thats why we have defined it as configurable. Like they might follow their own filter format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO defined type would be better here. Do we have functional support for handling generics?
|  | ||
| export interface Column<T = object> { | ||
| name: string; | ||
| dataType: ColumnType; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
column shuold be just DimensionType right?
448649b    to
    09ed48c      
    Compare
  
    1f3740b    to
    206ffbb      
    Compare
  
    
This DatasetRegistry and ColumnCompatibilityAnalyzer functions should help us to power joins and show column suggestions based on schema