-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawler transform #797
Crawler transform #797
Conversation
…dURLs Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@touma-I Thank you very much for making this change! This simple implementation looks good to me. I added several comments, but most of them are nitpicking.
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. A few typos here and there
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed a couple of things that I see now.
Signed-off-by: Maroun Touma <[email protected]>
Signed-off-by: Maroun Touma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README file looks good to me.
Signed-off-by: Maroun Touma <[email protected]>
Will create a new PR to address additional comments related to processing parquet tables
Why are these changes needed?
Implement crawler transforms using the dpi-connector API. This is based on the work done by the data sift but also had to add CLI in order to integrate with python runtime. This implementation uses the new layout for the transform using module name dpk_web2parquet
Related issue number (if any).
#751