Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Add methods for Spark Reader and improve the performance #86

Closed
lixueclaire opened this issue Feb 2, 2023 · 0 comments · Fixed by #87
Closed

[Improvement] Add methods for Spark Reader and improve the performance #86

lixueclaire opened this issue Feb 2, 2023 · 0 comments · Fixed by #87
Assignees
Labels
Component:Documentation Improvements or additions to documentation enhancement New feature or request improvement Improvement

Comments

@lixueclaire
Copy link
Contributor

lixueclaire commented Feb 2, 2023

Is your feature request related to a problem? Please describe.

  1. Support to read multiple property groups for VertexReader and EdgeReader.
  2. Optimize the Spark Reader when reading multiple property groups simultaneously, and maintain the order of rows in resulting DataFrame.
  3. When reading multiple chunks simultaneously, adding indices by default.
  4. Update the examples and related documentations.

Describe the solution you'd like

  1. Add methods in VertexReader and EdgeReader to allow to pass in a list of property groups and read related chunks.
  2. Currently, this is done by adding indices and join different DataFrames. We would like to concatenate the DataFrames row by row without repartitioning and shuffling.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@lixueclaire lixueclaire added the improvement Improvement label Feb 2, 2023
@lixueclaire lixueclaire moved this to In Progress in alibaba/GraphAr Feb 3, 2023
@lixueclaire lixueclaire changed the title [Improvement] Improve the performance of Spark Reader [Improvement] Add methods for Spark Reader and improve the performance Feb 3, 2023
@lixueclaire lixueclaire self-assigned this Feb 3, 2023
@lixueclaire lixueclaire added Component:Documentation Improvements or additions to documentation enhancement New feature or request labels Feb 3, 2023
@lixueclaire lixueclaire moved this from In Progress to In Review in alibaba/GraphAr Feb 7, 2023
@lixueclaire lixueclaire moved this from In Review to In Progress in alibaba/GraphAr Feb 7, 2023
@lixueclaire lixueclaire moved this from In Progress to In Review in alibaba/GraphAr Feb 7, 2023
@acezen acezen closed this as completed in #87 Feb 7, 2023
@github-project-automation github-project-automation bot moved this from In Review to Done in alibaba/GraphAr Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:Documentation Improvements or additions to documentation enhancement New feature or request improvement Improvement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant