Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weird source code structure (maybe bug?) #593

Closed
yecol opened this issue Aug 14, 2024 · 3 comments
Closed

weird source code structure (maybe bug?) #593

yecol opened this issue Aug 14, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@yecol
Copy link
Contributor

yecol commented Aug 14, 2024

Describe the bug, including details regarding any error messages, version, and platform.

Screenshot 2024-08-14 at 4 29 59 PM

Just check out the latest code and find something weird.
As shown in the above figure, it seems some code is under dir datasources-34 and datasources-35?

  • should they be in the maven-projects/spark/src/main if they are library code?
  • or should be in maven-projects/spark/src/test if they are test specific code?

Component(s)

Spark

@yecol yecol added the bug Something isn't working label Aug 14, 2024
@SemyonSinchenko
Copy link
Member

SemyonSinchenko commented Aug 14, 2024

@yecol Our target is to support multiple versions of Apache Spark. Unfortunately, the DataSource API of Apache Spark is a Developer API and changing dramatically from one version of spark to another. And sometimes changes are so big, that reflection is not enough.

We made a decision to separate datasource implementation into a maven subpackage.

And we have the following maven profiles:

    <profiles>
        <profile>
            <id>datasources-32</id>
            <properties>
                <sbt.project.name>graphar</sbt.project.name>
                <spark.version>3.2.4</spark.version>
            </properties>
            <modules>
                <module>graphar</module>
                <module>datasources-32</module>
            </modules>
        </profile>
        <profile>
            <id>datasources-33</id>
            <properties>
                <sbt.project.name>graphar</sbt.project.name>
                <spark.version>3.3.4</spark.version>
            </properties>
            <modules>
                <module>graphar</module>
                <module>datasources-33</module>
            </modules>
        </profile>
        <profile>
            <id>datasources-34</id>
            <properties>
                <sbt.project.name>graphar</sbt.project.name>
                <spark.version>3.4.3</spark.version>
            </properties>
            <modules>
                <module>graphar</module>
                <module>datasources-34</module>
            </modules>
        </profile>
        <profile>
            <id>datasources-35</id>
            <properties>
                <sbt.project.name>graphar</sbt.project.name>
                <spark.version>3.5.1</spark.version>
            </properties>
            <modules>
                <module>graphar</module>
                <module>datasources-35</module>
            </modules>
            <activation>
                <activeByDefault>true</activeByDefault>
            </activation>
        </profile>
    </profiles>

Each of subfolders is actually a subproject in Maven.

so, using that approach we are able to build GraphAr Spark for a different version of spark itself.

At the moment, that approach is used in our CI when we are running tests for all the supported Maven profiles.

@SemyonSinchenko
Copy link
Member

An alternative way to provide the support is to use tags/branches. But for me it is better to have Meven sub-projects. At the random moment of time about 4-5 versions of spark are maintained, so I don't think that amount of duplicated code will grow infinitely: spark-3.2 is EoL soon, for example, so we can drop it, etc.

@yecol
Copy link
Contributor Author

yecol commented Aug 14, 2024

I see. It makes sense!
I didn't aware the diverged datasource versions of Spark. Thanks for your kindly and detailed response!

@yecol yecol closed this as completed Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants