Skip to content

Conversation

@nilesh-c
Copy link
Member

This was done to convert all remaining File-specific stuff to using Path. That is necessary because we have to support Hadoop file systems (HDFS etc.) equally well - already discussed.

As for this particular change, I personally feel this is getting pretty ugly because the client code is having to juggle between DistConfig and Config when it comes to dealing with mappingsDir, 'ontologyFileanddumpDir`. But this is supposed to be handled behind-the-scenes by the config object.

I propose to reject this PR and combine Config and DistConfig into one single DistConfig. I'll make a new PR. Here's how DistConfig will look like:

/**
 * Class for distributed configuration
 */
class DistConfig(configProps: Properties, config: Config)
{
    // add all Spark and Hadoop-specific stuff

   // dumpDir, mappingsDir, ontologyFile => do the fallback to config logic here and delegate to config accordingly.

   // for the rest of the config.* methods, simply delegate. Act as wrapper.

}

Dump dirs and dump files will be read from DistConfig if defined - else, corresponding properties from Config will be used.
@nilesh-c
Copy link
Member Author

@jimkont @jcsahnwaldt please discuss how you think we should proceed. @jcsahnwaldt had mentioned DistConfig extending Config, using generics etc. But I can't see how we can use inheritance here because dumpDir, mappingsDir etc. are not remaining Files anymore - they are Option[Path]s now in DistConfig. :(

I suggested the delegation/wrapper approach, but I'm open to ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants