[SPARK-20196][PYTHON][SQL] update doc for catalog functions for all languages, add pyspark refreshByPath API by felixcheung · Pull Request #17512 · apache/spark

felixcheung · 2017-04-02T19:34:46Z

What changes were proposed in this pull request?

Update doc to remove external for createTable, add refreshByPath in python

How was this patch tested?

manual

SparkQA · 2017-04-02T21:46:34Z

Test build #75466 has finished for PR 17512 at commit 573b265.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung · 2017-04-02T22:28:06Z

@gatorsmile

felixcheung · 2017-04-04T00:45:20Z

will update after #17518 + changes to R doc too

felixcheung · 2017-04-04T18:38:26Z

updated @gatorsmile

SparkQA · 2017-04-04T21:02:57Z

Test build #75515 has finished for PR 17512 at commit 2900afe.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-04-04T22:41:55Z

 }

-#' Create a SparkDataFrame from a SparkSQL Table
+#' Create a SparkDataFrame from a SparkSQL table or temporary view


table or view

Here, actually, it includes both temporary views or persistent views.

gatorsmile · 2017-04-04T22:42:35Z

 #'
-#' Returns the specified Table as a SparkDataFrame.  The Table must have already been registered
-#' in the SparkSession.
+#' Returns the specified table or temporary view as a SparkDataFrame. The temporary view must have


The same here.

gatorsmile · 2017-04-04T23:02:09Z

-        """Recover all the partitions of the given table and update the catalog."""
+        """Recovers all the partitions of the given table and update the catalog.
+
+        Only works with a partitioned table, and not a temporary view.


a temporary view -> a view

Both temporary and persistent views are not supported. We will detect and issue exceptions.

gatorsmile · 2017-04-04T23:04:20Z


  /**
   * Recovers all the partitions in the directory of a table and update the catalog.
+   * Only works with a partitioned table, and not a temporary view.


The same here.

gatorsmile · 2017-04-04T23:40:46Z

  /**
   * Refreshes the cache entry and the associated metadata for all Dataset (if any), that contain
-   * the given data source path.
+   * the given data source path. Path matching is by prefix, i.e. "/" would invalidate


invalidate -> invalidate and refresh

We also do the re-cache, but the new version cached lazily.

For some reason in here, CatalogImpl.scala is very different from Catalog.scala - let me know if you want me to change them - for now I've updated the first sentence.

Yes. I found this sentence is copied from Catalog.scala. Maybe, we can update them to

Path matching is by prefix, i.e. "/" would invalidate all the cached entries and make the new versions cached lazily.

gatorsmile · 2017-04-05T01:41:02Z


  /**
   * Recovers all the partitions in the directory of a table and update the catalog.
+   * Only works with a partitioned table, and not a temporary view.


The same here.

not a temporary view.
->
not a view.

SparkQA · 2017-04-05T05:42:35Z

Test build #75534 has started for PR 17512 at commit 3e0ccd3.

felixcheung · 2017-04-05T07:56:24Z

Jenkins, retest this please

SparkQA · 2017-04-05T10:06:45Z

Test build #75536 has finished for PR 17512 at commit 3e0ccd3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-04-05T17:16:43Z

 #'
-#' Returns the specified Table as a SparkDataFrame.  The Table must have already been registered
-#' in the SparkSession.
+#' Returns the specified table or view as a SparkDataFrame. The table or view must already exists or


Nit: exists -> exist

fixed, thanks for catching this!

gatorsmile · 2017-04-05T17:17:05Z

LGTM except minor comments.

SparkQA · 2017-04-05T22:01:16Z

Test build #75555 has finished for PR 17512 at commit 5ed1950.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung · 2017-04-06T16:13:08Z

merged to master, thanks!

felixcheung added 2 commits April 4, 2017 11:09

update doc for catalog functions

fc3ca3e

update

2900afe

felixcheung force-pushed the catalogdoc branch from 573b265 to 2900afe Compare April 4, 2017 18:37

felixcheung changed the title ~~[SPARK-20196][PYTHON][SQL] update doc for catalog functions, pyspark refreshByPath API~~ [SPARK-20196][PYTHON][SQL] update doc for catalog functions for all languages, add pyspark refreshByPath API Apr 4, 2017

gatorsmile reviewed Apr 4, 2017

View reviewed changes

gatorsmile reviewed Apr 5, 2017

View reviewed changes

update from feedback

3e0ccd3

gatorsmile reviewed Apr 5, 2017

View reviewed changes

fix typo

5ed1950

asfgit closed this in bccc330 Apr 6, 2017

Conversation

felixcheung commented Apr 2, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Apr 2, 2017

Uh oh!

felixcheung commented Apr 2, 2017

Uh oh!

felixcheung commented Apr 4, 2017

Uh oh!

felixcheung commented Apr 4, 2017

Uh oh!

SparkQA commented Apr 4, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gatorsmile Apr 4, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 5, 2017

Uh oh!

felixcheung commented Apr 5, 2017

Uh oh!

SparkQA commented Apr 5, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gatorsmile commented Apr 5, 2017

Uh oh!

SparkQA commented Apr 5, 2017

Uh oh!

felixcheung commented Apr 6, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gatorsmile Apr 4, 2017 •

edited

Loading