-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Adding presto-pinot with pushdown working #13504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| =============== | ||
| Pinot Connector | ||
| =============== | ||
|
|
||
| The Pinot connector allows querying and creating tables in an external Pinot | ||
| database. This can be used to query pinot data or join pinot data with | ||
| something else. | ||
|
|
||
| Configuration | ||
| ------------- | ||
|
|
||
| To configure the Pinot connector, create a catalog properties file | ||
| in ``etc/catalog`` named, for example, ``pinot.properties``, to | ||
| mount the Pinot connector as the ``pinot`` catalog. | ||
| Create the file with the following contents, replacing the | ||
| connection properties as appropriate for your setup: | ||
|
|
||
| .. code-block:: none | ||
|
|
||
| connector.name=pinot | ||
| pinot.controller-urls=controller_host1:9000,controller_host2:9000 | ||
|
|
||
| Where the ``pinot.controller-urls`` property allows you to specify a | ||
| comma separated list of the pinot controller host/port pairs. | ||
|
|
||
| Multiple Pinot Servers | ||
| ^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| You can have as many catalogs as you need, so if you have additional | ||
| Pinot clusters, simply add another properties file to ``etc/catalog`` | ||
| with a different name (making sure it ends in ``.properties``). For | ||
| example, if you name the property file ``sales.properties``, Presto | ||
| will create a catalog named ``sales`` using the configured connector. | ||
|
|
||
| Querying Pinot | ||
| -------------- | ||
|
|
||
| The Pinot catalog exposes all pinot tables inside a flat schema. The | ||
| schema name is immaterial when querying but running ``SHOW SCHEMAS``, | ||
| will show just one schema entry of ``default``. | ||
|
|
||
| The name of the pinot catalog is the catalog file you created above | ||
| without the ``.properties`` extension. | ||
|
|
||
| For example, if you created a | ||
| file called ``mypinotcluster.properties``, you can see all the tables | ||
| in it using the command:: | ||
|
|
||
| SHOW TABLES from mypinotcluster.default | ||
|
|
||
| OR:: | ||
|
|
||
| SHOW TABLES from mypinotcluster.foo | ||
|
|
||
| Both of these commands will list all the tables in your pinot cluster. | ||
| This is because Pinot does not have a notion of schemas. | ||
|
|
||
| Consider you have a table called ``clicks`` in the ``mypinotcluster``. | ||
| You can see a list of the columns in the ``clicks`` table using either | ||
| of the following:: | ||
|
|
||
| DESCRIBE mypinotcluster.dontcare.clicks; | ||
| SHOW COLUMNS FROM mypinotcluster.dontcare.clicks; | ||
|
|
||
| Finally, you can access the ``clicks`` table:: | ||
|
|
||
| SELECT count(*) FROM mypinotcluster.default.clicks; | ||
|
|
||
|
|
||
| How the Pinot connector works | ||
| ----------------------------- | ||
|
|
||
| The connector tries to push the maximal subquery inferred from the | ||
| presto query into pinot. It can push down everything Pinot supports | ||
| including aggregations, group by, all UDFs etc. It generates the | ||
| correct Pinot PQL keeping Pinot's quirks in mind. | ||
|
|
||
| By default, it sends aggregation and limit queries to the Pinot broker | ||
| and does a parallel scan for non-aggregation/non-limit queries. The | ||
| pinot broker queries create a single split that lets the Pinot broker | ||
| do the scatter gather. Whereas, in the parallel scan mode, there is | ||
| one split created for one-or-more Pinot segments and the Pinot servers | ||
| are directly contacted by the Presto servers (ie., the Pinot broker is | ||
| not involved in the parallel scan mode) | ||
|
|
||
| There are a few configurations that control this behavior: | ||
|
|
||
| * ``pinot.prefer-broker-queries``: This config is true by default. | ||
| Setting it to false will also create parallel plans for | ||
| aggregation and limit queries. | ||
| * ``pinot.forbid-segment-queries``: This config is false by default. | ||
| Setting it to true will forbid parallel querying and force all | ||
| querying to happen via the broker. | ||
| * ``pinot.non-aggregate-limit-for-broker-queries``: To prevent | ||
| overwhelming the broker, the connector only allows querying the | ||
| pinot broker for ``short`` queries. We define a ``short`` query to | ||
| be either an aggregation (or group-by) query or a query with a limit | ||
| less than the value configured for | ||
| ``pinot.non-aggregate-limit-for-broker-queries``. The default value | ||
| for this limit is 25K rows. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,7 +23,13 @@ public class HivePlanOptimizerProvider | |
| implements ConnectorPlanOptimizerProvider | ||
| { | ||
| @Override | ||
| public Set<ConnectorPlanOptimizer> getConnectorPlanOptimizers() | ||
| public Set<ConnectorPlanOptimizer> getLogicalPlanOptimizers() | ||
|
||
| { | ||
| return ImmutableSet.of(); | ||
| } | ||
|
|
||
| @Override | ||
| public Set<ConnectorPlanOptimizer> getPhysicalPlanOptimizers() | ||
| { | ||
| return ImmutableSet.of(); | ||
| } | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,6 +15,7 @@ | |
|
|
||
| import com.facebook.presto.spi.ConnectorId; | ||
| import com.facebook.presto.spi.ConnectorPlanOptimizer; | ||
| import com.facebook.presto.spi.PrestoException; | ||
| import com.facebook.presto.spi.connector.ConnectorPlanOptimizerProvider; | ||
| import com.google.common.collect.ImmutableMap; | ||
|
|
||
|
|
@@ -24,6 +25,7 @@ | |
| import java.util.Set; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
|
|
||
| import static com.facebook.presto.spi.StandardErrorCode.GENERIC_INTERNAL_ERROR; | ||
| import static com.google.common.base.Preconditions.checkArgument; | ||
| import static com.google.common.collect.Maps.transformValues; | ||
| import static java.util.Objects.requireNonNull; | ||
|
|
@@ -43,8 +45,20 @@ public void addPlanOptimizerProvider(ConnectorId connectorId, ConnectorPlanOptim | |
| "ConnectorPlanOptimizerProvider for connector '%s' is already registered", connectorId); | ||
| } | ||
|
|
||
| public Map<ConnectorId, Set<ConnectorPlanOptimizer>> getOptimizers() | ||
| public Map<ConnectorId, Set<ConnectorPlanOptimizer>> getOptimizers(PlanPhase phase) | ||
| { | ||
| return ImmutableMap.copyOf(transformValues(planOptimizerProviders, ConnectorPlanOptimizerProvider::getConnectorPlanOptimizers)); | ||
| switch (phase) { | ||
|
||
| case LOGICAL: | ||
| return ImmutableMap.copyOf(transformValues(planOptimizerProviders, ConnectorPlanOptimizerProvider::getLogicalPlanOptimizers)); | ||
| case PHYSICAL: | ||
| return ImmutableMap.copyOf(transformValues(planOptimizerProviders, ConnectorPlanOptimizerProvider::getPhysicalPlanOptimizers)); | ||
| default: | ||
| throw new PrestoException(GENERIC_INTERNAL_ERROR, "Unknown plan phase " + phase); | ||
| } | ||
| } | ||
|
|
||
| public enum PlanPhase | ||
| { | ||
| LOGICAL, PHYSICAL | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way is to add a
getStage()method inConnectorPlanOptimizerbut current design is fine given we won't (probably shouldn't) have more than two stages.