feat(firestore): [PQ] add pipeline queries#12217
feat(firestore): [PQ] add pipeline queries#12217bhshkh merged 3 commits intogoogleapis:feature/fs-pipeline-queriesfrom
Conversation
| } | ||
| return m | ||
| } | ||
|
|
There was a problem hiding this comment.
is PipelineResult.Get() coming later?
There was a problem hiding this comment.
PipelineResult is similar to DocumentSnapshot. So, there won't be a Get.
DocumentRef field inside PipelineResult and DocumentSnapshot already have a Get
The other methods that I would add are DataAt and DataAtPath
google-cloud-go/firestore/document.go
Lines 120 to 151 in 2671c4f
| Name: s.name(), | ||
| Args: []*pb.Value{arg}, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
Is there any way to simplify how stages are represented, to make it easier to scale?
Like maybe each the only actual struct is a base like this:
type PipelineStage struct {
string
values
options
}
And each sub-type just builds one of those?
I'm not super familiar with go, so let me know if the answer is "no". Just want to make sure scalability is being considered, because we're going to have a lot more of these
There was a problem hiding this comment.
In Go, the approach we're currently using (an interface like pipelineStage with specific structs like limitStage, collectionStage implementing it) is generally the more idiomatic and preferred way to handle this kind of polymorphism while maintaining strong type safety.
While a generic PipelineStage struct could make the Pipeline.stages slice look uniform, it would shift complexity and reduce type safety:
- Type Safety: We'd lose compile-time checks on the arguments specific to each stage. For example, limitStage clearly takes an int. With a generic struct, we'd use interface{} for arguments and rely on runtime type assertions within the logic that processes these stages (like converting to protobuf), which is more error-prone.
- Clarity: Specific structs make it very clear what parameters each stage type accepts.
- Encapsulation: Each stage struct can manage its own logic for converting to its protobuf representation in its toProto() method. With a generic struct, this logic would likely end up in a large, centralized switch statement.
The interface approach scales well because adding a new stage type involves defining its specific struct, implementing the interface, and adding a corresponding builder method to Pipeline. The core pipeline processing logic that iterates over []pipelineStage and calls toProto() doesn't need to be modified for each new stage type.
This pattern is common in Go for these reasons.
| // is valid. | ||
| func (p *PipelineResult) Exists() bool { | ||
| return p.proto != nil | ||
| } |
There was a problem hiding this comment.
Are you sure exists is the right word for a PipelineResult that doesn't have a document? I don't remember seeing that in the other languages
I think the document will be empty if it's an aggregation, but the aggregation result still exists.
| func (p *Pipeline) append(s pipelineStage) *Pipeline { | ||
| if p.err != nil { | ||
| return p | ||
| } | ||
| newP := &Pipeline{ | ||
| c: p.c, | ||
| stages: make([]pipelineStage, len(p.stages)+1), | ||
| } | ||
| copy(newP.stages, p.stages) | ||
| newP.stages[len(p.stages)] = s | ||
| return newP | ||
| } |
There was a problem hiding this comment.
Why does this function essentially create a deep copy of the existing pipeline then extend the new one? Is this a better approach than just returning the existing object after appending the new pipelineStage to the existing one?
There was a problem hiding this comment.
This is a deliberate choice to make the Pipeline builder immutable. It's a common pattern for fluent, chainable APIs.
- This allows branching e.g.
base := client.Pipeline().Collection("events")
pA := base.Where(Field("type").Eq("A"))
pB := base.Where(Field("type").Eq("B"))
// 'base' is still just Collection("events")
// pA and pB are distinct and don't interfere.- Thread Safety (for reading):
Immutable objects are inherently safe to share across goroutines for reading purposes without requiring locks, as their state never changes once created.
The minor performance overhead of copying is usually negligible compared to the benefits in robustness.
| if !p.Exists() { | ||
| return status.Errorf(codes.NotFound, "document does not exist") | ||
| } | ||
| return setFromProtoValue(v, &pb.Value{ValueType: &pb.Value_MapValue{MapValue: &pb.MapValue{Fields: p.proto.Fields}}}, p.c) |
There was a problem hiding this comment.
Where is setFromProtoValue defined? Is this just a generic serialization function?
There was a problem hiding this comment.
Yes. Its here:
google-cloud-go/firestore/from_value.go
Lines 27 to 33 in 70c11ad
It is also being used for query results
|
|
||
| // PipelineSource is a factory for creating Pipeline instances. | ||
| // It is obtained by calling [Client.Pipeline()]. | ||
| type PipelineSource struct { |
There was a problem hiding this comment.
If PipelineSource is going to be a factory class, would it be better to rename this to PipelineFactory, or have the Pipeline class in Client be renamed to something reflecting that you're getting a factory class instead of just Client.Pipeline?
There was a problem hiding this comment.
This is to have uniformity across clients. Java, Node and Python use the same terminology.
"Source" is domain-appropriate. While it is a factory, naming it PipelineSource effectively describes its role. Changing to PipelineFactory might make it sound more like a GoF pattern and less like a domain concept.
Changing Client.Pipeline() to something like Client.GetPipelineFactory() or Client.NewPipelineBuilder() is more verbose.
687d67d
into
googleapis:feature/fs-pipeline-queries
* feat(firestore): add pipeline queries * add comments * remove exists
* feat(firestore): add pipeline queries * add comments * remove exists
* feat(firestore): add pipeline queries * add comments * remove exists
) b/364927702 1. Fixes test ```go --- FAIL: TestPipelineResult_NoResults (0.00s) panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked] [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x10d40b6] goroutine 4633 [running]: testing.tRunner.func1.2({0x13c5240, 0x1ec2150}) /usr/local/go/src/testing/testing.go:1872 +0x419 testing.tRunner.func1() /usr/local/go/src/testing/testing.go:1875 +0x683 panic({0x13c5240?, 0x1ec2150?}) /usr/local/go/src/runtime/panic.go:783 +0x132 cloud.google.com/go/firestore.(*PipelineResult).Data(0xc00046f380) /tmpfs/src/google-cloud-go/firestore/pipeline_result.go:95 +0x96 cloud.google.com/go/firestore.TestPipelineResult_NoResults(0xc000455500) /tmpfs/src/google-cloud-go/firestore/pipeline_result_test.go:359 +0x312 testing.tRunner(0xc000455500, 0x157dca0) /usr/local/go/src/testing/testing.go:1934 +0x21d created by testing.(*T).Run in goroutine 1 /usr/local/go/src/testing/testing.go:1997 +0x9d3 FAIL cloud.google.com/go/firestore 2.337s ``` ```go === RUN TestPipelineResultIterator_GetAll pipeline_result_test.go:249: second result id: got 1, want: 2 --- FAIL: TestPipelineResultIterator_GetAll (0.00s) ``` 2. Add enterprise database env variable Previous pull requests: - #12217 - #12425 - #12538 - #13147
…3194) b/364927702 1. add all the remaining private preview aggregate functions. Merging this PR completes the implementation of all the **type: "Function" subType : "Accumulators (Aggregation)"** private preview features. See "Firestore Features (Pipeline)" sheet in [go/firestore-query-tracker](http://go/firestore-query-tracker) for the list of features. Java reference: - https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/AggregateFunction.java#L43-L71 - https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/AggregateFunction.java#L93-L111 2. add all the remaining private preview timestamp functions. Merging this PR completes the implementation of all the **type: "Function" subType: "Date / Timestamp"** private preview features. (except timestamp_trunc function which is not yet inmplemented in any of the SDKs. Requires additonal approvals from Firestore team and will be added to separate PR). See "Firestore Features (Pipeline)" sheet in [go/firestore-query-tracker](http://go/firestore-query-tracker) for the list of functions. Java reference: - https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java#L2262-L2517 3. Add integration tests for all functions. 4. Remove Rand function since it is not targeted for private preview. 5. Renamed numericExprOrField to numericExprOrFieldPath since field is a separate type/expression. https://github.com/googleapis/google-cloud-go/blob/a3ee1f19068c6d3fb77ad797e29884a90d6402a2/firestore/pipeline_field.go#L21-L41 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218
…13218) b/364927702 1. add all the remaining private preview stages. Merging this PR completes the implementation of all the type "Stage" subType "General" private preview features. (except literals stage which is not yet inmplemented in Java and Node. Requires additonal approvals from Firestore team and will be added to separate PR). See "Firestore Features (Pipeline)" sheet in [go/firestore-query-tracker](http://go/firestore-query-tracker) 2. Add integration and unit tests. 3. Refactor existing stages code to remove duplicated code and rearrange in alphabetical order. 4. Modify behaviour of Data and DataTo to match existing implementation in document.go https://github.com/googleapis/google-cloud-go/blob/9a4cb31f4d34948404d91123fb560a43aeebe83e/firestore/document.go#L64-L127 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199
) 1. add all the private preview 'array' functions. Merging this PR completes the implementation of all the **type: "Function" subType : "Array"** private preview features (except 'maximum' and 'minimum' which are not yet implemented in any of the SDKs. Requires additonal approvals from Firestore team and will be added to separate PR). See "Firestore Features (Pipeline)" sheet in [go/firestore-query-tracker](http://go/firestore-query-tracker) for the list of features. Java reference: - https://github.com/googleapis/java-firestore/blob/wuandy/JavaPplPP/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java 2. add all the private preview 'string' functions. (except 'string_split' which is not yet implemented in any of the SDKs. Requires additonal approvals from Firestore team and will be added to separate PR) 3. add all the private preview 'vector' functions. 4. add remaining types to ConstantOf to match Java's implementation. Java reference: - https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java#L70-L211 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194
b/364927702 - toExprOrField was renamed to asFieldExpr in https://github.com/googleapis/google-cloud-go/pull/13194/files#diff-4a55211f7d38a1f0599e2f4cc92795073f138b2c56b846c933bda19e26bc3a7a . There were a few call locations were rename was missed while resolving merge conflicts which caused build failures. Fixing those failures in this PR. - the function signature of Data was changed in #13218. It no longer returns err as second argument. Fixing this in this PR. - remove duplicate asInt64Expr and asStringExpr - Move pipeline tests to their own file Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245
b/364927702 - Combine FieldOf and FieldOfPath to avoid verbose name FieldOfPath Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1. Move PipelineStages integration tests. 2. Remove IsNaN, IsNotNaN, IsNull, IsNotNull, Equivalent as they are no longer supported by backend 3. Remove examples as commented here #13245 (comment) Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271
Add raw stage similar to Java https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/Pipeline.java#L997-L1021 The raw stage is an escape hatch to allow customers to consume new stages supported by the backend without having to update their SDK to a version that adds the stage. Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279
Add consistency selector similar to existing requests Existing code for reference: **client.go :** https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/client.go#L405-L412 https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/client.go#L308-L317 https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/client.go#L490-L504 **query.go :** https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/query.go#L1406-L1412 https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/query.go#L1551-L1561 **list_documents.go :** https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/list_documents.go#L48-L55 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279 - #13280 - #13281
1. Move PipelineStages integration tests. 2. Remove IsNaN, IsNotNaN, IsNull, IsNotNull, Equivalent as they are no longer supported by backend 3. Remove examples as commented here #13245 (comment) Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271
Add raw stage similar to Java https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/Pipeline.java#L997-L1021 The raw stage is an escape hatch to allow customers to consume new stages supported by the backend without having to update their SDK to a version that adds the stage. Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279
Add consistency selector similar to existing requests Existing code for reference: **client.go :** https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/client.go#L405-L412 https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/client.go#L308-L317 https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/client.go#L490-L504 **query.go :** https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/query.go#L1406-L1412 https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/query.go#L1551-L1561 **list_documents.go :** https://github.com/googleapis/google-cloud-go/blob/66cc9bb6e158416897af1d1dc4b9001118db3373/firestore/list_documents.go#L48-L55 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279 - #13280 - #13281
#13283) Changes in this PR: 1. firestore_client.go : Updated generated client as per googleapis/gapic-generator-go#1661 . Removed retries from tests since the headers have now been fixed. 2. Remove Equivalent since it was removed from backend. 3. Add/update comments 4. Add timestamp truncate (pending from #13194) and string split (pending from #13245) functions. 5. add all the private preview general, key, logical (except iferror), type and object functions. See "Firestore Features (Pipeline)" sheet in [go/firestore-query-tracker](http://go/firestore-query-tracker) for the list of functions. Java reference: - https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279 - #13280 - #13281 - #13282 - googleapis/gapic-generator-go#1661
add collection and collectiongroup stage options similar to Java https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/stages/CollectionGroupOptions.java#L34-L36 https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/stages/CollectionOptions.java#L34-L36 https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/stages/CollectionHints.java#L34-L40 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279 - #13280 - #13282 - googleapis/gapic-generator-go#1661
Add ExecuteOptions similar to query run options which was introduced in #10164. GetRawData and GetText implementation similar to Java https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/stages/PipelineExecuteOptions.java https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/stages/ExplainOptions.java https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/ExplainStats.java#L42-L71 https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/test/java/com/google/cloud/firestore/it/ITPipelineTest.java#L2474-L2498 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279 - #13280 - #13281 - #13282 - googleapis/gapic-generator-go#1661 - #13283
Add CreateFrom() and Pipeline() similar to Java. https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/PipelineSource.java#L140-L164 Previous pull requests - #12217 - #12425 - #12538 - #13147 - #13199 - #13218 - #13194 - #13245 - #13270 - #13271 - #13279 - #13280 - #13281 - #13282 - googleapis/gapic-generator-go#1661 - #13283 - #13274 - #13338
b/364927702
What are pipeline queries?
A Pipeline Query is constructed by defining a series of stages that are executed in order
Stages: A pipeline may consist of one or more stages. Logically, these represent the series of steps (or stages) taken to execute the query. Note: In practice, stages may be executed out of order to improve performance. However, this does not modify the intent or correctness of the query.
Expressions: Stages will often accept an expression allowing you to express more complex queries. Expression may be simple and consist of a single function like eq("a", 1). You can also express more complex expressions by nesting expressions like and(eq("a", 1), eq("b", 2)).
More info about pipeline queries can be found here: https://cloud.google.com/firestore/docs/pipeline/overview
Changes in this PR
This is the initial PR for pipeline queries to have basic components in place before adding more stages and functions.
This PR is for feature branch and not main. Once the protos are public, I will create a new PR to merge feature branch into main.
The new structs introduced in this PR are inline with Java and Node implementations
PipelineResult.Data() and PipelineResult.DataTo implementations are similar to existing DocumentSnapshot.Data and DataTo
streamPipelineResultIterator is similar to queryDocumentIterator