Skip to content

feat(firestore): [PQ] add pipeline queries#12217

Merged
bhshkh merged 3 commits intogoogleapis:feature/fs-pipeline-queriesfrom
bhshkh:feature/fs-pipeline-queries
May 21, 2025
Merged

feat(firestore): [PQ] add pipeline queries#12217
bhshkh merged 3 commits intogoogleapis:feature/fs-pipeline-queriesfrom
bhshkh:feature/fs-pipeline-queries

Conversation

@bhshkh
Copy link
Copy Markdown
Contributor

@bhshkh bhshkh commented May 7, 2025

b/364927702

What are pipeline queries?

A Pipeline Query is constructed by defining a series of stages that are executed in order

Stages: A pipeline may consist of one or more stages. Logically, these represent the series of steps (or stages) taken to execute the query. Note: In practice, stages may be executed out of order to improve performance. However, this does not modify the intent or correctness of the query.

Expressions: Stages will often accept an expression allowing you to express more complex queries. Expression may be simple and consist of a single function like eq("a", 1). You can also express more complex expressions by nesting expressions like and(eq("a", 1), eq("b", 2)).

Screenshot 2025-05-07 at 2 37 14 PM

More info about pipeline queries can be found here: https://cloud.google.com/firestore/docs/pipeline/overview

Changes in this PR

This is the initial PR for pipeline queries to have basic components in place before adding more stages and functions.
This PR is for feature branch and not main. Once the protos are public, I will create a new PR to merge feature branch into main.

  • The new structs introduced in this PR are inline with Java and Node implementations

  • PipelineResult.Data() and PipelineResult.DataTo implementations are similar to existing DocumentSnapshot.Data and DataTo

  • streamPipelineResultIterator is similar to queryDocumentIterator

image

@product-auto-label product-auto-label Bot added the api: firestore Issues related to the Firestore API. label May 7, 2025
@bhshkh bhshkh marked this pull request as ready for review May 7, 2025 21:52
@bhshkh bhshkh requested review from a team May 7, 2025 21:52
}
return m
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is PipelineResult.Get() coming later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PipelineResult is similar to DocumentSnapshot. So, there won't be a Get.

DocumentRef field inside PipelineResult and DocumentSnapshot already have a Get

The other methods that I would add are DataAt and DataAtPath

// DataAt returns the data value denoted by path.
//
// The path argument can be a single field or a dot-separated sequence of
// fields, and must not contain any of the runes "˜*/[]". Use DataAtPath instead for
// such a path.
//
// See DocumentSnapshot.DataTo for how Firestore values are converted to Go values.
//
// If the document does not exist, DataAt returns a NotFound error.
func (d *DocumentSnapshot) DataAt(path string) (interface{}, error) {
if !d.Exists() {
return nil, status.Errorf(codes.NotFound, "document %s does not exist", d.Ref.Path)
}
fp, err := parseDotSeparatedString(path)
if err != nil {
return nil, err
}
return d.DataAtPath(fp)
}
// DataAtPath returns the data value denoted by the FieldPath fp.
// If the document does not exist, DataAtPath returns a NotFound error.
func (d *DocumentSnapshot) DataAtPath(fp FieldPath) (interface{}, error) {
if !d.Exists() {
return nil, status.Errorf(codes.NotFound, "document %s does not exist", d.Ref.Path)
}
v, err := valueAtPath(fp, d.proto.Fields)
if err != nil {
return nil, err
}
return createFromProtoValue(v, d.c)
}

Name: s.name(),
Args: []*pb.Value{arg},
}, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to simplify how stages are represented, to make it easier to scale?

Like maybe each the only actual struct is a base like this:

type PipelineStage struct {
  string
  values
  options
}

And each sub-type just builds one of those?

I'm not super familiar with go, so let me know if the answer is "no". Just want to make sure scalability is being considered, because we're going to have a lot more of these

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Go, the approach we're currently using (an interface like pipelineStage with specific structs like limitStage, collectionStage implementing it) is generally the more idiomatic and preferred way to handle this kind of polymorphism while maintaining strong type safety.

While a generic PipelineStage struct could make the Pipeline.stages slice look uniform, it would shift complexity and reduce type safety:

  • Type Safety: We'd lose compile-time checks on the arguments specific to each stage. For example, limitStage clearly takes an int. With a generic struct, we'd use interface{} for arguments and rely on runtime type assertions within the logic that processes these stages (like converting to protobuf), which is more error-prone.
  • Clarity: Specific structs make it very clear what parameters each stage type accepts.
  • Encapsulation: Each stage struct can manage its own logic for converting to its protobuf representation in its toProto() method. With a generic struct, this logic would likely end up in a large, centralized switch statement.

The interface approach scales well because adding a new stage type involves defining its specific struct, implementing the interface, and adding a corresponding builder method to Pipeline. The core pipeline processing logic that iterates over []pipelineStage and calls toProto() doesn't need to be modified for each new stage type.

This pattern is common in Go for these reasons.

Comment thread firestore/pipeline_result.go Outdated
// is valid.
func (p *PipelineResult) Exists() bool {
return p.proto != nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure exists is the right word for a PipelineResult that doesn't have a document? I don't remember seeing that in the other languages

I think the document will be empty if it's an aggregation, but the aggregation result still exists.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Comment thread firestore/pipeline.go
Comment on lines +84 to +95
func (p *Pipeline) append(s pipelineStage) *Pipeline {
if p.err != nil {
return p
}
newP := &Pipeline{
c: p.c,
stages: make([]pipelineStage, len(p.stages)+1),
}
copy(newP.stages, p.stages)
newP.stages[len(p.stages)] = s
return newP
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this function essentially create a deep copy of the existing pipeline then extend the new one? Is this a better approach than just returning the existing object after appending the new pipelineStage to the existing one?

Copy link
Copy Markdown
Contributor Author

@bhshkh bhshkh May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a deliberate choice to make the Pipeline builder immutable. It's a common pattern for fluent, chainable APIs.

  1. This allows branching e.g.
base := client.Pipeline().Collection("events")
pA := base.Where(Field("type").Eq("A"))
pB := base.Where(Field("type").Eq("B"))
// 'base' is still just Collection("events")
// pA and pB are distinct and don't interfere.
  1. Thread Safety (for reading):
    Immutable objects are inherently safe to share across goroutines for reading purposes without requiring locks, as their state never changes once created.

The minor performance overhead of copying is usually negligible compared to the benefits in robustness.

if !p.Exists() {
return status.Errorf(codes.NotFound, "document does not exist")
}
return setFromProtoValue(v, &pb.Value{ValueType: &pb.Value_MapValue{MapValue: &pb.MapValue{Fields: p.proto.Fields}}}, p.c)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is setFromProtoValue defined? Is this just a generic serialization function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Its here:

func setFromProtoValue(dest interface{}, vprotoSrc *pb.Value, c *Client) error {
destV := reflect.ValueOf(dest)
if destV.Kind() != reflect.Ptr || destV.IsNil() {
return errors.New("firestore: nil or not a pointer")
}
return setReflectFromProtoValue(destV.Elem(), vprotoSrc, c)
}

It is also being used for query results


// PipelineSource is a factory for creating Pipeline instances.
// It is obtained by calling [Client.Pipeline()].
type PipelineSource struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If PipelineSource is going to be a factory class, would it be better to rename this to PipelineFactory, or have the Pipeline class in Client be renamed to something reflecting that you're getting a factory class instead of just Client.Pipeline?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to have uniformity across clients. Java, Node and Python use the same terminology.

"Source" is domain-appropriate. While it is a factory, naming it PipelineSource effectively describes its role. Changing to PipelineFactory might make it sound more like a GoF pattern and less like a domain concept.

Changing Client.Pipeline() to something like Client.GetPipelineFactory() or Client.NewPipelineBuilder() is more verbose.

Copy link
Copy Markdown
Contributor

@daniel-sanche daniel-sanche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bhshkh bhshkh merged commit 687d67d into googleapis:feature/fs-pipeline-queries May 21, 2025
171 of 179 checks passed
bhshkh added a commit to bhshkh/google-cloud-go that referenced this pull request Jun 2, 2025
* feat(firestore): add pipeline queries

* add comments

* remove exists
Copy link
Copy Markdown

@wu-hui wu-hui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, FWIW.

bhshkh added a commit that referenced this pull request Jun 24, 2025
* feat(firestore): add pipeline queries

* add comments

* remove exists
bhshkh added a commit that referenced this pull request Jul 1, 2025
* feat(firestore): add pipeline queries

* add comments

* remove exists
bhshkh added a commit that referenced this pull request Oct 21, 2025
…tions (#13147)

b/364927702

Add rest of the arithmetics and comparison functions from
[go/firestore-query-tracker](http://go/firestore-query-tracker)


Previous pull requests:
- #12217
- #12425
- #12538
bhshkh added a commit that referenced this pull request Oct 23, 2025
)

b/364927702

1. Fixes test
```go
--- FAIL: TestPipelineResult_NoResults (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked]
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x10d40b6]

goroutine 4633 [running]:
testing.tRunner.func1.2({0x13c5240, 0x1ec2150})
	/usr/local/go/src/testing/testing.go:1872 +0x419
testing.tRunner.func1()
	/usr/local/go/src/testing/testing.go:1875 +0x683
panic({0x13c5240?, 0x1ec2150?})
	/usr/local/go/src/runtime/panic.go:783 +0x132
cloud.google.com/go/firestore.(*PipelineResult).Data(0xc00046f380)
	/tmpfs/src/google-cloud-go/firestore/pipeline_result.go:95 +0x96
cloud.google.com/go/firestore.TestPipelineResult_NoResults(0xc000455500)
	/tmpfs/src/google-cloud-go/firestore/pipeline_result_test.go:359 +0x312
testing.tRunner(0xc000455500, 0x157dca0)
	/usr/local/go/src/testing/testing.go:1934 +0x21d
created by testing.(*T).Run in goroutine 1
	/usr/local/go/src/testing/testing.go:1997 +0x9d3
FAIL	cloud.google.com/go/firestore	2.337s
```

```go
=== RUN   TestPipelineResultIterator_GetAll
    pipeline_result_test.go:249: second result id: got 1, want: 2
--- FAIL: TestPipelineResultIterator_GetAll (0.00s)
```
2. Add enterprise database env variable


Previous pull requests:

- #12217
- #12425
- #12538
- #13147
@bhshkh bhshkh changed the title feat(firestore): add pipeline queries feat(firestore): [PQ] add pipeline queries Oct 27, 2025
bhshkh added a commit that referenced this pull request Oct 30, 2025
…3194)

b/364927702

1. add all the remaining private preview aggregate functions. Merging
this PR completes the implementation of all the **type: "Function"
subType : "Accumulators (Aggregation)"** private preview features.
See "Firestore Features (Pipeline)" sheet in
[go/firestore-query-tracker](http://go/firestore-query-tracker) for the
list of features.

    Java reference:
-
https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/AggregateFunction.java#L43-L71
-
https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/AggregateFunction.java#L93-L111

2. add all the remaining private preview timestamp functions. Merging
this PR completes the implementation of all the **type: "Function"
subType: "Date / Timestamp"** private preview features. (except
timestamp_trunc function which is not yet inmplemented in any of the
SDKs. Requires additonal approvals from Firestore team and will be added
to separate PR).
See "Firestore Features (Pipeline)" sheet in
[go/firestore-query-tracker](http://go/firestore-query-tracker) for the
list of functions.

    Java reference:
-
https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java#L2262-L2517

3. Add integration tests for all functions.
4. Remove Rand function since it is not targeted for private preview.
5. Renamed numericExprOrField to numericExprOrFieldPath since field is a
separate type/expression.

https://github.com/googleapis/google-cloud-go/blob/a3ee1f19068c6d3fb77ad797e29884a90d6402a2/firestore/pipeline_field.go#L21-L41



Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
bhshkh added a commit that referenced this pull request Oct 30, 2025
…13218)

b/364927702

1. add all the remaining private preview stages. Merging this PR
completes the implementation of all the type "Stage" subType "General"
private preview features. (except literals stage which is not yet
inmplemented in Java and Node. Requires additonal approvals from
Firestore team and will be added to separate PR).
See "Firestore Features (Pipeline)" sheet in
[go/firestore-query-tracker](http://go/firestore-query-tracker)
2. Add integration and unit tests.
3. Refactor existing stages code to remove duplicated code and rearrange
in alphabetical order.
4. Modify behaviour of Data and DataTo to match existing implementation
in document.go

https://github.com/googleapis/google-cloud-go/blob/9a4cb31f4d34948404d91123fb560a43aeebe83e/firestore/document.go#L64-L127



Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
bhshkh added a commit that referenced this pull request Oct 30, 2025
)

1. add all the private preview 'array' functions. 
Merging this PR completes the implementation of all the **type:
"Function" subType : "Array"** private preview features (except
'maximum' and 'minimum' which are not yet implemented in any of the
SDKs. Requires additonal approvals from Firestore team and will be added
to separate PR). See "Firestore Features (Pipeline)" sheet in
[go/firestore-query-tracker](http://go/firestore-query-tracker) for the
list of features.
    Java reference: 
-
https://github.com/googleapis/java-firestore/blob/wuandy/JavaPplPP/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java
2. add all the private preview 'string' functions. (except
'string_split' which is not yet implemented in any of the SDKs. Requires
additonal approvals from Firestore team and will be added to separate
PR)

3. add all the private preview 'vector' functions. 
4. add remaining types to ConstantOf to match Java's implementation.
    Java reference: 
-
https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java#L70-L211

Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
bhshkh added a commit that referenced this pull request Oct 30, 2025
b/364927702

- toExprOrField was renamed to asFieldExpr in
https://github.com/googleapis/google-cloud-go/pull/13194/files#diff-4a55211f7d38a1f0599e2f4cc92795073f138b2c56b846c933bda19e26bc3a7a
. There were a few call locations were rename was missed while resolving
merge conflicts which caused build failures. Fixing those failures in
this PR.
- the function signature of Data was changed in
#13218. It no longer
returns err as second argument. Fixing this in this PR.
- remove duplicate asInt64Expr and asStringExpr
- Move pipeline tests to their own file


Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
bhshkh added a commit that referenced this pull request Oct 30, 2025
b/364927702

- Combine FieldOf and FieldOfPath to avoid verbose name FieldOfPath


Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
- #13270

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
bhshkh added a commit that referenced this pull request Oct 31, 2025
1. Move PipelineStages integration tests.
2. Remove IsNaN, IsNotNaN, IsNull, IsNotNull, Equivalent as they are no
longer supported by backend
3. Remove examples as commented here

#13245 (comment)

Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
- #13270
- #13271
bhshkh added a commit that referenced this pull request Nov 3, 2025
Add raw stage similar to Java

https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/Pipeline.java#L997-L1021


The raw stage is an escape hatch to allow customers to consume new
stages supported by the backend without having to update their SDK to a
version that adds the stage.



Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
- #13270
- #13271
- #13279
bhshkh added a commit that referenced this pull request Nov 3, 2025
1. Move PipelineStages integration tests.
2. Remove IsNaN, IsNotNaN, IsNull, IsNotNull, Equivalent as they are no
longer supported by backend
3. Remove examples as commented here

#13245 (comment)

Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
- #13270
- #13271
bhshkh added a commit that referenced this pull request Nov 3, 2025
Add raw stage similar to Java

https://github.com/googleapis/java-firestore/blob/742fab6583c9a6f9c47cf0496124c3c9b05fe0ee/google-cloud-firestore/src/main/java/com/google/cloud/firestore/Pipeline.java#L997-L1021


The raw stage is an escape hatch to allow customers to consume new
stages supported by the backend without having to update their SDK to a
version that adds the stage.



Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
- #13270
- #13271
- #13279
bhshkh added a commit that referenced this pull request Nov 10, 2025
#13283)

Changes in this PR:

1. firestore_client.go : Updated generated client as per
googleapis/gapic-generator-go#1661 . Removed
retries from tests since the headers have now been fixed.
2. Remove Equivalent since it was removed from backend.
3. Add/update comments
4. Add timestamp truncate (pending from
#13194) and string
split (pending from
#13245) functions.
5. add all the private preview general, key, logical (except iferror),
type and object functions.
See "Firestore Features (Pipeline)" sheet in
[go/firestore-query-tracker](http://go/firestore-query-tracker) for the
list of functions.
    Java reference:
-
https://github.com/googleapis/java-firestore/blob/ccaf9d4fac5bd87a4da3d37493ca66fdc7681bc3/google-cloud-firestore/src/main/java/com/google/cloud/firestore/pipeline/expressions/Expression.java



Previous pull requests

- #12217
- #12425
- #12538
- #13147
- #13199
- #13218
- #13194
- #13245
- #13270
- #13271
- #13279
- #13280
- #13281
- #13282
- googleapis/gapic-generator-go#1661
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: firestore Issues related to the Firestore API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants