Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2 json schema es docs tests #1392

Merged
merged 30 commits into from
Sep 26, 2018
Merged

Conversation

roncohen
Copy link
Contributor

@roncohen roncohen commented Sep 18, 2018

I'm putting this up for early feedback

There are still a number of things to do:

  • streamline when the tests use regex, tests.Group or just prefix match. Here i changed isBlacklistedKey to only do prefix matching for tests.Group, but really, we should probably just always use regex. outside the scope of this change
  • DataValidation for stream for error, transaction, span and error
  • metadata tests
  • lots of cleanups

@roncohen roncohen requested a review from simitt September 18, 2018 12:56
@roncohen roncohen force-pushed the v2-json-schema-es-docs-tests branch from bf4fe3d to 8604731 Compare September 19, 2018 08:36
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the nature of the payloads changed quite a lot, there are a lot of exceptions for the tests now, which makes them a bit confusing. As we discussed offline some time ago, in the long run it would be nice to simplify the payload<>events and payload<>json schema tests and have a more generalised mutation framework for the json schema rules. But that would be too much of a change for this.

I think the introduction of the test_processors is a nice way to abstract their different behaviour.

@@ -65,6 +66,10 @@
type: long
description: The total amount of dropped spans for this transaction.

- name: started
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you indexing this? I thought this field will only be used for UI purposes to indicate whether or not spans are still missing.

@@ -104,6 +105,7 @@ func keywordExceptionKeys(s *tests.Set) *tests.Set {

func templateToSchemaMapping(mapping map[string]string) map[string]string {
return map[string]string{
"context.agent.": "agent.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context doesn't have an agent property, as agent is part of context.service which is covered here.

"context.service.runtime.name",
"context.service.agent.name",
"context.service.agent.version",
"context.service.runtime.version",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using a tests.Group for those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!. I updated KeywordLimitation to support tests.Groups now.


tests.Group("context.custom"),
tests.Group("context.request.env"),
tests.Group("context.request.cookies"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't most of those excluded anyways in the test definition

notInFields := Union(payloadAttrsNotInFields, NewSet(
?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, thanks!

func errorKeywordExceptionKeys(s *tests.Set) *tests.Set {
return tests.Union(s, tests.NewSet(
"processor.event", "processor.name", "listening", "error.grouping_key",
"error.id", "transaction.id", "context.tags", "error.parent_id", "error.trace_id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id, transaction.id, parent_id and trace_id should not be exceptions any more, as the pattern has been removed, so a maxLength check is necessary now.

return nil, err
}

err = p.Processor.Validate(pl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse p.Decode and p.Validate here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the advantage? For Decode, we need the returned metadata and transformables. p.Decode only returns an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you return that from decode? I just tried to avoid having the Processor.Validate and Processor.Decode duplicated. But it is not a big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. Do you want me to call p.Decode instead of p.Processor.Decode and p.Validate instead of p.Processor.Validate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, use p.Decode instead of p.Processor.Decode, same for validation - as you are duplicating the logic from the processor here (which I thought you wanted to avoid with this implementation). But as I said, it is a minor detail, so if you don't agree, it's fine with me to leave as is.

return err
}
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decode and Validate is the same here, so you can alias one.

@roncohen roncohen force-pushed the v2-json-schema-es-docs-tests branch 3 times, most recently from 7a0cf9f to 6e36a34 Compare September 20, 2018 11:36
@roncohen
Copy link
Contributor Author

@simitt apart from more metadata tests, this is ready for another round of feedback

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a high level perspective, will do a more thorough review once it is finished. Really looking forward to have those tests in.

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally lgtm.

@@ -45,7 +45,7 @@ var (
processorEntry = common.MapStr{"name": processorName, "event": docType}
)

var cachedModelSchema = validation.CreateSchema(schema.ModelSchema, processorName)
var cachedModelSchema = validation.CreateSchema(schema.ModelSchema, "metricset")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change from metrics to metricset everywhere, as it is confusing otherwise. Looks like we overlooked this in #1359.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. I wrote a comment on the PR regarding this, but it wasn't very clear: #1359 (comment)

What i meant there is that in v1, it's called metrics. There are some places in the code that is only used by v1 which needs to still be named "metrics", because otherwise the v1 API changes. Additionally, there are several places that is shared between v1 and v2 and where we have to choose between making it right for v1 and making it right for v2. Then there are the places that only affect v2 and those were the ones i chose to change in that PR. I'll go ahead and change it to metricset everywhere that doesn't affect v1 directly to minimize confusion.

// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rename the file to v1_test_processor


type V2TestProcessor struct {
stream.StreamProcessor
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to rename the file from common to v2_test_processor.

@@ -82,21 +87,28 @@ var (
// specified in the schema.
// - schemaAttrsNotInPayload: attributes that are reflected in the json schema but are
// not part of the payload.
func (ps *ProcessorSetup) PayloadAttrsMatchJsonSchema(t *testing.T, payloadAttrsNotInSchema, schemaAttrsNotInPayload *Set) {
require.NotNil(t, ps.Schema)
func (ps *ProcessorSetup) PayloadAttrsMatchJsonSchema(t *testing.T, payloadAttrsNotInSchema, schemaAttrsNotInPayload *Set, schemaPrefix string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to add schemaPrefix to the ProcessorSetup, it is not changing between tests for the same setup.

FullPayloadPath: "../testdata/sourcemap/payload.json",
TemplatePaths: []string{"../../../model/sourcemap/_meta/fields.yml"},
Schema: schema.PayloadSchema,
}
)

func TestPayloadAttrsMatchFields(t *testing.T) {
procSetup.PayloadAttrsMatchFields(t, tests.NewSet("sourcemap"), tests.NewSet())
procSetup.PayloadAttrsMatchFields(t, tests.NewSet("sourcemap.sourcemap"), tests.NewSet())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good find!

@@ -139,7 +142,7 @@
"hex_id": "abcde56a89012345",
"id": 3156431159584433000,
"name": "get /api/types",
"parent": 3146835049025875000,
"parent": 3156441702022342700,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the motivation to change the testfile so all spans have the same parent_id? I think the tests are more exhaustive with different ids.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a rebase mistake, thanks.

func errorPayloadAttrsNotInFields(s *tests.Set) *tests.Set {
return tests.Union(s, tests.NewSet(
"error.exception.attributes",
"error.exception.attributes.foo",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not also using tests.Group("error.exception.attributes") here?

// specific language governing permissions and limitations
// under the License.

package package_tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a attrs_common field in the processor/error/package_tests which was meant to build the base for v1, v2, dt tests. If you move this to the tests folder you could reuse it, instead of defining the same attrs again.

Otherwise I don't see the point in defining the methods allowing to pass in another set that is merged into the defined set. E.g.

func errorPayloadAttrsNotInFields(s *tests.Set) *tests.Set {
	return tests.Union(s, tests.NewSet(

Copy link
Contributor Author

@roncohen roncohen Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we discussed offline how to achieve this. The test helpers are mostly the same except for the first part of them. In v1 they are "errors", while in v2 it's "error". Parameterizing the helpers will get too complicated. In summary, we didn't find a good way to reuse the helpers at this time. I'll remove the parameter that allows to pass in a set. We'll maintain separate copies of the helpers for v2 and do a refactor of the tests once v1 has been fully dropped.


"error.trace_id": tests.Condition{Existence: obj{"error.parent_id": "abc123", "error.transaction_id": "abc123"}},
"error.transaction_id": tests.Condition{Existence: obj{"error.parent_id": "abc123", "error.trace_id": "abc123"}},
"error.parent_id": tests.Condition{Existence: obj{"error.transaction_id": "abc123", "error.trace_id": "abc123"}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the exception.message and exception.type conditional requirements?

@roncohen roncohen force-pushed the v2-json-schema-es-docs-tests branch from f83920f to be5ba78 Compare September 25, 2018 15:32
{"transaction": { "id": "cdef4340a8e0df19", "trace_id": "0acd456789abcdef0123456789abcdef", "type": "request", "duration": 13.980558, "timestamp": "2018-07-30T18:53:42.281Z", "sampled": null, "span_count": { "dropped": null, "started": 8 }, "context": { "request": { "socket": { "remote_address": null, "encrypted": null }, "method": "POST", "headers": { "user-agent": null, "content-type": null, "cookie": null }, "url": {} }, "response": { "headers": { "content-type": null } }, "tags": { "organization_uuid": "9f0e9d64-c185-4d21-a6f4-4673ed561ec8" }, "custom": { "my_key": 1, "some_other_value": "foo bar", "and_objects": { "foo": [ "bar", "baz" ] } } } }}
{"metadata": {"service": {"name": "1234_service-12a3","version": "5.1.3","environment": "staging","language": {"name": "ecmascript","version": "8"},"runtime": {"name": "node","version": "8.0.0"},"framework": {"name": "Express","version": "1.2.3"},"agent": {"name": "elastic-node","version": "3.14.0"}},"process": {"pid": 1234,"ppid": 6789,"title": "node","argv": ["node","server.js"]},"system": {"hostname": "prod1.example.com","architecture": "x64","platform": "darwin"}}}
{ "transaction": { "id": "945254c567a5417e", "trace_id": "0123456789abcdef0123456789abcdef", "parent_id": "abcdefabcdef01234567", "type": "request", "duration": 32.592981, "span_count": { "started": 43 }} }
{"transaction": {"id": "945254c5-67a5-417e-8a4e-aa29efcbfb79", "trace_id": "0acd456789abcdef0123456789abcdef", "name": "GET /api/types","type": "request","duration": 32.592981,"result": "success","timestamp": "2017-05-30T18:53:27.154Z", "sampled": true, "span_count": {"started": 17},"context": {"request": {"socket": {"remote_address": "12.53.12.1","encrypted": true},"http_version": "1.1","method": "POST","url": {"protocol": "https:","full": "https://www.example.com/p/a/t/h?query=string#hash","hostname": "www.example.com","port": "8080","pathname": "/p/a/t/h","search": "?query=string","hash": "#hash","raw": "/p/a/t/h?query=string#hash"},"headers": {"user-agent": "Mozilla Chrome Edge","content-type": "text/html","cookie": "c1=v1; c2=v2","some-other-header": "foo","array": ["foo","bar","baz"]},"cookies": {"c1": "v1","c2": "v2"},"env": {"SERVER_SOFTWARE": "nginx","GATEWAY_INTERFACE": "CGI/1.1"},"body": {"str": "hello world","additional": { "foo": {},"bar": 123,"req": "additional information"}}},"response": {"status_code": 200,"headers": {"content-type": "application/json"},"headers_sent": true,"finished": true}, "user": {"id": "99","username": "foo","email": "[email protected]"},"tags": {"organization_uuid": "9f0e9d64-c185-4d21-a6f4-4673ed561ec8"},"custom": {"my_key": 1,"some_other_value": "foo bar","and_objects": {"foo": ["bar","baz"]},"(": "not a valid regex and that is fine"}}}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to not use "id": "945254c5-67a5-417e-8a4e-aa29efcbfb79" as ID here. It is valid now that we removed the pattern validation on the Intake API, but it is not what we expect to receive. Testdata are often used to showcase what is sent by the agents.

return tests.NewSet(
"listening", "view errors", "error id icon",
"context.user.user-agent", "context.user.ip", "context.system.ip",
"error.parent_id", "error.trace_id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error.parent_id and error.trace_id should not be listed here.

}

func getMetadataEventAttrs(t *testing.T, prefix string) *tests.Set {
payloadStream, err := loader.LoadDataAsStream("../testdata/intake-v2/spans.ndjson")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to modify the only-metadata.ndjson to contain all possible metadata and use this file. It is a bit confusing to use a span.ndjson file here.


func transactionContext() *tests.Set {
return tests.NewSet(
tests.Group("context.request.url"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove tests.Group("context.request.url"), as it is covered by tests.Group("context.request")

"span.context.request.headers.array",
"span.stacktrace.vars.key",
"span.context.tags.tag1",
tests.Group("metadata"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests.Group("metadata") can be removed

"span.stacktrace.filename",
"span.stacktrace.lineno",
"span.context.request.method",
"span.context.request.url",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove span.context.request

func transactionPayloadAttrsNotInFields() *tests.Set {
return tests.NewSet(
tests.Group("transaction.marks."),
tests.Group("context.db"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove tests.Group("context.db"), it only concerns spans.

tests/fields.go Outdated
require.NoError(t, err)

// all, err := fetchFlattenedFieldNames(ps.TemplatePaths, addAllFields)
// require.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover

"error.log": tests.Condition{Absence: []string{"error.exception"}},

"error.message": tests.Condition{Absence: []string{"error.type"}},
"error.type": tests.Condition{Absence: []string{"error.message"}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be error.exception.message and error.exception.type. I assume the test doesn't fail as the specified attribute doesn't exist, so it never actually tests it.

"error.exception.message",
"error.exception",
"error.log",
"error.log.type",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no attribute error.log.type, this should probably be error.exception.type.

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this - great to have the full testing in v2!

@roncohen roncohen merged commit 0b7d3b4 into elastic:v2 Sep 26, 2018
@roncohen
Copy link
Contributor Author

thanks for taking the time and having patience in reviewing this @simitt !

@roncohen roncohen deleted the v2-json-schema-es-docs-tests branch September 26, 2018 15:01
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 7, 2018
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 15, 2018
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 15, 2018
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 15, 2018
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 15, 2018
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 15, 2018
roncohen added a commit to roncohen/apm-server that referenced this pull request Oct 16, 2018
roncohen added a commit that referenced this pull request Oct 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants