Investigate inline data conversion #16

JBAhire · 2020-10-22T13:03:26Z

Here, the open question is,

Can we directly support aka inline format conversion in our span-normalizer (Rely on java library for data conversion)?
Something like hypertrace-ingestion image having two roles
- processing, receiving spans
  -The output of the above, reduce operational complexity, third-party dependency and reduce hypetrace-collector container, ease of debugging

Ref: https://github.com/ExpediaGroup/pitchfork

The rationale behind this:

How many different formats will we have?
data format conversion: zipkin, jeager, open-telemetry, open-census

* first commit * fix: use empty array for image pull secrets * First commit for the Span normalizer. * Run helm dependency before packaging the helm chart. Also did minor cleanups in gralde file. * Trying to turn off parallellism for gw publish to see if that fixes some weird issue with publishing the jars. * first commit * inserting service name into first class field * list of values for query params * first commit * Empty commit to re-trigger build * Don't explicitly set a partitioner in which case it falls back to Kafka producer's default strategy of round robin distribution to the target topic partitions * Don't explicitly set a partitioner in which case it falls back to Kafka producer's default strategy of round robin distribution to the target topic partitions * Added test for serviceName verification * Added empty service_name check * Added test for null servicename * removing unused dependency * Updates to community license * Custom method to parse query param values * renamed variable * remaming variables * removed typo * modified the parsing and added tests * chore: update to Apache 2.0 License * chore: update to Apache 2.0 License * chore: update to Community License * Remove accidentally added log file * fix: missing useJunitPlatform() in build.gradle.kts. * using the first class service name field * renaming variables * Adderessed code reviews * code reviews * fix: update dependencies to remove kafka topic creation * Fix formatting of build file for consistency * Upgrading the entity-service client to latest version. * Hadling invalid query param inputs * templatize flink parallelism and kafka producer configurations * make flink parallellism optional * update view creator job configuration * fix: only set parallelism if it's defined in the config * Just assert > 0 * add weight in Kube job to maintain exec order * Added neccessary dependency * upgraded dependency * upgraded dependency * Upgrading dependency * suppresing the synk vulnerability * ignoring synk * Added another missing enricher dependency * chore: upgrade docker plugin * chore: upgrade docker plugin * chore: upgrade docker plugin * chore: upgrade docker plugin * fix: snyk build failure * ci: update docker plugin version * ci: update docker plugin version * ci: update docker plugin version * ci: update docker plugin version * ci: push out snyk ignore expirations * Handle the HTTPS protocol also while resolving HTTP backends. This fixes a bug where an exit call with HTTPS protocol wasn't creating a backend. Note: Backend type for both HTTP and HTTPS protocols is still HTTP. * Adding HTTPS BackendType so that we can distinguish HTTP vs HTTPS backends. * expose schema reigstry and broker property as env * provides a way to override the mongo_host via an environment variable * prefixing env vars with underlying component like kafka, zk, etc * Addressed Ravi's comments of prefixing with underlying component * Going back to ENV base approach for default tenant id * made consistent kafka broker env vars * exposing common host related config as envs * provides a way to overrride common properties via envs * exposing source and sink topic too as envs * avoids name clashes between kube envs vs app envs * chore: upgrade docker plugin * chore: upgrade docker plugin * chore: upgrade docker plugin * chore: upgrade docker plugin * bumping up version of view framework * Using the latest metrics library and reporting per-tenant span normalization time. * Adding unit tests and using the latest platform-metrics library. * Code cleanup. * Version upgrade to fix snyk errors. * Upgrading to the newer version of metrics library to fix the service start issue. * Locks Java to 11 * Locks Java to 11 (#11) * Locks Java to 11 (#11) * Locks Java to 11 (#16) * Moves to new docker plugin which corrects port defaults and health check (#17) was ``` hypertrace-trace-enricher java -cp /app/resources:/a ... Up 8080/tcp ``` now ``` hypertrace-trace-enricher java -cp /app/resources:/a ... Up (healthy) 8099/tcp ``` * Moves to new docker plugin which corrects port defaults and health check (#15) After ``` span-normalizer java -cp /app/resources:/a ... Up (healthy) 8099/tcp ``` * Moves to new docker plugin which corrects port defaults and health check (#12) after: ``` raw-spans-grouper java -cp /app/resources:/a ... Up (healthy) 8099/tcp ``` * updates README (#14) * Update README.md * Update README.md (#19) Update README.md * Add creation timestamp to traces based on sampling percent (#13) * e2e latency first draft * remove ununsed inports * Sampling percent * get percent from conf as double * fix test * update data model versions * remane variable * better test coverage * better name for sampling conf property * assert in test * Enrichment arrival lag metric (#18) * lag metric * use new data model * remove redundant changes * init timer in static block * in line init of enrichmentArrivalTimer * view gen arrival lag metric (#14) * view gen arrival lag metric * view gen as single word * remove redundant imports * Moves to new docker plugin which corrects port defaults and health check (#12) after: ``` all-views-creator java -cp /app/resources:/a ... Exit 0 all-views-generator java -cp /app/resources:/a ... Up (healthy) 8099/tcp ``` * Adds one-line description of images produced by this build (#13) * Report span grouper lag and record trace creation time (#15) * Report trace creation time and span arrival lag * rename timer * update depndency versions * update depndency versions * Update data model and declare ENRICHMENT_ARRIVAL_TIME (#20) * update data model * self declare timer * update dependencies (#15) * update view gen framework version (#16) * update view-generator-framework version (#17) * chore: switch from master to main * chore: switch from master to main * chore: switch from master to main * chore: switch from master to main * Disable slf4j metrics reporter (#16) * fix: add pod annotations to helm values (#17) * fix: add pod annotations to helm values (#16) * adds arch diagram, list of enrichers and explains use-case with example (#22) * added description of trace-enrichers and build steps * updated formatting * Addresses Avinash's suggestions Co-authored-by: Avinash <[email protected]> * Addresses Avinash's suggestions Co-authored-by: Avinash <[email protected]> * fixes snyc issue * fixes snyc issue * fixes snyc issue * addressed shubham's comment * reverts snyc fix * reverts snyc fix * fixes snyc issue * fix: snyk Co-authored-by: Avinash <[email protected]> Co-authored-by: SJ <[email protected]> * adds ingestion arch and explain working of span-normalizer (#20) * added build steps and image source * adds working of span-normalizer * rearranging sections * updates links * adds ingestion arch and explain working of raw-spans-grouper (#18) * added build steps and image source * adds working of raw-spans-grouper * updates to newer clients which cache miss hit logs (#25) * fix: log a deserializable raw span (#21) * fix: log a deserializable raw span * Update grpc netty dependency to fix vulnerability detected by snyk * Add unit test for the avro to json converter method * check if debug is enabled before invoking method * fix: stop using StructuredTraceBuilder in ApiTraceGraph (#24) * fix: stop using StructuredTraceBuilder in ApiTraceGraph * update data-model dependency version in other projects * fix: update trace enricher api to patch the empty event id in event ref list issue (#19) * fix: update trace enricher api to patch the empty event id in event ref list issue * update snyk deadline * adds testing section and rearranges sections as per suggestions (#20) * added testing section * updated README * adds testing section and rearranges sections as per suggestions (#27) * adds testing in README * updated readme and added COC and Contribution gudelines * removed contributing guidelines and coc for now * test * adds description, ingestion architecture and testing steps in README (#21) * Update README.md * updates one line description * Update README.md Co-authored-by: Avinash <[email protected]> * Update README.md Co-authored-by: Avinash <[email protected]> * Update README.md Co-authored-by: Avinash <[email protected]> * Addresses Avinash's suggestions Co-authored-by: Avinash <[email protected]> Co-authored-by: Avinash <[email protected]> * adds testing section and rearranges sections as per suggestions (#23) * adds testing * renames license file * updated sections * updated testing sections * updated gradle plugin * removed extra whitespce * Host header enrichment should look at x-forwarded headers when host header is localhost. (#26) Host header enrichment should look at x-forwarded-host header when host header is localhost. This would help to identify correct host in the case of sidecar deployments. * Migrate to kafka streams (#18) * Migrate span-normalizer to kafka streams * use trace_id as key when sending RawSpans to the output topic. The raw-spans-grouper can then do a groupByKey on the trace_id * Check if KStream for an input topic already exists and re-use it. * Extract common streams config * dd producer.max.request.size property * merged topic creationg changes as part of this PR * Schema registry configurable and decoupled * 1. bring in changes from job-config branch 2. clean up flink dependencies 3. add a simple integration test * revert config changes done for local testing * remove dev docker registry used for testing * Update span-normalizer/build.gradle.kts Co-authored-by: Ronak <[email protected]> Co-authored-by: Laxman Ch <[email protected]> Co-authored-by: kotharironak <[email protected]> * Kafka streams cofig fix (#25) * fix: update data model version (#23) * fix: update data-model version (#32) * fix(UserAgentSpanEnricher): use first class user agent attributes (#29) * fix(UserAgentSpanEnricher): use first class user agent attributes * chore: prefer user agent from request header * test(UserAgentSpanEnricherTest): add more tests * fix: update data model and ht trace enricher api versions (#22) * update pinot broker and server default tags (#23) * fix: stop logging full span in api trace graph (#33) * fix: remove one more noisy log (#34) * Fixed tests to use AvroSerDe without schema registry (#26) * Migrate ht-trace-enricher to kafka streams (#23) * Migrate ht-trace-enricher to kafka streams * Check if KStream for an input topic already exists and re-use it. * SerDe configurable and schema registry decoupled from code * SerDe configurable and schema registry decoupled from code * incorporate the latest kafka-stream-framework related changes (#31) * incorporate the latest kafka-stream-framework related changes * fixed review comments * Helm changes for dev verification Co-authored-by: Laxman Ch <[email protected]> * chore: cleanup by reverting helm test changes * chore: updates to latest framework lib, and clean up * Update helm/values.yaml * uses avroserde instead of schema registery based in test case * addressed review comments Co-authored-by: Laxman Ch <[email protected]> Co-authored-by: kotharironak <[email protected]> Co-authored-by: Ronak <[email protected]> * Raw spans grouper using kafka streams (#17) * Raw spans grouper using kafka streams * use groupByKey for grouping spans into traces * remove unused kv mapper * Check if KStream for an input topic already exists and re-use it. * 1. Remove RawSpansHolder and instead use RawSpans from data-model/RawSpan.avdl 2. Fix imagePullSecrets variable in helm/templates/deployment.yaml * rename prefix RawSpansHolder to RawSpans * merged topics creation changes, and modify override methods based on new changes * add statefulset helm template * fix jmx-exporter config * SerDe configurable and schema registry decoupled from code * merged the changes of PR#22 of providing jobConfig * updates the subject.name strategy for schema registry * Helm changes for dev verification * Helm changes for dev verification * 1. add a test to understand how time baased windowing works 2. bump grace period to 10s (still see expiry - need to debug) 3. increased replicas to 2 and stream.threads to 4 each * wip * Use processing time based session windowing for generating traces * remove unused property * Remove changes used for local testing * remove service account config * fix comment * rename variable * Fix tests * set replication.factor=3 for internal topics * remove overriden method and instead use the one from base class * rename sessionTimeoutMs to groupingWindowTimeoutMs * make method defaultValueSerde private and remove unused constants * address review comments * add an additional test case Co-authored-by: Ronak <[email protected]> Co-authored-by: Laxman Ch <[email protected]> * make num.stream.threads, replication.factor configurable * Use kafka-streams-framework (#20) * Use kafka-streams-framework + Check if KStream for an input topic already exists and re-use it. * Updates to config * updates the configs for pre_create.topics option * updates to latest version of view-gen framework * Update gradle.properties * cleaned up the PRs with final changes * Update gradle.properties * reverted back to master copy as there was no change Co-authored-by: Ronak <[email protected]> Co-authored-by: kotharironak <[email protected]> * temporarily disable restorePunctuators until we identify why there are duplicate spans (#24) * chore: update configs to provide new required settings (#24) * chore: update configs to provide new required settings * chore: remove bloom filters and update libs * updating view-generator-framework to latest version (#25) * include tenant_id in key when producing to output (#27) * include customer_id in key when producing to output * revert change * address review comments * Use TraceIdentity as the key when sending RawSpan messages to the topic * rename customer_id to tenant_id * make field mandatory * Use TraceIdentity as key for state stores (#26) * Fix logs and remove update of traceEmitTriggerStore inside punctuator as it is not needed * handle the case when a RawSpans object is empty. This should never happen but adding code to guard against it. * fix unused imports * 1. Use TraceIdentity as the key for state stores 2. Report spans_per_trace distribution metric * rename customer_id to tenant_id * Update raw-spans-grouper/build.gradle.kts Co-authored-by: ravisingal <[email protected]> Co-authored-by: ravisingal <[email protected]> * templatize kafka streams configurations for large messages (#28) * Add RpcFieldsGenerator for parsing rpc.* tags (#28) * Add RpcFieldsGenerator for parsing rpc.* tags rpc.* tags have been defined for Otel. Added Protocol generator for such tags. The custom rpc.* tags defined gets translated to relevant rpc protocol such as grpc, java_rmi etc. Since we support grpc only for now, added support for translating of tags to grpc event field * Fix protos * Added new constants as enum instead of proto. * Updated the test dependency * Updated data models to latest (#29) * Use named fields for enriching grpc info (#36) * Use named fields for enriching grpc info Updated data models to have Rpc event field with system value set to grpc in case the request is grpc. Added the check rpc.system = grpc in SpanTypeAttributeEnricher. Updated UserAgentSpanEnricher and ApiBoundryTypeAttributeEnricher to use rpc.request.metadata. prefixed tags for identifying relevant info. Also updated ApiStatusEnricher to use new named fields for grpc status. * Use named fields for identifying the user agent for grpc * Fix formatting * Addressed review comments * Refactored enrichHostHeader method * Added user agent check in UserAgentSpanEnricher * Add test to increase test coverage of SpanTypeAttributeEnricher * Bugfix in JDBC backend resolver to parse URLs without failure. (#37) Bugfix in JDBC backend resolver to parse URLs without failure. Ideally, we should parse the URL separately based on the database driver type so that we can understand more properties of the datasource. That can come next. * Update data models to latest (#26) * chore: update service and view framework dependencies (#27) * chore: update dependencies to fix snyk issue (#38) * Span counter metric (#29) * Add span counter * Fix snyk error * Remove total spans metrics * Recv/Sent prefix doesn't guarantee protocol GRPC (#39) The assumption has been made that if the eventname has Recv/Sent as prefix, then the protocol is GRPC which doesn't hold. We receive OperationName as Recv./cart/checkout in case of http request. * fix: convert list type attributes as well (#40) * fix: covert list type attributes as well * More unit tests * Nothing short of 100% diff coverage * Add prometheus jmx exporter for exporting kafka-streams metrics (#30) * add pod annotations to deployment (#30) * add pod annotations to deployment * fix snyk failure * parameterizing stream configs - numStandbyReplicas and replication factor (#31) * Publish span-normalizer-constants (#31) We should use span normalizer constants where ever needed. * Upgrading grpc netty version to fix snyk build failure (#32) * Upgrading grpc netty version to fix snyk build failure * Upgrade dependencies for data-model and netty. * Update guava for protobuf * Update guava dependencies * ci: add dockerhub auth (#43) * ci: add dockerhub auth * fix snyk issue * revert snyk changes * ci: add dockerhub auth (#29) * ci: add dockerhub auth * fix snyk issue * revert snyk changes * ci: add dockerhub auth (#32) * ci: add dockerhub auth * fix snyk issue * revert snyk changes * ci: add dockerhub auth (#34) * Handle relative URLs to extract out path and query string (#33) * chore: update vulnerable dependencies and gradle (#44) * feat: add trace reader (#21) * test: add trace client * refactor: update trace client based on current projection * refactor: rename to trace reader * test: add trace reader tests * chore: move gradle change to separate pr * chore: update dep, add publish * Upgrade data-model to fix snyk-build (#34) * chore: adds PR template (#30) * updating version of grpc to address no-class-found issue (#46) * Add field generator for 'http.host' tag (#39) * Add rpc.body.decode_raw tag for identifying decoded rpc bodies (#40) * Removed submodule span-normalizer per #8 * Removed submodule raw-spans-grouper per #8 * Removed submodule hypertrace-trace-enricher per #8 * Removed submodule hypertrace-view-generator per #8 * merged subtree branch here * consolidates ingester pipeline as macrorepo * removes redundant publish * Update .circleci/helm.sh Co-authored-by: José Carlos Chávez <[email protected]> * fixed the groups for each project and deps via project deps * fixing helm.sh not found using absolute path * tried with just working directory url * trying with bin/sh and working_directory construct * trying with bash script * fixing alpine linux sh issue * removing un-supported -o option * fixed the bad substitution issue in helm.sh script * updates snyk expiry dates * adds contrains on guava library suggested by snyk * update workflow to trigger release based on release tag * fixed circleci config issue Co-authored-by: Tim Mwangi <[email protected]> Co-authored-by: Buchi Reddy B <[email protected]> Co-authored-by: Buchi Reddy Busi Reddy <[email protected]> Co-authored-by: Tim Mwangi <[email protected]> Co-authored-by: anujgoyal <[email protected]> Co-authored-by: anujgoyal1 <[email protected]> Co-authored-by: surajpuvvada <[email protected]> Co-authored-by: Ronak <[email protected]> Co-authored-by: SJ <[email protected]> Co-authored-by: Anuraag Agrawal <[email protected]> Co-authored-by: Ravi Singal <[email protected]> Co-authored-by: rleiwang-traceableai <[email protected]> Co-authored-by: Avinash <[email protected]> Co-authored-by: Laxman Ch <[email protected]> Co-authored-by: Aaron Steinfeld <[email protected]> Co-authored-by: Aaron Steinfeld <[email protected]> Co-authored-by: kotharironak <[email protected]> Co-authored-by: Jayesh Bapu Ahire <[email protected]> Co-authored-by: Samarth Gupta <[email protected]> Co-authored-by: ravisingal <[email protected]> Co-authored-by: SJ <[email protected]> Co-authored-by: surajpuvvada <[email protected]> Co-authored-by: Laxman Ch <[email protected]> Co-authored-by: mohit-a21 <[email protected]> Co-authored-by: GurtejSohi <[email protected]> Co-authored-by: José Carlos Chávez <[email protected]>

jcchavezs pushed a commit that referenced this issue Nov 10, 2020

Locks Java to 11 (#16)

6b91f68

jcchavezs pushed a commit that referenced this issue Nov 10, 2020

update view gen framework version (#16)

f9c0d77

jcchavezs pushed a commit that referenced this issue Nov 10, 2020

Disable slf4j metrics reporter (#16)

061d2e0

jcchavezs pushed a commit that referenced this issue Nov 10, 2020

fix: add pod annotations to helm values (#16)

5fe7d2a

codefromthecrypt pushed a commit that referenced this issue Nov 23, 2020

Fix default config (#16)

d07cb95

kotharironak added the enhancement New feature or request label Mar 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate inline data conversion #16

Investigate inline data conversion #16

JBAhire commented Oct 22, 2020

Investigate inline data conversion #16

Investigate inline data conversion #16

Comments

JBAhire commented Oct 22, 2020