Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promotes "GCP Billing" dashboard to GA, with some modifications #955

Merged
merged 6 commits into from
Oct 21, 2022

Conversation

nkinkade
Copy link
Contributor

@nkinkade nkinkade commented Oct 14, 2022

https://grafana.mlab-sandbox.measurementlab.net/d/a5mC51ZMk/gcp-billing

This PR takes the "GCP Billing" dashboard that @stephen-soltesz created with the following changes

  • moves it from the "soltesz" folder to the "General" folder
  • cleans it up (to my personal tastes)
    • resizes numerous panels
    • removes legend from some panels
    • changes hover tip to show "All" and sort desc
    • Sets data type to "Currency: US Dollars"
    • removes "stacked" graph display from most panels
    • renames panels slightly
    • removes any comments from BigQuery queries
    • reduces default range from 14d to 7d
    • various other tweaks I can't think of now
  • adds a new panel for estimated GCE egress traffic costs per day

This change is Reviewable

@stephen-soltesz
Copy link
Contributor

Let's talk through this today.

@stephen-soltesz
Copy link
Contributor

(or when you're back)

This dashboard, as it is, was created by @stephen-soltesz. It was
located in Stephen's personal folder in Grafana in mlab-sandbox. This
commit is using his work as the basis for an official dashboard to
monitor GCP billing, where additional panels and metrics may be added in
the future.
Additionally:

* resizes numerous panels
* removes legend from some panels
* removes "stacked" graph display from most panels
* renames panels slightly
* removes any commens from BigQuery queries
* various other tweaks I can't think of now
Queries the exported GCP billing data in BigQuery for any SKU like
"Network Internet Egress%". This will be grossly higher than the
Internet egress costs for just the virtual GCE platform nodes, since it
includes everything originating from GCE, not just platform nodes. But
should still provide a good idea of GCE Internet egress costs, of which
virtual platform nodes will surely be a major contributor.
Copy link
Contributor Author

@nkinkade nkinkade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephen-soltesz: Per our discussion the other day, I've restored the "stacked" graphs for costs per project and per service in the "All Projects" section. Additionally, I have added a new BigQuery query to the GCE egress panel, and have added rather lengthy description to the panel. PTAL?

Reviewable status: 0 of 1 approvals obtained (waiting on @stephen-soltesz)

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "cost per service" panel does not use "$" units (you said you wanted to update that).

GCP egress costs are ~2x our estimate from prometheus metrics?

While there are many " to " variations, there may also be surprises in there... Maybe include this in the panel as a "hidden" query (so it can be enabled manually)?

SELECT
  TIMESTAMP_TRUNC(usage_start_time, DAY) AS date,
  sku.description as metric,
  SUM(cost) as gce_egress
FROM
  `mlab-oti.billing.unified`
WHERE
  $__timeFilter(usage_start_time)
  AND service.description = 'Compute Engine'
  AND sku.description LIKE 'Network Internet Egress%'
  AND cost > 0
  AND project.id = 'mlab-oti'
GROUP BY
  date, sku.description
ORDER BY
  date, gce_egress

With this feedback, :lgtm:

Reviewable status: :shipit: complete! 1 of 1 approvals obtained

It lists only the egress SKUs where the cost per day exceeds $10, which
is an arbitrary value, but weeds out quite a lot of SKUs where the cost
is mere pennies per day, and make the chart more readable.

Also, sets the cost per service per house data type to currency->USD.
Copy link
Contributor Author

@nkinkade nkinkade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the data type for the "cost per service" panel, thanks for catching that oversight.

gce_egress is much higher than the sum of octets_out, I surmise, because the query is summing all GCE Internet egress for the mlab-oti project, which would include GCE VMs and services that aren't part of the k8s platform, whereas octets_out is very specifically targeting egress traffic from only k8s platform VMs. There are fairly detailed descriptions in panel description field for each plot. Hover over the lowercase "i" in the top left of the panel to see if you think the descriptions I had added there shed any meaningful light on what you see and describe those discrepancies.

I also renamed gce_egress to gce_egress_sum, and octets_out to octets_out_sum to better characterize what they represent.

I added your query to the panel, but set AND cost > $10 to filter out all the SKUs where the hourly cost is mere pennies and not of much interest to us, I think. And the query is not hidden, but visible by default, since I think it's useful data.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained

@nkinkade nkinkade merged commit 5ef041c into main Oct 21, 2022
@nkinkade nkinkade deleted the sandbox-kinkade branch October 21, 2022 17:00
@stephen-soltesz
Copy link
Contributor

stephen-soltesz commented Oct 21, 2022

Ah, yes of course. We have monitoring, we have the k8s management traffic, and that explains why there's "Americas to <everywhere>" entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants