Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] Series struct like Ecto.Schema #30

Open
gullitmiranda opened this issue Dec 9, 2016 · 5 comments
Open

[proposal] Series struct like Ecto.Schema #30

gullitmiranda opened this issue Dec 9, 2016 · 5 comments

Comments

@gullitmiranda
Copy link

gullitmiranda commented Dec 9, 2016

why not use a struct like Ecto.Schema for the Series?

ex:

defmodule MySeries
  use Instream.Schema

  series do
    database    "my_database_optional"
    measurement "cpu_load"

    tag :host, default: "www"
    tag :core

    field :value, default: 100
    field :value_desc
  end
end

# or measurement name with argument of `series/2`
defmodule MySeries
  use Instream.Schema

  series "cpu_load" do
  end
end

serie = %MySeries{
  value: 100,
  value_desc: nil,
  host: "www",
  core: nil,
  timestamp: nil
}

serie.__meta__(:fields)
%MySeries.Fields{ value: 100, value_desc: nil }

serie.__meta__(:tags)
%MySeries.Tags{ host: "www", core: nil }
@mneudert
Copy link
Owner

I don't know if I understand you correctly, so I might tell you things that might not answer your question at all. Please tell me if that happens.

The definition is called series because it hints at the client's intention to operate on a series. That having the fixed set of measurement, tags and fields lead to that structure. The measurement is defined using a separate macro to not have the naming mixed up between those two. After all a "series cpu_load" is only a series if you know the tags associated (even if none), otherwise it would be called "measurement cpu_load". (Hope I don't mess up the official definitions here...)

When writing points it is necessary to know whether a value is a tag or a field. Otherwise the measurement and series name would be the same and no field would be used internally for indexing and structuring the data. Might be no fun when querying.

That is also the way the data is set up when writing:

fields = %MySeries.Fields{ value: 100, value_desc: nil }
tags   = %MySeries.Tags{ host: "www", core: nil }
point  = %MySeries{ fields: fields, tags: tags }

A thing however might be an automatic separation of values into their respective fields an tags:

# somewhat abbreviated
MySeries.from_map(%{ host: "www", value: 100 })

The result of that would be identical to the above one, just with said automatic detection what goes where. And that looks like some really usable feature while still maintaining the current structure and internal workings.

If you meant something different please say so. After all even a rename of Instream.Series to Instream.Schema should be possible while maintaining backwards compatibility to provide time to adjust. As long as there is a valid reasoning there should be one or the other way to follow it.

@gullitmiranda
Copy link
Author

as per your last example, I believe that you have understood very well my suggestion. After all, the idea is that the internal operation is basically the same (separation through tags and fields). What changes would the final structure, that is, the __struct__.

internally I've already done this implementation in a project, if you want I can make a PR with it and already take this opportunity to add tests. Even I like it to be made into a new module (Instream.Schema).

@mneudert
Copy link
Owner

I whipped up a somewhat prototypical implementation for this hydration operations in a separate branch (full diff). Having the ability to construct any struct from a plain map also made it easy to be able to turn individual query results back into them. Something I had on my to do list for quite some time...

Would definitely like to see your approach before going any further with this 👍

@gullitmiranda
Copy link
Author

gullitmiranda commented Dec 20, 2016

I have created the PR without tests for now, so you can take a look.
The module is very similar to that Series, I just think more simple to handle.

I believe that if we add the Schema, the methods below and a few more adjustments for the feature will be good.

my connection with override and helpers:

defmodule App.InfluxDB do
  use Instream.Connection, otp_app: :status_api

  # Query

  def all(queryable) do
    queryable
    |> query()
    |> from_result
  end

  # Helpers

  defp from_result(%{ results: results }) do
    results |> Enum.map(&(from_result(&1)))
  end

  defp from_result(%{series: series} = response) do
    series = series |> Enum.map(&(from_result(&1)))
    %{ response | series: series}
  end

  defp from_result(%{ columns: columns, values: values }) do
    keys = columns |> Enum.map(&(String.to_atom(&1)))

    values
    |> Enum.map(fn(data) ->
      Enum.zip(keys, data)
      |> Enum.into(%{})
    end)
  end

  defp from_result(response), do: response


  # Queries override

  def insert(payload, opts \\ [])

  def insert(%{ __struct__: module } = payload, opts) do
    fields = module.__meta__(:fields)
    tags   = module.__meta__(:tags)

    %{
      points: [
        %{
          measurement: module.__meta__(:measurement),
          fields:      Map.take(payload, fields),
          tags:        Map.take(payload, tags),
          timestamp:   Map.get(payload, :timestamp)
        }
      ]
    }
    |> write(opts)
  end

  def insert(payload, opts) do
    payload
    |> write(opts)
  end
end

@mneudert
Copy link
Owner

From what I see the basic intent is to flatten the current series struct (remove the nesting for fields/tags). Your helper methods are mostly looking like a rewording to other things (write becomes insert, query() |> from_result() becomes all()).

Please correct me if I am misunderstanding something.

One thing to take into account with a flattened map is the :timestamp key. InfluxDB itself uses :time for both timestamps and RFC formatted dates. To not conflict with a potentially named :timestamp field or tag the current key would also need to be renamed to simply :time. Not that much of a problem, but something to take care of. After all the type of time used can be determined by the size of the integer or matching a binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants