Skip to content

Commit

Permalink
[processor/geoip] Add GeoIP providers configuration and maxmind facto…
Browse files Browse the repository at this point in the history
…ry (open-telemetry#33268)

**Description:** Define and parse a configuration for the geo IP
providers section. In addition, the Maxmind GeoIPProvider implementation
is included in the providers factories.

Example configuration:

```yaml
processors:
  geoip:
    providers:
      maxmind:
        database: "/tmp"
```

Implementation details:
- Custom Unmarshal implementation for the processor's component, this is
needed to dynamically load each provider's configuration:
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33268/files#diff-aed2c6fd774ef54a3039647190c67e28bd0fc67e008fdd5630b6201c550bd00aR46
. The approach is very similar to how the [hostmetrics receiver
loads](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/hostmetricsreceiver/config.go#L44)
its scraper configuration.
- A new factory for the providers is included in order to retrieve their
default configuration and get the actual provider:
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33268/files#diff-2fbb171efac07bbf07c1bcb67ae981eb481e56491add51b6a137fd2c17eec9dcR27

**Link to tracking Issue:**
open-telemetry#33269

**Testing:** 

- Unit tests for the configuration unmarshall + factories
- A base mock structure is used through all the test files:
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33268/files#diff-28f4a173f1f4b5ccd3cf4c9f7f7b6bf864ef1567a28291322d7e94a9f63243aeR26
- Integration test to verify the behaviour of the processor + maxmind
provider
- The generation of the Maxmind databases has been moved to a [public
internal
package](https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33268/files#diff-83fc0ce7aa1f0495b4f4e5d5aabc2918162fec31ad323cc417b3f8c8eb5a00bcR14)
as being used for the unit and integration tests. Should we upload the
assembled database files instead?

**Documentation:** TODO

---------

Co-authored-by: Andrzej Stencel <[email protected]>
  • Loading branch information
rogercoll and andrzej-stencel authored Jul 8, 2024
1 parent 85f9421 commit 3e5c046
Show file tree
Hide file tree
Showing 19 changed files with 861 additions and 56 deletions.
27 changes: 27 additions & 0 deletions .chloggen/geoipprocessor_provider.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: 'enhancement'

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: geoipprocessor

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add providers configuration and maxmind provider factory

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [33269]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: []
38 changes: 36 additions & 2 deletions processor/geoipprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,42 @@
[development]: https://github.com/open-telemetry/opentelemetry-collector#development
<!-- end autogenerated section -->

**This processor is currently under development and is presently a NOP (No Operation) processor. Further features and functionalities will be added in upcoming versions.**

## Description

The geoIP processor `geoipprocessor` enhances resource attributes by appending information about the geographical location of an IP address. To add geographical information, the IP address must be included in the resource attributes using the [`source.address` semantic conventions key attribute](https://github.com/open-telemetry/semantic-conventions/blob/v1.26.0/docs/general/attributes.md#source).

### Geographical location metadata

The following [resource attributes](./internal/convention/attributes.go) will be added if the corresponding information is found:

```
* geo.city_name
* geo.postal_code
* geo.country_name
* geo.country_iso_code
* geo.continent_name
* geo.continent_code
* geo.region_name
* geo.region_iso_code
* geo.timezone
* geo.location.lat
* geo.location.lon
```

## Configuration

The following settings must be configured:

- `providers`: A map containing geographical location information providers. These providers are used to search for the geographical location attributes associated with an IP. Supported providers:
- [maxmind](./internal/provider/maxmindprovider/README.md)

## Examples

```yaml
processors:
# processor name: geoip
geoip:
providers:
maxmind:
database_path: /tmp/mygeodb
```
78 changes: 77 additions & 1 deletion processor/geoipprocessor/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,85 @@

package geoipprocessor // import "github.com/open-telemetry/opentelemetry-collector-contrib/processor/geoipprocessor"

import (
"errors"
"fmt"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/confmap"

"github.com/open-telemetry/opentelemetry-collector-contrib/processor/geoipprocessor/internal/provider"
)

const (
providersKey = "providers"
)

// Config holds the configuration for the GeoIP processor.
type Config struct{}
type Config struct {
// Providers specifies the sources to extract geographical information about a given IP.
Providers map[string]provider.Config `mapstructure:"-"`
}

var (
_ component.Config = (*Config)(nil)
_ confmap.Unmarshaler = (*Config)(nil)
)

func (cfg *Config) Validate() error {
if len(cfg.Providers) == 0 {
return errors.New("must specify at least one geo IP data provider when using the geoip processor")
}

// validate all provider's configuration
for providerID, providerConfig := range cfg.Providers {
if err := providerConfig.Validate(); err != nil {
return fmt.Errorf("error validating provider %s: %w", providerID, err)
}
}
return nil
}

// Unmarshal a config.Parser into the config struct.
func (cfg *Config) Unmarshal(componentParser *confmap.Conf) error {
if componentParser == nil {
return nil
}

// load the non-dynamic config normally
err := componentParser.Unmarshal(cfg, confmap.WithIgnoreUnused())
if err != nil {
return err
}

// dynamically load the individual providers configs based on the key name
cfg.Providers = map[string]provider.Config{}

// retrieve `providers` configuration section
providersSection, err := componentParser.Sub(providersKey)
if err != nil {
return err
}

// loop through all defined providers and load their configuration
for key := range providersSection.ToStringMap() {
factory, ok := getProviderFactory(key)
if !ok {
return fmt.Errorf("invalid provider key: %s", key)
}

providerCfg := factory.CreateDefaultConfig()
providerSection, err := providersSection.Sub(key)
if err != nil {
return err
}
err = providerSection.Unmarshal(providerCfg)
if err != nil {
return fmt.Errorf("error reading settings for provider type %q: %w", key, err)
}

cfg.Providers[key] = providerCfg
}

return nil
}
103 changes: 96 additions & 7 deletions processor/geoipprocessor/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,45 @@
package geoipprocessor

import (
"errors"
"path/filepath"
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/confmap/confmaptest"
"go.opentelemetry.io/collector/otelcol/otelcoltest"

"github.com/open-telemetry/opentelemetry-collector-contrib/processor/geoipprocessor/internal/metadata"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/geoipprocessor/internal/provider"
maxmind "github.com/open-telemetry/opentelemetry-collector-contrib/processor/geoipprocessor/internal/provider/maxmindprovider"
)

func TestLoadConfig(t *testing.T) {
t.Parallel()

tests := []struct {
id component.ID
expected component.Config
errorMessage string
id component.ID
expected component.Config
validateErrorMessage string
unmarshalErrorMessage string
}{
{
id: component.NewID(metadata.Type),
expected: &Config{},
id: component.NewID(metadata.Type),
validateErrorMessage: "must specify at least one geo IP data provider when using the geoip processor",
},
{
id: component.NewIDWithName(metadata.Type, "maxmind"),
expected: &Config{
Providers: map[string]provider.Config{
"maxmind": &maxmind.Config{DatabasePath: "/tmp/db"},
},
},
},
{
id: component.NewIDWithName(metadata.Type, "invalid_providers_config"),
unmarshalErrorMessage: "unexpected sub-config value kind for key:providers value:this should be a map kind:string)",
},
}

Expand All @@ -39,10 +56,15 @@ func TestLoadConfig(t *testing.T) {

sub, err := cm.Sub(tt.id.String())
require.NoError(t, err)

if tt.unmarshalErrorMessage != "" {
assert.EqualError(t, sub.Unmarshal(cfg), tt.unmarshalErrorMessage)
return
}
require.NoError(t, sub.Unmarshal(cfg))

if tt.errorMessage != "" {
assert.EqualError(t, component.ValidateConfig(cfg), tt.errorMessage)
if tt.validateErrorMessage != "" {
assert.EqualError(t, component.ValidateConfig(cfg), tt.validateErrorMessage)
return
}

Expand All @@ -51,3 +73,70 @@ func TestLoadConfig(t *testing.T) {
})
}
}

func TestLoadConfig_InvalidProviderKey(t *testing.T) {
factories, err := otelcoltest.NopFactories()
require.NoError(t, err)

factory := NewFactory()
factories.Processors[metadata.Type] = factory
_, err = otelcoltest.LoadConfigAndValidate(filepath.Join("testdata", "config-invalidProviderKey.yaml"), factories)

require.Contains(t, err.Error(), "error reading configuration for \"geoip\": invalid provider key: invalidProviderKey")
}

func TestLoadConfig_ValidProviderKey(t *testing.T) {
type dbMockConfig struct {
Database string `mapstructure:"database"`
providerConfigMock
}
baseMockFactory.CreateDefaultConfigF = func() provider.Config {
return &dbMockConfig{providerConfigMock: providerConfigMock{func() error { return nil }}}
}
providerFactories["mock"] = &baseMockFactory

factories, err := otelcoltest.NopFactories()
require.NoError(t, err)

factory := NewFactory()
factories.Processors[metadata.Type] = factory
collectorConfig, err := otelcoltest.LoadConfigAndValidate(filepath.Join("testdata", "config-mockProvider.yaml"), factories)

require.NoError(t, err)
actualDbMockConfig := collectorConfig.Processors[component.NewID(metadata.Type)].(*Config).Providers["mock"].(*dbMockConfig)
require.Equal(t, "/tmp/geodata.csv", actualDbMockConfig.Database)

// assert provider unmarshall configuration error by removing the database fieldfrom the configuration struct
baseMockFactory.CreateDefaultConfigF = func() provider.Config {
return &providerConfigMock{func() error { return nil }}
}
providerFactories["mock"] = &baseMockFactory

factories.Processors[metadata.Type] = factory
_, err = otelcoltest.LoadConfigAndValidate(filepath.Join("testdata", "config-mockProvider.yaml"), factories)

require.ErrorContains(t, err, "has invalid keys: database")
}

func TestLoadConfig_ProviderValidateError(t *testing.T) {
baseMockFactory.CreateDefaultConfigF = func() provider.Config {
sampleConfig := struct {
Database string `mapstructure:"database"`
providerConfigMock
}{
"",
providerConfigMock{func() error { return errors.New("error validating mocked config") }},
}
return &sampleConfig
}
providerFactories["mock"] = &baseMockFactory

factories, err := otelcoltest.NopFactories()
require.NoError(t, err)

factory := NewFactory()
factories.Processors[metadata.Type] = factory
_, err = otelcoltest.LoadConfigAndValidate(filepath.Join("testdata", "config-mockProvider.yaml"), factories)

require.Contains(t, err.Error(), "error validating provider mock")
}
Loading

0 comments on commit 3e5c046

Please sign in to comment.