Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing a config error saying "Unknown filter plugin 'mecab' when I try to create MeCab plugin for Fluentd and execute. #2610

Closed
yazaki-tatsuya opened this issue Sep 10, 2019 · 3 comments

Comments

@yazaki-tatsuya
Copy link

yazaki-tatsuya commented Sep 10, 2019

1.What I want to do

I want to get the tweet information from Fluentd and analyse using Fluentd. I have done the following steps.

(1)Install rbenv => (2)Install Ruby => (3)Install Fluentd => (4)Install MeCab

2.Problems I am facing & error messages

2019-09-09 16:26:26 +0900 [error]: config error file="./fluent.twitter.mecab.conf" error_class=Fluent::ConfigError error="Unknown filter plugin 'mecab'. Run 'gem search -rd fluent-plugin' to find plugins"

3.Source Code

This is the source code of the config file (fluent.twitter.mecab.conf)

<source>
        type twitter
        consumer_key [MY_KEY]
        consumer_secret [MY_KEY_SECRET]
        access_token [MY_TOKEN]
        access_token_secret [MY_TOKEN_SECRET]
        tag twitter
        timeline sampling
        lang ja
        output_format nest
</source>

<filter twitter>
        @type mecab
        key text
</filter>

<match twitter>
        @type stdout
</match>

<match twitter.mecab>
        type stdout
</match>

Also this is the source code of the .rb file which I created at [./fluentd/fluent/plugin/]

module Fluent
class MeCabFileter < Filter
    Plugin.register_filter('mecab',self)
    config_param : key, : string
    config_param : tag, : string, default: "mecab"

    def initialize
        super
        require 'natto'
    end

    def configure(config)
        super
        @mecab = Natto::MeCab.new
    end

    def start
        super
    end

    def shutdown
        super
    end

    def filter(tag, time, record)
    end

    def filter_stream(tag, es)
        result_es = MultiEventStream.new
        es.each do |time, record|
        begin
            position = 0
            @mecab.parse(pre_process(record[@key])) do |mecab|

                length = mecab.surface.length
                next if length == 0

                new_record = record.clone
                new_record["mecab"] = {    "word" => mecab.surface,
                                        "length" => length,
                                        "pos" => mecab.feature.split(/\,/),
                                        "position" => position}
                result_es.add(time, new_record)

                position += length
            end
            rescue => e
                router.emit_error_event(tag, time, record, e)
            enend
            return result_es
        end
        def pre_process(text)
            # delete URL
            return text.gsub(/https?\:\/\/([\w\-]+\.)+[\w-]+(\/[\w-]+)*\/?/,'').gsub(/RT\s*:\*/,'').gsub(/@[\w]+\s*/,'')
        end
    end
end

4.Things I tried

In the error message, it said to try "gem search -rd fluent-plugin" command so I tried to install the following plugin but it didn't solve the problem.

gem search -rld fluent-plugin
sudo gem install fluent-plugin-mecab

5.Environment Information

OS : Linux CentOS 7
rbenv 1.1.2-2-g4e92322
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
fluentd 1.7.0
mecab of 0.996

@repeatedly
Copy link
Member

@ganmacs
Copy link
Member

ganmacs commented Sep 10, 2019

Could you please post this kind of questions to Mailing list next time?
I'm closing this issue. but if you have more question, we are welcome in Mailing list.

https://github.com/fluent/fluentd/blob/07d9eecf0ce1413ab7873da84462b6f8fe9fae04/CONTRIBUTING.md#got-a-question-or-problem

  1. Could you paste ALL log?(including executing command)
  2. your ruby script is a syntax error. please check it by yourself.
  3. Could you tell me what the file name of MeCabFileter is?
  4. filter class needs to be defined under Fluent::Plugin, so below is correct.
...
module Fluent::Plugin
  class MecabFileter < Filter
...
  1. I checked in my local env. and I worked well(an error is not related to this issue. please ignore it)
$ cat plugins/filter_mecab.rb
module Fluent::Plugin
  class MecabFileter < Filter
    Fluent::Plugin.register_filter('mecab', self)

    config_param :key, :string
    config_param :tag, :string, default: "mecab"

    def initialize
      super
      # require 'natto'
    end

    def configure(config)
      super
      # @mecab = Natto::MeCab.new
    end

    def start
      super
    end

    def shutdown
      super
    end

    def filter(tag, time, record)
    end

    def filter_stream(tag, es)
      result_es = MultiEventStream.new
      es.each do |time, record|
        begin
          position = 0
          @mecab.parse(pre_process(record[@key])) do |mecab|

            length = mecab.surface.length
            next if length == 0

            new_record = record.clone
            new_record["mecab"] = {    "word" => mecab.surface,
                                       "length" => length,
                                       "pos" => mecab.feature.split(/\,/),
                                       "position" => position}
            result_es.add(time, new_record)

            position += length
          end
        rescue => e
          router.emit_error_event(tag, time, record, e)
          enend
          return result_es
        end
      end
    end

    def pre_process(text)
      # delete URL
      return text.gsub(/https?\:\/\/([\w\-]+\.)+[\w-]+(\/[\w-]+)*\/?/,'').gsub(/RT\s*:\*/,'').gsub(/@[\w]+\s*/,'')
    end
  end
end
$ fluentd -p plugins -c example/mecab.conf      
2019-09-10 16:56:28 +0900 [info]: parsing config file is succeeded path="example/mecab.conf"
2019-09-10 16:56:28 +0900 [warn]: both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2019-09-10 16:56:28 +0900 [warn]: both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2019-09-10 16:56:28 +0900 [info]: using configuration file: <ROOT>
  <source dummy>
    @type dummy
    tag "twitter"
  </source>
  <filter twitter>
    @type mecab
    key "text"
  </filter>
  <match twitter.mecab>
    @type stdout
  </match>
</ROOT>
2019-09-10 16:56:28 +0900 [info]: starting fluentd-1.7.0 pid=75064 ruby="2.6.3"
2019-09-10 16:56:28 +0900 [info]: spawn command to main:  cmdline=["ruby", "-Eascii-8bit:ascii-8bit", "~/.rbenv/versions/2.6.3/bin/fluentd", "-p", "plugins", "-c", "example/mecab.conf", "--under-supervisor"]
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.0.4'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-kafka' version '0.11.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-kafka' version '0.10.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-kafka' version '0.9.3'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-multiprocess' version '0.2.2'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-prometheus' version '1.5.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-prometheus' version '1.4.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-prometheus' version '0.5.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-record-modifier' version '2.0.1'
2019-09-10 16:56:29 +0900 [info]: gem 'fluent-plugin-record-modifier' version '1.0.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluentd' version '1.7.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluentd' version '1.6.3'
2019-09-10 16:56:29 +0900 [info]: gem 'fluentd' version '1.6.2'
2019-09-10 16:56:29 +0900 [info]: gem 'fluentd' version '1.6.0'
2019-09-10 16:56:29 +0900 [info]: gem 'fluentd' version '1.4.2'
2019-09-10 16:56:29 +0900 [info]: gem 'fluentd' version '0.12.43'
2019-09-10 16:56:29 +0900 [info]: adding filter pattern="twitter" type="mecab"
2019-09-10 16:56:29 +0900 [info]: adding match pattern="twitter.mecab" type="stdout"
2019-09-10 16:56:29 +0900 [info]: adding source type="dummy"
2019-09-10 16:56:29 +0900 [warn]: #0 both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2019-09-10 16:56:29 +0900 [warn]: #0 both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2019-09-10 16:56:29 +0900 [info]: #0 starting fluentd worker pid=75077 ppid=75064 worker=0
2019-09-10 16:56:29 +0900 [info]: #0 fluentd worker is now running worker=0
2019-09-10 16:56:30 +0900 [warn]: #0 emit transaction failed: error_class=NameError error="uninitialized constant Fluent::Plugin::MecabFileter::MultiEventStream\nDid you mean?  Fluent::MultiEventStream" location="/path/to/fluent/fluentd/plugins/filter_mecab.rb:30:in `filter_stream'" tag="twitter"

@ganmacs ganmacs closed this as completed Sep 10, 2019
@yazaki-tatsuya
Copy link
Author

Dear ganmacs
Thanks for your reply. Sorry I am new to Git and Ruby...next time I will use the mailing list. As you pointed out in #2, I found the syntax error and fixed it. Regarding to #3, #4 I couldn't understand the point, but if I face any problems, let me ask in the mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants