Skip to content

Setup Scumblr 1.0

Scott Behrens edited this page Oct 17, 2016 · 1 revision

Setup

This section will walkthrough a basic setup for Scumblr on a base Ubuntu 14.04 system. This guide assumes you have an Ubuntu system setup and available to go.

Install Prerequisites

From the command line:

sudo apt-get update
sudo apt-get -y install git libxslt-dev libxml2-dev build-essential bison openssl zlib1g libxslt1.1 libssl-dev libxslt1-dev libxml2 libffi-dev libxslt-dev libpq-dev autoconf libc6-dev libreadline6-dev zlib1g-dev libtool libsqlite3-dev libcurl3 libmagickcore-dev ruby-build libmagickwand-dev imagemagick bundler

Install Rbenv/Ruby

From the command line:

cd ~
    git clone git://github.com/sstephenson/rbenv.git .rbenv
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(rbenv init -)"' >> ~/.bashrc
exec $SHELL

git clone git://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build
echo 'export PATH="$HOME/.rbenv/shims:$HOME/.rbenv/plugins/ruby-build/bin:$PATH"' >> ~/.bashrc
exec $SHELL

rbenv install 2.0.0-p481
rbenv global 2.0.0-p481
ruby -v

Install Ruby on Rails

From the command line:

gem install bundler --no-ri --no-rdoc
rbenv rehash
gem install rails -v 4.0.9 

Install Application Dependencies

sudo apt-get install redis-server
gem install sidekiq
rbenv rehash

Setup Application

From the command line:

git clone https://github.com/Netflix/Scumblr.git
cd Scumblr
bundle install
rake db:create
rake db:schema:load

Create an Admin User

From the command line from the Scumblr root folder:

bundle exec rails c

In the console:

user = User.new
user.email = "<Valid email address>"
user.password = "<Password>"
user.password_confirmation = "<Password>"
user.admin = true
user.save

Run Scumblr

From the command line from the Scumblr root folder:

redis-server &
bundle exec sidekiq -l log/sidekiq.log &
bundle exec rails s &

Now connect to your server on port 3000 (http://localhost:3000 in a browser if running on your local machine).

Additional Configuration

This section will discuss additional items that should be configured before using Scumblr in production.

Scumblr Configuration

Scumblr integrates with other services and APIs in order to find results and generate screenshots. Locations and API keys should be placed in config/initializers/scumblr.rb. A sample of this file is located at config/initializers/scumblr.rb.sample. In this file you can set:

  • The URL where Sketchy can be accessed (if using Sketchy to generate screenshots)
  • Keys, Secrets, and IDs for API authentication/authorization (Google, Apple Store, eBay, Twitter, etc.)

Examples for each configuration option for built-in search providers are located in the config/initializers/scumblr.rb.sample file. Simply rename this file to scumblr.rb and add the appropriate keys/values.

Secrets

First you should generate secrets for the Rails Application and Devise:

rake secret

Run this command twice. Put one secret on line 7 of config/initializers/secret_token.rb. Put the other on line 7 of config/initializers/devise.rb

Routing for Email Notifications and Sketchy

If you plan to use email notifications or the sketchy integration, you should ensure the default URL options are set correctly. This can be done in the config/environments/* files. Placeholders are located at the end of the production.rb and test.rb files. Example:

Rails.application.routes.default_url_options[:host] = "scumblr.com"
Rails.application.routes.default_url_options[:protocol] = "https"

The :host option can also use an IP address and/or include the port if non-standard (i.e. "192.168.10.101:3000"). In addition, you may need to setup configure your Rails environment configuration to allow the application to send email. See instructions here: http://guides.rubyonrails.org/action_mailer_basics.html#action-mailer-configuration

Automatic Syncing

In order to allow Scumblr to automatically run searches and send email notifications, you may want to setup cron jobs using the appropriate rake tasks:

rake sync_all

This take will run all the searches and import any new results. It will also generate screenshots of each result, if the integration with Sketchy is configured.

There are also two rake tasks available to run searches and send notifications independently.

rake perform_searches # run all searches
rake send_email_updates # send notifications

Note: In the sync_all task there is a 1 hour delay to ensure all searches have completed before sending out updates. If these tasks are run sequentially, the notifications may miss some new results (until the next run) if the searches haven't completed.

Other Setup

There are other items that should be considered before deploying Scumblr in a "production" environment. These include:

  • Choosing an appropriate web server to front the Rails application (We use Unicorn/Nginx)
  • Adjusting redis configuration to meet your needs
  • Reviewing and adjusting the Devise configuration if needed
  • Standard hardening for the Ubuntu host
  • Setup statuses for tracking the processing of Results

You may also want to review the app/models/ability.rb file. This file specifies the authorization roles in place. At Netflix, we use a simple admin/normal user scheme. This may not be appropriate for all use cases.

Using Scumblr

This section will discuss using basic use of Scumblr. We will assume that you've gotten Scumblr up and running, including Redis and Sidekiq, and have left the default search providers in place.

Searches

Searches are task that run in order to look for results to import into Scumblr. Searches rely on a Search Provider--a plugin module that knows how to take a set of options and find and return results. Scumblr includes a number of search providers by default. These include:

  • Google
  • YouTube
  • Facebook
  • Apple AppStore
  • Google Play Store
  • eBay
  • Twitter

For now we'll stick with using the built-in search providers, but adding a new search provider is relatively straightforward and will be discussed later. We will assume you have generated the appropriate API keys and added them to the configuration file as discussed in "Scumblr Configuration" above.

Creating a New Search

In order to create a new search, you will need to be logged in with an account with admin rights. If you'd like normal users to have the ability to create a new search, you can make the appropriate modifications to the /app/models/ability.rb file.

From any page in Scumblr:

  1. On the top menu click "Searches"
  2. Click "New Search"
  3. Give your search an identifiable name
  4. Add a query string. We're going to use "netflix scumblr"
  5. Select a Search Provider. We're going to use Google
  6. If additional options are available, they will appear inline. We'll leave the additional options blank for now.
  7. Add any tags you would like to be automatically applied to these results. We'll add one tag, "Scumblr"
  8. If you'd like, add a verbose description to the search
  9. When done click "Create Search"

Running a Search

In order to get results, the search needs to be run. This can be done in a number of ways:

  • An individual search can be run through the web interface
  • All searches can be run at the same time through the web interface
  • All searches can be run at the same time using a rake task

We'll discuss each of these methods in this section.

Running an Individual Search

  1. From anywhere in Scumblr, click "Searches" on the top menu
  2. Click on the Name of the Search you'd like to run
  3. Click "Run Now"

Your search should run and once complete you should see results on the results page (if any we're found!)

Running All Searches

  1. From anywhere in Scumblr, click "Searches" on the top menu
  2. Click "Run All Searches"

All the searches you have configured should run. Once complete you should see identified results on the results page.

Running All Searches From a Rake Task

  1. From the command line at the Scumblr root path, run:

     rake sync_all
    

This will run all the searches you have configured. Once complete you should see results on the results page.

Results

Results are the core model in Scumblr. A result is represent a URL that has been entered manually or imported using a Search Provider. This section will discuss how to view, inspect, and action results.

Result List

The result list is the main page for the application. This page shows a summary of all the results that have been identified and also searching/filtering, sorting, viewing basic details, and taking basic actions on the results.

The Result List page is is the first page you'll arrive at after logging in. It can also be reached by clicking "Results" on the top menu.

Navigating the Result List

There are two main sections on the Result List page: the results list, and the filter/action panel.

Result List

The results list is the main part of this screen and consists of the table on the left side of the page. In this table, each row represents a result. From here you can view the title of the result, the status (if one is given/available), the domain, when the result was first identified, and when it was last seen in a search.

There is also a link that will take you to the URL represented by the result. Note: This open a new tab and take you outside of Scumblr. Always be careful when visiting random sites on the Internet.

On the right side of the result list is a "Show" button clicking this button will take you to the detail page for this result. This page will be discussed in the "Vewiing Results" section below.

If a result has screenshots attached (either manually added or synced with Sketchy), a small monitor icon will appear to the left of the result's title. Hovering over this icon will show a thumbnail image of the first screenshot. Clicking the icon will allow seeing a larger gallery view of all the screenshots attached to the result.

Filter Panel

On the right side of the results list page is the filter/action panel. This section allows performing granular filtering for specific results.

From here it is possible to search based on:

  • URL
  • Title
  • Tags
  • Assignee
  • Status
  • Search
  • Workflow Flag
  • Workflow Stage

It is also possible to indicate whether "Closed" results should be included in the list. A closed result is an result whose status has been indicated to be a closed status. (More about Statuses in the relevant section below.) In order to perform a search, fill out the fields you're interested in and click "Search". Multiple filter attributes can be used and will be treated as "and" conditions. For example if you search for "facebook" in the URL field and "Investigating" in the Status field, you will get all the results with facebook in the URL that are also currently in the "Investigating" status.

If multiple entries are searched in the multi-search boxes (Tags, Assignee, Status, Search, Workflow Flag, and Workflow Stage), these will be treated as an "or" condition. For example if you search for "John" and "Cindy" in the Assignee field (assuming these were users of the system), you will get a list of all results assigned to either John OR Cindy.

Important: Filters will persist between requests. In other words if you navigate away from the results list (into a result's detail page for example), when you return to the results list your filter will remain intact. If you want to remove your filter and see all results click "Clear Search". When results are filtered this will be indicated in the result count displayed at the top of the result list table. For example, the result count may indicate "Displaying 1 result (1000 results filtered)". This would mean that 1 result meets your search criteria while another 1000 have been filtered from view.

Action Panel

The action panel allows performing certain actions on one or more results. The action panel appears at the right side of the screen, but is not visible until one or more results are selected with the checkboxes on the left side of the result. Actions that can be taken with the action panel include:

  • Changing the Status
  • Adding Tags
  • Setting the Assignee
  • Generating Screenshot (if Sketchy is enabled)

To use the action panel, first select one or more results. You can also use the checkbox in the header of the results list. This will select all results on the current page of results. If you'd like you can select all results that meet the current filter (on all pages). This is done by selecting the checkbox in the header of the results list and then clicking "Select all n results that match this filter."

Once the appropriate results are selected, simply select the options on the action panel that you'd like to change. You can perform multiple changes at one time (changing status, adding tags, setting assignee, generating screenshots). You can also add multiple tags at once by adding multiple tags to the tag field.

To generate screenshots, use the right side of the update button (area with the arrow) and select either "Update and Generate Screenshot" or "Update and Force Generate Screenshot". "Force Generate" will add a new screenshot for all selected results, even if a screenshot already exists. "Generate" will only add a new screenshot to selected results without an existing attachment.

Creating Results

Results can be manually created by clicking the "New Result" button on the bottom of the Result List page. To create a results you'll need to provide a name/title for the result, as well as a URL.

Result Details

The result details page contains additional information and actions that can be taken on individual results. From here one can:

  • View the details of the results
  • View/manually add screenshots
  • Change the status
  • Add comments
  • Changes the assignee
  • Subscribe to updates
  • Add/Remove tags
  • Add/Update Workflows

This section will walkthrough using the results detail page.

Status

At the top of the result view is the status bar. This indicates the current status of the result. A new status can be selected by clicking on the desired status. This is meant for a simple, high-level status that can be used to filter/search for results and indicate where in the process the result currently resides.

Details

Below the status bar are additional details about the results. These include the title, when the result was first identified, and a link to the original result page.

Searches

The searches section lists all the searches which have found the result page. If more than one search identifies the same URL, they will all be listed here. The information listed here includes the name of the search, the providers, the query used, and when the search first identified the result.

Attachments

The attachments section show any files attached to the result and allows triggering screenshot generation (through Sketchy if available) or manually uploading a file. New screenshots can be uploaded by rolling over the large "+" placeholder and clicking "Upload". A screenshot can be requested from Sketchy by rolling over the large "+" placeholder and clicking "Generate"

Rolling over an existing screenshot will provide options to "View" the result full size or "Delete" the screenshot.

Comments

At the bottom of the result page is the comments section. Comments can be added using the comment form at the top, or existing comments can be replyed to by hitting the "Reply" button under the comment. The small -/+ to the left of the commenter's name allows collasping/expanding the comment thread.

Assignee

On the right side of the page, the result's assignee can be viewed/set. To change the assignee click the pencil icon and select a new user.

Note: a user must exist on the system to be used as an assignee.

Subscriptions

If Scumblr is configured to send email messages, it is possible to subscribe to a result to receive an email when updates occur. To do this, click the "Subscribe" link. Once subscribed you can unsubscribe by clicking "Unsubscribe".

The number to the right of the "Subscribe/Unsubscribe" link indicates how many users are currently subscribed. Clicking this number will show a list of subscribed users and allow adding a new subscriber (including someone besides yourself) or removing existing subscribers.

Tags

If the result has had tags applied, they will appear in the tags section. Tags can be removed by clicking the "X" inside the tag. New tags can be added by clicking the "+" and typing/selecting the tag you'd like to apply.

Workflow Flags

The workflow flags section of the page shows any workflows that have been added to the result. Results can be flagged for multiple workflows ("Investigate" and "Takedown" for example), however they can only utilize a single copy of an individual workflow (You cannot apply two "Investigate" workflows, for example). All workflows that have been applied to the current result will be listed and the current stage will be shown in the drop down.

To add a new workflow flag to the result, click the "+", choose a workflow flag, and click submit. If any options are required to add the workflow, a form will appear for you to fill out. Click "Add workflow" when complete.

To change the stage of an existing workflow, click the associate dropdown and select the new stage. If any options are required to add the workflow, a form will appear for you to fill out. Click "Add workflow" when complete.

Saved Filters

Saved filtered allow saving a set of criteria used to filter results so it can easily be accessed and shared. Additionally, saved filters can have a list of subscribers, who can receive email updates when new matching results are identified.

Saved Filters are created from the results list page. To create a saved filter perform a search as normal. You can click "Search" to preview the results if you'd like, but this is not necessary. When you're ready to save your filter, click the save button. This will allow to to fill in additional options about the search include a name which will be used to refer to the saved filter, a list of subscribers to notify when new matching results are identified, and whether you want to share the filter with other users of the application. When you're happy with your saved filter, click "Create Saved Filter".

Once a saved filter has been saved, it can be access from the Saved Filters menu at the top of the page. Clicking on the filter name will take you to the current results for that saved filter.

To modify an existing filter, click "Manage" in the Saved Filters menu. Saved Filters can be modified by clicking the "Edit" button or deleted by clicking "Delete".

Public filters created by other users can be added to your Saved Filters menu. To do this, first click "Manage" under the "Saved Filters" menu at the top of the page. From here, at the bottom of this page will be a list of public filters, if any exist. Clicking "Add" next to any of any of these filters will add it to your list (under Saved Filters, Public Filters).

Email Notification

Notifications of new results can be sent out using Saved Filters. To setup email notifications, create a saved filter with the types of results you want to be notified of (for example Results with the status "New" or results tagged with the "Important" tag.) In the Saved Filter setup page, add the email addresses to be notified in the subscriber field.

These email addresses should now receive an email when new results match the saved filter.

Note: In order for Scumblr to be able to send notifications, you may need to perform some additional configuration for the application. See the "Email Notifications/Routing" section for more information.

Statuses

Statuses allow a flexible way of tracking the high-level state of a given result. Statuses can be created/edited from the "Admin>Statuses" menu at the top of the page. Created/editing statuses required admin privileges on the system.

Statuses have three fields:

  • Name
  • Closed
  • Invalid

The Closed flag indicates that results in this status should be consirded closed. Results that have been moved into a closed status will be excluded from the results list by default.

The Invalid flag indicates the result was invalid--meaning it's not something you were looking for from the given search. For example, you may have a "False Positive" status. The invalid flag allows tracking which searches are producing high numbers of results that you're not actually interested in.

Dashboard

Scumblr ships with a simple dashboard which is available from the top menu. This page shows the following information:

  • Time-series chart showing number of results identified per page
  • A breakdown of results by status
  • A breakdown of the number of results assigned to a workflow
  • For each search
    • The total number of results found
    • The number of results found in the last 24 hours
    • The number of results found in the last 7 days
    • A 30 day trend line
    • The number of results assigned to each workflow and/or to no workflow
Clone this wiki locally