Page Crawler

Simple project to read and extract page content

Dependencies

Ruby 2.3.1
Postgresql and Nokogiri dependencies
Redis

For Development Mode

In a terminal:

# Copy the .env.example file to .env
cp .env.example .env

# Open the .env file and configure it, if necessary 

# Execute 
bundle exec rake db:create
bundle exec rake db:migrate
bundle exec sidekiq

Open another terminal and execute rails server

For Production mode

Create a new database. For each terminal, export all environment variables:

export PAGE_CRAWLER_DB_PORT=<postgres port>
export PAGE_CRAWLER_DB_USER=<postgres user>
export PAGE_CRAWLER_DB_PASSWORD=<postgres user password>
export PAGE_CRAWLER_DB_NAME=<postgres database>
export SECRET_KEY_BASE=<postgres database>
export REDIS_URL=<redis url like: redis://localhost:6379/0>
export RAILS_ENV=production

Execute in a terminal bundle exec rake db:migrate and rails server, open another one, export all variables and execute bundle exec sidekiq.

Production in Docker

You must have docker and docker-compose, then execute in a terminal docker-compose up -d In the end, you will be able to access at http://localhost:3000/v1/pages

API Endpoint

Here are the endpoint descriptions

Verb	Endpoint	Description
GET	/v1/pages	List all previous urls and content stored
POST	/v1/pages/enqueue	Enqueue a new url to be processed and get its page content

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
bin		bin
config		config
db		db
lib/tasks		lib/tasks
log		log
public		public
spec		spec
.env.example		.env.example
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
.rubocop_todo.yml		.rubocop_todo.yml
.ruby-version		.ruby-version
Dockerfile		Dockerfile
Gemfile		Gemfile
Guardfile		Guardfile
README.md		README.md
Rakefile		Rakefile
config.ru		config.ru
docker-compose.yml		docker-compose.yml
init.sh		init.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Page Crawler

Dependencies

For Development Mode

For Production mode

Production in Docker

API Endpoint

About

Uh oh!

Releases

Packages

Languages

emilio2hd/page_crawler

Folders and files

Latest commit

History

Repository files navigation

Page Crawler

Dependencies

For Development Mode

For Production mode

Production in Docker

API Endpoint

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages