Use the following steps if you want to host it locally instead of use the live site.
- In terminal: clone the repo, cd into the root directory, run bundle install. This installs the dependent libraries.
- Run rake db:migrate to set up the database.
- Start the server with the following command: rails s
- Generally the site should be hosted at https:localhost.3000
- Sign up/Sign in
- Enter a ticker symbol into the search bar.
- You will be redirected to that ETF's show page if the ticker symbol can be found on the site. The search form will re-render with a flash notice if the ticker symbol could not be found.
- Optionally download the data in CSV format.
- Search page is unique to the user. It displays their 10 most recent searches by ticker symbol and date. These are hyperlinks to that ETF's show page.
Here is a shortened example CSV file output:
name,amount
Apple Inc.,60640164
Microsoft Corporation,89330570
sector,weight(%)
Information Technology,22.51
Financials,14.09
country,weight(%)
United Kingdown, 45.6%
China, 25.45%
I designed the output this way to make it easier for a user to parse it into a workbook or with a scripting language. The logic is in the to_csv instance method of the Etf class. The Etf#Show action also had to be programmed to allow the .csv file format. Essentially, the download happens by clicking a button which redirects you to /etfs/:id.csv where the file will automatically download.
The bulk of the Etf creation happens as class methods of the Scraper class. This is my first time doing web scraping in Ruby. I would appreciate feedback on better ways to organize the scraping related code. It is completely dependent on https://us.spdrs.com/en maintaining the same site organization, layout, and css tags. The ETFs will most likely be created with erroneous information if that site changes. It heavily relies on the ruby Nokogiri library to create an object representing the page. Then it searches for the specific css selectors where the data resides. This generally returns an array of Nokogiri objects representing the relevant fragments of the DOM for Top 10 Holdings, Sector Allocation, and Country Weight. Then the Scraper class iterates over the array of fragments to pull the relevant information to create a corresponding database record with. This lives in the scraper.rb file.
def self.etf_from_scrape(ticker_symbol, page)
fund_name = page.css('h1').text.strip
fund_objective = page.css('.overview.tab_section').css('.col2s.leftm').css('p').first.text
@etf = Etf.create(ticker: ticker_symbol, name: fund_name, objective: fund_objective)
self.etf_holdings_from_scrape(page, @etf.id)
@etf
end
For example, this method takes in the user inputed ticker symbol, and a page created by the Nokogiri library. It then parses the page by css selectors for the attributes needed to create an ETF record. Next, it calls another method to create records for the ETF's top holdings, sector weights, and country weights by passing the same page and ETF ID(necessary for these records to be related to the ETF by the foreign key of etf_id). Lastly, it returns an instance of the ETF itself.
- Fork it!
- Create your feature branch:
git checkout -b my-new-feature
- Commit your changes:
git commit -am 'Add some feature'
- Push to the branch:
git push origin my-new-feature
- Submit a pull request :D
I love feedback. Always looking to learn and improve. Feel free to send me an email: [email protected] ]]> readme