Skip to content

Commit

Permalink
Merge branch 'master' into issue_389
Browse files Browse the repository at this point in the history
  • Loading branch information
Mifrill committed Feb 8, 2021
2 parents f56f0c4 + 5339d3a commit d212720
Show file tree
Hide file tree
Showing 45 changed files with 525 additions and 181 deletions.
15 changes: 8 additions & 7 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,20 +1,21 @@
language: ruby
rvm:
- 2.2
- 2.3
- 2.4
- 2.5
- 2.6
- 2.7
- ruby-head
- jruby-9.1.6.0
env:
- LONG_RUN=true
matrix:
include:
- rvm: 2.0
gemfile: Gemfile_ruby2
- rvm: 2.1
gemfile: Gemfile_ruby2
- rvm: 2.6
env: RUBYOPT=--jit LONG_RUN=true
- rvm: ruby-head
env: RUBYOPT=--jit LONG_RUN=true
allow_failures:
- rvm: ruby-head
- rvm: ruby-head
env: RUBYOPT=--jit LONG_RUN=true
- rvm: jruby-9.1.6.0
bundler_args: --without local_development
45 changes: 44 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,51 @@
## Unreleased
## Unreleased

## [2.8.3] 2020-02-03
### Changed/Added
- Updated rubyzip version. Now minimal version is 1.3.0 [515](https://github.com/roo-rb/roo/pull/515) - [CVE-2019-16892](https://github.com/rubyzip/rubyzip/pull/403)

## [2.8.2] 2019-02-01
### Changed/Added
- Support range cell for Excelx's links [490](https://github.com/roo-rb/roo/pull/490)
- Skip `extract_hyperlinks` if not required [488](https://github.com/roo-rb/roo/pull/488)

### Fixed
- Fixed error for invalid link [492](https://github.com/roo-rb/roo/pull/492)

## [2.8.1] 2019-01-21
### Fixed
- Fixed error if excelx's cell have empty children [487](https://github.com/roo-rb/roo/pull/487)

## [2.8.0] 2019-01-18
### Fixed
- Fixed inconsistent column length for CSV [375](https://github.com/roo-rb/roo/pull/375)
- Fixed formatted_value with `%` for Excelx [416](https://github.com/roo-rb/roo/pull/416)
- Improved Memory consumption and performance [434](https://github.com/roo-rb/roo/pull/434) [449](https://github.com/roo-rb/roo/pull/449) [454](https://github.com/roo-rb/roo/pull/454) [456](https://github.com/roo-rb/roo/pull/456) [458](https://github.com/roo-rb/roo/pull/458) [462](https://github.com/roo-rb/roo/pull/462) [466](https://github.com/roo-rb/roo/pull/466)
- Accept both Transitional and Strict Type for Excelx's worksheets [441](https://github.com/roo-rb/roo/pull/441)
- Fixed ruby warnings [442](https://github.com/roo-rb/roo/pull/442) [476](https://github.com/roo-rb/roo/pull/476)
- Restore support for URL as file identifier for CSV [462](https://github.com/roo-rb/roo/pull/462)
- Fixed missing location for Excelx's links [482](https://github.com/roo-rb/roo/pull/482)

### Changed / Added
- Drop support for ruby 2.2.x and lower
- Updated rubyzip version for fixing security issue. Now minimal version is 1.2.1
- Roo::Excelx::Coordinate now inherits Array [458](https://github.com/roo-rb/roo/pull/458)
- Improved Roo::HeaderRowNotFoundError exception's message [461](https://github.com/roo-rb/roo/pull/461)
- Added `empty_cell` option which by default disable allocation for Roo::Excelx::Cell::Empty [464](https://github.com/roo-rb/roo/pull/464)
- Added support for variable number of decimals for Excelx's formatted_value [387](https://github.com/roo-rb/roo/pull/387)
- Added `disable_html_injection` option to disable html injection for shared string in `Roo::Excelx` [392](https://github.com/roo-rb/roo/pull/392)
- Added image extraction for Excelx [414](https://github.com/roo-rb/roo/pull/414) [397](https://github.com/roo-rb/roo/pull/397)
- Added support for `1e6` as scientific notation for Excelx [433](https://github.com/roo-rb/roo/pull/433)
- Added support for Integer as 0 based index for Excelx's `sheet_for` [455](https://github.com/roo-rb/roo/pull/455)
- Extended `no_hyperlinks` option for non streaming Excelx methods [459](https://github.com/roo-rb/roo/pull/459)
- Added `empty_cell` option to disable Roo::Excelx::Cell::Empty allocation for Excelx [464](https://github.com/roo-rb/roo/pull/464)
- Added support for Integer with leading zero for Roo:Excelx [479](https://github.com/roo-rb/roo/pull/479)
- Refactored Excelx code [453](https://github.com/roo-rb/roo/pull/453) [477](https://github.com/roo-rb/roo/pull/477) [483](https://github.com/roo-rb/roo/pull/483) [484](https://github.com/roo-rb/roo/pull/484)

### Deprecations
- Roo::Excelx::Sheet#present_cells is deprecated [454](https://github.com/roo-rb/roo/pull/454)
- Roo::Utils.split_coordinate is deprecated [458](https://github.com/roo-rb/roo/pull/458)
- Roo::Excelx::Cell::Base#link is deprecated [457](https://github.com/roo-rb/roo/pull/457)

## [2.7.1] 2017-01-03
### Fixed
Expand Down
30 changes: 0 additions & 30 deletions Gemfile_ruby2

This file was deleted.

21 changes: 15 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Install as a gem
Or add it to your Gemfile

```ruby
gem "roo", "~> 2.7.0"
gem "roo", "~> 2.8.0"
```
## Usage

Expand Down Expand Up @@ -261,7 +261,7 @@ ods.formula('A', 2)
csv = Roo::CSV.new("mycsv.csv")
```

Because Roo uses the [standard CSV library](), you can use options available to that library to parse csv files. You can pass options using the ``csv_options`` key.
Because Roo uses the standard CSV library, you can use options available to that library to parse csv files. You can pass options using the ``csv_options`` key.

For instance, you can load tab-delimited files (``.tsv``), and you can use a particular encoding when opening the file.

Expand All @@ -274,6 +274,18 @@ csv = Roo::CSV.new("mytsv.tsv", csv_options: {col_sep: "\t"})
csv = Roo::CSV.new("mycsv.csv", csv_options: {encoding: Encoding::ISO_8859_1})
```

You can also open csv files through the Roo::Spreadsheet class (useful if you accept both CSV and Excel types from a user file upload, for example).

```ruby
# Load a spreadsheet from a file path
# Roo figures out the right parser based on file extension
spreadsheet = Roo::Spreadsheet.open(csv_or_xlsx_file)

# Load a csv and auto-strip the BOM (byte order mark)
# csv files saved from MS Excel typically have the BOM marker at the beginning of the file
spreadsheet = Roo::Spreadsheet.open("mycsv.csv", { csv_options: { encoding: 'bom|utf-8' } })
```

## Upgrading from Roo 1.13.x
If you use ``.xls`` or Google spreadsheets, you will need to install ``roo-xls`` or ``roo-google`` to continue using that functionality.

Expand All @@ -283,7 +295,7 @@ Roo's public methods have stayed relatively consistent between 1.13.x and 2.0.0,

## Contributing
### Features
1. Fork it ( https://github.com/[my-github-username]/roo/fork )
1. Fork it ( https://github.com/roo-rb/roo/fork )
2. Install it (`bundle install --with local_development`)
3. Create your feature branch (`git checkout -b my-new-feature`)
4. Commit your changes (`git commit -am 'My new feature'`)
Expand All @@ -300,9 +312,6 @@ You can run the tests/examples with Rspec like reporters by running
Roo also has a few tests that take a long time (5+ seconds). To run these, use
`LONG_RUN=true bundle exec rake`

When testing using Ruby 2.0 or 2.1, use this command:
`BUNDLE_GEMFILE=Gemfile_ruby2 bundle exec rake`

### Issues

If you find an issue, please create a gist and refer to it in an issue ([sample gist](https://gist.github.com/stevendaniels/98a05849036e99bb8b3c)). Here are some instructions for creating such a gist.
Expand Down
29 changes: 16 additions & 13 deletions lib/roo/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -288,12 +288,12 @@ def each(options = {})
clean_sheet_if_need(options)
search_or_set_header(options)
headers = @headers ||
Hash[(first_column..last_column).map do |col|
[cell(@header_line, col), col]
end]
(first_column..last_column).each_with_object({}) do |col, hash|
hash[cell(@header_line, col)] = col
end

@header_line.upto(last_row) do |line|
yield(Hash[headers.map { |k, v| [k, cell(line, v)] }])
yield(headers.each_with_object({}) { |(k, v), hash| hash[k] = cell(line, v) })
end
end
end
Expand Down Expand Up @@ -424,9 +424,9 @@ def find_by_row(row_index)

def find_by_conditions(options)
rows = first_row.upto(last_row)
header_for = Hash[1.upto(last_column).map do |col|
[col, cell(@header_line, col)]
end]
header_for = 1.upto(last_column).each_with_object({}) do |col, hash|
hash[col] = cell(@header_line, col)
end

# are all conditions met?
conditions = options[:conditions]
Expand All @@ -441,9 +441,9 @@ def find_by_conditions(options)
rows.map { |i| row(i) }
else
rows.map do |i|
Hash[1.upto(row(i).size).map do |j|
[header_for.fetch(j), cell(i, j)]
end]
1.upto(row(i).size).each_with_object({}) do |j, hash|
hash[header_for.fetch(j)] = cell(i, j)
end
end
end
end
Expand Down Expand Up @@ -497,8 +497,11 @@ def sanitize_value(v)
def set_headers(hash = {})
# try to find header row with all values or give an error
# then create new hash by indexing strings and keeping integers for header array
@headers = row_with(hash.values, true)
@headers = Hash[hash.keys.zip(@headers.map { |x| header_index(x) })]
header_row = row_with(hash.values, true)
@headers = {}
hash.each_with_index do |(key, _), index|
@headers[key] = header_index(header_row[index])
end
end

def header_index(query)
Expand Down Expand Up @@ -541,7 +544,7 @@ def download_uri(uri, tmpdir)
tempfilename = File.join(tmpdir, find_basename(uri))
begin
File.open(tempfilename, "wb") do |file|
open(uri, "User-Agent" => "Ruby/#{RUBY_VERSION}") do |net|
URI.open(uri, "User-Agent" => "Ruby/#{RUBY_VERSION}") do |net|
file.write(net.read)
end
end
Expand Down
14 changes: 10 additions & 4 deletions lib/roo/csv.rb
Original file line number Diff line number Diff line change
Expand Up @@ -90,17 +90,23 @@ def read_cells(sheet = default_sheet)
def each_row(options, &block)
if uri?(filename)
each_row_using_tempdir(options, &block)
elsif is_stream?(filename_or_stream)
::CSV.new(filename_or_stream, options).each(&block)
else
::CSV.foreach(filename, options, &block)
csv_foreach(filename_or_stream, options, &block)
end
end

def each_row_using_tempdir(options, &block)
::Dir.mktmpdir(Roo::TEMP_PREFIX, ENV["ROO_TMP"]) do |tmpdir|
tmp_filename = download_uri(filename, tmpdir)
::CSV.foreach(tmp_filename, options, &block)
csv_foreach(tmp_filename, options, &block)
end
end

def csv_foreach(path_or_io, options, &block)
if is_stream?(path_or_io)
::CSV.new(path_or_io, **options).each(&block)
else
::CSV.foreach(path_or_io, **options, &block)
end
end

Expand Down
35 changes: 17 additions & 18 deletions lib/roo/excelx.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ def initialize(filename_or_stream, options = {})
sheet_options = {}
sheet_options[:expand_merged_ranges] = (options[:expand_merged_ranges] || false)
sheet_options[:no_hyperlinks] = (options[:no_hyperlinks] || false)
sheet_options[:empty_cell] = (options[:empty_cell] || false)
shared_options = {}

shared_options[:disable_html_wrapper] = (options[:disable_html_wrapper] || false)
unless is_stream?(filename_or_stream)
file_type_check(filename_or_stream, %w[.xlsx .xlsm], 'an Excel 2007', file_warning, packed)
Expand All @@ -59,16 +60,17 @@ def initialize(filename_or_stream, options = {})
@filename = local_filename(filename_or_stream, @tmpdir, packed)
process_zipfile(@filename || filename_or_stream)

@sheet_names = workbook.sheets.map do |sheet|
unless options[:only_visible_sheets] && sheet['state'] == 'hidden'
sheet['name']
end
end.compact
@sheet_names = []
@sheets = []
@sheets_by_name = Hash[@sheet_names.map.with_index do |sheet_name, n|
@sheets[n] = Sheet.new(sheet_name, @shared, n, sheet_options)
[sheet_name, @sheets[n]]
end]
@sheets_by_name = {}

workbook.sheets.each_with_index do |sheet, index|
next if options[:only_visible_sheets] && sheet['state'] == 'hidden'

sheet_name = sheet['name']
@sheet_names << sheet_name
@sheets_by_name[sheet_name] = @sheets[index] = Sheet.new(sheet_name, @shared, index, sheet_options)
end

if cell_max
cell_count = ::Roo::Utils.num_cells_in_range(sheet_for(options.delete(:sheet)).dimensions)
Expand Down Expand Up @@ -333,7 +335,7 @@ def extract_worksheet_ids(entries, path)

wb.extract(path)
workbook_doc = Roo::Utils.load_xml(path).remove_namespaces!
workbook_doc.xpath('//sheet').map { |s| s.attributes['id'].value }
workbook_doc.xpath('//sheet').map { |s| s['id'] }
end

# Internal
Expand All @@ -359,14 +361,11 @@ def extract_worksheet_rels(entries, path)
rels_doc = Roo::Utils.load_xml(path).remove_namespaces!

relationships = rels_doc.xpath('//Relationship').select do |relationship|
worksheet_types.include? relationship.attributes['Type'].value
worksheet_types.include? relationship['Type']
end

relationships.inject({}) do |hash, relationship|
attributes = relationship.attributes
id = attributes['Id']
hash[id.value] = attributes['Target'].value
hash
relationships.each_with_object({}) do |relationship, hash|
hash[relationship['Id']] = relationship['Target']
end
end

Expand Down Expand Up @@ -463,7 +462,7 @@ def process_zipfile_entries(entries)
end

def safe_send(object, method, *args)
object.send(method, *args) if object && object.respond_to?(method)
object.send(method, *args) if object&.respond_to?(method)
end

def worksheet_types
Expand Down
3 changes: 1 addition & 2 deletions lib/roo/excelx/cell.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@ def type
end

def self.create_cell(type, *values)
type_class = cell_class(type)
type_class && type_class.new(*values)
cell_class(type)&.new(*values)
end

def self.cell_class(type)
Expand Down
2 changes: 1 addition & 1 deletion lib/roo/excelx/cell/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def type
end

def formula?
!!@formula
!!(defined?(@formula) && @formula)
end

def link?
Expand Down
5 changes: 3 additions & 2 deletions lib/roo/excelx/cell/number.rb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def create_numeric(number)
when /\.0/
Float(number)
else
(number.include?('.') || (/\A[-+]?\d+E[-+]?\d+\z/i =~ number)) ? Float(number) : Integer(number)
(number.include?('.') || (/\A[-+]?\d+E[-+]?\d+\z/i =~ number)) ? Float(number) : Integer(number, 10)
end
end

Expand All @@ -48,7 +48,7 @@ def generate_formatter(format)
when /^(0+)$/ then "%0#{$1.size}d"
when /^0\.(0+)$/ then "%.#{$1.size}f"
when '#,##0' then number_format('%.0f')
when '#,##0.00' then number_format('%.2f')
when /^#,##0.(0+)$/ then number_format("%.#{$1.size}f")
when '0%'
proc do |number|
Kernel.format('%d%%', number.to_f * 100)
Expand All @@ -64,6 +64,7 @@ def generate_formatter(format)
when '#,##0.00;[Red](#,##0.00)' then number_format('%.2f', '[Red](%.2f)')
# FIXME: not quite sure what the format should look like in this case.
when '##0.0E+0' then '%.1E'
when "_-* #,##0.00\\ _€_-;\\-* #,##0.00\\ _€_-;_-* \"-\"??\\ _€_-;_-@_-" then number_format('%.2f', '-%.2f')
when '@' then proc { |number| number }
else
raise "Unknown format: #{format.inspect}"
Expand Down
2 changes: 1 addition & 1 deletion lib/roo/excelx/cell/time.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ def initialize(value, formula, excelx_type, style, link, base_date, coordinate)
super
@format = excelx_type.last
@datetime = create_datetime(base_date, value)
@value = link ? Roo::Link.new(link, value) : (value.to_f * 86_400).to_i
@value = link ? Roo::Link.new(link, value) : (value.to_f * 86_400).round.to_i
end

def formatted_value
Expand Down
Loading

0 comments on commit d212720

Please sign in to comment.