Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return primary keys when using blocks or values #43

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 19 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ Default value for this option is false.
destination_columns = [:title, :author]

# Update duplicate rows (MySQL)

Book.bulk_insert(*destination_columns, update_duplicates: true) do |worker|
worker.add(...)
worker.add(...)
Expand All @@ -175,20 +176,32 @@ end

### Return Primary Keys (PostgreSQL, PostGIS)

If you want the worker to store primary keys of inserted records, then you can
use the _return_primary_keys_ option. The worker will store a `result_sets`
array of `ActiveRecord::Result` objects. Each `ActiveRecord::Result` object
will contain the primary keys of a batch of inserted records.
If you want the worker to store primary keys of inserted records, then you
can use the _return_primary_keys_ option. The `bulk_insert` block will the
list of primary keys values inserted.

```ruby
worker = Book.bulk_insert(*destination_columns, return_primary_keys: true) do
|worker|
inserted_ids = Book.bulk_insert(return_primary_keys: true) do |worker|
worker.add(...)
worker.add(...)
# ...
end
```

When working with the worker instance directly, the primary keys of the
inserted records will be stored in the `results_sets` attribute as an array
of `ActiveRecord::Result` objects. Each `ActiveRecord::Result` object will
contain the primary keys of a batch of inserted records.

The `inserted_ids` method will return these ids as a flat list.

```ruby
worker = Book.bulk_insert(*destination_columns, return_primary_keys: true)
worker.add(...)
worker.save!

worker.result_sets
worker.inserted_ids
```

## Ruby and Rails Versions Supported
Expand Down
6 changes: 3 additions & 3 deletions lib/bulk_insert.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@ def bulk_insert(*columns, values: nil, set_size:500, ignore: false, update_dupli
columns = default_bulk_columns if columns.empty?
worker = BulkInsert::Worker.new(connection, table_name, primary_key, columns, set_size, ignore, update_duplicates, return_primary_keys)

if values.present?
if !values.nil?
transaction do
worker.add_all(values)
worker.save!
end
nil
return_primary_keys ? worker.inserted_ids : nil
elsif block_given?
transaction do
yield worker
worker.save!
end
nil
return_primary_keys ? worker.inserted_ids : nil
else
worker
end
Expand Down
4 changes: 4 additions & 0 deletions lib/bulk_insert/worker.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ def initialize(connection, table_name, primary_key, column_names, set_size=500,
@set = []
end

def inserted_ids
@return_primary_keys ? @result_sets.map(&:rows).flatten : nil
end
Comment on lines +39 to +41
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option on non postgresql adapters may lead to some errors like:

Old rails versions

NoMethodError: undefined method `rows' for nil:NilClass
    /home/travis/build/jamis/bulk_insert/lib/bulk_insert/worker.rb:40:in `map'
    /home/travis/build/jamis/bulk_insert/lib/bulk_insert/worker.rb:40:in `inserted_ids'
    /home/travis/build/jamis/bulk_insert/test/bulk_insert/worker_test.rb:163:in `block in <class:BulkInsertWorkerTest>'

Or it may return empty arrays in SQLite and Mysql (only if rails >=6).

In order to avoid confusion and future bugs, I would suggest to fail as soon as possible in case of misconfiguration.

E.g.

@return_primary_keys = return_primary_keys

# Add at line 23
validate_return_primary_keys!

# ---
def validate_return_primary_keys!
  if @return_primary_keys && !@statement_adapter.is_a?(StatementAdapters::PostgreSQLAdapter)
    raise ArgumentError, "Unsupported option 'return_primary_keys' for adapter #{@adapter_name}"
  end
end


def pending?
@set.any?
end
Expand Down
3 changes: 3 additions & 0 deletions test/bulk_insert/worker_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ class BulkInsertWorkerTest < ActiveSupport::TestCase
@insert.save!

assert_equal 0, @insert.result_sets.count
assert_nil @insert.inserted_ids
end


Expand Down Expand Up @@ -159,11 +160,13 @@ class BulkInsertWorkerTest < ActiveSupport::TestCase
worker.add greeting: "second"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this test I would follow something similar to:

# return_primary_keys is not supported for mysql and rails < 5
# this test ensures that the case is covered in the CI and handled as expected
if ActiveRecord::VERSION::STRING < "5.0.0" && worker.adapter_name =~ /^mysql/i
error = assert_raise(ArgumentError) { worker.save! }
assert_equal error.message, "BulkInsert does not support @return_primary_keys for mysql and rails < 5"
else
worker.save!
assert_equal 1, worker.result_sets.count
end
end

if Testing.connection.adapter_name =~ /\APost(?:greSQL|GIS)/i
      worker = BulkInsert::Worker.new(
      Testing.connection,
      Testing.table_name,
      'id',
      %w(greeting age happy created_at updated_at color),
      500,
      false,
      false,
      true
    )
    # put the rest of the test here
else
  assert_raise(ArgumentError) {     worker = BulkInsert::Worker.new(Testing.connection, Testing.table_name, 'id', %w(greeting age happy created_at updated_at color), 500, false, false, true) }

you can eventually also capture the error message as per the example linked and perform a partial matching on the error message

worker.save!
assert_equal 1, worker.result_sets.count
Copy link
Collaborator

@mberlanda mberlanda Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the result_sets for non postgresql adapters would look like

  • mysql rails < 6 [ nil ]
  • sqlite and mysql rails > 6 [#<ActiveRecord::Result:0x0000563f19834d08 @columns=[], @rows=[], @hash_rows=nil, @column_types={}>]

assert_equal [], worker.inserted_ids
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the expected result here should be

assert_equal [ Testing.last(1).id ], worker.inserted_ids


worker.add greeting: "third"
worker.add greeting: "fourth"
worker.save!
assert_equal 2, worker.result_sets.count
assert_equal [], worker.inserted_ids
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the expected result here should be

assert_equal Testing.last(2).map(&:id).sort, worker.inserted_ids.sort

end

test "initialized with empty result sets array" do
Expand Down
15 changes: 14 additions & 1 deletion test/bulk_insert_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,26 @@ class BulkInsertTest < ActiveSupport::TestCase

test "bulk_insert with array should save the array immediately" do
assert_difference "Testing.count", 2 do
Testing.bulk_insert values: [
result = Testing.bulk_insert values: [
[ "Hello", 15, true ],
{ greeting: "Hey", age: 20, happy: false }
]
assert_nil result
end
end

test "bulk_insert with an empty value should not return worker" do
assert_nil Testing.bulk_insert values: []
end

test "bulk_insert with option to return primary keys and values should return the new ids" do
output = Testing.bulk_insert(
values: [["Hello", 15, true]],
return_primary_keys: true
)
Comment on lines +60 to +63
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as per the previous comment on skip based on the adapter.

assert_equal output, []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the expected result here should be

assert_equal [Testing.last(1).id], output

end

test "default_bulk_columns should return all columns without id" do
default_columns = %w(greeting age happy created_at updated_at color)

Expand Down