Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whitelisted fields are not persisted in the destination database #17

Closed
jabley opened this issue Nov 6, 2012 · 9 comments · Fixed by #19
Closed

whitelisted fields are not persisted in the destination database #17

jabley opened this issue Nov 6, 2012 · 9 comments · Fixed by #19
Assignees
Labels
Milestone

Comments

@jabley
Copy link
Contributor

jabley commented Nov 6, 2012

I'm trying to use this gem, but having problems copying the data over.

I have this but the resulting table only contains the id column; the created_at and updated_at fields are populated but with the current time rather than the values in the source database and all of the other fields are empty / nil.

When I run the task under the debugger, I see this

/var/govuk/data-anonymization/lib/strategy/whitelist.rb:23
dest_record.save!
(rdb:1) p dest_record_map
{"legacy_id"=>"1", "group_description"=>"Trading", "category_description"=>"Loss of Market", "code"=>"A.10.10", "code_description"=>"Competition"}
(rdb:1) p dest_record
#<#<Class:0x000000071bd1c0> id: 1, legacy_id: nil, group_description: nil, category_description: nil, code: nil, code_description: nil, ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>

To me, this looks like I've set up the anonymization correctly (for now, I just want to copy the data and once I'm happy that works, I'll add the correct anonymization strategies and other tables in) but the data isn't being persisted for some reason.

Can you help me understand what I'm doing wrong?

ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] running under Ubuntu 10.04

@jabley
Copy link
Contributor Author

jabley commented Nov 9, 2012

It looks to be a rails mass assignment thing. I'm using rails 3.2.8.

(rdb:1) p dest_record
#<#<Class:0x00000005dfc870> id: 1, legacy_id: nil, group_description: nil, category_description: nil, code: nil, code_description: nil, ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>
(rdb:1) p dest_record_map
{"legacy_id"=>"1", "group_description"=>"Trading", "category_description"=>"Loss of Market", "code"=>"A.10.10", "code_description"=>"Competition"}
(rdb:1) eval dest_record.attributes= dest_record_map
{"legacy_id"=>"1", "group_description"=>"Trading", "category_description"=>"Loss of Market", "code"=>"A.10.10", "code_description"=>"Competition"}
(rdb:1) p dest_record
#<#<Class:0x00000005dfc870> id: 1, legacy_id: nil, group_description: nil, category_description: nil, code: nil, code_description: nil, ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>
(rdb:1) eval dest_record.assign_attributes(dest_record_map, without_protection: true)
nil
(rdb:1) p dest_record
#<#<Class:0x00000005dfc870> id: 1, legacy_id: "1", group_description: "Trading", category_description: "Loss of Market", code: "A.10.10", code_description: "Competition", ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>

The mass assignment sanitizer is correct, but perhaps isn't quite having the intended effect.

(rdb:1) p dest_record.class._mass_assignment_sanitizer
=> #<DataAnon::Utils::MassAssignmentIgnoreSanitizer:0x0000000682dce0>

I can potentially work around it by changing the assignment to happen not by passing it to the initializer, but instead by doing dest_record.assign_attributes(dest_record_map, without_protection: true) but that feels pretty nasty. I'll have a poke around for cleaner solutions.

@sunitparekh
Copy link
Owner

thank u figuring out the problem. Give me a day and I will get something
working in for you. I am out from the no internet zone and can work freely
to help you

regards,
Sunit.

On Fri, Nov 9, 2012 at 6:24 PM, James Abley [email protected]:

It looks to be a rails mass assignment thing. I'm using rails 3.2.8.

(rdb:1) p dest_record
#<#Class:0x00000005dfc870 id: 1, legacy_id: nil, group_description: nil, category_description: nil, code: nil, code_description: nil, ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>
(rdb:1) p dest_record_map
{"legacy_id"=>"1", "group_description"=>"Trading", "category_description"=>"Loss of Market", "code"=>"A.10.10", "code_description"=>"Competition"}
(rdb:1) eval dest_record.attributes= dest_record_map
{"legacy_id"=>"1", "group_description"=>"Trading", "category_description"=>"Loss of Market", "code"=>"A.10.10", "code_description"=>"Competition"}
(rdb:1) p dest_record
#<#Class:0x00000005dfc870 id: 1, legacy_id: nil, group_description: nil, category_description: nil, code: nil, code_description: nil, ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>
(rdb:1) eval dest_record.assign_attributes(dest_record_map, without_protection: true)
nil
(rdb:1) p dest_record
#<#Class:0x00000005dfc870 id: 1, legacy_id: "1", group_description: "Trading", category_description: "Loss of Market", code: "A.10.10", code_description: "Competition", ar_timestamp: nil, ar_insert_timestamp: nil, created_at: nil, updated_at: nil>

The mass assignment sanitizer is correct, but perhaps isn't quite having
the intended effect.

(rdb:1) p dest_record.class._mass_assignment_sanitizer
=> #DataAnon::Utils::MassAssignmentIgnoreSanitizer:0x0000000682dce0

I can potentially work around it by changing the assignment to happen not
by passing it to the initializer, but instead by doing dest_record.assign_attributes(dest_record_map,
without_protection: true) but that feels pretty nasty. I'll have a poke
around for cleaner solutions.


Reply to this email directly or view it on GitHubhttps://github.com//issues/17#issuecomment-10237246.

thanks & regards,
Sunit
[email protected]

@sunitparekh
Copy link
Owner

to replicate the error you are getting I added timestamp to my tests. However, things are working fine as expected for me. Checkout the following test that I updated for testing your scenario.

https://github.com/sunitparekh/data-anonymization/blob/master/spec/acceptance/rdbms_whitelist_spec.rb#L51

Let me know if the scenario is something different. I am online on Skype for most of the day. My skype id 'sunitparekh'. Will be happy to help you today. I have time.

@jabley
Copy link
Contributor Author

jabley commented Nov 11, 2012

Sorry, been doing family stuff most of the day. Thanks for the offer.

I guess I have 2 issues here.

  1. timestamp fields aren't being populated from the source database.
  2. The fields that are being read from the source database aren't being persisted.

The debugger output above shows that fields are being read from the source database, but the timestamp fields aren't.

The mass assignment problem I described and potential fix would cure my problem with fields not being persisted, but I'm not clear why the timestamp fields don't appear to be read from the source database.

I'll take a look into that tomorrow.

@sunitparekh
Copy link
Owner

Would it be possible to create one sample test send me across, so I can put
that as my test and reproduce the issue. That will help a lot to fix the
problem.

regards,
Sunit.

On Sun, Nov 11, 2012 at 11:43 PM, James Abley [email protected]:

Sorry, been doing family stuff most of the day. Thanks for the offer.

I guess I have 2 issues here.

  1. timestamp fields aren't being populated from the source database.
  2. The fields that are being read from the source database aren't
    being persisted.

The debugger output above shows that fields are being read from the source
database, but the timestamp fields aren't.

The mass assignment problem I described and potential fix would cure my
problem with fields not being persisted, but I'm not clear why the
timestamp fields don't appear to be read from the source database.

I'll take a look into that tomorrow.


Reply to this email directly or view it on GitHubhttps://github.com//issues/17#issuecomment-10273447.

thanks & regards,
Sunit
[email protected]

@jabley
Copy link
Contributor Author

jabley commented Jan 28, 2013

I have a test case, after a fashion.

https://github.com/alphagov/EFG/commits/data-extract-failure contains a copy of a test from the data-anonymization gem. It passes (in isolation, there are some other effects going on that cause existing specs to fail) but if you check out that repo, use the data-extract-failure branch:

bundle exec rake spec SPEC=spec/lib/extractor_spec.rb

can run the test in isolation, and the modification to CustomerSample.insert_record is needed to avoid a test failure due to what looks like mass assignment protection. If you remove that change, it will fail.

Hope this helps.

UPDATE you'll need to remove the pending from the spec as well; it's a WIP.

The failure I see when running that spec:

1) Extractor should anonymize customer table record 
     Failure/Error: new_rec.address.should == 'F 501 Shanti Nagar'
       expected: "F 501 Shanti Nagar"
            got: nil (using ==)
     # ./spec/lib/extractor_spec.rb:48:in `block (2 levels) in <top (required)>'

@jabley
Copy link
Contributor Author

jabley commented Jan 28, 2013

I think it's down to our use of config.active_record.whitelist_attributes = true

I'll see if I can create a better test case within the data-anonymization project which demonstrates this.

@jabley jabley mentioned this issue Jan 29, 2013
@sunitparekh
Copy link
Owner

Done. I have accepted your pull request with one minor modification.

in place of doing it as assign_attributes call, I passed the config value without_protection: true in constructer. Your test is passing after that change, so I hope that's fine.

Thank you using library. Let me know if I can help with any other issues.

@sunitparekh
Copy link
Owner

Released v0.5.3
Happy data anonymization :-)

@ghost ghost assigned sunitparekh Jan 29, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants