Skip to content

Conversation

@rileyseaburg
Copy link
Contributor

@rileyseaburg rileyseaburg commented Dec 11, 2025

Summary

This release fixes a critical bug affecting OMB users since November 19, 2025.

Changes

  • Fix nil organization handling in User#cx_collections - prevents NoMethodError when user's organization is nil
  • Improve eager loading in CxCollection#to_csv for better performance
  • Add database indexes on cx_collections, cx_collection_details, and cx_responses tables

Issue

Users were seeing a 500 error ('We're sorry, but something went wrong') when accessing /admin/cx_collections/export_csv

Root Cause

The User#cx_collections method was calling user_org.id without checking if user_org (the user's organization) was nil first.

Testing

  • Verified on staging (touchpoints-staging.app.cloud.gov)
  • All CI tests pass (except pre-existing unrelated Rack::Attack test)

Related PR

- Add restore/save cache for Cargo target and registry directories
- Use Cargo.lock checksum for cache key to ensure proper invalidation
- Verify Rust library linkage with ldd before deploy
- Fail fast if library has unresolved dependencies
- Should significantly speed up Rust builds on subsequent runs
- Handle nil organization in User#cx_collections to prevent NoMethodError
- Add eager loading for service_provider.organization and cx_collection_details
- Add missing database indexes on cx_collections, cx_collection_details, cx_responses

Fixes the 'We're sorry, but something went wrong' error at:
/admin/cx_collections/export_csv
Copilot AI review requested due to automatic review settings December 11, 2025 17:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical production bug that has been affecting OMB users since November 19, 2025, where accessing the CX Collections CSV export resulted in a 500 error. The root cause was a NoMethodError when attempting to call methods on a nil organization in the User#cx_collections method.

Key Changes:

  • Added nil safety check in User#cx_collections to prevent NoMethodError when user has no organization
  • Optimized CxCollection.to_csv with improved eager loading to prevent N+1 queries and enhance performance
  • Added database indexes on foreign keys for cx_collections, cx_collection_details, and cx_responses tables

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
app/models/user.rb Adds early return with CxCollection.none when user's organization is nil, fixing the production crash
app/models/cx_collection.rb Improves eager loading in CSV export by including all necessary associations and removes unnecessary variable assignment
db/migrate/20251210192727_add_indexes_to_cx_collections.rb Adds indexes on foreign key columns to improve query performance for CX Collections
db/schema.rb Updates schema version and reflects new indexes from migration
.circleci/config.yml Adds Rust build caching and linkage verification to speed up CI builds

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 20 to 27
def cx_collections
user_org = organization
user_parent_org = user_org&.parent
return CxCollection.none if user_org.nil?

user_parent_org = user_org.parent

CxCollection.where(cx_collections: { organization_id: [user_org.id, user_parent_org&.id].compact })
end
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix for the nil organization handling looks correct and addresses the production bug. However, this critical method now lacks test coverage. Consider adding tests to verify:

  1. The method returns an empty collection when the user has no organization
  2. The method correctly queries collections when the user has an organization
  3. The method correctly includes parent organization collections when applicable

The user_spec.rb file has comprehensive test coverage for other User methods, so adding tests here would maintain consistency.

Copilot uses AI. Check for mistakes.
@rileyseaburg rileyseaburg merged commit 91a8a15 into production Dec 11, 2025
14 of 15 checks passed
rileyseaburg added a commit that referenced this pull request Dec 22, 2025
* Production release: Fix CX Collections export CSV error (#1911)

* Add Cargo caching and library linkage verification to CircleCI

- Add restore/save cache for Cargo target and registry directories
- Use Cargo.lock checksum for cache key to ensure proper invalidation
- Verify Rust library linkage with ldd before deploy
- Fail fast if library has unresolved dependencies
- Should significantly speed up Rust builds on subsequent runs

* Fix cx_collections export_csv 500 error

- Handle nil organization in User#cx_collections to prevent NoMethodError
- Add eager loading for service_provider.organization and cx_collection_details
- Add missing database indexes on cx_collections, cx_collection_details, cx_responses

Fixes the 'We're sorry, but something went wrong' error at:
/admin/cx_collections/export_csv

* Update schema.rb with new indexes for CircleCI

* Update schema version to include new migration

* Add tests for User#cx_collections method

* fix: use git URL for rust buildpack in production manifest

* fix: correct redis service name in production manifest

* Release: WidgetRenderer load fix (#1914)

* Add Cargo caching and library linkage verification to CircleCI

- Add restore/save cache for Cargo target and registry directories
- Use Cargo.lock checksum for cache key to ensure proper invalidation
- Verify Rust library linkage with ldd before deploy
- Fail fast if library has unresolved dependencies
- Should significantly speed up Rust builds on subsequent runs

* Fix cx_collections export_csv 500 error

- Handle nil organization in User#cx_collections to prevent NoMethodError
- Add eager loading for service_provider.organization and cx_collection_details
- Add missing database indexes on cx_collections, cx_collection_details, cx_responses

Fixes the 'We're sorry, but something went wrong' error at:
/admin/cx_collections/export_csv

* Update schema.rb with new indexes for CircleCI

* Update schema version to include new migration

* Add tests for User#cx_collections method

* Fix widget renderer load when native lib missing (#1913)

* Fix User#cx_collections specs for Service owner (#1915) (#1917)

* Release: set prod disk quota to 2G (#1919)

* Add Cargo caching and library linkage verification to CircleCI

- Add restore/save cache for Cargo target and registry directories
- Use Cargo.lock checksum for cache key to ensure proper invalidation
- Verify Rust library linkage with ldd before deploy
- Fail fast if library has unresolved dependencies
- Should significantly speed up Rust builds on subsequent runs

* Fix cx_collections export_csv 500 error

- Handle nil organization in User#cx_collections to prevent NoMethodError
- Add eager loading for service_provider.organization and cx_collection_details
- Add missing database indexes on cx_collections, cx_collection_details, cx_responses

Fixes the 'We're sorry, but something went wrong' error at:
/admin/cx_collections/export_csv

* Update schema.rb with new indexes for CircleCI

* Update schema version to include new migration

* Add tests for User#cx_collections method

* Fix widget renderer load when native lib missing (#1913)

* Fix User#cx_collections specs for Service owner (#1915)

* Fix production manifest for rust buildpack (#1918)

* fix: remove empty secret keys from manifest to prevent wiping env vars on deploy

Empty values in the manifest were overwriting secrets set via cf set-env.
Secrets should only be managed via cf set-env, not in the manifest.

* fix: Use fba-usa-modal class for USWDS Modal compatibility

* fix: Add CSS support to Rust widget renderer to fix modal positioning

* fix: Use parent_id instead of parent in Organization factory

* Increase Cloud Foundry start timeout to 180s and fix Sidekiq health check type

* Fix Sidekiq crash and optimize Rust build script

- Added rust-buildpack to Sidekiq deployment
- Updated build_widget_renderer.sh to handle workspace paths and avoid rebuilds
- Added rust-buildpack to touchpoints-demo manifest

* build(cf): simplify widget_renderer build script and ignore rules

Refactors the `.profile.d/build_widget_renderer.sh` script to remove complex logic for locating Rust dependencies, building from source, and checking library linkage. The script now focuses solely on setting `LD_LIBRARY_PATH` and ensuring the prebuilt `libwidget_renderer.so` is in the correct location.

Additionally, updates `.cfignore` to exclude the `target/` directory and specific release artifacts, while ensuring the root `libwidget_renderer.so` is preserved. This suggests a shift towards deploying a precompiled binary rather than building the Rust extension during the Cloud Foundry deployment process.

* Increase deployment timeout and add widget_renderer fallback to resolve startup crashes

* fix(widget_renderer): restore LoadError and adjust app timeout

- Re-enable raising `LoadError` in `widget_renderer.rb` when the native library is missing, removing the previous fallback log message. This ensures the application fails fast if the required extension is absent.
- Reduce application timeout from 600 to 180 seconds in `touchpoints.yml` to better align with platform constraints or performance expectations.
- Remove the hardcoded `WEB_CONCURRENCY: 1` environment variable from `touchpoints.yml`, allowing the buildpack or platform defaults to manage concurrency.

* ci: increase deployment wait time and enable static files in dev

- Increase the `max_wait` time in `.circleci/deploy.sh` from 600 to 800 seconds to prevent timeouts during longer deployment processes.
- Update `config/environments/development.rb` to conditionally enable the public file server based on the `RAILS_SERVE_STATIC_FILES` environment variable, aligning development behavior with other environments when needed.

* ci(deploy): increase timeouts and enable static file serving

- Increase Cloud Foundry push timeout from 180s to 600s in deployment scripts and `touchpoints.yml` to prevent timeouts during startup.
- Enable `RAILS_SERVE_STATIC_FILES` in production configuration and manifest to allow the application to serve precompiled assets directly.
- Update `deploy.sh` to accept a manifest path argument and refactor retry logic for better error handling.

* Decouple db:migrate from deploy: migrations must be run separately to avoid CF 180s timeout

* Add automated pre-deploy migrations via cf run-task to avoid 180s timeout during app start

* Set health check type to process for sidekiq worker before rolling deploy

* Fix sidekiq worker timeout: explicitly set to 180s before rolling deploy

* Fix flaky logo upload test: add wait time to prevent Selenium stale element race condition

* Fix cf set-health-check: use --invocation-timeout instead of --timeout

* Fix Rack::Attack test: create actual form fixture to avoid 404 responses

* Scale sidekiq worker to 1 instance during rolling deploy to avoid org memory quota exceeded

* Stop sidekiq worker before push to free memory for staging (avoids org quota exceeded)

* Fix cf run-task syntax: add --command flag for migrations

* Skip WidgetRenderer load during migrations - library not built in task droplet

* Fix Rack::Attack test: add valid submission params to avoid 400 errors

* Temporarily disable pre-deploy migrations to unblock deployment

* Fix widget_renderer initializer - use simpler skip detection logic

* Fix deployment: restore db:migrate in manifest and enable migrations in deploy script

* Revert migrations to start command - Rust library not available in cf tasks

* Build Rust library at runtime in .profile.d script

* Revert to working widget_renderer script that copies prebuilt library

* Keep prebuilt Rust library during deployment to ensure correct linking with CF Ruby installation

* Fix deploy-sidekiq.sh: remove explicit buildpack flags to avoid re-installing Rust

- Remove -b flags from cf push in deploy-sidekiq.sh
- Let CF auto-detect buildpacks from app metadata like deploy.sh does
- Prevents unnecessary reinstallation of Rust during staging
- Matches web deployment behavior for consistency

* Fix touchpoints.yml: comment out buildpacks to prevent Rust reinstallation

- Comment out explicit buildpacks in manifest
- Let CF auto-detect buildpacks from app metadata
- Prevents re-running supply phase (Rust installation) during deployment
- Rust library already built in CircleCI via bundle install (gem extensions)
- Matches deploy-sidekiq.sh approach for consistency

* Fix flaky timing test in submission_digest mailer spec

Use a single time_threshold let variable to ensure consistent timestamp
comparison instead of calling days_ago.days.ago multiple times which can
result in 1-second differences in CI environments.

* Fix custom-button-modal USWDS initialization

- Add try/catch error handling
- Add conditional checks for fbaUswds methods
- Initialize custom button element for custom-button-modal
- Prevents modal from appearing visible at page bottom

* Bump Cargo version to force Rust rebuild

* Bump widget_renderer version to force Cargo rebuild

* Force cargo clean before build to ensure recompilation

* Bump widget_renderer gem version to 0.1.2 to force rebuild

* Force Rust rebuild with BUILD_ID and version bump

* Bump widget_renderer to 0.1.2 to force CF to rebuild native extension

* Update Gemfile.lock for widget_renderer 0.1.2

* Add Rust library verification before CF push

* Prioritize workspace-level Rust library and bump to 0.1.3

* Invalidate CircleCI cargo cache to force fresh Rust build

* Fix Rust widget renderer modal button initialization

The Rust widget renderer was missing the USWDS Modal initialization
for the #fba-button element used in 'modal' delivery method forms.
This caused the toast/feedback button to not open the modal when clicked,
instead rendering the form inline at the bottom of the page.

Added initialization for #fba-button to match the ERB template behavior
in _fba.js.erb (lines 858-875).

Fixes Zendesk ticket #37620

* Fix breaking changes from PR review

1. Fix modal_class prefix logic: Now respects load_css setting
   - When load_css=true: uses 'fba-usa-modal' prefix
   - When load_css=false: uses 'usa-modal' (no prefix)
   - Matches ERB template behavior in _fba.js.erb line 110

2. Fix CSS backtick escaping: Added escape for backticks in CSS
   - Prevents JavaScript syntax errors when CSS contains backticks
   - CSS is inserted into JS template literals using backticks

3. Remove expired certificate file: tmp_expired_login_gov_cert.pem
   - Certificate expired Aug 2023
   - Added *.pem to .gitignore to prevent future accidental commits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants