Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #4897: Follow up alpha MR6 fixes #4896

Merged
merged 60 commits into from
Mar 10, 2023
Merged

Conversation

BenHenning
Copy link
Member

@BenHenning BenHenning commented Mar 10, 2023

Explanation

Fixes #4897

Overview of changes

This PR follows up from the changes introduced in #4846 by adding new functionality to help improve event reliability investigations, as well as telemetry diagnostics for research project facilitators that need to ensure events have been successfully uploaded for processing. To help with this, the following features have been introduced:

  • A new "force upload" button that immediately uploads any pending cached events.
  • Event counts to show both the number of events waiting to be uploaded & the events that are successfully uploaded.
  • An improvement on the 'share' button to include a Base64 deflated binary representation of the entire event log store on the device.

Leading up to, and during, development, a number of theoretical issues were encountered that were also fixed in case they are playing a role in some of the ongoing concerns around Firebase event delivery reliability:

  • The sync status reported on the learner analytics screen can be wrong in some cases, leading to the assumption that events have been uploaded when they may not have been.
  • Events can, in some cases, be lost from previous app instances if they haven't been uploaded yet and, in other cases, duplicated. The former is much more likely to happen in practice, particularly with the connectivity patterns being used in the research project.

Technical specifics

  • FakeSyncStatusManager was renamed to TestSyncStatusManager since it's actually augmenting the production implementation of SyncStatusManager, not replacing it.
  • Events are now being cached indefinitely if the learner study feature is enabled. This allows for some robustness improvements (see below), and the ability to share the events themselves if the team suspects events are not arriving to Firebase.
  • The issue of sync status being incorrect came from an unreliable means of reporting sync status during different event scenarios. Specifically, if the LogUploadWorker doesn't kick off and there's network connectivity, a new event being logged could trigger the sync status to become "data uploaded" even though there are still pending events. The new solution is to leverage the now-tracked uploaded events to determine whether there are still events yet waiting to upload. In this way, the sync status should always be the most up-to-date information with one caveat: network connectivity changes won't reflect until a new log is requested (this is a limitation in how the app tracks network connectivity, and it didn't seem very important to fix it).
  • The issue of possible event duplication comes from PersistentCacheStore using mergeFrom with the current cached state and the on-disk proto being loaded. I originally thought that all pathways to loading the proto from disk could only execute once, but there's at least one pathway where it's possible to chain multiple calls into the cache store such that the load happens twice (and both results end up being merged together, hence the event duplication). This has been fixed by never reloading the cache store if it's already been loaded which should always be correct.
  • The issue if possible events being lost comes from AnalyticsController previously not priming the event log cache. This means that if a new event is logged before the cache is read (e.g. early events like "open profile screen" can race against LogUploadWorker kicking off on app startup) then the cache will be completely reset and, once written, will overwrite any events currently cached on disk. This has been fixed by ensuring that the event log cache has been primed before any interactions with it (read or write) occur.

Note that all of the above has been thoroughly tested through automated testing to ensure that these issues have been properly fixed, and stay that way in the long-term.

Test notes & exemptions

  • In order to simplify testing SyncStatusManagerImpl and TestSyncStatusManager, they ought to both pass a common test suite specific to SyncStatusManager itself. This has been done by introducing a SyncStatusManagerTestBase class that's inherited by both implementation test suites so that they inherit common API tests. This pattern is used in one other location in the codebase currently:
  • CircularProgressIndicatorAdaptersTestActivity has been exempted for an a11y label for the same reason as all other test activities: it technically doesn't need one.
  • TestSyncStatusManagerTest has been allow-listed to use parameterized tests since it's aggressively verifying many different force/override sync status cases. Making sure that the test utility works correctly is extremely important to avoid false failures/passes of dependent tests (which are, in turn, guarding against regressions for the core issues covered by this PR, among other event system behaviors).
  • SyncStatusManagerTestBase has been exempted from requiring KDocs since it's technically a test suite, so KDocs aren't required.
  • A number of classes have been exempted from having new test suites as per existing conventions and requirements: ControlButtonsViewModel (renamed from ShareIdsViewModel), CircularProgressIndicatorAdaptersTestActivity (simple test activities don't require their own tests), CircularProgressIndicatorAdaptersTestViewModel (view models aren't tested directly), SyncStatusManagerTestBase (the test base is more or less a test suite itself, so it doesn't make sense for it to have tests as well).
  • TestSyncStatusManagerTest has been disabled in Gradle since it depends on SyncStatusManagerTestBase which is part of a different module. To make this work, the test base would have to be moved to src/main of the testing module which isn't a very clean approach. The current setup works fine with Bazel, and longer term this will get cleaner by moving testing utilities like TestSyncStatusManager to be near their corresponding production implementations (we just can't do this today because of Gradle limitations).
  • Due to the asynchronous and data race nature of some of the issues discovered during test writing, changed & added tests were sometimes found to be flaky (and, in some cases, very rarely, e.g. 1/50). To ensure that all of the new & changed tests are stable, I made sure to run all 8 affected test suites 128 times to ensure stability. Here are the results of that run (buildbuddy link):
    128xruns_followup_mr6_fixes

Essential Checklist

  • The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • Any changes to scripts/assets files have their rationale included in the PR explanation.
  • The PR follows the style guide.
  • The PR does not contain any unnecessary code changes from Android Studio (reference).
  • The PR is made from a branch that's not called "develop" and is up-to-date with "develop".
  • The PR is assigned to the appropriate reviewers (reference).

For UI-specific PRs only

Following is a video demonstrating the new functionality introduced in this PR:

oppia_follow_up_mr6_fixes_ui_vid.mp4

Note that affected UI tests that could run on Espresso were not verified in this PR similar for the reasons listed in #4846 (specifically, the ongoing issue of Espresso tests being time consuming to investigate, often broken, and this PR being high priority).

Localization & accessibility changes aren't relevant for this change, and any tablet-specific issues aren't important since the affected screen is only expected to be accessed on handset devices.

This fixes a bunch of Swahili string bugs by stripping out unnecessary
prefixed whitespace.

This also introduces a fast language switch bar between English and
Swahili. It's not currently tested or gated behind a platform flag, but
both of these will be changed if the team decides to proceed with this
solution.
This adds support for opening concept cards from hints, solutions, and
other concept cards.
Conflicts:
	app/src/main/java/org/oppia/android/app/player/state/StateFragmentPresenter.kt
	app/src/main/res/layout/state_fragment.xml
	app/src/main/res/values-sw/strings.xml
	app/src/main/res/values/untranslated_strings.xml
Fixes:
- Removed extra spacing around a few Swahili strings (all should now be
fixed).
- Replaced 'reveal' English wording with 'show'.
- Added a new event for logging voiceover pausing.
- Updated the play/pause voiceover events to include language codes.

Documentation and/or tests may not yet be completed.
Conflicts:
	app/src/main/java/org/oppia/android/app/hintsandsolution/HintsAndSolutionDialogFragmentPresenter.kt
	app/src/main/java/org/oppia/android/app/player/exploration/ExplorationActivity.kt
	app/src/main/java/org/oppia/android/app/topic/conceptcard/ConceptCardFragmentPresenter.kt
	domain/src/main/assets/test_exp_id_2.json
	domain/src/main/assets/test_exp_id_2.textproto
With the learner study parameter enabled, admins can now force-complete
lessons on a per-profile basis for all profiles in the app. This
leverages the existing developer-only options menu so this isn't
something we would want to incorporate in the long-term without more
refinement (none of the UI strings are even translated).
This also adds Swahili translations for test topic 0. Some style changes
including support for a secondary button type. Updated the in-lesson
language switcher to use this style & to be gated behind the learner
study platform parameter.

And, fixes hint & solution header expanding to be the whole header and
not the whole card.
This also includes a rebuild of much of the hint/solution data flow
since the previous version had a lot of unnecessary complexity. It also
redefines the internal representation for solution answers (which makes
them type-safe during data retrieval time).
(The intent of this issue is being tracked separately and will be
addressed on this branch in a follow-up commit).
Conflicts:
	app/src/main/res/drawable/secondary_button_background.xml
	app/src/main/res/layout/state_fragment.xml
	app/src/main/res/values/component_colors.xml
	app/src/main/res/values/styles.xml
This includes:
- Post-merge fixes
- A change to make the switch lang button container transparent so that
the continue & submit buttons are still clickable through it
- A fix to work around a databinding bug that can sometimes cause a
crash that would be 100% reproducible for a specific build of the app
(since it's a bytecode-level issue)
Also, clean up component_colors.xml (noticed during PR self-review) and
miscellaneous other code health adjustments.
Also, fix a few issues that the new tests discovered.
- Fixes a test broken in Gradle.
- Repurposes a now unused color for the new secondary button style.
- Fixes landscape support in profile_edit_fragment for the new modify
progress flow.
This includes:
- Fixing image sizing issues
- Fixing bullet/numbered list formatting issues
- Moving the in-lesson Swahili language switcher to also be behind an
admin-configurable per-profile setting
- Logging a new event for when the in-lesson language switch button is
clicked
- Logging a new event for when the user exits a revision card
- Updating all events to include app/content/audio language settings for
the user
- Updating the in-lesson language switch button to only show if the
content of a card is available in Swahili
- Updating the switch-to-Swahili button's text to be in Swahili when the
rest of the strings are in English
- Fixing an issue with session ID to ensure it gets reset properly upon
profile login
- Fixing an issue where switching between tabs in the topic screen
wasn't actually logging events for the switches
- Introducing a new share button in the learner analytics screen to make
it easier to record per-device IDs

Miscellaneous other refactors and/or fixes were also done, as needed,
and per feedback.
Also, bump version codes to prepare for another Kenya-specific alpha
re-release.
@BenHenning BenHenning changed the title Follow up alpha MR6 fixes [Blocked: #4846] Fixes #4897: Follow up alpha MR6 fixes Mar 10, 2023
@BenHenning BenHenning marked this pull request as ready for review March 10, 2023 10:18
@BenHenning BenHenning requested a review from rt4914 as a code owner March 10, 2023 10:18
@BenHenning
Copy link
Member Author

BenHenning commented Mar 10, 2023

@seanlip PTAL at the PR as a whole since you have more context on the chain of changes leading up to this, and the nature of the problems being fixed.

@rt4914 OR @MohitGupta121 PTAL at the UI changes (I'll follow up via chat to see which one of you might have availability).

@adhiamboperes PTAL specifically at the analytics related changes, but feel free to look beyond that if you have time.

The only CI failure at the moment is our standard out-of-memory Gradle flake which I'll restart after I wake up next. I'll also address any open comments, but please err on the side of approving the PR if you only have small concerns (since I'd really like to merge this in order to simplify the release this is tied to since it needs to be cherry-picked).

Please let me know if you have any specific concerns.

Edit: Also, I have self-reviewed the entire PR.

Copy link
Member

@MohitGupta121 MohitGupta121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenHenning LGTM, I test on my emulator and from the video mention in PR looks perfect from UI side. Also pass review on the .xml files code. (But ignore the dark mode side, that I'll do in my revisit of all xml files for darkmode)
Thanks.

@oppiabot
Copy link

oppiabot bot commented Mar 10, 2023

Unassigning @MohitGupta121 since they have already approved the PR.

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenHenning This looks pretty good to me, thanks. Left some small notes but nothing blocking!

@seanlip seanlip assigned BenHenning and unassigned seanlip Mar 10, 2023
Copy link
Collaborator

@adhiamboperes adhiamboperes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mainly focused on the analytics bit, it LGTM.

@adhiamboperes adhiamboperes removed their assignment Mar 10, 2023
@BenHenning
Copy link
Member Author

Thanks @MohitGupta121 @adhiamboperes and @seanlip!

@seanlip I resolved your conversation threads since you approved, but feel free to go back over them. If you have any follow-up concerns or thoughts, I'll address them in a follow-up PR. Note that I didn't make changes per #4896 (comment) since it wasn't clear to me on what we should actually do here, so that one might be worth looking at in particular.

@rt4914 I think you might not have time to review this on short notice, so I'll mark this as a PR to follow-up review on just to make sure everything looks good for codeowners from your perspective.

@BenHenning
Copy link
Member Author

Force merging since @rt4914 won't be able to review this before we need the changes in as it's time-sensitive. Will ask for a post-merge approval if it's needed.

@BenHenning BenHenning merged commit 1addc8f into develop Mar 10, 2023
@BenHenning BenHenning deleted the follow-up-alpha-mr6-fixes branch March 10, 2023 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Follow-up issues & fixes for Kenya user study research project
5 participants