You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 22, 2020. It is now read-only.
Data hub ideally provides a centralized, integrated, and cleansed data repository for the different various departmental/teams/siloed data, and the dataset/date entities managed in the repo are either data which are currently being used for known analytics requirements and data which we assess to be potentially useful for some future known/unknown analytics requirement.
WHO wants this? Data Hub Team
WHEN do they want it? ASAP. Data hub should have implemented a system using xpub as its data source by the time elife system finally migrates. Besides, editorial team has provides us with a list of data requirements, which requires data probably present only in xpub
WHAT do they want? Access From xPub / Libero Reviewer To Data Hub
WHY do they want this?
Based on current eJP system, some of the entitites we manage in the repo include
manuscript
manuscript versions
manuscript reviews
generic persons (authors, reviewers, editors etc)
person roles
Specific person types
reviewer information
early career researchers information
editors
Section of the data schema for the manuscript data is shown below.
Data Access Requirements
Data
ability to access the data in the libero reviewer database to satisfy our current dataset requirements (as shown above above)
ability to quickly/easily/flexibly add new entity types dataset e.g. financial data because finance departments require some analytics
ability to do both full and incremental data load (ability to sort and filter by time of last update/insert/delete)
Data Load Frequency
presently daily for most dataset based on the limitations imposed by EJP, however, i imagine having some sub-daily data loading (hourly ? perhaps)
Possible Nice To Have -message streaming using web sockets or pub-sub system for listening for new/deleted/updated entities from the database - THIS IS NOT A REQUIREMENT
Proposed solution
Possible solutions
Provide/Use direct access to the database
Provide some web api for accessing the entity - REST / GraphQl
Use of messaging systems to listening to for new/deleted/updated entities
BTW these days, I am leaning on use of direct access to the database as it will provide the data hub team with the flexibility and ease of defining data source independently of the xpub team, however, we will be tightly coupted to the database/data schema definition used by the libero reviewer.
Problem / Motivation
Data hub ideally provides a centralized, integrated, and cleansed data repository for the different various departmental/teams/siloed data, and the dataset/date entities managed in the repo are either data which are currently being used for known analytics requirements and data which we assess to be potentially useful for some future known/unknown analytics requirement.
Based on current eJP system, some of the entitites we manage in the repo include
Section of the data schema for the manuscript data is shown below.
Data Access Requirements
Data
Data Load Frequency
Proposed solution
Possible solutions
BTW these days, I am leaning on use of direct access to the database as it will provide the data hub team with the flexibility and ease of defining data source independently of the xpub team, however, we will be tightly coupted to the database/data schema definition used by the libero reviewer.
/cc @de-code @hdrury1
The text was updated successfully, but these errors were encountered: