General comment: let's do "README" (docs) driven development here.
- [show] Local functionality for Frictionless datasets with CSV #528
- Move in new work (portal-experiment) into portal.js and refactor datopian/portal.js.bak#59
- [show] Uber Epic covering all functionality See below
- [show] README only + data datasets (don’t have to be frictionless)
- (?) Graphs direct in README with say visdown …
- [show] SQL interface to the data (alasql or sql.js … https://github.com/agershun/alasql/wiki/Performance-Tests)
- file/resource subpages ... (for datasets with lots of resources)
- [show] README only + data datasets (don’t have to be frictionless)
- Docs 80% analysed #
- Create portal components and library i.e. have a Table, Graph, Dataset component
- publish to @datopian/portal
- Examples
- Catalog functionality 20% analysed
- Elegant
- Description (README/Description)
- Data preview and exploration (for tablular)
- Basic: some sample data shown
- Data exploration v1: filterable
- Data Exploration v2: can do sql etc ...
- Graphs / visualization
- Validation: this row does not match schema in column X
- Summarization e.g. this columns has this range of values, this average value, this number of nulls
- Frictionless
- Plain README (with frontmatter)
- README (no frontmatter) and LICENSE file (?)
Data has roughly two dimensions that are relevant
-
Format
- CSV
- xlsx
- JSON
- ...
-
Size
- Small: < 5mb (can just load inline ...)
- Medium < 100mb
- Large < 5Gb
- xlarge > 5Gb
-
TODO: How does show/build work with remote files e.g. a resource ...
path: abc.csv remote_storage_url: s3://.../.../.../
Options:
- We clone the data into path locally ...
- Possible problem if data is big ...
- Load data direct from remote_storage_url (as long as supports CORs)
- We clone the data into path locally ...
Portal.js is a React and NextJS based framework for building dataset/resources pages and catalogs. It consists of:
- React components for data portal functionality e.g. data tables, graphs, dataset pages etc
- Tooling to load data (based on Frictionless)
- Template sites you can reuse using
create-next-app
- Single dataset micro-site
- Github backed catalog
- CKAN backed catalog
- ...
- Local development environment
- Deployment integration with DataHub.io
In summary, technically PortalJS is: NextJS + data specific react components + data loading glue (mostly using frictionless-js).