Skip to content

Commit

Permalink
feat: add compact mode, CSL types, and type checking (#124)
Browse files Browse the repository at this point in the history
- standardize publication types by adopting universal CSL publication types
- add compact mode (`--compact`) to generate minimal output (strips comments, line breaks, and empty keys)
- make input (bibtex) and output (publication folder) positional args rather than options (`--...`) as they are always required
- add Python static type checking with pyright (to run: `make type`)
- improve documentation (more docstrings and add easier installation method with pipx to Readme)
  • Loading branch information
gcushen authored Oct 2, 2023
1 parent 00d38b2 commit 9b7164a
Show file tree
Hide file tree
Showing 13 changed files with 269 additions and 157 deletions.
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: black lint test publish
.PHONY: black lint test type publish

format:
poetry run isort --profile black .
Expand All @@ -10,5 +10,8 @@ lint:
test:
poetry run pytest

type:
poetry run pyright

publish:
poetry publish --build --dry-run
62 changes: 31 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,40 +18,37 @@
**Community**

- 📚 [View the **documentation**](https://wowchemy.com/docs/content/publications/#import-from-bibtex) and usage guide below
- 💬 [Chat with the **Wowchemy community**](https://discord.gg/z8wNYzb) or [**Hugo community**](https://discourse.gohugo.io)
- 💬 [Chat with the **community**](https://discord.gg/z8wNYzb)
- 🐦 Twitter: [@wowchemy](https://twitter.com/wowchemy) [@GeorgeCushen](https://twitter.com/GeorgeCushen) [#MadeWithAcademic](https://twitter.com/search?q=(%23MadeWithWowchemy%20OR%20%23MadeWithAcademic)&src=typed_query)

**❤️ Support this open-source software**

To help us develop this converter tool and the associated Wowchemy software sustainably under the MIT license, we ask all individuals and businesses that use it to help support its ongoing maintenance and development via sponsorship and contributing.
To help us develop this converter tool and the associated Wowchemy open source software sustainably under the MIT license, we ask all individuals and businesses that use it to help support its ongoing maintenance and development via sponsorship and contributing.

Support development of the Academic CLI:
Support this open science movement:

- ⭐️ [**Star** this project on GitHub](https://github.com/wowchemy/bibtex-to-markdown)
- ❤️ [Become a **GitHub Sponsor** and **unlock perks**](https://github.com/sponsors/gcushen)
- ☕️ [**Donate a coffee**](https://github.com/sponsors/gcushen)
- 👩‍💻 [**Contribute**](#contribute)

## Prerequisites
## Installation

1. Install [Python 3.11+](https://realpython.com/installing-python/) if it’s not already installed
Open your **Terminal** or **Command Prompt** app and enter one of the installation commands below.

### For Building a Website with Hugo (Optional)
### With Pipx

1. Create a [Hugo](https://gohugo.io) website such as by using the [Hugo Academic Starter](https://github.com/wowchemy/starter-hugo-academic) template for the [Wowchemy](https://wowchemy.com) website builder
1. [Download your site from GitHub, installing Hugo and its dependencies](https://wowchemy.com/docs/getting-started/install-hugo-extended/)
1. [Version control](https://guides.github.com/introduction/git-handbook/#version-control) your website
- Ideally, version control your site with [Git](http://rogerdudler.github.io/git-guide/) so that you can review the proposed changes and accept or reject them without risking breaking your site
- Otherwise, if not using Git, **backup your site folder** prior to running this tool
For the **easiest** installation, install with [Pipx](https://pypa.github.io/pipx/):

## Installation
pipx install academic

Open your Terminal or Command Prompt app and install the Academic CLI tool:
Pipx will **automatically install the required Python version for you** in a dedicated environment.

pip3 install -U academic

Or, help test the latest development version:
### With Pip

pip3 install -U git+https://github.com/wowchemy/hugo-academic-cli.git
To install using the Python's Pip tool, ensure you have [Python 3.11+](https://realpython.com/installing-python/) installed and then run:

pip3 install -U academic

## Usage

Expand All @@ -61,28 +58,24 @@ Use the `cd` command to navigate to the folder containing your Bibtex file:

cd <MY_BIBTEX_FOLDER>

**Import publications:**

Say we downloaded our publications from our reference manager, such as Zotero, to a file named `my_publications.bib` within the website folder. We can import them into the default `content/publication/` folder with:
### Import publications

academic import --bibtex my_publications.bib
Say we downloaded our publications to a file named `my_publications.bib` within the website folder, let's import them into the `content/publication/` folder:

**Import publications to a specific folder (e.g. `content/zh/publication`):**

Say our site has multiple languages, we may want to output the publications to a specific folder with:

academic import --bibtex my_publications.bib --publication-dir content/zh/publication/
academic import my_publications.bib content/publication/ --compact

Optional arguments:

* `--publication-dir PUBLICATION_DIR` Folder to import publications to (defaults to `content/publication`)
* `--compact` Generate minimal markdown without comments or empty keys
* `--overwrite` Overwrite any existing publications in the output folder
* `--normalize` Normalize tags by converting them to lowercase and capitalizing the first letter (e.g. "sciEnCE" -> "Science")
* `--featured` Flag these publications as *featured* (to appear in *Featured Publications* widget)
* `--featured` Flag these publications as *featured* (to appear in your website's *Featured Publications* section)
* `--verbose` or `-v` Show verbose messages
* `--help` Help

After importing publications, [a full text PDF and image can be associated with each item and further details added via extra parameters](https://wowchemy.com/docs/content/publications/).
### Import full text and cover image

After importing publications, [a full text and image can be associated with each item and further details added via extra parameters](https://wowchemy.com/docs/content/publications/).

## Contribute

Expand All @@ -97,13 +90,20 @@ For local development, clone this repository and use Poetry to install and run t
git clone https://github.com/wowchemy/bibtex-to-markdown.git
cd bibtex-to-markdown
poetry install
poetry run academic import --bibtex=tests/data/article.bib --publication-dir=debug --overwrite
poetry run academic import tests/data/article.bib output/ --overwrite --compact

Preparing a contribution:
When preparing a contribution, run the following checks and ensure that they all pass:

- Lint: `make lint`
- Format: `make format`
- Test: `make test`
- Type check: `make type`
-
### Help beta test the dev version

You can help test the latest development version by installing the latest `main` branch from GitHub:

pip3 install -U git+https://github.com/wowchemy/bibtex-to-markdown.git

## License

Expand Down
35 changes: 14 additions & 21 deletions academic/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import argparse
import importlib.metadata
import logging
import os
import sys
from argparse import RawTextHelpFormatter

Expand All @@ -18,12 +17,9 @@
log = logging.getLogger(__name__)


class AcademicError(Exception):
pass


def main():
parse_args(sys.argv[1:]) # Strip command name, leave just args.
# Strip command name (currently `academic`) and feed arguments to the parser
parse_args(sys.argv[1:])


def parse_args(args):
Expand All @@ -38,17 +34,12 @@ def parse_args(args):
subparsers = parser.add_subparsers(help="Sub-commands", dest="command")

# Sub-parser for import command.
parser_a = subparsers.add_parser("import", help="Import data into Academic")
parser_a.add_argument("--bibtex", required=False, type=str, help="File path to your BibTeX file")
parser_a.add_argument(
"--publication-dir",
required=False,
type=str,
default=os.path.join("content", "publication"),
help="Path to import publications to (default `content/publication`)",
)
parser_a = subparsers.add_parser("import", help="Import content into your website or book")
parser_a.add_argument("input", type=str, help="File path to your BibTeX file")
parser_a.add_argument("output", type=str, help="Path to import publications to (e.g. `content/publication/`)")
parser_a.add_argument("--featured", action="store_true", help="Flag publications as featured")
parser_a.add_argument("--overwrite", action="store_true", help="Overwrite existing publications")
parser_a.add_argument("--compact", action="store_true", help="Generate minimal markdown")
parser_a.add_argument(
"--normalize",
action="store_true",
Expand All @@ -71,17 +62,19 @@ def parse_args(args):
parser.exit()
else:
# The command has been recognised, proceed to parse it.
if known_args.command and known_args.verbose:
# Set logging level to debug if verbose mode activated.
logging.getLogger().setLevel(logging.DEBUG)
elif known_args.command and known_args.bibtex:
if known_args.command:
if known_args.verbose:
# Set logging level to debug if verbose mode activated.
logging.getLogger().setLevel(logging.DEBUG)

# Run command to import bibtex.
import_bibtex(
known_args.bibtex,
pub_dir=known_args.publication_dir,
known_args.input,
pub_dir=known_args.output,
featured=known_args.featured,
overwrite=known_args.overwrite,
normalize=known_args.normalize,
compact=known_args.compact,
dry_run=known_args.dry_run,
)

Expand Down
94 changes: 82 additions & 12 deletions academic/generate_markdown.py
Original file line number Diff line number Diff line change
@@ -1,52 +1,122 @@
from pathlib import Path

from ruamel.yaml import YAML
import ruamel.yaml

yaml = YAML()

class GenerateMarkdown:
"""
Load a Markdown file, enable its YAML front matter to be edited (currently, directly via `self.yaml[...]`), and then save it.
"""

class EditableFM:
def __init__(self, base_path: Path, delim: str = "---", dry_run: bool = False):
def __init__(self, base_path: Path, delim: str = "---", dry_run: bool = False, compact: bool = False):
"""
Initialise the class.
Args:
base_path: the folder to save the Markdown file to
delim: the front matter delimiter, i.e. `---` for YAML front matter
dry_run: whether to actually save the output to file
compact: whether to strip comments, line breaks, and empty keys from the generated Markdown
"""
self.base_path = base_path
if delim != "---":
raise NotImplementedError("Currently, YAML is the only supported front-matter format.")
self.delim = delim
self.fm = []
self.yaml = {}
self.content = []
self.path = ""
self.dry_run = dry_run
self.compact = compact
# We use Ruamel's default round-trip loading to preserve key order and comments, rather than `YAML(typ='safe')`
self.yaml_parser = ruamel.yaml.YAML()

def load(self, file: Path):
self.fm = []
"""
Load the Markdown file to edit.
Args:
file: the Markdown filename to load. By default, it will be a copy of the Markdown template file saved to the output folder.
Returns: n/a - directly saves output to `self.yaml`
"""
front_matter_text = []
self.yaml = {}
self.content = []
self.path = self.base_path / file
if self.dry_run and not self.path.exists():
self.fm = dict()
self.yaml = dict()
return

with self.path.open("r", encoding="utf-8") as f:
lines = f.readlines()

# Detect both the YAML front matter and the Markdown content in the template
delims_seen = 0
for line in lines:
if line.startswith(self.delim):
delims_seen += 1
else:
if delims_seen < 2:
self.fm.append(line)
else:
front_matter_text.append(line)
# In Compact mode, we don't add any placeholder content to the page
elif not self.compact:
# Append any Markdown content from the template body (after the YAML front matter)
self.content.append(line)

# Parse YAML, trying to preserve comments and whitespace
self.fm = yaml.load("".join(self.fm))
# Parse YAML, trying to preserve key order, comments, and whitespace
self.yaml = self.yaml_parser.load("".join(front_matter_text))

def recursive_delete_comment_attribs(self, d):
"""
Delete comments from the YAML template for Compact mode
Args:
d: the named attribute to delete from the YAML dict
"""
if isinstance(d, dict):
for k, v in d.items():
self.recursive_delete_comment_attribs(k)
self.recursive_delete_comment_attribs(v)
elif isinstance(d, list):
for elem in d:
self.recursive_delete_comment_attribs(elem)
try:
# literal scalarstring might have comment associated with them
attr = "comment" if isinstance(d, ruamel.yaml.scalarstring.ScalarString) else ruamel.yaml.comments.Comment.attrib # type: ignore
delattr(d, attr)
except AttributeError:
pass

def dump(self):
"""
Save the generated markdown to file.
"""
assert self.path, "You need to `.load()` first."
if self.dry_run:
return

with open(self.path, "w", encoding="utf-8") as f:
f.write("{}\n".format(self.delim))
yaml.dump(self.fm, f)
if self.compact:
# For compact output, strip comments, new lines, and empty keys
# Strip `image` key in Compact mode as it cannot currently be set via Bibtex, it's just set in template.
# Note: a better implementation may be just to start with a different template for Compact mode,
# rather than remove items from the detailed template.
self.recursive_delete_comment_attribs(self.yaml)
elems_to_delete = []
for elem in self.yaml:
if (
self.yaml[elem] is None
or self.yaml[elem] == ""
or self.yaml[elem] == []
or (elem == "featured" and self.yaml[elem] is False)
or (elem == "image")
):
elems_to_delete.append(elem)
for elem in elems_to_delete:
del self.yaml[elem]
del elems_to_delete
self.yaml_parser.dump(self.yaml, f)
f.write("{}\n".format(self.delim))
f.writelines(self.content)
Loading

0 comments on commit 9b7164a

Please sign in to comment.