Skip to content
This repository has been archived by the owner on Sep 4, 2023. It is now read-only.

simsapa/simsapa-dictionary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 

Repository files navigation

Simsapa Dictionary Tool

āš ļø DEPRECATED āš ļø

This is now an old project and superseded by the more recent Simsapa Dhamma Reader


Table of Contents

What is this for?

This tool generates EPUB, MOBI, Stardict (.zip) and Babylon (.gls) dictionary files.

I hope it is useful for:

  • looking up Pali words more easily on different devices
  • with fulltext search (looking for English words in Pali)
  • in an offline context

The main result are the downloadable files:

Download Pali - English dictionaries: See the Releases page.

You are going to need a dictionary application for your desktop or mobile device, then download and use a suitable format to open it.

For example the StarDict format is widely supported, search for dictionary app stardict format for Windows / Mac / Android or similar to find an app which works for you.

Download one of the formats and open it with the dictionary app.

Desktop setup with GoldenDict

Specific steps to use this with GoldenDict.

Install GoldenDict

Note that the goldendict.org website only has an old version (1.0.1) for Windows.

Download the more recent v1.5 version:

On Linux distributions, you can also install the goldendict package from your package manager.

Download the dictionary

Download one of the StarDict .zip files from the Releases page.

Extract the .zip to a folder, if will contain four files, such as:

combined-dictionary-stardict/
  combined-dictionary.dict.dz
  combined-dictionary.idx
  combined-dictionary.ifo
  combined-dictionary.syn

Add to GoldenDict

  • Open GoldenDict.
  • Select the Edit > Dictionaries menu. It usually opens with the Sources > Files tab open.
  • Click the Add... button, and select the folder where you extracted the .zip.
  • Click OK. The menu will close.
  • Use the top input field to search for words.
  • Use the Search > Full-text Search menu to search in the word definition texts (such as looking for English to Pali).

Add more dictionaries in other languages if you wish. Search for example portuguese stardict dictionary.

Creating StarDict format

The StarDict format is created in two steps:

  • generate an .xml with simsapa_dictionary
  • use stardict-text2bin to generate the StarDict files (.idx, .dict.gz, .syn, .ifo)

Install

This only works on Linux systems. Install the stardict-tools package which contains the above binary.

sudo apt-get install stardict-tools

The package doesn't install the binary to /usr/local/bin, so you will have to specify the full path when using it.

On Ubuntu, the path is /usr/lib/stardict-tools/stardict-text2bin.

Create

For example, say you have a dictionary file in MS Excel Spreadsheet, dictionary.xlsx.

This has to have a Word entries and Metadata sheet (see the sample [ncped with space.xlsx](./tests/data/data with space/ncped with space.xlsx)).

First run simsapa_dictionary_linux to generate the .xml (for more cli options, see ./src/cli.yml):

./simsapa_dictionary_linux xlsx_to_stardict_xml \
    --source_path "./dictionary.xlsx" \
    --output_path "./dictionary.xml"

Then, stardict-text2bin to generate the StarDict files:

/usr/lib/stardict-tools/stardict-text2bin dictionary.xml dictionary.ifo

This is going to create four files, dictionary{.idx, .dict.gz, .syn, .ifo}.

You may wish to ZIP them if you are going to distribute it.

Create it using a shell script

Copy your dictionary.xlsx to a folder.

Open ./assets/xlsx_to_stardict.sh, Right-click on the [Raw] button, select Save as.., save to the folder.

Copy simsapa_dictionary_linux there as well.

Remember to set execution rights for xlsx_to_stardict.sh and simsapa_dictionary_linux, either with chmod +x in the terminal, or the Right-click > Permissions menu in the file manager.

dictionary/
  dictionary.xlsx
  simsapa_dictionary_linux
  xlsx_to_stardict.sh

Open this folder in a terminal and run:

./xlsx_to_stardict.sh dictionary.xlsx

The script combines the above steps and creates dictionary-stardict.zip.

Logging

To see progress log messages, add a .env file with RUST_LOG=info in the folder:

echo "RUST_LOG=info" > .env

This causes the tool to print messages such as:

Running simsapa_dictionary_linux ... [2019-12-11T13:55:30Z INFO  simsapa_dictionary] šŸš€ Launched
[2019-12-11T13:55:30Z INFO  simsapa_dictionary::app] process_first_arg()
[2019-12-11T13:55:30Z INFO  simsapa_dictionary::app] process_cli_args()
[2019-12-11T13:55:30Z INFO  simsapa_dictionary] Subcommand given: XlsxToStardict
[2019-12-11T13:55:30Z INFO  simsapa_dictionary::app] === Begin processing XLSX "ncped.xlsx" ===

Converting to other dictionary formats

The pyglossary tool can convert to a wide range of dictionary formats.

You can use the StarDict files as input format.

simsapa_dictionary tool

The binary executables (simsapa_dictionary.exe, _linux, _osx) are command line applications.

If you simply double click to run it, it will do nothing. If you run it in a terminal, it will display some usage notes.

It is a conversion utility, which can be used in small shell scripts to create or update dictionary files.

Dictionary texts

The dictionary source texts are in the simsapa-dictionary-data repo.

You can download the source text, edit and generate updated EPUB and MOBI files using this tool.

To generate MOBI files, also download Kindlegen from Amazon (free download).

Sources

Applications

GoldenDict (Win, Mac OSX, Linux desktop)

GoldenDict full text search

Use the *-stardict.zip files, extract them and add the folder to the dictionary list in GoldenDict.

Version 1.5 includes Search menu > Full text search, useful for English to Pali searches.

For Windows and OSX, download v1.5 from the Early Access Builds.

Read mode on the wiki pages.

On Linux, install goldendict from your package manager.

Kindle Paperwhite

Kindle Paperwhite

Use one of the *.mobi files and copy them to your Kindle. It will appear in the Dictionaries category.

Epub readers

The *.epub files can be used with ebook readers which read the Epub format.

Android

Search for applications which can open or import StarDict format dictionaries.

You might have to copy-paste the link of a *-stardict.zip file from the Releases page, or download it and extract it to a folder where the dictionary application can find it.

Such apps include:

Online Pali dictionaries

Example dictionary file

See an example dictionary content below. It starts with metadata describing the dictionary, followed by the word entries. Each word entry starts with a TOML formatted block, followed by the definition text in Markdown syntax.

Use a text editor such as Notepad++ and copy the example to a file, for example ncped-example.md.

The file extension must be .md.

Arrange the files in a folder:

dictionary/
  kindlegen.exe
  ncped-example.md
  simsapa_dictionary.exe

On Windows, drag-and-drop ncped-example.md on the simsapa_dictionary.exe.

On Linux and Mac, open a terminal in the folder and run ./simsapa_dictionary ./ncped-example.md.

The default action is to generate a MOBI if kindlegen.exe is also present in the folder, otherwise to generate an EPUB.

More options are available, see them with simsapa_dictionary.exe --help. An overview is included below.

ndped-example.md
--- DICTIONARY METADATA ---

``` toml
title = "New Concise Pali - English Dictionary (NCPED)"
description = "Pali - English"
creator = "Simsapa Dhamma Reader"
source = "https://simsapa.github.io"
cover_path = "default_cover.jpg"
book_id = "NcpedDictionarySimsapa"
created_date_human = ""
created_date_opf = ""
```

--- DICTIONARY WORD ENTRIES ---

``` toml
dict_label = "NCPED"
word = "ababa"
summary = "the name of a hell, or place in Avīci, where one s"
grammar = ""
inflections = []
```

ababa

masculine the name of a hell, or place in Avīci, where one suffers for an *ababa* of years.

``` toml
dict_label = "NCPED"
word = "abbhantara"
summary = "interior, internal; being within, included in, amo"
grammar = ""
inflections = []
```

abbhantara

mfn. & neuter

1. (mfn.) interior, internal; being within, included in, among; belonging to one ā€˜s house, personal, intimate.
2. (n.)
   1. intermediate space, interval; the inside, interior.
   2. a measure of length (= 28 hatthas).

``` toml
dict_label = "NCPED"
word = "ajjhokāse"
summary = "in the open air, in the open."
grammar = ""
inflections = []
```

ajjhokāse

ind. in the open air, in the open.

CLI Options

Use the help command to discover the command line options, or see src/cli.yml.

./simsapa_dictionary help

Feedback, corrections, bug reports

Both the tool and the dictionary content has some rough edges.

The dictionary entries can be edited using the files at simsapa-dictionary-data, and the dictionary formats re-generated with this tool.

Dictionary corrections or bug reports about the tool are welcome. Open an Issue here or see my email in the Cargo.toml.