Skip to content

Commit 6a28cce

Browse files
committed
Fix whitespace issues
* Remove whitespace (blanks, tabs, cr) at line endings Signed-off-by: Stefan Weil <[email protected]>
1 parent 3af2773 commit 6a28cce

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+239
-239
lines changed

.github/ISSUE_TEMPLATE.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Note that it will be much easier for us to fix the issue if a test case that
66
reproduces the problem is provided. Ideally this test case should not have any
77
external dependencies. Provide a copy of the image or link to files for the test case.
88

9-
Please delete this text and fill in the template below.
9+
Please delete this text and fill in the template below.
1010

1111
------------------------
1212

CONTRIBUTING.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ If you think you found a bug in Tesseract, please create an issue.
99
Use the [users mailing-list](https://groups.google.com/d/forum/tesseract-ocr) instead of creating an Issue if ...
1010
* You have problems using Tesseract and need some help.
1111
* You have problems installing the software.
12-
* You are not satisfied with the accuracy of the OCR, and want to ask how you can improve it. Note: You should first read the [ImproveQuality](https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality) wiki page.
12+
* You are not satisfied with the accuracy of the OCR, and want to ask how you can improve it. Note: You should first read the [ImproveQuality](https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality) wiki page.
1313
* You are trying to train Tesseract and you have a problem and/or want to ask a question about the training process. Note: You should first read the **official** guides [[1]](https://github.com/tesseract-ocr/tesseract/wiki) or [[2]](https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract) found in the project wiki.
14-
* You have a general question.
14+
* You have a general question.
1515

1616
An issue should only be reported if the platform you are using is one of these:
1717
* Linux (but not a version that is more than 4 years old)
@@ -22,7 +22,7 @@ For older versions or other operating systems, use the Tesseract forum.
2222

2323
When creating an issue, please report your operating system, including its specific version: "Ubuntu 16.04", "Windows 10", "Mac OS X 10.11" etc.
2424

25-
Search through open and closed issues to see if similar issue has been reported already (and sometimes also has been solved).
25+
Search through open and closed issues to see if similar issue has been reported already (and sometimes also has been solved).
2626

2727
Similarly, before you post your question in the forum, search through past threads to see if similar question has been asked already.
2828

@@ -32,10 +32,10 @@ Only report an issue in the latest official release. Optionally, try to check if
3232

3333
Make sure you are able to replicate the problem with Tesseract command line program. For external programs that use Tesseract (including wrappers and your own program, if you are developer), report the issue to the developers of that software if it's possible. You can also try to find help in the Tesseract forum.
3434

35-
Each version of Tesseract has its own language data you need to obtain. You **must** obtain and install trained data for English (eng) and osd. Verify that Tesseract knows about these two files (and other trained data you installed) with this command:
35+
Each version of Tesseract has its own language data you need to obtain. You **must** obtain and install trained data for English (eng) and osd. Verify that Tesseract knows about these two files (and other trained data you installed) with this command:
3636
`tesseract --list-langs`.
3737

38-
Post example files to demonstrate the problem.
38+
Post example files to demonstrate the problem.
3939
BUT don't post files with private info (about yourself or others).
4040

4141
When attaching a file to the issue report / forum ...
@@ -46,23 +46,23 @@ Do not attach programs or libraries to your issues/posts.
4646

4747
For large files or for programs, add a link to a location where they can be downloaded (your site, Git repo, Google Drive, Dropbox etc.)
4848

49-
Attaching a multi-page TIFF image is useful only if you have problem with multi-page functionality, otherwise attach only one or a few single page images.
49+
Attaching a multi-page TIFF image is useful only if you have problem with multi-page functionality, otherwise attach only one or a few single page images.
5050

5151
Copy the error message from the console instead of sending a screenshot of it.
5252

5353
Use the toolbar above the comment edit area to format your comment.
5454

5555
Add three backticks before and after a code sample or output of a command to format it (The `Insert code` button can help you doing it).
5656

57-
If your comment includes a code sample or output of a command that exceeds ~25 lines, post it as attached text file (`filename.txt`).
57+
If your comment includes a code sample or output of a command that exceeds ~25 lines, post it as attached text file (`filename.txt`).
5858

5959
Use `Preview` before you send your issue. Read it again before sending.
6060

6161
Note that most of the people that respond to issues and answer questions are either other 'regular' users or **volunteers** developers. Please be nice to them :-)
6262

6363
The [tesseract developers](http://groups.google.com/group/tesseract-dev/) forum should be used to discuss Tesseract development: bug fixes, enhancements, add-ons for Tesseract.
6464

65-
Sometimes you will not get a respond to your issue or question. We apologize in advance! Please don't take it personally. There can be many reasons for this, including: time limits, no one knows the answer (at least not the ones that are available at that time) or just that
65+
Sometimes you will not get a respond to your issue or question. We apologize in advance! Please don't take it personally. There can be many reasons for this, including: time limits, no one knows the answer (at least not the ones that are available at that time) or just that
6666
your question has been asked (and has been answered) many times before...
6767

6868
## For Developers: Creating a Pull Request

ChangeLog

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
2017-03-24 - V4.00.00-alpha
22
* Added new neural network system based on LSTMs, with major accuracy gains.
33
* Improvements to PDF rendering.
4-
* Fixes to trainingdata rendering.
4+
* Fixes to trainingdata rendering.
55
* Added LSTM models+lang models to 101 languages. (tessdata repository)
66
* Improved multi-page TIFF handling.
77
* Fixed damage to binary images when processing PDFs.
@@ -40,7 +40,7 @@
4040
* Fixed some openCL issues.
4141
* Added option to build Tesseract with CMake build system.
4242
* Implemented CPPAN support for easy Windows building.
43-
43+
4444
2016-02-17 - V3.04.01
4545
* Added OSD renderer for psm 0. Works for single page and multi-page images.
4646
* Improve tesstrain.sh script.
@@ -84,7 +84,7 @@
8484
text and truetype fonts.
8585
* Added support for PDF output with searchable text.
8686
* Removed entire IMAGE class and all code in image directory.
87-
* Tesseract executable: support for output to stdout; limited support for one
87+
* Tesseract executable: support for output to stdout; limited support for one
8888
page images from stdin (especially on Windows)
8989
* Added Renderer to API to allow document-level processing and output
9090
of document formats, like hOCR, PDF.
@@ -169,12 +169,12 @@
169169
* Added TessdataManager to combine data files into a single file.
170170
* Some dead code deleted.
171171
* VC++6 no longer supported. It can't cope with the use of templates.
172-
* Many more languages added.
172+
* Many more languages added.
173173
* Doxygenation of most of the function header comments.
174174
* Added man pages.
175175
* Added bash completion script (issue 247: thanks to neskiem)
176176
* Fix integer overview in thresholding (issue 366: thanks to Cyanide.Drake)
177-
* Add Danish Fraktur support (issues 300, 360: thanks to
177+
* Add Danish Fraktur support (issues 300, 360: thanks to
178178
179179
* Fix file pointer leak (issue 359, thanks to yukihiro.nakadaira)
180180
* Fix an error using user-words (Issue 345: thanks to max.markin)
@@ -183,7 +183,7 @@
183183
* Fix an automake error (Issue 318, thanks to ichanjz)
184184
* Fix a Win32 crash on fileFormatIsTiff() (Issues 304, 316, 317, 330, 347,
185185
349, 352: thanks to nguyenq87, max.markin, zdenop)
186-
* Fixed a number of errors in newer (stricter) versions of VC++ (Issues
186+
* Fixed a number of errors in newer (stricter) versions of VC++ (Issues
187187
301, among others)
188188

189189
2009-06-30 - V2.04

INSTALL.GIT.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,14 @@ So, the steps for making Tesseract are:
2626
$ make training
2727
$ sudo make training-install
2828

29-
You need to install at least English language and OSD traineddata files to
30-
`TESSDATA_PREFIX` directory.
29+
You need to install at least English language and OSD traineddata files to
30+
`TESSDATA_PREFIX` directory.
3131

3232
You can retrieve single file with tools like [wget](https://www.gnu.org/software/wget/), [curl](https://curl.haxx.se/), [GithubDownloader](https://github.com/intezer/GithubDownloader) or browser.
3333

3434
All language data files can be retrieved from git repository (useful only for packagers!).
3535
(Repository is huge - more that 1.2 GB. You do NOT need to download traineddata files for
36-
all languages).
36+
all languages).
3737

3838
$ git clone https://github.com/tesseract-ocr/tessdata.git tesseract-ocr.tessdata
3939

appveyor.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ environment:
55
- APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2017
66
vs_ver: 15 2017
77
vs_platform: " Win64"
8-
8+
99
configuration:
1010
- Release
11-
11+
1212
cache:
1313
- c:/Users/appveyor/.cppan/storage
14-
14+
1515
# for curl
1616
install:
1717
- set PATH=C:\Program Files\Git\mingw64\bin;%PATH%
@@ -25,7 +25,7 @@ before_build:
2525
- ps: 'Add-Content $env:USERPROFILE\.cppan\cppan.yml "`n`nbuild_warning_level: 0`n"'
2626
- ps: 'Add-Content $env:USERPROFILE\.cppan\cppan.yml "`n`nbuild_system_verbose: false`n"'
2727
- ps: 'Add-Content $env:USERPROFILE\.cppan\cppan.yml "`n`nvar_check_jobs: 1`n"'
28-
28+
2929
build_script:
3030
- mkdir build
3131
- mkdir build\bin

autogen.sh

+6-6
Original file line numberDiff line numberDiff line change
@@ -46,10 +46,10 @@ if [ "$1" = "clean" ]; then
4646
find . -iname "Makefile.in" -type f -exec rm '{}' +
4747
fi
4848

49-
# Prevent any errors that might result from failing to properly invoke
50-
# `libtoolize` or `glibtoolize,` whichever is present on your system,
51-
# from occurring by testing for its existence and capturing the absolute path to
52-
# its location for caching purposes prior to using it later on in 'Step 2:'
49+
# Prevent any errors that might result from failing to properly invoke
50+
# `libtoolize` or `glibtoolize,` whichever is present on your system,
51+
# from occurring by testing for its existence and capturing the absolute path to
52+
# its location for caching purposes prior to using it later on in 'Step 2:'
5353
if command -v libtoolize >/dev/null 2>&1; then
5454
LIBTOOLIZE="$(command -v libtoolize)"
5555
elif command -v glibtoolize >/dev/null 2>&1; then
@@ -67,13 +67,13 @@ fi
6767
bail_out()
6868
{
6969
echo
70-
echo " Something went wrong, bailing out!"
70+
echo " Something went wrong, bailing out!"
7171
echo
7272
exit 1
7373
}
7474

7575
# --- Step 1: Generate aclocal.m4 from:
76-
# . acinclude.m4
76+
# . acinclude.m4
7777
# . config/*.m4 (these files are referenced in acinclude.m4)
7878

7979
mkdir -p config

contrib/genlangdata.pl

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
=pod
1010
11-
=head1 NAME
11+
=head1 NAME
1212
1313
genwordlists.pl - generate word lists for Tesseract
1414
@@ -33,7 +33,7 @@ =head1 DESCRIPTION
3333
pfx=$(echo $i|tr '/' '_'); cat $i | \
3434
perl genwordlists.pl -d OUTDIR -p $pfx; done
3535
36-
This will create a set of output files to match each of the files
36+
This will create a set of output files to match each of the files
3737
WikiExtractor created.
3838
3939
To combine these files:

contrib/tesseract.completion

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#-*- mode: shell-script;-*-
22
#
3-
# bash completion support for tesseract
3+
# bash completion support for tesseract
44
#
55
# Copyright (C) 2009 Neskie A. Manuel <[email protected]>
66
# Distributed under the Apache License, Version 2.0.
@@ -20,19 +20,19 @@ _tesseract()
2020
COMPREPLY=()
2121
cur="$2"
2222
prev="$3"
23-
23+
2424
case "$prev" in
2525
tesseract)
2626
COMPREPLY=($(compgen -f -X "!*.+(tif)" -- "$cur") )
2727
;;
2828
*.tif)
29-
COMPREPLY=($(compgen -W "$(basename $prev .tif)" ) )
29+
COMPREPLY=($(compgen -W "$(basename $prev .tif)" ) )
3030
;;
3131
-l)
3232
_tesseract_languages
3333
;;
3434
*)
35-
COMPREPLY=($(compgen -W "-l" ) )
35+
COMPREPLY=($(compgen -W "-l" ) )
3636
;;
3737
esac
3838
}

doc/Makefile.am

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ man_MANS = \
1717
text2image.1 \
1818
unicharambigs.5 \
1919
unicharset_extractor.1 \
20-
wordlist2dawg.1
20+
wordlist2dawg.1
2121

2222
if !DISABLED_LEGACY_ENGINE
2323
man_MANS += \

doc/classifier_tester.1.asc

+7-7
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@ SYNOPSIS
1111
1212
DESCRIPTION
1313
-----------
14-
classifier_tester(1) runs Tesseract in a special mode.
15-
It takes a list of .tr files and tests a character classifier
16-
on data as formatted for training,
14+
classifier_tester(1) runs Tesseract in a special mode.
15+
It takes a list of .tr files and tests a character classifier
16+
on data as formatted for training,
1717
but it doesn't have to be the same as the training data.
1818
1919
IN/OUT ARGUMENTS
@@ -25,11 +25,11 @@ OPTIONS
2525
-------
2626
-l 'lang'::
2727
(Input) three character language code; default value 'eng'.
28-
28+
2929
-classifier 'x'::
3030
(Input) One of "pruner", "full".
31-
32-
31+
32+
3333
-U 'unicharset'::
3434
(Input) The unicharset for the language.
3535
@@ -42,7 +42,7 @@ OPTIONS
4242
(Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ]
4343

4444
*font_name* *xheight*
45-
45+
4646
-output_trainer 'trainer'::
4747
(Output, Optional) Filename for output trainer.
4848

doc/combine_lang_model.1.asc

+20-20
Original file line numberDiff line numberDiff line change
@@ -8,54 +8,54 @@ combine_lang_model - generate starter traineddata
88

99
SYNOPSIS
1010
--------
11-
*combine_lang_model* --input_unicharset 'filename' --script_dir 'dirname' --output_dir 'rootdir' --lang 'lang' [--lang_is_rtl] [pass_through_recoder] [--words file --puncs file --numbers file]
11+
*combine_lang_model* --input_unicharset 'filename' --script_dir 'dirname' --output_dir 'rootdir' --lang 'lang' [--lang_is_rtl] [pass_through_recoder] [--words file --puncs file --numbers file]
1212

1313
DESCRIPTION
1414
-----------
1515
combine_lang_model(1) generates a starter traineddata file that can be used to train an LSTM-based neural network model. It takes as input a unicharset and an optional set of wordlists. It eliminates the need to run set_unicharset_properties(1), wordlist2dawg(1), some non-existent binary to generate the recoder (unicode compressor), and finally combine_tessdata(1).
16-
16+
1717
OPTIONS
1818
-------
1919
'-l lang'::
20-
The language to use.
20+
The language to use.
2121
Tesseract uses 3-character ISO 639-2 language codes. (See LANGUAGES)
2222

23-
'--script_dir PATH'::
23+
'--script_dir PATH'::
2424
Directory name for input script unicharsets. It should point to the location of langdata (github repo) directory. (type:string default:)
25-
26-
'--input_unicharset FILE'::
25+
26+
'--input_unicharset FILE'::
2727
Unicharset to complete and use in encoding. It can be a hand-created file with incomplete fields. Its basic and script properties will be set before it is used. (type:string default:)
28-
28+
2929
'--lang_is_rtl BOOL'::
3030
True if language being processed is written right-to-left (eg Arabic/Hebrew). (type:bool default:false)
31-
31+
3232
'--pass_through_recoder BOOL'::
3333
If true, the recoder is a simple pass-through of the unicharset. Otherwise, potentially a compression of it by encoding Hangul in Jamos, decomposing multi-unicode symbols into sequences of unicodes, and encoding Han using the data in the radical_table_data, which must be the content of the file: langdata/radical-stroke.txt. (type:bool default:false)
3434

35-
'--version_str STRING'::
35+
'--version_str STRING'::
3636
An arbitrary version label to add to traineddata file (type:string default:)
37-
38-
'--words FILE'::
37+
38+
'--words FILE'::
3939
(Optional) File listing words to use for the system dictionary (type:string default:)
40-
41-
'--numbers FILE'::
40+
41+
'--numbers FILE'::
4242
(Optional) File listing number patterns (type:string default:)
43-
44-
'--puncs FILE'::
43+
44+
'--puncs FILE'::
4545
(Optional) File listing punctuation patterns. The words/puncs/numbers lists may be all empty. If any are non-empty then puncs must be non-empty. (type:string default:)
46-
47-
'--output_dir PATH'::
46+
47+
'--output_dir PATH'::
4848
Root directory for output files. Output files will be written to <output_dir>/<lang>/<lang>.* (type:string default:)
49-
49+
5050
HISTORY
5151
-------
52-
combine_lang_model(1) was first made available for tesseract4.00.00alpha.
52+
combine_lang_model(1) was first made available for tesseract4.00.00alpha.
5353

5454
RESOURCES
5555
---------
5656
Main web site: <https://github.com/tesseract-ocr> +
5757
Information on training tesseract LSTM: <https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00>
58-
58+
5959
SEE ALSO
6060
--------
6161
tesseract(1)

0 commit comments

Comments
 (0)