05 Jul 2020 - tsp
Last update 05 Jul 2020
3 mins
So now that my page got larger and larger - and most articles are written sometimes
at night or in short breaks - I decided that it would be time to include some
kind of spell checking into my writing and publishing process. The typical
tool to be used on Unices is aspell
.
It’s easily installable on FreeBSD using the textproc/aspell
package as
well as the desired dictionaries.
sudo pkg install textproc/aspell
sudo pkg install textproc/en-aspell
The basic interactive usage on the command line to check a markdown document is rather simple:
aspell --dont-backup -p dictionary.pwd -M check 2020-06-12-nacaprofiles3d.md
In theory there exists another command line switch -p
that allows one
to specify a user dictionary.This allows to keep a user dictionary inside my
repository so the fully automated pipeline automatically uses the same
user dictionary without modifying the master-dictionary on the build machine
manually.
The --dont-backup
switch suppresses the generation of a backup file with
the same filename as the original and target file - just with an .bak
extension. The -M
switch enables the markdown filter to suppress any
error detection inside markdown markup - or inside code tags.
For batch mode checking there are some different options:
The basic idea of the second approach is to modify the build script to
execute a simple command that counts the candidates of spelling errors
for every .md
file:
cat ${FILENAME} | aspell -p dictionary.pwd -M list | wc -l
To get a total error candidate count:
find ./_posts/ -name "*.md" -exec cat {} \; | aspell -p dictionary.pwd -M list | wc -l
Another option would be to execute the spellchecker for every file separately
which is way more useful. I’ve done this by implementing a small shell script
that either accepts a directory or a filename. In case one specifies a directory
name the script simply iterates over the specified directory and executes itself
for every file. I didn’t use find
since this POSIX conforming find
cannot process the return value of and program or script executed via -exec
.
In case the script has been called with a filename the aspell -M list
command gets executed and the output gets counted. In case the spelling error
count is above a configurable threshold the script returns an error code.
#!/bin/sh
if [ $# -lt 1 ]; then
echo "Specify directory or filename"
return 1
fi
if [ -d ${1} ]; then
# find ${1} -type f -name "*.md" -exec ${0} {} \;
FAILING=0
for FNAME in ${1}/*.md; do
${0} ${FNAME}
if [ ! $? -eq 0 ]; then
FAILING=1
fi
done
if [ ! ${FAILING} -eq 0 ]; then
echo "Aborting - too many spellchecking errors"
fi
return ${FAILING}
else
ERRCOUNT=`cat ${1} | aspell -p dictionary.pwd -M list | wc -l`
echo "${ERRCOUNT} ${1}"
if [ ${ERRCOUNT} -gt 10 ]; then
return 1
fi
return 0
fi
This script is then called as usual using
the Makefile
in it’s own Jenkins stage - in parallel to the
automatic tag page generation.
This article is tagged:
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/