05 Jul 2020 - tsp
Last update 05 Jul 2020
3 mins
So now that my page got larger and larger - and most articles are written sometimes
at night or in short breaks - I decided that it would be time to include some
kind of spell checking into my writing and publishing process. The typical
tool to be used on Unices is aspell.
It’s easily installable on FreeBSD using the textproc/aspell package as
well as the desired dictionaries.
sudo pkg install textproc/aspell
sudo pkg install textproc/en-aspell
The basic interactive usage on the command line to check a markdown document is rather simple:
aspell --dont-backup -p dictionary.pwd -M check 2020-06-12-nacaprofiles3d.md
In theory there exists another command line switch -p that allows one
to specify a user dictionary.This allows to keep a user dictionary inside my
repository so the fully automated pipeline automatically uses the same
user dictionary without modifying the master-dictionary on the build machine
manually.
The --dont-backup switch suppresses the generation of a backup file with
the same filename as the original and target file - just with an .bak
extension. The -M switch enables the markdown filter to suppress any
error detection inside markdown markup - or inside code tags.
For batch mode checking there are some different options:
The basic idea of the second approach is to modify the build script to
execute a simple command that counts the candidates of spelling errors
for every .md file:
cat ${FILENAME} | aspell -p dictionary.pwd -M list | wc -l
To get a total error candidate count:
find ./_posts/ -name "*.md" -exec cat {} \; | aspell -p dictionary.pwd -M list | wc -l
Another option would be to execute the spellchecker for every file separately
which is way more useful. I’ve done this by implementing a small shell script
that either accepts a directory or a filename. In case one specifies a directory
name the script simply iterates over the specified directory and executes itself
for every file. I didn’t use find since this POSIX conforming find
cannot process the return value of and program or script executed via -exec.
In case the script has been called with a filename the aspell -M list
command gets executed and the output gets counted. In case the spelling error
count is above a configurable threshold the script returns an error code.
#!/bin/sh
if [ $# -lt 1 ]; then
echo "Specify directory or filename"
return 1
fi
if [ -d ${1} ]; then
# find ${1} -type f -name "*.md" -exec ${0} {} \;
FAILING=0
for FNAME in ${1}/*.md; do
${0} ${FNAME}
if [ ! $? -eq 0 ]; then
FAILING=1
fi
done
if [ ! ${FAILING} -eq 0 ]; then
echo "Aborting - too many spellchecking errors"
fi
return ${FAILING}
else
ERRCOUNT=`cat ${1} | aspell -p dictionary.pwd -M list | wc -l`
echo "${ERRCOUNT} ${1}"
if [ ${ERRCOUNT} -gt 10 ]; then
return 1
fi
return 0
fi
This script is then called as usual using
the Makefile in it’s own Jenkins stage - in parallel to the
automatic tag page generation.
This article is tagged:
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/