                                              26 March 2010

The numbers appearing in a document can often provide a
lot of information about the application domain it addresses.

The program numbers extracts numeric values (the current release
limits itself to values containing a decimal point or exponent)
from a file and compares them against numbers contained in an
'interesting' numbers database.  A match causes information
about the number to be output; a match corresponding to the
diameter of the Sun or the orbit of Mercury suggests an
astronomical related domain.

The 'interesting' numbers database is a collection of text files
that users are encouraged to add to.  The default directory for
these files is called ndb and is initially searched for in the
directory that contains the numbers program executable.

The project's home page is: www.coding-guidelines.com/numbers

See doc/numbers.1 for the man page.


Using numbers on a single file:

 > numbers stuff.txt

If the argument is a directory all files in that directory and
any contained subdirectories will be processed.

 > numbers dir_path


By default the names of files processed and directories
being processed is printed.  To switch off this progress
trace using the option -trace progress-

To obtain a list of numbers contained in the input file that did not
match use the -print option:

 > numbers stuff.txt -print nomatch


By default numbers compares values and looks for a match that
is within 0.01% of a known value.  The fuzzyness of the match
can be changed using the -fuzz option, e.g., to match within 0.1%:

 > numbers stuff.txt -fuzz 0.001


Sometimes a value is scaled by some power of ten and a value
match would not be found, but a comparison of the digits appearing
in the mantissa would match.  This form of comparison can be
obtained using the -match option:

 > numbers stuff.txt -match mantissa

to switch off value matching, which defaults to on:

 > numbers stuff.txt -match mantissa -match value-


It is also possible to perform a fuzzy mantissa match.  The fuzzyness
can be controlled either by specifying the number of significant
digits that must match (default 5) or by specifying the levenstein
distance of the comparison (default 2).  For instance to specify
that 7 significant digits must match:

 > numbers stuff.txt -match mantissa -match 7

or to specify a levenstein distance of 3 (option names can be
abbreviated to three letters)

 > numbers stuff.txt -lev 3

The distance between two changed digits that are adjacent on
the keyboard is 1.  Every other kind of difference has a distance
of 2.

Words that occur before a value may provide some context about
the domain it comes from.  A specified number of words (-words <number>
option) that occurred before a match can be output using:

 > numbers stuff.txt -print words

To see the list of supported options and their current setting:

 > numbers -v


Sometimes the digits within a single literal may be separated by
a single space character.  By default a space character is treated
as being a delimited, i.e., it would cause space separated literals
to be treated as two separate values.  The option '-allow space+'
tells the numbers to skip a single space character if it occurs
between two sequences of digit characters.

