26 March 2010 The numbers appearing in a document can often provide a lot of information about the application domain it addresses. The program numbers extracts numeric values (the current release limits itself to values containing a decimal point or exponent) from a file and compares them against numbers contained in an 'interesting' numbers database. A match causes information about the number to be output; a match corresponding to the diameter of the Sun or the orbit of Mercury suggests an astronomical related domain. The 'interesting' numbers database is a collection of text files that users are encouraged to add to. The default directory for these files is called ndb and is initially searched for in the directory that contains the numbers program executable. The project's home page is: www.coding-guidelines.com/numbers See doc/numbers.1 for the man page. Using numbers on a single file: > numbers stuff.txt If the argument is a directory all files in that directory and any contained subdirectories will be processed. > numbers dir_path By default the names of files processed and directories being processed is printed. To switch off this progress trace using the option -trace progress- To obtain a list of numbers contained in the input file that did not match use the -print option: > numbers stuff.txt -print nomatch By default numbers compares values and looks for a match that is within 0.01% of a known value. The fuzzyness of the match can be changed using the -fuzz option, e.g., to match within 0.1%: > numbers stuff.txt -fuzz 0.001 Sometimes a value is scaled by some power of ten and a value match would not be found, but a comparison of the digits appearing in the mantissa would match. This form of comparison can be obtained using the -match option: > numbers stuff.txt -match mantissa to switch off value matching, which defaults to on: > numbers stuff.txt -match mantissa -match value- It is also possible to perform a fuzzy mantissa match. The fuzzyness can be controlled either by specifying the number of significant digits that must match (default 5) or by specifying the levenstein distance of the comparison (default 2). For instance to specify that 7 significant digits must match: > numbers stuff.txt -match mantissa -match 7 or to specify a levenstein distance of 3 (option names can be abbreviated to three letters) > numbers stuff.txt -lev 3 The distance between two changed digits that are adjacent on the keyboard is 1. Every other kind of difference has a distance of 2. Words that occur before a value may provide some context about the domain it comes from. A specified number of words (-words option) that occurred before a match can be output using: > numbers stuff.txt -print words To see the list of supported options and their current setting: > numbers -v Sometimes the digits within a single literal may be separated by a single space character. By default a space character is treated as being a delimited, i.e., it would cause space separated literals to be treated as two separate values. The option '-allow space+' tells the numbers to skip a single space character if it occurs between two sequences of digit characters.