Wright Information Indexing and Taxonomy Tools

The software tools used to generate indexes come in many flavors and varieties. Which technique is used depends on variables such as budget, eventual re-usability of the source material, translation needs, time constraints, media used to publish the material, file sizes and transferal issues, and individual preferences.

There are several different methodologies for indexing and taxonomy development:

Standalone tools, usually used for back-of-the-book indexes, allow indexers to work from page-numbered galleys. The indexing is completely separate from the published material. Wright Information uses CINDEX for these tasks, which allows formatting of RTF files and the generation of a database of entries for any translation purposes.

Embedding tools allow indexing codes to be embedded in the electronic text of a book or file, and allow the index's locators to be updated as the text changes. Indexers must work in the same files as the publishers. Wright Information has expertise in FrameMaker, PageMaker and Microsoft Word.

Tagging tools allow indexing codes to be embedded in the electronic text after the indexing is complete. The indexer inserts numbered dummy tags in the files, and then builds the index separately. The final step uses macros to insert the indexing at each tag in the files. Many of these tools are developed in-house to fit the publishing group's needs.

Keywording is used primarily in online help materials. It can be hard-coded jumps, similar to HTML jumps, or it can be inserted as embedded coding and compiled into a list by the software. Wright Information uses RoboHELP, RoboHTML, and other tools to keyword help files.

Weighted-text search tools, similar to the intelligence in agents or Microsoft's Office Assistant, involve building terminology sets for helping the intelligence work. An example would be helping an agent identify the different between a cell in an Excel spreadsheet and a cell in a jail. Often terminology sets are built specifically for the information system, outlining all the synonyms and special meanings that a particular product uses. Indexing thought and practice comes into play in the building of these terminology sets.

Automated indexing software builds a concordance, or a word list, from processed files. Although the manufacturers often claim these packages build indexes, the actual results are a list of words and phrases, sometimes useful in the beginning stages of building and index. Usability tests of these packages have shown that the word lists omit many key ideas and phrases, and cannot fine-tune terminology for easy retrieval, or build the needed hierarchies of ideas that professional indexing can. Free-text search, also produced automatically by software, is useful in some environments, but tests have shown the retrieval is much higher with a human-generated index. Wright Information owns software that will generate concordances, but doesn't use it for a finished index.

Abstracting and citation-control software aids in building abstracts with associated keywords. Wright Information uses ProCite for abstracting needs.

Taxonomy, thesaurus, and controlled-language software packages aid in building controlled languages and sets of keywords for metadata and web sites. Wright Information uses both MultiTes and TermTree software for these needs.

Web indexing software aids in building HTML web indexes. Wright Information uses a variety of proprietary tools and RoboHTML as needed by the client to build metadata sets, Web-based indexes, and compiled scripted Web indexes, such as those found in WebHelp.

For more information about software for indexing and taxonomy work, try these sources:

"Sources of Software" -- lists tools that go beyond basic back-of-the-book tools.

"Jan Wright's and Nancy Mulvany's Software Setups Compared" -- compares the software packages chosen and used by two leading indexers. Goes beyond just indexing tools.

"ASI's Software Tools Page" -- lists basic indexing tools and some add-ons.

Jan C. Wright
info@wrightinformation.com