The software tools used to generate indexes come in many flavors and varieties. Which technique is used depends on variables such as budget, eventual re-usability of the source material, translation needs, time constraints, media used to publish the material, file sizes and transferal issues, and individual preferences.
There are several different methodologies for indexing and taxonomy development:
Standalone tools, usually used for back-of-the-book indexes,
allow indexers to work from page-numbered galleys. The indexing is completely
separate from the published material. Wright Information uses CINDEX
for these tasks, which allows formatting of RTF files and the generation
of a database of entries for any translation purposes.
Embedding tools allow indexing codes to be embedded in the electronic
text of a book or file, and allow the index's locators to be updated
as the text changes. Indexers must work in the same files as the publishers.
Wright Information has expertise in FrameMaker, PageMaker and Microsoft
Tagging tools allow indexing codes to be embedded in the electronic
text after the indexing is complete. The indexer inserts numbered dummy
tags in the files, and then builds the index separately. The final step
uses macros to insert the indexing at each tag in the files. Many of
these tools are developed in-house to fit the publishing group's needs.
Keywording is used primarily in online help materials. It can
be hard-coded jumps, similar to HTML jumps, or it can be inserted as
embedded coding and compiled into a list by the software. Wright Information
uses RoboHELP, RoboHTML, and other tools to keyword help files.
Weighted-text search tools, similar to the intelligence in agents
or Microsoft's Office Assistant, involve building terminology sets for
helping the intelligence work. An example would be helping an agent
identify the different between a cell in an Excel spreadsheet and a
cell in a jail. Often terminology sets are built specifically for the
information system, outlining all the synonyms and special meanings
that a particular product uses. Indexing thought and practice comes
into play in the building of these terminology sets.
Automated indexing software builds a concordance, or a word list,
from processed files. Although the manufacturers often claim these packages
build indexes, the actual results are a list of words and phrases, sometimes
useful in the beginning stages of building and index. Usability tests
of these packages have shown that the word lists omit many key ideas
and phrases, and cannot fine-tune terminology for easy retrieval, or
build the needed hierarchies of ideas that professional indexing can.
Free-text search, also produced automatically by software, is
useful in some environments, but tests have shown the retrieval is much
higher with a human-generated index. Wright Information owns software
that will generate concordances, but doesn't use it for a finished index.
Taxonomy, thesaurus, and controlled-language software packages aid in building controlled
languages and sets of keywords for metadata and web sites. Wright Information
uses both MultiTes and TermTree software for these needs.
Web indexing software aids in building HTML web indexes. Wright Information uses a variety of proprietary tools and RoboHTML as needed by the client to build metadata sets, Web-based indexes, and compiled scripted Web indexes, such as those found in WebHelp.
For more information about software for indexing and taxonomy work, try these sources:
"Sources of Software" -- lists tools that go beyond basic back-of-the-book tools.
"Jan Wright's and Nancy Mulvany's Software Setups Compared" -- compares the software packages chosen and used by two leading indexers. Goes beyond just indexing tools.
"ASI's Software Tools Page" -- lists basic indexing tools and some add-ons.