What is a "Functions thesaurus"?

The term "thesaurus" has, unfortunately, been applied to various things which do not conform to the description of an information retrieval thesaurus as set out in the international standard for thesaurus construction, ISO 2788 and the USA national standard, ANSI/NISO Z39.19 . One of these is the "functions thesaurus" developed in Australia and used primarily for classification of items in records managment systems. The main example is Keyword AAA.

The problem is that a "functions thesaurus" like Keyword AAA is not really a thesaurus in the sense of these standards, despite a statement (p.6) that it complies with ISO 2788 "as much as is practicable" - it might be considered to be a classification scheme or a scheme for construcing pre-coordinated alphabetical subject headings, and indeed the descriptive texts about it don't appear to recognise the distinction between these types of product, mixing the terms thesaurus and classification somewhat indiscrimately. It uses some of the terminology of the standards for thesaurus construction, but with meanings very different from those which the standards define, so that there is great potential for confusion.

This is not to decry Keyword AAA as a product, for it appears to be a perfectly valid scheme, widely used and well suited to the purposes for which it is intended; it is just misleading to call it a thesaurus. It is primarily a tool for constructing subject headings, for use in file titles or for browsing a sequence of items in an "alphabetico-classed" arrangement. It prescribes certain terms that should be used first in a subject heading string, and then says what subheadings you can use under these, to several levels. In this respect it is more akin to a system like Library of Congress Subject Headings than to a standard thesaurus.

The standards cited above are primarily concerned with inherent relationships between concepts. They would allow relationships such as:

because "agenda papers" are inherently documents in any context. They are a "kind of" document, so the specific-generic relationship is valid.

A "functions thesaurus", however, might show the relationship

which is to be interpreted to mean that "agenda papers" may be used as a subheading under "meetings" when constructing an indexing string such as "MEETINGS - agenda papers". There is no implication that "agenda papers" are a "kind of" meeting.

This is very confusing for people who are familiar with, or in the process of learning about, standard thesauri, and is likely to cause problems if an attempt is made to use a thesaurus which does follow the standards to provide a wider range of subject indexing terms to supplement a "functions thesaurus".

A "functions thesaurus" emphasises its coverage of functions, which are the largest units of business activity in an organisation or jurisdiction. It uses three levels: "Keyword" (i.e. function), "Activity" and "Subject". These labels do not show the nature of the terms clearly, especially the first (the expression "keyword" is used with so many different meanings that I think it is best avoided altogether). In a standard thesaurus there is no need to label "levels" or assign terms to specific levels; doing so introduces problems because the level of a term is essentially arbitrary, and the level of a term or whole sub-tree can easily change if a broader term is inserted or deleted. The only significant level is that of top terms, and terms at this level can easily be distinguished by the fact that they have no broader terms.

It is perfectly possible to create a standard thesaurus with good coverage of functions. Such a thesaurus could be used to create pre-coordinated indexing strings by grouping terms together in facets according to their type and specifying a citation order of facets in building index strings. If we interpret the Keyword AAA "levels" as "facets", we could have three facets: functions, activities and subjects, which could be combined into an indexing string with the following citation order

function : activity : subject

for example


This is fine as a pre-coordinated indexing string, but as there is no inherent hierarchical relationship between function and activity, or between activity and subject, they should not be shown in the thesaurus has having a BT/NT relationship. This type of hierarchical relationship can apply only within a facet, so that one type of activity can be a narrower term of another type of activity. It is misleading to show accidents as a narrower term of fleet management, because accidents are events and not a kind of fleet management. Similarly, accident report forms are not a kind of accident, they are physical items that are used in connections with accidents.

A pre-coordinate system is well suited to browsing and scanning a list arranged in a logical order. To use it for the retrieval of specific topics a searcher has to construct a string with the components in the specified order in order to match that constructed by the indexer. If the heading "fleet management : accidents" is used, someone searching for accidents would not find it unless the system provided some form of permuted or free text searching, which means that the sequence of terms in the string is irrelevant. Post-coordinate indexing, as more widely used in computer systems, would assign the two terms "fleet managment" and "accidents" separately to the document, and a searcher would find it whether looking for one or both terms. In the latter case the search statement would be ("fleet management" AND "accidents").

A thesaurus and a classification scheme can be closely related, and can be alternative presentations and arrangements of terms representing the same concepts, as in a "thesaurofacet". They are not the same thing, though, and it is unfortunate that the expression "functions thesaurus" has been used. The currently fashionable term "taxonomy" is used for a wide variety of schemes that have varying degrees of conformity to standards and principles of thesaurus and classification scheme construction, so it might be better to talk about a "functions taxonomy" - nobody agrees sufficiently on what exactly a taxonomy is, so they will not argue that this is not one.


