Library functions should be named according to the following
convention.
Rules:
(1) The name should consist of at most five parts.
(2) The verb should reflect the action in a meaningful way ie. Draw, Load, Save, Print. For performing string searches, the
verb should be Search. If we are aligning sequences, it should
be Align. If we are loading information from a file, it should
be Load. If we are creating a graphics file, it should be Draw. If we require a generic verb (such as Create, Build,
Do, Compute, Make, we recommend Make be chosen.
For example, ReadRawFile loads a ``raw file''. This should be
changed to LoadRawFile. The function DrawHistogram creates a
histogram. This should be changed to MakeHistogram.
(3) The noun will typically be the object if you were to say the
sentence completely. It will typically be a
type or structured type. If the routine works on a particular type,
then this type should be placed in the name as the noun.
For example, ReadDb should be changed to LoadDatabase.
(4) A noun should be chosen that represents the
generic object and is mathematical in nature ie. you would not choose
DrawLarson but instead DrawTree.
(5) The adjective should only be included when its absence does
not distinguish between the objective of two or more routines.
There are currently two routines in Darwin to create phylogenetic
trees: DarwUnrootedTree (unrooted) and DrawTree (rooted).
We would choose the most generic mathematical name (Rule 4). This would lead
to the choices DrawRootedTree and DrawUnrootedTree.
(6) The first and only the first letter of each word should be
capitalised unless it is part of an abbreviation common in the
biology/biochemistry literature ie. DNA, RNA, PAM and maybe
AA. See list of abbreviations below.
(7) The adverb indicates a qualified action. For example,
currently the function ApprxTextSearch looks for an approximate
match of a string against a body of text. This should be changed to
ApproximateSearchString (Approximate is the adverb, Search is
the verb, String is the object type.)
(8) The domain is a special identifier used to indicate that the
routine that follows (ie. the <adverb><verb><adjective><noun>)
applies to a special type of object. There are only two special
domains in Darwin: Inter Processor Communication abbreviated to
IPC and Nuclear Peptide abbreviated to NucPep.
The domain is essentially only a short form for the name. The information could be encoded as adjectives and adverbs in the name of the function but this leads to very long names. As a concrete example, the routines GlobalAlign, LocalAlign, LocalAlignBestPam etc. are used to match amino acid sequences with amino acid sequences. Most of the routines have analogs in the the nucleotide/peptide matching arena. We would rename LocalAlignBestPam to AlignBestPamMatch according to the new naming conventions but when we enter the nucleotide/peptide setting this becomes AlignNucPepBestPamMatch which is a little too long (well over twenty letters) and rather cumbersome. Instead, we suggest NucPep_AlignPamMatch for various reasons. Now the same function name is retained except prefixed by the new domain. (It is still too long but this requires that ShakeAlignBestPamMatch be abbreviated, possibly ShakeAlignPamMatch.
The same problems arise with the IPC protocols.
(9) Abbreviations should be avoided. When function names are too
long, the adverb and adjective should be the first to be abbreviated.
All abbreviations should follow the rule above.
(10) Underscore characters should be avoided except to separate <domain> from the rest of the name. Of course, underscore characters are need for polymorphism but this poses no problem with our conventions.
(11) Nouns should be singular.
(12) Functions which perform ``conversion'' require a bit of extra
attention. In the worst case, we might require the following
extension to the grammar for names.
<adverb><verb><adjective><noun>To<adverb><verb><adjective><noun>
For example, the Strings function takes an offset from DB[string] and returns the associated sequence from the database. First we would apply rule 7 to change Strings to String. In the full form, this would become OffsetToString. However, the type specification in the Strings function is offset : integer, so reasonably intelligent readers will immediately know that Strings takes an Offset and returns a String. Here we may leave out the OffsetTo entirely. When any ambiguity arises, we suggest the above <obj>To<obj> syntax.