In this chapter, we discuss the file handling routines available in Darwin. There are three different types of files Darwin recognizes.
The first of these is the standard file which contains Darwin programs, data structures or variable assignments. They can be accessed via the ReadProgram and ReadLibrary commands.
The second file type is the raw data file. It is sometimes convenient to simply load the entire contents of a file into an array so that it may be processed and formatted later (as we do in Chapter - Genetic Databases). These files are read via the ReadRawFile command or the readlines command. When dealing with extremely large files (as is the case with, say, DNA flat files like EMBL), the raw file must be read line by line. Darwin offers several commands for establishing and reading from UNIX pipes.
The third type of file is the genetic database file. A genetic database is a collection of sequence entries annotated with information about these sequences. The genetic database is a cornerstone of the Darwin sequence comparison operations and, as such, it is stored in a special data structure to allow for fast access. We delay discussion about this third type of file until Chapter - Genetic Databases.