 
                   A SIMPLE MACRO PROCESSOR
                   - ------ ----- ---------
 
 
                         JOHN R. RICE
                        CALVIN RIBBENS
                       COMPUTER SCIENCE
                      PURDUE UNIVERSITY
 
                       WILLIAM A. WARD
                        EXXON RESEARCH
 
 
 
                           ABSTRACT
                           --------
 
           THE DESIGN OBJECTIVE FOR THIS MACRO PROCESSOR
      IS  TO  BE  AS POWERFUL AS POSSIBLE AND YET REMAIN
      SIMPLE TO USE AND  IMPLEMENT.   IT  WAS  DEVELOPED
      PRIMARILY  TO  MANIPULATE  COMPUTER PROGRAMS WHERE
      THE PROCESSOR TAKES A SYMBOL TABLE PLUS A  PROGRAM
      TEMPLATE CONTAINING MACROS AND PRODUCES A SPECIFIC
      PROGRAM.  THIS APPROACH IS APPLIED  TO  THE  MACRO
      PROCESSOR  ITSELF;  THE  ALGORITHM  CONSISTS  OF A
      PORTABLE FORTRAN 66 VERSION OF THE PROCESSOR  PLUS
      A  PROGRAM  TEMPLATE  OF THE PROCESSOR.  THE MACRO
      PROCESSOR TEMPLATE MAY BE RUN THROUGH THE PORTABLE
      MACRO  PROCESSOR  TO PRODUCE A VERSION TAILORED TO
      THE LOCAL COMPUTING ENVIRONMENT.   IN  PARTICULAR,
      IT  IS EASY TO PRODUCE A FORTRAN 77 VERSION OF THE
      MACRO PROCESSOR.
 
 
 
 1.  THE MACRO PROCESSOR
     --- ----- ---------
 
      A MACRO PROCESSOR IS A TOOL TO SUBSTITUTE VALUES FROM A
 SYMBOL TABLE INTO A TEXT.  THUS, IF DATE HAS THE VALUE 'JULY
 10, 1983' AND PLACE HAS THE VALUE 'HONG KONG' THEN THE  TEXT
 FRAGMENT DATELINE: $DATE, $PLACE.  THIS OBSERVER... WOULD BE
 TRANSFORMED INTO DATELINE: JULY 10, 1983,  HONG  KONG.  THIS
 OBSERVER...  SUBSTITUTION IS SIMPLE TO UNDERSTAND AND IMPLE-
 MENT; COMPLEXITY IN A MACRO PROCESSOR ARISES FROM FACILITIES
 TO  CONTROL THE SUBSTITUTION.  SEE [COLE, 1976] FOR A SURVEY
 OF MACRO PROCESSORS; SOME ARE  ALMOST  COMPLETE  PROGRAMMING
 LANGUAGES.   THE  MACRO PROCESSOR PRESENTED HERE IS DESIGNED
 TO BE AS POWERFUL AS POSSIBLE WHILE REMAINING SIMPLE TO  USE
 AND  IMPLEMENT.  IT IS EXPRESSLY DESIGNED TO MANIPULATE FOR-
 TRAN CODE ALTHOUGH IT IS SUITABLE FOR GENERAL TEXT  PROCESS-
 ING.
 
      THE TWO INGREDIENTS OF MACRO PROCESSING ARE THE  SYMBOL
 TABLE  AND  THE INPUT TEXT.  THIS PROCESSOR HAS A VERY SMALL
 INITIAL SYMBOL TABLE (MOSTLY CONSISTING OF PROCESSOR  OPTION
 SWITCHES)  SO  THE  INPUT  TEXT  CONTAINS THE INFORMATION TO
 BUILD THE SYMBOL TABLE.  THE FACILITIES ARE OF  FOUR  KINDS:
 (1)  SUBSTITUTION  OF  TEXT,  (2) MANIPULATION OF THE SYMBOL
 TABLE, (3) CONTROL OF THE SUBSTITUTION, AND (4) OTHERS (E.G.
 COMMENTS, PROCESSOR OPTIONS).  THE PROCESSOR IS KEYED TO TWO
 SPECIAL CHARACTERS: $, THE SUBSTITUTION PREFIX  AND  *,  THE
 DIRECTIVE  PREFIX.  THE INPUT HAS LINES OF TEXT WITH PROCES-
 SOR COMMANDS EMBEDDED IN THEM.  EACH LINE IS  FIRST  SCANNED
 FOR  SUBSTITUTION  AND  THESE  ARE  MADE.   THE LINE IS THEN
 SCANNED FOR DIRECTIVES (THE * MUST BE  THE  FIRST  NON-BLANK
 CHARACTER)  AND  THESE  ARE  EXECUTED.   IF  A  SUBSTITUTION
 INVOLVES MULTIPLE LINES  THEN  EACH  LINE  IS  PROCESSED  AS
 THOUGH IT WERE INPUT.  THIS ALLOWS FOR INDEFINITE NESTING OF
 SUBSTITUTIONS WHICH MAY INCLUDE CONTROL DIRECTIVES.
 
      THE ALGORITHM CONTAINS A COMPLETE USER'S GUIDE FOR  THE
 MACRO PROCESSOR SO WE LIMIT FURTHER DESCRIPTION HERE TO COM-
 PACT TABULAR SUMMARY OF THE FACILITIES, TABLE 1 AND THE PRO-
 CESSOR OPTIONS, TABLE 2.
 
      THE PRINCIPAL DRAWBACKS TO A PORTABLE  MACRO  PROCESSOR
 IN  FORTRAN  ARE  (1) CHARACTERS MUST BE STORED ONE PER WORD
 AND (2) THE FORTRAN I/O PACKAGES ARE  USUALLY  VERY  INEFFI-
 CIENT.   THE INPUT/OUTPUT OF THE MACRO PROCESSOR IS ISOLATED
 IN THE SHORT ROUTINES IOERRM, IOLIST,  IOPAGE,  IORDLN,  AND
 IOWRLN.   THESE  MAY  BE REPLACED BY MORE EFFICIENT, MACHINE
 DEPENDENT ROUTINES WITHOUT MUCH DIFFICULTY.
 
      STORING ONE CHARACTER PER WORD  AND  USING  FORTRAN  66
 MAKES THE MACRO PROCESSOR INEFFICIENT IN SPACE.  THESE INEF-
 FICIENCIES ARE NOT VERY SIGNIFICANT FOR SHORT TEXTS OR OCCA-
 SIONAL  USE  BUT  BECOME IMPORTANT WITH HEAVY USE.  FOR THIS
 REASON A PROGRAM TEMPLATE OF THE MACRO PROCESSOR IS INCLUDED
 SO  THE  PORTABLE  FORTRAN  66 VERSION CAN PRODUCE A VERSION
 WHICH USES THE CHARACTER DATA TYPE FACILITIES OF FORTRAN 77.
 OTHER  TAILORING,  SUCH  AS RESETTING STANDARD UNIT NUMBERS,
 CAN BE MADE AT THE SAME TIME.  THE DETAILS OF THIS PROCEDURE
 ARE GIVEN IN THE USER'S MANUAL.
 
      TABLE 1 BELOW SUMMARIZES THE FACILITIES OF  THE  SIMPLE
 MACRO PROCESSOR.  TABLE 2 LISTS ITS OPTIONS.  THE NATURE AND
 USE OF THE PROCESSOR IS ILLUSTRATED BY  THE  SIMPLE  EXAMPLE
 APPLICATION IN THE NEXT SECTION.
 
 
 
       TABLE 1. SUMMARY OF MACRO PROCESSOR FACILITIES
       ----- -  ------- -- ----- --------- ----------
 
 1. TEXT SUBSTITUTION
 
 FACILITY                      DESCRIPTION
 
 $(NAME), $NAME                SUBSTITUTES VALUE OF NAME INTO TEXT
                               $(TYPE) A => REAL A OR INTEGER A
 
 $DEF(NAME)                    RETURNS .TRUE. OR .FALSE.  DEPENDING
                               ON  WHETHER  NAME IS DEFINED OR NOT.
                               USED FOR CONTROL IN *IF FACILITY.
 
 $LIST(NAME)                   SUBSTITUTES NEXT ITEM FROM LIST NAME
 
 *INCLUDE(NAME)                SUBSTITUTES LINES OF TEXT  OF  NAME.
                               SIMILAR  TO  $(NAME)  ON  A  LINE BY
                               ITSELF, BUT BEHAVES DIFFERENTLY WHEN
                               SUBSTITUTION FLAG IS OFF
 
 LABEL                         THIS IS A SPECIAL VARIABLE WHICH  IS
                               INCREMENTED  BY  1  EACH  TIME IT IS
                               ACCESSED.
                               *SET(MAINLOOP = LABEL)
                               *SET(EXIT     = LABEL)
                                     DO $MAINLOOP I = 1, $ITEMS
                                     ...
                                     GO TO $EXIT
                                     ...
                               $MAINLOOP CONTINUE
                               PRODUCES
                                     DO 9004 I = 1,200
                                     ...
                                     GO TO 9005
                               9004 CONTINUE
 
 2.  SYMBOL TABLE CONSTRUCTION AND MANIPULATION
 
 *SET(NAME1 = NAME2)           ASSIGNS VALUES TO NAME1 IN THE  SYM-
 *SET(NAME1 = 'LITERAL')       BOL  TABLE.   EXAMPLE TO SET SEVERAL
 *SET(NAME1 = INTEGER)         VALUES.
 *SET                          *SET
   ...                             MONTH = 'APRIL'
 *ENDSET                           DAY   = 20
                                   YEAR  = CURRENTYEAR
 *SET(NAME1)                   *ENDSET
   ...                         EXAMPLE TO SET VALUE TO SEVERAL LINES
 *ENDSET                       *SET(READTIME)
                                   IF(TIMER) CALL SECOND(TIME1)
                                   KTIME = KTIME+1
                                   TIME(KTIME) = TIME2-TIME1
                                   TIME1 = TIME2
                               *ENDSET
 
 *DELETE(NAME)                 REMOVE VARIABLE NAME FROM SYMBOL
                               TABLE
 
 *APPEND(NAME1, NAME2)         APPEND OR CONCATENATE TEXT TO NAME1.
 *APPEND(NAME1, 'LITERAL')     APPEND  IS  MUCH MORE EFFICIENT THAN
 *APPEND(NAME1)                *SET WHEN USED FOR  THE  SAME  TASK.
   ...                         MULTIPLE  LINES  MAY  BE APPENDED AS
 *ENDAPP                       FOLLOWS:
                               *APPEND(PROCESSACCOUNT)
                                    PRINT $LABELB, ACCOUNT, BALANCE
                               $LABELB FORMAT('ACCOUNT=',I8,
                                      A      /'BALANCE=',F12.2,
                                      B      /'ON $DAY $MONTH $YEAR')
                               *ENDAPP
                               ADDS FOUR LINES OF CODE  TO  PROCESS
                               AN ACCOUNT
 
 3. CONTROL
 
 *IF(LOGICAL)LINE              THE  TEXT  IN  LINE,  LINETRUE   AND
 *IF(LOGICAL)                  LINEFALSE  IS PROCESSED IF THE VALUE
     LINETRUE                  OF LOGICAL IS APPROPRIATE.   LOGICAL
   *ELSE                       CAN  BE  A LOGICAL CONSTANT (.TRUE.,
     LINEFALSE                 .FALSE.) A LOGICAL VARIABLE (INCLUD-
 *ENDIF                        ING $DEF(NAME)) OR EQUALITY (NAME1 =
                               NAME2, NAME1 =  'LITERAL',  NAME1  =
                               INTEGER  ).   *IFS  MAY BE NESTED TO
                               ANY DEPTH.
                               *IF(NOLIMIT) *SET(LIMIT = 1000)
                               *IF($DEF(LIMIT))
                                  *ELSE
                                     *SET(LIMIT = 1000)
                               *END IF
                               *IF(DEBUG) WRITE($(OUTPUT),66) X,Y,Z
                               *IF(ID = SUPERUSER)
                                  *SET(PRIORITY = HIGHESTPRIORITY)
                               *ENDIF
 
 *DO(NAME = I1,I2,I3)          DO-LOOP MUCH  AS  IN  FORTRAN.  NAME
    ...                        ASSUMES  INTEGER  VALUES  SO $(NAME)
 *ENDDO                        BECOMES 12, SAY, IN THE  TEXT.   THE
                               RANGE SPECIFICATIONS MUST BE INTEGER
                               LITERALS OR VARIABLES  WITH  INTEGER
                               VALUES.
                               *DO (K = 1, NLIST, 3)
                                  $(K), $LIST(A) $LIST(A)-$LIST(A)
                               *ENDDO
                               PRODUCES  (FOR   NLIST   =   9   AND
                               APPROPRIATE VALUES IN A)
                                  1, BIOLOGY 200-299
                                  4, MATHEMA 100-299
                                  7, PHYSICS 110-320
 
 4. OTHER
 
 *COMMENT                      COMMENT LINES. NO SUBSTITUTIONS  ARE
    ...                        MADE OR DIRECTIVES PROCESSED IN COM-
 *ENDCOM                       MENTS.
 
 *END                          TERMINATE PROCESSING (END-OF-FILE)
 
 *RESET(NAME)                  RESET  POINTER  FOR  LIST  NAME   TO
                               BEGINNING OF LIST
 
 *OPTIONS(NAME1 = NAME2)       SET MACRO  PROCESSOR  OPTION  NAME1.
 *OPTIONS(NAME1 = 'LITERAL')   NAME2   OR   LITERAL   MUST   BE  AN
                               APPROPRIATE VALUE. THE OPTIONS  WITH
                               POSSIBLE  NAME1  VALUES AND DEFAULTS
                               ARE GIVEN IN TABLE 2.
 
 
 
$     TABLE 2.  MACRO PROCESSOR OPTIONS
 
  NAME    DEFAULT   DEFINITION
 
 CDIR        *      DIRECTIVE PREFIX CHARACTER
 CEOL        -      END-OF-LINE MARKER IS $-
 CEOR        /      LIST ITEM SEPARATOR IS $/
 CONC        +      CONTINUATION PREFIX CHARACTER
 CSUB        $      SUBSTITUTION PREFIX CHARACTER
 ICPLI      72      CHARACTERS PER LINE OF INPUT
 ICPLO      72      CHARACTERS PER LINE OF OUTPUT
 IUNITI      5      INPUT UNIT NUMBER
 LBREAK   .TRUE.    SWITCH TO BREAK OUTPUT AT NICE CHARACTER
 LCOL1    .TRUE.    ONLY CHECK COLUMN 1 FOR CDIR
 LFORT    .TRUE.    WRITE LINES WITH FORTRAN CONTINUATION
 LISTI    .TRUE.    LIST INPUT
 LISTO    .TRUE.    LIST OUTPUT
 LSUB     .TRUE.    PROCESS SUBSTITUTIONS AFTER THIS POINT
 L1TRIP   .TRUE.    USE ONE-TRIP DO-LOOPS
 
 
 
 2. APPLICATIONS
    ------------
 
      THIS MACRO PROCESSOR IS POWERFUL ENOUGH TO BE  APPLICA-
 BLE TO A WIDE RANGE OF TYPICAL MACRO PROCESSOR APPLICATIONS.
 THESE RANGE FROM PROCESSING SIMPLE FORM LETTERS  TO  COMPLEX
 "INSTRUMENTATIONS"  OF PROGRAMS AND TEXTS.  THE PROCESSOR IS
 TUNED TO FORTRAN IN SEVERAL WAYS  (E.G.  IT  HAS  A  SPECIAL
 VARIABLE  LABEL FOR CREATING FORTRAN LABELS) AND IS TARGETED
 TO FORTRAN CODE MANIPULATION.  TYPICAL APPLICATIONS INCLUDE
 
      (1) IMPLEMENTATIONS OF VERY HIGH  LEVEL  LANGUAGES  VIA
 FORTRAN  PREPROCESSORS.   THESE  PREPROCESSORS HAVE TWO COM-
 PONENTS: LANGUAGE PARSING AND CODE GENERATION.  THE LANGUAGE
 PARSER  SAVES  VALUES IN A SYMBOL TABLE WHICH DEFINE WHAT IS
 TO BE DONE, THESE ARE THEN MERGED WITH  THE  TEMPLATE  OF  A
 FORTRAN  PROGRAM TO GENERATE THE SPECIFIC FORTRAN CODE.  THE
 MACRO PROCESSOR CAN IMPLEMENT THIS SECOND  COMPONENT.   SOME
 SUBSTANTIAL LANGUAGES HAVE BEEN IMPLEMENTED USING THIS MACRO
 PROCESSOR.
 
      (2) TAILORING PROGRAMS  TO  SPECIFIC  ENVIRONMENTS.   A
 FORTRAN PROGRAM CAN BE PUT INTO A TEMPLATE WITH MANY "PARAM-
 ETERS" TO BE INSERTED FOR A SPECIFIC VERSION.  THESE PARAME-
 TERS  MAY  RANGE  FROM  SOMETHING  SIMPLE  LIKE THE I/O UNIT
 NUMBERS OR THE  DIMENSIONS  OF  CERTAIN  ARRAYS  TO  COMPLEX
 THINGS  LIKE  WHOLE SUBROUTINES FOR SPECIFIC ENVIRONMENTS OR
 CHANGING PROGRAM TYPE E.G. FROM REAL  TO  DOUBLE  PRECISION.
 THE FOLLOWING EXAMPLE ILLUSTRATES THIS TYPE OF APPLICATION.
 
      CONSIDER THE LINPACK ROUTINES TO  FACTOR  AND  SOLVE  A
 SYSTEM  OF LINEAR EQUATIONS.  WE WANT TO BE ABLE TO CREATE A
 SPECIFIC PROGRAM WITH THE FOLLOWING OPTIONS:
 
      1.   THE CODE MAY BE SINGLE OR DOUBLE PRECISION,
 
      2.   THE MATRIX CONDITION NUMBER MAY BE ESTIMATED,
 
      3.   A RIGHT SIDE MAY BE READ  AND  THE  LINEAR  SYSTEM
           SOLVED.
 
 A PROGRAM TEMPLATE FOR THIS FOLLOWS:
 
 *IF (TYPE = 'SINGLE')
       *SET ( DECL = 'REAL')
       *SET ( PREFIX = 'S' )
 *ELSE
       *SET ( DECL = 'DOUBLE PRECISION' )
       *SET ( PREFIX = 'D' )
 *ENDIF
       $DECL A($N,$N)
 *IF (CONDNO)
       $DECL RCOND, WORK($N)
 *ENDIF
 *IF (SOLVE)
       $DECL B($N)
 *ENDIF
       INTEGER IPVT($N)
       READ(5,*) A
 *IF (CONDNO)
       CALL $(PREFIX)GECO (A, $N, $N, IPVT, RCOND, WORK)
       WRITE(6,*) RCOND
 *ELSE
       CALL $(PREFIX)GEFA (A, $N, $N, IPVT, INFO)
 *ENDIF
 *IF (SOLVE)
       READ(5,*) B
       CALL $(PREFIX)GESL (A, $N, $N, IPVT, B, O)
       WRITE(6,*) B
 *ENDIF
       STOP
       END
 *END
 
 WE SEE THAT THE CODE IS PARAMETERIZED BY THE VARIABLES
 
       DECL   = FORTRAN DECLARATION KEYWORD
       PREFIX = LINPACK SUBROUTINES NAME PREFIX CHARACTER
       CONDNO = SWITCH FOR CONDITION NUMBER
       SOLVE  = SWITCH FOR SOLVING LINEAR SYSTEM
       TYPE   = VARIBLE FOR SINGLE OR DOUBLE PRECISION
 
 IF THE PROGRAM TEMPLATE IS PRECEDED BY  THE  MACRO  INSTRUC-
 TIONS
 
       *SET
$   TYPE   = 'SINGLE'
$   CONDNO = .FALSE.
$   SOLVE  = .TRUE.
$   N      = 10
       *ENDSET
 
 THEN THE MACRO PROCESSOR PRODUCES THE PROGRAM
 
       REAL A(10,10)
       REAL B(10)
       INTEGER IPVT(10)
       READ(5,*) A
       CALL SGEFA (A, 10, 10, IPVT, INFO)
       READ(5,*) B
       CALL SGESL (A, 10, 10, IPVT, B, 0)
       WRITE(6,*) B
       STOP
       END
 
 IF THE MACRO INSTRUCTIONS ARE CHANGED TO  TYPE  =  'DOUBLE',
 CONDNO = .TRUE., SOLVE = .FALSE. AND N=5 THEN THE MACRO PRO-
 CESSOR PRODUCES THE PROGRAM
 
       DOUBLE PRECISION A(5,5)
       DOUBLE PRECISION RCOND, WORK(5)
       INTEGER IPVT(5)
       READ(5,*) A
       CALL DGECO (A, 5, 5, IPVT, RCOND, WORK)
       WRITE(6,*) RCOND
       STOP
       END
 
 3. DISTRIBUTED MATERIAL
    ----------- --------
 
      THE ALGORITHM CONSISTS OF THE FOLLOWING FILES:
 
      (1)  PORTABLE, FORTRAN 66 VERSION OF THE MACRO  PROCES-
           SOR
 
      (2)  TEXT OF THIS PAPER
 
      (3)  USER'S GUIDE FOR THE MACRO PROCESSOR
 
      (4)  MACRO PROCESSOR TEMPLATE
 
      (5)  TEST CASES
 
           A.   EXHAUSTIVE TEST OF ALL FACILITIES
 
           B.   FORM LETTER TO AUTHORS TO REPORT PROBLEMS
 
           C.   THE LINPACK EXAMPLE GIVEN ABOVE
 
           D.   THE SIMPLE EXAMPLES FROM THE USER'S GUIDE
 
           E.   A COMPLEX EXAMPLE: THE  ELLPACK  SYSTEM  TEM-
                PLATE
 
 
 
 INSTALLERS SHOULD NOTE THAT  ROUTINES  UTCHKA,  UTCHKN,  AND
 UTCHKS  MAY HAVE TO BE MODIFIED IF THE PROCESSOR IS TAILORED
 BY SETTING TESTCH = .FALSE., AND IF THE DIGITS 0  TO  9  AND
 THE  LETTERS  A  TO Z ARE NONCONTIGUOUS IN THE CHARACTER SET
 USED.
 
 THIS WORK WAS SUPPORTED IN PART BY NSF GRANT MCS-79763L0
 
 
 4. REFERENCES
    ----------
 
 A.J. COLE, MACRO PROCESSORS,  CAMBRIDGE  UNIVERSITY,  PRESS,
      CAMBRIDGE, ENGLAND, 1976.
 
