The Art of
ASSEMBLY LANGUAGE PROGRAMMING

Chapter Sixteen (Part 8)

Table of Content

Chapter Sixteen (Part 10) 

CHAPTER SIXTEEN:
PATTERN MATCHING (Part 9)
16.8.2 - Processing Dates

16.8.2 Processing Dates

Another useful program that converts English text to numeric form is a date processor. A date processor takes strings like "Jan 23, 1997" and converts it to three integer values representing the month, day, and year. Of course, while we're at it, it's easy enough to modify the grammar for date strings to allow the input string to take any of the following common date formats:

		Jan 23, 1997
		January 23, 1997
		23 Jan, 1997
		23 January, 1997
		1/23/97
		1-23-97
		1/23/1997
		1-23-1997

In each of these cases the date processing routines should store one into the variable month, 23 into the variable day, and 1997 into the year variable (we will assume all years are in the range 1900-1999 if the string supplies only two digits for the year). Of course, we could also allow dates like "January twenty-third, nineteen hundred and ninety seven" by using an number processing parser similar to the one presented in the previous section. However, that is an exercise left to the reader.

The grammar to process dates is

Date EngMon Integer Integer |

Integer EngMon Integer |

Integer / Integer / Integer |

Integer - Integer - Integer

EngMon JAN | JANUARY | FEB | FEBRUARY |  | DEC | DECEMBER
Integer digit Integer | digit
digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

We will use some semantic rules to place some restrictions on these strings. For example, the grammar above allows integers of any size; however, months must fall in the range 1-12 and days must fall in the range 1-28, 1-29, 1-30, or 1-31 depending on the year and month. Years must fall in the range 0-99 or 1900-1999.

Here is the 80x86 code for this grammar:

; datepat.asm
;
; This program converts dates of various formats to a three integer
; component value- month, day, and year.

                .xlist
                .286
                include                 stdlib.a
                includelib              stdlib.lib
                matchfuncs
                .list
                .lall


dseg            segment para public 'data'

; The following three variables hold the result of the conversion.

month           word    0
day             word    0
year            word    0

; StrPtr is a double word value that points at the string under test.
; The output routines use this variable. It is declared as two word
; values so it is easier to store es:di into it.

strptr          word    0,0

; Value is a generic variable the ConvertInt routine uses

value           word    0



; Number of valid days in each month (Feb is handled specially)

DaysInMonth     byte    31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31



; Some sample strings to test the date conversion routines.

Str0            byte    "Feb 4, 1956",0
Str1            byte    "July 20, 1960",0
Str2            byte    "Jul 8, 1964",0
Str3            byte    "1/1/97",0
Str4            byte    "1-1-1997",0
Str5            byte    "12-25-74",0
Str6            byte    "3/28/1981",0
Str7            byte    "January 1, 1999",0
Str8            byte    "Feb 29, 1996",0
Str9            byte    "30 June, 1990",0
Str10           byte    "August 7, 1945",0
Str11           byte    "30 September, 1992",0
Str12           byte    "Feb 29, 1990",0
Str13           byte    "29 Feb, 1992",0



; The following grammar is what we use to process the dates
;
; Date  ->      EngMon Integer Integer
;       |       Integer EngMon Integer
;       |       Integer "/" Integer "/" Integer
;       |       Integer "-" Integer "-" Integer
;
; EngMon->      Jan | January | Feb | February | ... | Dec | December
; Integer->     digit integer | digit
; digit ->      0 | 1 | ... | 9
;
; Some semantic rules this code has to check:
;
; If the year is in the range 0-99, this code has to add 1900 to it.
; If the year is not in the range 0-99 or 1900-1999 then return an error.
; The month must be in the range 1-12, else return an error.
; The day must be between one and 28, 29, 30, or 31. The exact maximum
; day depends on the month.


separators      pattern {spancset, delimiters}


; DatePat processes dates of the form "MonInEnglish Day Year"

DatePat         pattern {sl_match2, EngMon, DatePat2, DayYear}
DayYear         pattern {sl_match2, DayInteger, 0, YearPat}
YearPat         pattern {sl_match2, YearInteger}

; DatePat2 processes dates of the form "Day MonInEng Year"

DatePat2        pattern {sl_match2, DayInteger, DatePat3, MonthYear}
MonthYear       pattern {sl_match2, EngMon, 0, YearPat}

; DatePat3 processes dates of the form "mm-dd-yy"

DatePat3        pattern {sl_match2, MonInteger, DatePat4, DatePat3a}
DatePat3a       pattern {sl_match2, separators, DatePat3b, DatePat3b}
DatePat3b       pattern {matchchar, '-', 0, DatePat3c}
DatePat3c       pattern {sl_match2, DayInteger, 0, DatePat3d}
DatePat3d       pattern {sl_match2, separators, DatePat3e, DatePat3e}
DatePat3e       pattern {matchchar, '-', 0, DatePat3f}
DatePat3f       pattern {sl_match2, YearInteger}

; DatePat4 processes dates of the form "mm/dd/yy"

DatePat4        pattern {sl_match2, MonInteger, 0, DatePat4a}
DatePat4a       pattern {sl_match2, separators, DatePat4b, DatePat4b}
DatePat4b       pattern {matchchar, '/', 0, DatePat4c}
DatePat4c       pattern {sl_match2, DayInteger, 0, DatePat4d}
DatePat4d       pattern {sl_match2, separators, DatePat4e, DatePat4e}
DatePat4e       pattern {matchchar, '/', 0, DatePat4f}
DatePat4f       pattern {sl_match2, YearInteger}


; DayInteger matches an decimal string, converts it to an integer, and
; stores the result away in the Day variable.

DayInteger      pattern {sl_match2, Integer, 0, SetDayPat}
SetDayPat       pattern {SetDay}

; MonInteger matches an decimal string, converts it to an integer, and
; stores the result away in the Month variable.

MonInteger      pattern {sl_match2, Integer, 0, SetMonPat}
SetMonPat       pattern {SetMon}

; YearInteger matches an decimal string, converts it to an integer, and
; stores the result away in the Year variable.


YearInteger     pattern {sl_match2, Integer, 0, SetYearPat}
SetYearPat      pattern {SetYear}


; Integer skips any leading delimiter characters and then matches a
; decimal string. The Integer0 pattern matches exactly the decimal
; characters; the code does a patgrab on Integer0 when converting
; this string to an integer.

Integer         pattern {sl_match2, separators, 0, Integer0}
Integer0        pattern {sl_match2, number, 0, Convert2Int}
number          pattern {anycset, digits, 0, number2}
number2         pattern {spancset, digits}
Convert2Int     pattern {ConvertInt}




; A macro to make it easy to declare each of the 24 English month
; patterns (24 because we allow the full month name and an
; abbreviation).

MoPat           macro   name, next, str, str2, value
                local SetMo, string, full, short, string2, doMon

name            pattern {sl_match2, short, next}
short           pattern {matchistr, string2, full, SetMo}
full            pattern {matchistr, string, 0, SetMo}

string          byte str
                byte    0

string2         byte    str2
                byte    0

SetMo           pattern {MonthVal, value}
                endm


; EngMon is a chain of patterns that match one of the strings
; JAN, JANUARY, FEB, FEBRUARY, etc. The last parameter to the
; MoPat macro is the month number.

EngMon          pattern {sl_match2, separators, jan, jan}
                MoPat   jan, feb, "JAN", "JANUARY", 1
                MoPat   feb, mar, "FEB", "FEBRUARY", 2
                MoPat   mar, apr, "MAR", "MARCH", 3
                MoPat   apr, may, "APR", "APRIL", 4
                MoPat   may, jun, "MAY", "MAY", 5
                MoPat   jun, jul, "JUN", "JUNE", 6
                MoPat   jul, aug, "JUL", "JULY", 7
                MoPat   aug, sep, "AUG", "AUGUST", 8
                MoPat   sep, oct, "SEP", "SEPTEMBER", 9
                MoPat   oct, nov, "OCT", "OCTOBER", 10
                MoPat   nov, decem, "NOV", "NOVEMBER", 11
                MoPat   decem, 0, "DEC", "DECEMBER", 12




; We use the "digits" and "delimiters" sets from the standard library.

                include stdsets.a

dseg            ends



cseg            segment para public 'code'
                assume  cs:cseg, ds:dseg


; ConvertInt-   Matches a sequence of digits and converts them to an integer.

ConvertInt      proc    far
                push    ds
                push    es
                push    di
                mov     ax, dseg
                mov     ds, ax

                lesi Integer0           ;Integer0 contains the decimal
                patgrab                 ; string we matched, grab that
                atou                    ; string and convert it to an
                mov     Value, ax       ; integer and save the result.
                free                    ;Free mem allocated by patgrab.

                pop     di
                mov     ax, di          ;Required by sl_match.
                pop     es
                pop     ds
                stc                     ;Always succeed.
                ret

ConvertInt      endp


; SetDay, SetMon, and SetYear simply copy value to the appropriate
; variable.

SetDay          proc    far
                push    ds
                mov     ax, dseg
                mov     ds, ax
                mov     ax, value
                mov     day, ax
                mov     ax, di
                pop     ds
                stc
                ret
SetDay          endp


SetMon          proc    far
                push    ds
                mov     ax, dseg
                mov     ds, ax
                mov     ax, value
                mov     Month, ax
                mov     ax, di
                pop     ds
                stc
                ret
SetMon          endp


SetYear         proc    far
                push    ds
                mov     ax, dseg
                mov     ds, ax
                mov     ax, value
                mov     Year, ax
                mov     ax, di
                pop     ds
                stc
                ret
SetYear         endp


; MonthVal is a pattern used by the English month patterns.
; This pattern function simply copies the matchparm field to
; the month variable (the matchparm field is passed in si).

MonthVal        proc    far
                push    ds
                mov     ax, dseg
                mov     ds, ax
                mov     Month, si
                mov     ax, di
                pop     ds
                stc
                ret
MonthVal        endp



; ChkDate-      Checks a date to see if it is valid. Returns with the
;               carry flag set if it is, clear if not.

ChkDate         proc    far
                push    ds
                push    ax
                push    bx

                mov     ax, dseg
                mov     ds, ax

; If the year is in the range 0-99, add 1900 to it.
; Then check to see if it's in the range 1900-1999.

                cmp     Year, 100
                ja      Notb100
                add     Year, 1900
Notb100:        cmp     Year, 2000
                jae     BadDate
                cmp     Year, 1900
                jb      BadDate

; Okay, make sure the month is in the range 1-12

                cmp     Month, 12
                ja      BadDate
                cmp     Month, 1
                jb      BadDate

; See if the number of days is correct for all months except Feb:

                mov     bx, Month
                mov     ax, Day                 ;Make sure Day <> 0.
                test    ax, ax
                je      BadDate
                cmp     ah, 0                   ;Make sure Day < 256.
                jne     BadDate

                cmp     bx, 2                   ;Handle Feb elsewhere.
                je      DoFeb
                cmp     al, DaysInMonth[bx-1]   ;Check against max val.
                ja      BadDate
                jmp     GoodDate

; Kludge to handle leap years. Note that 1900 is *not* a leap year.

DoFeb:          cmp     ax, 29                  ;Only applies if day is
                jb      GoodDate                ; equal to 29.
                ja      BadDate                 ;Error if Day > 29.
                mov     bx, Year                ;1900 is not a leap year
                cmp     bx, 1900                ; so handle that here.
                je      BadDate
                and     bx, 11b                 ;Else, Year mod 4 is a
                jne     BadDate                 ; leap year.

GoodDate:       pop     bx
                pop     ax
                pop     ds
                stc
                ret

BadDate:        pop     bx
                pop     ax
                pop     ds
                clc
                ret
ChkDate         endp


; ConvertDate-  ES:DI contains a pointer to a string containing a valid
;               date. This routine converts that date to the three
;               integer values found in the Month, Day, and Year
;               variables. Then it prints them to verify the pattern
;               matching routine.

ConvertDate     proc    near

                ldxi    DatePat
                xor     cx, cx
                match
                jnc     NoMatch

                mov     strptr, di              ;Save string pointer for
                mov     strptr+2, es            ; use by printf

                call    ChkDate                 ;Validate the date.
                jnc     NoMatch

                printf
                byte    "%-20^s = Month: %2d Day: %2d Year: %4d\n",0
                dword   strptr, Month, Day, Year
                jmp     Done

NoMatch:        printf
                byte    "Illegal date ('%^s')",cr,lf,0
                dword   strptr

Done:           ret
ConvertDate     endp




Main            proc
                mov     ax, dseg
                mov     ds, ax
                mov     es, ax

                meminit                         ;Init memory manager.

; Call ConvertDate to test several different date strings.

                lesi    Str0
                call    ConvertDate
                lesi    Str1
                call    ConvertDate
                lesi    Str2
                call    ConvertDate
                lesi    Str3
                call    ConvertDate
                lesi    Str4
                call    ConvertDate
                lesi    Str5
                call    ConvertDate
                lesi    Str6
                call    ConvertDate
                lesi    Str7
                call    ConvertDate
                lesi    Str8
                call    ConvertDate
                lesi    Str9
                call    ConvertDate
                lesi    Str10
                call    ConvertDate
                lesi    Str11
                call    ConvertDate
                lesi    Str12
                call    ConvertDate
                lesi    Str13
                call    ConvertDate


Quit:           ExitPgm
Main            endp

cseg            ends

sseg            segment para stack 'stack'
stk             db      1024 dup ("stack ")
sseg            ends

zzzzzzseg       segment para public 'zzzzzz'
LastBytes       db      16 dup (?)
zzzzzzseg       ends
                end     Main

Sample Output:

Feb 4, 1956 			= Month:  2 Day:  4 Year: 1956
July 20, 1960 			= Month:  7 Day: 20 Year: 1960
Jul 8, 1964 			= Month:  7 Day:  8 Year: 1964
1/1/97 				= Month:  1 Day:  1 Year: 1997
1-1-1997 			= Month:  1 Day:  1 Year: 1997
12-25-74 			= Month: 12 Day: 25 Year: 1974
3/28/1981 			= Month:  3 Day: 28 Year: 1981
January 1, 1999 		= Month:  1 Day:  1 Year: 1999
Feb 29, 1996 			= Month:  2 Day: 29 Year: 1996
30 June, 1990 			= Month:  6 Day: 30 Year: 1990
August 7, 1945 			= Month:  8 Day:  7 Year: 1945
30 September, 1992 		= Month:  9 Day: 30 Year: 1992
Illegal date ('Feb 29, 1990')
29 Feb, 1992 			= Month:  2 Day: 29 Year: 1992

Chapter Sixteen (Part 8)

Table of Content

Chapter Sixteen (Part 10) 

Chapter Sixteen: Pattern Matching (Part 9)
29 SEP 1996