Tokenising Requirements
Overview
After the conditions have been processed and adjusted via our regex algorithm, the next step is to parse it into a list of tokens which our algorithm will then read from. Here is a rough overview including examples of the logic involved in this step.
An opening and closing bracket is added to all conditions
Split on (, ), &&, ||
Split on keywords
Simple ||
Original: COMP1511 || DPST1091 || COMP1911 || COMP1917
Tokenised: [(, COMP1511, ||, DPST1091, ||, COMP1911, ||, COMP1917, )]
Notes: Split on ||
Simple &&
Original: COMP1511 && DPST1091 && COMP1911 && COMP1917
Tokenised: [(, COMP1511, &&, DPST1091, &&, COMP1911, &&, COMP1917, )]
Notes: Split on &&
Complex &&, ||, (, )
Original: (MMAN2400 || ENGG2400) && (MMAN2100 || DESN2000)
Tokenised: [(, MMAN2400, ||, ENGG2400, ), &&, (, MMAN2100, ||, DESN2000, )]
Notes: Split on all the key tokens
Simple "in"
Original: 24UOC in COMP
Tokenised: [(, 24UOC, in, COMP, )]
Notes: Treat "in" as a keyword
Complex "in"
Original: 96UOC in (COMP || SENG || MATH) && COMP1511 && COMP1521
Tokenised: [(, 96UOC, in, (, COMP, ||, SENG, ||, MATH, ), &&, COMP1511, &&, COMP1521]
Notes:: Treat "in" as a keyword