tokeniser
Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| tokeniser [2018/03/31 13:19] – external edit 127.0.0.1 | tokeniser [2024/01/05 00:21] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 2: | Line 2: | ||
| //by JGH, June 2006//\\ \\ BBC BASIC programs are tokenised, that is, BASIC keywords are stored as one-byte values. This results in programs which execute faster and are more compact.\\ \\ A tokenised line can easily be detokenised, | //by JGH, June 2006//\\ \\ BBC BASIC programs are tokenised, that is, BASIC keywords are stored as one-byte values. This results in programs which execute faster and are more compact.\\ \\ A tokenised line can easily be detokenised, | ||
| + | <code bb4w> | ||
| quote%=FALSE | quote%=FALSE | ||
| REPEAT | REPEAT | ||
| Line 8: | Line 9: | ||
| addr%=addr%+1 | addr%=addr%+1 | ||
| UNTIL ?addr%=13 | UNTIL ?addr%=13 | ||
| + | </ | ||
| Tokenising, however, is more fiddly. Tokens can be abbreviated on entry and characters are only tokenised at certain parts of the line. For instance, in the following line: | Tokenising, however, is more fiddly. Tokens can be abbreviated on entry and characters are only tokenised at certain parts of the line. For instance, in the following line: | ||
| + | <code bb4w> | ||
| ON NOON GOTO 1,2 | ON NOON GOTO 1,2 | ||
| + | </ | ||
| the first ' | the first ' | ||
| ==== In Windows BASIC: ==== | ==== In Windows BASIC: ==== | ||
| + | <code bb4w> | ||
| B%=EVAL(" | B%=EVAL(" | ||
| token$=$(!332+2) | token$=$(!332+2) | ||
| + | </ | ||
| This code may fail if an event interrupt (e.g. ON TIME) occurs between the two statements. To avoid this use the following alternative which (in //BBC BASIC for Windows// version 6 only) does not allow an intervening interrupt: | This code may fail if an event interrupt (e.g. ON TIME) occurs between the two statements. To avoid this use the following alternative which (in //BBC BASIC for Windows// version 6 only) does not allow an intervening interrupt: | ||
| + | <code bb4w> | ||
| IF EVAL(" | IF EVAL(" | ||
| + | </ | ||
| The input and output share the same memory buffer, which is OK so long as the tokenising process shortens the code (which is almost always the case) but can cause a crash if it lengthens the code. That can happen only in exceptional circumstances such as the following code: | The input and output share the same memory buffer, which is OK so long as the tokenising process shortens the code (which is almost always the case) but can cause a crash if it lengthens the code. That can happen only in exceptional circumstances such as the following code: | ||
| + | <code bb4w> | ||
| ON A% GOTO 10, | ON A% GOTO 10, | ||
| + | </ | ||
| The tokenising process encodes the line numbers in a special internal format which results in the overall length increasing from 25 to 31 bytes. To reduce the chance of this causing a crash the tokenising routine can be adapted as follows: | The tokenising process encodes the line numbers in a special internal format which results in the overall length increasing from 25 to 31 bytes. To reduce the chance of this causing a crash the tokenising routine can be adapted as follows: | ||
| + | <code bb4w> | ||
| IF EVAL(" | IF EVAL(" | ||
| + | </ | ||
| \\ | \\ | ||
| ==== In ARM BASIC: ==== | ==== In ARM BASIC: ==== | ||
| + | <code bb4w> | ||
| SYS " | SYS " | ||
| B%=EVAL(" | B%=EVAL(" | ||
| token$=$(A%-14) | token$=$(A%-14) | ||
| + | </ | ||
| \\ | \\ | ||
| ==== In 6502 BASIC: ==== | ==== In 6502 BASIC: ==== | ||
| + | <code bb4w> | ||
| A%=EVAL(" | A%=EVAL(" | ||
| token$=$((!4 AND & | token$=$((!4 AND & | ||
| + | </ | ||
| \\ By preceding the code you want to tokenise with " | \\ By preceding the code you want to tokenise with " | ||
| + | <code bb4w> | ||
| DEF FNTokenise_Win(A$): | DEF FNTokenise_Win(A$): | ||
| WHILELEFT$(A$, | WHILELEFT$(A$, | ||
| Line 40: | Line 57: | ||
| DEF FNTokenise_65(A$): | DEF FNTokenise_65(A$): | ||
| A%=EVAL(" | A%=EVAL(" | ||
| + | </ | ||
| \\ These functions are used in full in the ' | \\ These functions are used in full in the ' | ||
| + | <code bb4w> | ||
| in%=OPENIN(text$) | in%=OPENIN(text$) | ||
| out%=OPENOUT(basic$) | out%=OPENOUT(basic$) | ||
| Line 54: | Line 73: | ||
| CLOSE# | CLOSE# | ||
| CLOSE# | CLOSE# | ||
| + | </ | ||
| \\ | \\ | ||
| ==== Notes ==== | ==== Notes ==== | ||
| - | Acorn BBC BASIC programs are stored slightly differently. See [[/ | + | Acorn BBC BASIC programs are stored slightly differently. See [[/ |
| + | <code bb4w> | ||
| B%=EVAL(" | B%=EVAL(" | ||
| token$=$(!332+3) | token$=$(!332+3) | ||
| + | </ | ||
| \\ | \\ | ||
| ==== See also ==== | ==== See also ==== | ||
tokeniser.1522502386.txt.gz · Last modified: 2024/01/05 00:16 (external edit)