Neural Networks

Discussions related to mathematics, numerical methods, graph plotting etc.
Richard Russell
Posts: 591
Joined: Tue 18 Jun 2024, 09:32

Neural Networks

Post by Richard Russell »

Artificial intelligence is a hot topic, and BBC BASIC is particularly well suited to it because a key building block of Neural Networks is the matrix dot product which is a native operation in BBC BASIC. Below I've listed a program to demonstrate a Neural Network in action, unlike some demonstrations which are too simple to be of any practical value, this actually does something useful: it recognises a hand-written digit (0-9).

The program requires BBC BASIC for SDL 2.0 (although it could be adapted to run in BBC BASIC for Windows quite easily) and since it makes heavy use of the dot-product it will run significantly faster in the latest 64-bit build of BBCSDL in which that operation was accelerated. If you want to play with it I suggest you install that version.

The large data file it uses for training, mnist.dat, can be downloaded from here. Although the existing program does not do so, because training is fairly fast (a couple of minutes), you could of course save the post-training Neural Network parameters (weights and biases) to a file so that it will start up instantly.

Code: Select all

      REM Neural net demo: find the decimal digit in a 28 x 28 pixel image.
      REM Coded in 'BBC BASIC for SDL 2.0' by Richard Russell, 15-Feb-2026.

      REM The neural network consists of a Multi-Layer Perceptron with 784
      REM inputs (the number of pixels in the source images), 10 outputs
      REM (the number of possibilities for the recognised digit) and two
      REM hidden intermediate layers having 128 and 64 nodes respectively.
      REM The best guess of the digit is whichever output is the largest.

      REM The training data is 12000 images of handwritten digits from the
      REM MNIST database, see https://en.wikipedia.org/wiki/MNIST_database

      PRINT "Initialising the digit-recognition neural network..."

      BATCH = 32  : REM Maximum number of images in each batch
      LEARN = 0.1 : REM Learning rate

      REM Neural network (784 inputs, 10 outputs, intermediate 128 and 64):
      DIM W1(783,127),  W2(127,63),  W3(63,9)  : REM weights
      DIM b1(127),      b2(63),      b3(9)     : REM biases
      DIM dW1(783,127), dW2(127,63), dW3(63,9) : REM delta weights
      DIM db1(127),     db2(63),     db3(9)    : REM delta biases
      DIM T1(127,783),  T2(63,127),  T3(9,63)  : REM transposed weights

      REM Training data:
      DIM x(BATCH-1, 783), y(BATCH-1, 9) : REM x() = input, y() = label

      REM Temporary arrays:
      DIM t0(783,BATCH-1),  t1(127,BATCH-1), t2(63,BATCH-1)
      DIM a1(BATCH-1,127),  a2(BATCH-1,63),  a3(BATCH-1,9)
      DIM z1(BATCH-1,127),  z2(BATCH-1,63),  z3(BATCH-1,9)
      DIM dz1(BATCH-1,127), dz2(BATCH-1,63), dz3(BATCH-1,9)

      REM Initialise the weights to random values (mean = 0):
      FOR X% = 0 TO 783 : FOR Y% = 0 TO 127 : W1(X%, Y%) = (RND(1)-0.5) * 0.050 : NEXT : NEXT X%
      FOR X% = 0 TO 127 : FOR Y% = 0 TO  63 : W2(X%, Y%) = (RND(1)-0.5) * 0.125 : NEXT : NEXT X%
      FOR X% = 0 TO  63 : FOR Y% = 0 TO   9 : W3(X%, Y%) = (RND(1)-0.5) * 0.177 : NEXT : NEXT X%

      REM Declare bitmap for image display:
      DIM bmp{bfType{l&,h&}, bfSize%, bfReserved%, bfOffBits%, \
      \       biSize%, biWidth%, biHeight%, biPlanes{l&,h&}, biBitCount{l&,h&}, \
      \       biCompression%, biSizeImage%, biXPelsPerMeter%, biYPelsPerMeter%, \
      \       biClrUsed%, biClrImportant%, palette%(255)}, pixels&(783)

      REM Initialise bitmap:
      bmp.bfType.l& = ASC"B"
      bmp.bfType.h& = ASC"M"
      bmp.bfOffBits% = ^pixels&(0) - bmp{}
      bmp.bfSize% = bmp.bfOffBits% + 784
      bmp.biSize% = 40
      bmp.biWidth% = 28
      bmp.biHeight% = -28
      bmp.biPlanes.l& = 1
      bmp.biBitCount.l& = 8

      REM Initialise palette:
      FOR C% = 0 TO 255
        bmp.palette%(C%) = &FF000000 OR C% OR C% * 256 OR C% * 65536
      NEXT

      *HEX 64

      REM Read and display training data:
      mnist% = OPENIN(@dir$ + "mnist.dat")
      fsize% = EXT#mnist%

      image% = 0
      REPEAT
        PRINT CHR$13 "Training the network ("; INT(100 * PTR#mnist% / fsize%) "%)...";
        FOR I% = 0 TO BATCH-1

          REM Read label and image:
          label& = BGET#mnist% : IF label& > 9 STOP
          PROCreadarray(mnist%, pixels&())

          REM Display image:
          OSCLI "MDISPLAY " + STR$~bmp{} + " 0, 0, 448, 448"
          image% += 1

          REM Copy image and label into float arrays:
          x(I%, 0 TO 783) = pixels&() / &FF
          y(I%, 0 TO 9) = 0.0
          y(I%, label&) = 1.0

          IF EOF#mnist% THEN EXIT FOR
        NEXT I%
        m = I% : REM Number of images in batch

        REM We should strictly randomly-shuffle the training data but we will
        REM assume the MNIST dataset is sufficiently randomised for our purpose.

        REM Forward pass:
        z1() = x() . W1()
        PROCnpadd(z1(), b1())
        PROCrelu(a1(), z1())

        z2() = a1() . W2()
        PROCnpadd(z2(), b2())
        PROCrelu(a2(), z2())

        z3() = a2() . W3()
        PROCnpadd(z3(), b3())
        PROCsoftmax(a3(), z3())

        REM Backward pass:
        dz3() = a3() - y()
        PROCtranspose(t2(), a2())
        dW3() = t2() . dz3()    : dW3() /= m
        PROCnpsum(db3(), dz3()) : db3() /= m

        PROCtranspose(T3(), W3())
        dz2() = dz3() . T3()
        PROCderiv(dz2(), z2())
        PROCtranspose(t1(), a1())
        dW2() = t1() . dz2()    : dW2() /= m
        PROCnpsum(db2(), dz2()) : db2() /= m

        PROCtranspose(T2(), W2())
        dz1() = dz2() . T2()
        PROCderiv(dz1(), z1())
        PROCtranspose(t0(), x())
        dW1() = t0() . dz1()    : dW1() /= m
        PROCnpsum(db1(), dz1()) : db1() /= m

        REM Update parameters:
        W1() -= dW1() * LEARN
        b1() -= db1() * LEARN
        W2() -= dW2() * LEARN
        b2() -= db2() * LEARN
        W3() -= dW3() * LEARN
        b3() -= db3() * LEARN

      UNTIL EOF#mnist%
      CLOSE #mnist%
      PRINT CHR$13 "Completed training the network with "; image% " images."

      REM Arays for testing:
      DIM X(0, 783) : REM input
      DIM A1(0,127), A2(0,63), A3(0,9)
      DIM Z1(0,127), Z2(0,63), Z3(0,9)

      mnist% = OPENIN(@dir$ + "mnist.dat")
      fsize% = EXT#mnist%

      image% = 0
      ok% = 0
      REPEAT
        PRINT CHR$13 "Testing the network ("; INT(100 * PTR#mnist% / fsize%) "%)...";

        REM Read label and image:
        label& = BGET#mnist%
        PROCreadarray(mnist%, pixels&())

        REM Display image:
        OSCLI "MDISPLAY " + STR$~bmp{} + " 0, 0, 448, 448"
        image% += 1

        REM Copy image and label into float arrays:
        X(0, 0 TO 783) = pixels&() / &FF

        Z1() = X() . W1()
        PROCnpadd(Z1(), b1())
        PROCrelu(A1(), Z1())

        Z2() = A1() . W2()
        PROCnpadd(Z2(), b2())
        PROCrelu(A2(), Z2())

        Z3() = A2() . W3()
        PROCnpadd(Z3(), b3())
        PROCsoftmax(A3(), Z3())

        result% = -1
        maximum = -1E9
        FOR I% = 0 TO 9
          IF A3(0,I%) > maximum maximum = A3(0,I%) : result% = I%
        NEXTIF result% = label& ok% += 1

      UNTIL EOF#mnist%
      CLOSE #mnist%

      @%=&50A
      PRINT CHR$13 "Network tested: "; ok% " out of "; image% " correct ("; 100*ok%/image% "%)."
      PRINT ' "Try it for yourself:"
      PRINT   "Draw a digit 0-9 to fit just inside the red square;"
      PRINT   "Click outside the box to clear it and start again."
      VDU 5 : MOVE 510,500 : PRINT "0    1    2    3    4    5    6    7    8    9"
      COLOUR 7,255,128,0

      REPEAT
        GCOL 0 : RECTANGLE FILL 0, 0, 448, 448
        GCOL 9 : RECTANGLE 64, 64, 320, 320
        pixels&() = 0
        REPEAT
          MOUSE X%, Y%, B% : X% DIV= 16 : Y% DIV= 16
          IF B% THEN
            IF X% >= 28 OR Y% >= 28 EXIT REPEAT

            PROCpixel(pixels&(), X%,Y%, &FF)
            PROCpixel(pixels&(), X%-1,Y%, &80)
            PROCpixel(pixels&(), X%+1,Y%, &80)
            PROCpixel(pixels&(), X%,Y%-1, &80)
            PROCpixel(pixels&(), X%,Y%+1, &80)
            PROCpixel(pixels&(), X%-1,Y%-1, &40)
            PROCpixel(pixels&(), X%+1,Y%-1, &40)
            PROCpixel(pixels&(), X%-1,Y%+1, &40)
            PROCpixel(pixels&(), X%+1,Y%+1, &40)

            xc = 0 : yc = 0 : s = SUM(pixels&())
            FOR I% = 0 TO 783
              xc += I% MOD 28 * pixels&(I%)
              yc += I% DIV 28 * pixels&(I%)
            NEXT
            IF s <> 0 xc /= s
            IF s <> 0 yc /= s
            xc = INT(xc + 0.5)
            yc = INT(yc + 0.5)

            FOR I% = 0 TO 783
              X(0,I%) = pixels&((I% + 378 + xc + 28 * yc) MOD 784) / &FF
            NEXT

            Z1() = X() . W1()
            PROCnpadd(Z1(), b1())
            PROCrelu(A1(), Z1())

            Z2() = A1() . W2()
            PROCnpadd(Z2(), b2())
            PROCrelu(A2(), Z2())

            Z3() = A2() . W3()
            PROCnpadd(Z3(), b3())
            PROCsoftmax(A3(), Z3())

            OSCLI "MDISPLAY " + STR$~bmp{} + " 0, 0, 448, 448"
            GCOL 9 : RECTANGLE 64, 64, 320, 320
            FOR I% = 0 TO 9
              GCOL 15 : RECTANGLE FILL 480+I%*80, 0, 80, 448
              GCOL I% : RECTANGLE FILL 480+I%*80, 0, 80, A3(0,I%) * 448
            NEXT
          ELSE
            WAIT 10
          ENDIF
        UNTIL FALSE
      UNTIL FALSE

      END

      DEF PROCnpadd(b(),a()) LOCAL I%
      FOR I% = 0 TO DIM(b(),1) : b(I%,0 TO) += a() : NEXT
      ENDPROC

      DEF PROCnpsum(b(),a()) LOCAL I% : b() = 0
      FOR I% = 0 TO DIM(a(),1) : b() += a(I%,0 TO) : NEXT
      ENDPROC

      DEF PROCtranspose(t(),a()) LOCAL I%,J%
      FOR I% = 0 TO DIM(a(),1) : FOR J% = 0 TO DIM(a(),2)
          t(J%,I%)=a(I%,J%):NEXT:NEXT
      ENDPROC

      DEF PROCrelu(c(),a())
      c() = a() ^ 2 : c() = c() ^ 0.5 : c() += a() : c() /= 2
      ENDPROC

      DEF PROCsoftmax(b(),a()) LOCAL I%,J%,s
      FOR I% = 0 TO DIM(a(),1) : a(I%,0 TO) -= MOD(a(I%,0 TO))
        FOR J% = 0 TO DIM(a(),2) : b(I%,J%) = EXPa(I%,J%) : NEXT
        s = SUM(b(I%,0 TO)) : IF s <> 0 b(I%,0 TO) /= s
      NEXT
      ENDPROC

      DEF PROCderiv(a(),b()) LOCAL I%,J%
      FOR I% = 0 TO DIM(a(),1) : FOR J% = 0 TO DIM(a(),2)
          IF b(I%,J%) <= 0 a(I%,J%)=0
        NEXT : NEXT
      ENDPROC

      DEF PROCreadarray(F%, a&()) LOCAL d$
      PTR(d$) = ^a&(0) : !(^d$+4) = DIM(a&(),1) + 1
      d$ = GET$#F% BY LEN(d$) : !(^d$+4) = 0
      ENDPROC

      DEF PROCpixel(p&(), X%, Y%, P%)
      IF X% < 0 OR Y% < 0 OR X% > 27 OR Y% > 27 ENDPROC
      IF p&(X% + 756 - 28 * Y%) < P% p&(X% + 756 - 28 * Y%) = P%
      ENDPROC
neuralnet.png
You do not have the required permissions to view the files attached to this post.
Richard Russell
Posts: 591
Joined: Tue 18 Jun 2024, 09:32

Re: Neural Networks

Post by Richard Russell »

In case you want to adapt the program to other applications, here's a version with most of the numeric parameters replaced by named constants:

Code: Select all

      REM Neural net demo: find the decimal digit in a 28 x 28 pixel image.
      REM Coded in 'BBC BASIC for SDL 2.0' by Richard Russell, 17-Feb-2026.

      REM The neural network consists of a Multi-Layer Perceptron with 784
      REM inputs (the number of pixels in the source images), 10 outputs
      REM (the number of possibilities for the recognised digit) and two
      REM hidden intermediate layers having 128 and 64 nodes respectively.
      REM The best guess of the digit is whichever output is the largest.

      REM The training data is 12000 images of handwritten digits from the
      REM MNIST database, see https://en.wikipedia.org/wiki/MNIST_database

      PRINT "Initialising the digit-recognition neural network..."

      IMAGEW = 28  : REM Width of input images, pixels (must be multiple of 4)
      IMAGEH = 28  : REM Height of input images, pixels
      PIXELS = IMAGEW * IMAGEH : REM Number of inputs (pixels in input images)
      LAYER1 = 128 : REM Number of nodes in the first hidden layer
      LAYER2 = 64  : REM Number of nodes in the second hidden layer
      DIGITS = 10  : REM Number of outputs (decimal digits to recognise)
      BATCH = 32   : REM Batch size (number of input images in each batch)
      LEARN = 0.1  : REM Learning rate per batch

      REM Neural network:
      DIM W1(PIXELS-1,LAYER1-1), W2(LAYER1-1,LAYER2-1), W3(LAYER2-1,DIGITS-1) : REM weights
      DIM b1(LAYER1-1),          b2(LAYER2-1),          b3(DIGITS-1)          : REM biases
      DIM dW1(PIXELS-1,LAYER1-1),dW2(LAYER1-1,LAYER2-1),dW3(LAYER2-1,DIGITS-1): REM delta weights
      DIM db1(LAYER1-1),         db2(LAYER2-1),         db3(DIGITS-1)         : REM delta biases
      DIM T1(LAYER1-1,PIXELS-1), T2(LAYER2-1,LAYER1-1), T3(DIGITS-1,LAYER2-1) : REM transposed weights

      REM Training data (per batch):
      DIM x(BATCH-1, PIXELS-1),  y(BATCH-1, DIGITS-1) : REM x() = image data, y() = digit label

      REM Temporary arrays:
      DIM t0(PIXELS-1,BATCH-1),  t1(LAYER1-1,BATCH-1),  t2(LAYER2-1,BATCH-1)
      DIM a1(BATCH-1,LAYER1-1),  a2(BATCH-1,LAYER2-1),  a3(BATCH-1,DIGITS-1)
      DIM z1(BATCH-1,LAYER1-1),  z2(BATCH-1,LAYER2-1),  z3(BATCH-1,DIGITS-1)
      DIM dz1(BATCH-1,LAYER1-1), dz2(BATCH-1,LAYER2-1), dz3(BATCH-1,DIGITS-1)

      REM Initialise the weights to random values (mean = 0):
      FOR X% = 0 TO PIXELS-1 : FOR Y% = 0 TO LAYER1-1
          W1(X%, Y%) = (RND(1)-0.5) * SQR(2.0/PIXELS) : NEXT : NEXT X%
      FOR X% = 0 TO LAYER1-1 : FOR Y% = 0 TO LAYER2-1
          W2(X%, Y%) = (RND(1)-0.5) * SQR(2.0/LAYER1) : NEXT : NEXT X%
      FOR X% = 0 TO LAYER2-1 : FOR Y% = 0 TO DIGITS-1
          W3(X%, Y%) = (RND(1)-0.5) * SQR(2.0/LAYER2) : NEXT : NEXT X%

      REM Declare bitmap for image display:
      DIM bmp{bfType{l&,h&}, bfSize%, bfReserved%, bfOffBits%, \
      \       biSize%, biWidth%, biHeight%, biPlanes{l&,h&}, biBitCount{l&,h&}, \
      \       biCompression%, biSizeImage%, biXPelsPerMeter%, biYPelsPerMeter%, \
      \       biClrUsed%, biClrImportant%, palette%(255)}, pixels&(PIXELS-1)

      REM Initialise bitmap:
      bmp.bfType.l& = ASC"B"
      bmp.bfType.h& = ASC"M"
      bmp.bfOffBits% = ^pixels&(0) - bmp{}
      bmp.bfSize% = bmp.bfOffBits% + DIM(pixels&(),1) + 1
      bmp.biSize% = 40
      bmp.biWidth% = IMAGEW
      bmp.biHeight% = -IMAGEH
      bmp.biPlanes.l& = 1
      bmp.biBitCount.l& = 8

      REM Initialise greyscale palette:
      FOR C% = 0 TO 255
        bmp.palette%(C%) = &FF000000 OR C% OR C% * 256 OR C% * 65536
      NEXT

      *HEX 64

      REM Read and display training data:
      mnist% = OPENIN(@dir$ + "mnist.dat")
      fsize% = EXT#mnist%

      image% = 0
      REPEAT
        PRINT CHR$13 "Training the network ("; INT(100 * PTR#mnist% / fsize%) "%)...";
        FOR I% = 0 TO BATCH-1

          REM Read label and image:
          label& = BGET#mnist% : IF label& > DIGITS-1 STOP
          PROCreadarray(mnist%, pixels&())

          REM Display image:
          OSCLI "MDISPLAY " + STR$~bmp{} + " 0, 0, 448, 448"
          image% += 1

          REM Copy image and label into float arrays:
          x(I%, 0 TO PIXELS-1) = pixels&() / &FF
          y(I%, 0 TO DIGITS-1) = 0.0
          y(I%, label&) = 1.0

          IF EOF#mnist% THEN EXIT FOR
        NEXT I%
        m = I% : REM Number of images in batch

        REM Strictly we should randomly-shuffle the training data but we will
        REM assume the MNIST dataset is sufficiently randomised for our purpose.

        REM Forward pass:
        z1() = x() . W1()
        PROCnpadd(z1(), b1())
        PROCrelu(a1(), z1())

        z2() = a1() . W2()
        PROCnpadd(z2(), b2())
        PROCrelu(a2(), z2())

        z3() = a2() . W3()
        PROCnpadd(z3(), b3())
        PROCsoftmax(a3(), z3())

        REM Backward pass:
        dz3() = a3() - y()
        PROCtranspose(t2(), a2())
        dW3() = t2() . dz3()    : dW3() /= m
        PROCnpsum(db3(), dz3()) : db3() /= m

        PROCtranspose(T3(), W3())
        dz2() = dz3() . T3()
        PROCderiv(dz2(), z2())
        PROCtranspose(t1(), a1())
        dW2() = t1() . dz2()    : dW2() /= m
        PROCnpsum(db2(), dz2()) : db2() /= m

        PROCtranspose(T2(), W2())
        dz1() = dz2() . T2()
        PROCderiv(dz1(), z1())
        PROCtranspose(t0(), x())
        dW1() = t0() . dz1()    : dW1() /= m
        PROCnpsum(db1(), dz1()) : db1() /= m

        REM Update weights and biases:
        W1() -= dW1() * LEARN
        b1() -= db1() * LEARN
        W2() -= dW2() * LEARN
        b2() -= db2() * LEARN
        W3() -= dW3() * LEARN
        b3() -= db3() * LEARN

      UNTIL EOF#mnist%
      CLOSE #mnist%
      PRINT CHR$13 "Completed training the network with "; image% " images."

      REM Arrays for testing:
      DIM X(0, PIXELS-1) : REM input
      DIM A1(0,LAYER1-1), A2(0,LAYER2-1), A3(0,DIGITS-1)
      DIM Z1(0,LAYER1-1), Z2(0,LAYER2-1), Z3(0,DIGITS-1)

      mnist% = OPENIN(@dir$ + "mnist.dat")
      fsize% = EXT#mnist%

      image% = 0
      ok% = 0
      REPEAT
        PRINT CHR$13 "Testing the network ("; INT(100 * PTR#mnist% / fsize%) "%)...";

        REM Read label and image:
        label& = BGET#mnist%
        PROCreadarray(mnist%, pixels&())

        REM Display image:
        OSCLI "MDISPLAY " + STR$~bmp{} + " 0, 0, 448, 448"
        image% += 1

        REM Copy image and label into float arrays:
        X(0, 0 TO PIXELS-1) = pixels&() / &FF

        Z1() = X() . W1()
        PROCnpadd(Z1(), b1())
        PROCrelu(A1(), Z1())

        Z2() = A1() . W2()
        PROCnpadd(Z2(), b2())
        PROCrelu(A2(), Z2())

        Z3() = A2() . W3()
        PROCnpadd(Z3(), b3())
        PROCsoftmax(A3(), Z3())

        result% = -1
        maximum = -1E9
        FOR I% = 0 TO DIGITS-1
          IF A3(0,I%) > maximum maximum = A3(0,I%) : result% = I%
        NEXTIF result% = label& ok% += 1

      UNTIL EOF#mnist%
      CLOSE #mnist%

      @%=&50A
      PRINT CHR$13 "Network tested: "; ok% " out of "; image% " correct ("; 100*ok%/image% "%)."
      PRINT ' "Try it for yourself:"
      PRINT   "Draw a digit 0-9 to fit just inside the red square;"
      PRINT   "Click outside the box to clear it and start again."
      VDU 5 : MOVE 510,500 : PRINT "0    1    2    3    4    5    6    7    8    9"
      COLOUR 7,255,128,0

      REPEAT
        GCOL 0 : RECTANGLE FILL 0, 0, 448, 448
        GCOL 9 : RECTANGLE 64, 64, 320, 320
        pixels&() = 0
        REPEAT
          MOUSE X%, Y%, B% : X% DIV= 16 : Y% DIV= 16
          IF B% THEN
            IF X% >= IMAGEW OR Y% >= IMAGEH EXIT REPEAT

            PROCpixel(pixels&(), X%,Y%, &FF)
            PROCpixel(pixels&(), X%-1,Y%, &80)
            PROCpixel(pixels&(), X%+1,Y%, &80)
            PROCpixel(pixels&(), X%,Y%-1, &80)
            PROCpixel(pixels&(), X%,Y%+1, &80)
            PROCpixel(pixels&(), X%-1,Y%-1, &40)
            PROCpixel(pixels&(), X%+1,Y%-1, &40)
            PROCpixel(pixels&(), X%-1,Y%+1, &40)
            PROCpixel(pixels&(), X%+1,Y%+1, &40)

            xc = 0 : yc = 0 : s = SUM(pixels&())
            FOR I% = 0 TO PIXELS-1
              xc += I% MOD IMAGEW * pixels&(I%)
              yc += I% DIV IMAGEW * pixels&(I%)
            NEXT
            IF s <> 0 xc /= s
            IF s <> 0 yc /= s
            xc = INT(xc + 0.5)
            yc = INT(yc + 0.5)

            FOR I% = 0 TO PIXELS-1
              X(0,I%) = pixels&((I% + 378 + xc + IMAGEW * yc) MOD PIXELS) / &FF
            NEXT

            Z1() = X() . W1()
            PROCnpadd(Z1(), b1())
            PROCrelu(A1(), Z1())

            Z2() = A1() . W2()
            PROCnpadd(Z2(), b2())
            PROCrelu(A2(), Z2())

            Z3() = A2() . W3()
            PROCnpadd(Z3(), b3())
            PROCsoftmax(A3(), Z3())

            OSCLI "MDISPLAY " + STR$~bmp{} + " 0, 0, 448, 448"
            GCOL 9 : RECTANGLE 64, 64, 320, 320
            FOR I% = 0 TO DIGITS-1
              GCOL 15 : RECTANGLE FILL 480+I%*80, 0, 80, 448
              GCOL I% : RECTANGLE FILL 480+I%*80, 0, 80, A3(0,I%) * 448
            NEXT
          ELSE
            WAIT 10
          ENDIF
        UNTIL FALSE
      UNTIL FALSE

      END

      DEF PROCnpadd(b(),a()) LOCAL I%
      FOR I% = 0 TO DIM(b(),1) : b(I%,0 TO) += a() : NEXT
      ENDPROC

      DEF PROCnpsum(b(),a()) LOCAL I% : b() = 0
      FOR I% = 0 TO DIM(a(),1) : b() += a(I%,0 TO) : NEXT
      ENDPROC

      DEF PROCtranspose(t(),a()) LOCAL I%,J%
      FOR I% = 0 TO DIM(a(),1) : FOR J% = 0 TO DIM(a(),2)
          t(J%,I%)=a(I%,J%):NEXT:NEXT
      ENDPROC

      DEF PROCrelu(c(),a())
      c() = a() ^ 2 : c() = c() ^ 0.5 : c() += a() : c() /= 2
      ENDPROC

      DEF PROCsoftmax(b(),a()) LOCAL I%,J%,s
      FOR I% = 0 TO DIM(a(),1) : a(I%,0 TO) -= MOD(a(I%,0 TO))
        FOR J% = 0 TO DIM(a(),2) : b(I%,J%) = EXPa(I%,J%) : NEXT
        s = SUM(b(I%,0 TO)) : IF s <> 0 b(I%,0 TO) /= s
      NEXT
      ENDPROC

      DEF PROCderiv(a(),b()) LOCAL I%,J%
      FOR I% = 0 TO DIM(a(),1) : FOR J% = 0 TO DIM(a(),2)
          IF b(I%,J%) <= 0 a(I%,J%)=0
        NEXT : NEXT
      ENDPROC

      DEF PROCreadarray(F%, a&()) LOCAL d$
      PTR(d$) = ^a&(0) : !(^d$+4) = DIM(a&(),1) + 1
      d$ = GET$#F% BY LEN(d$) : !(^d$+4) = 0
      ENDPROC

      DEF PROCpixel(p&(), X%, Y%, V%) : LOCAL P%
      IF X% < 0 OR Y% < 0 OR X% >= IMAGEW OR Y% >= IMAGEH ENDPROC
      P% = X% + PIXELS - IMAGEW - IMAGEW * Y%
      IF p&(P%) < V% p&(P%) = V%
      ENDPROC
User avatar
hellomike
Posts: 201
Joined: Sat 09 Jun 2018, 09:47
Location: Amsterdam

Re: Neural Networks

Post by hellomike »

Now and then, I love to see BBCBASIC programs designed for scientific use, rather than games or fancy graphic demo's.
I downloaded the database and it runs fine with BBCSDL 1.43a.

What (missing?) interpreter feature makes it fail executing in BB4W 6.16a and throw a Type mismatch error?

Regards,

Mike
Richard Russell
Posts: 591
Joined: Tue 18 Jun 2024, 09:32

Re: Neural Networks

Post by Richard Russell »

hellomike wrote: Fri 20 Feb 2026, 19:35 What (missing?) interpreter feature makes it fail executing in BB4W 6.16a and throw a Type mismatch error?
In which line (or statement) does it fail?
User avatar
hellomike
Posts: 201
Joined: Sat 09 Jun 2018, 09:47
Location: Amsterdam

Re: Neural Networks

Post by hellomike »

It fails at this line:

Code: Select all

          x(I%, 0 TO PIXELS-1) = pixels&() / &FF
When I change it to:

Code: Select all

          x(I%, 0 TO PIXELS-1) = 0
it continues, but obviously I didn't check the output further.

Hope this helps.

Mike
Richard Russell
Posts: 591
Joined: Tue 18 Jun 2024, 09:32

Re: Neural Networks

Post by Richard Russell »

hellomike wrote: Sat 21 Feb 2026, 07:37 It fails at this line:

Code: Select all

          x(I%, 0 TO PIXELS-1) = pixels&() / &FF
Yes, attempting to assign an array of one type to an array of another type is an error in standard BBC BASIC (Acorn's ARM BASIC V, BBC BASIC for Windows and BBC BASIC for Z80 etc.).

It's not an error in Matrix Brandy BASIC or in BBC BASIC for SDL 2.0 version 1.43a or later (see the release announcement for that version).

It was only quite recently that I discovered that Matrix Brandy accepted this kind of assignment without error. Had I known sooner I would almost certainly have wanted to modify my BASICs accordingly some time ago. I've not seen it listed anywhere as a Brandy extension.
User avatar
hellomike
Posts: 201
Joined: Sat 09 Jun 2018, 09:47
Location: Amsterdam

Re: Neural Networks

Post by hellomike »

Hi,

Oh, yes I see. The next program works in BBCSDL and BBCConsole but in BB4W only the last line errors out.

Code: Select all

      x = 0
      y& = 0
      x = y&

      DIM x(10), y&(10)
      x(0) = y&(0)
      x() = y&()
So I guess it feels inconsistent but no worries as apparently it never bothered me in my 40+ years of coding in BBCBASIC!
Only if you feel the need, maybe you remove that restrictions some day.

Thanks for explaining.

Mike
Richard Russell
Posts: 591
Joined: Tue 18 Jun 2024, 09:32

Re: Neural Networks

Post by Richard Russell »

long-yr
hellomike wrote: Sat 21 Feb 2026, 12:12 So I guess it feels inconsistent but no worries as apparently it never bothered me in my 40+ years of coding in BBCBASIC!
I'm going to speculate that it hasn't bothered you because for the majority of those 40+ years there were only two numeric data types: 32-bit integer and 40-bit float (or 64-bit float in some versions) and the restriction isn't so onerous in that situation. For example although it's entirely possible that a dot product of two 32-bit integer arrays could overflow the integer range, it is unlikely that it would.

Things are very different now both Matrix Brandy and BB4W/BBCSDL have a byte (unsigned 8-bits) data type. With that type, it's very likely that a dot product will overflow the 0-255 range, you'd have to have very small arrays and/or very small numbers in those arrays for it not to overflow. So not being able to assign the result of the operation to a data type with a larger range is a major limitation.
Only if you feel the need, maybe you remove that restrictions some day.
It's unlikely. As I have explained before BBC BASIC for Windows, because it is coded in assembly language, is exquisitely sensitive to code alignment (especially when running on some AMD CPUs). That has meant that, for probably ten years or more, the changes I have made to the language have been so minor that I've managed to achieve them with no increase in the code size.

The changes required to allow an array expression to save its result to an array of a different data type cannot be achieved without increasing the code size. So there is no realistic possibility of adding that feature to BB4W. In any case, as you know very well, my age and cognitive decline make it far too risky to make anything but trivial changes.

And as you also know, I don't think BBC BASIC for Windows has a long term future because it's 32-bits only and Windows is bound to drop support for 32-bit apps sooner or later.
User avatar
hellomike
Posts: 201
Joined: Sat 09 Jun 2018, 09:47
Location: Amsterdam

Re: Neural Networks

Post by hellomike »

All very valid points Richard.
They way forwards is no doubt BBCSDL 64bit.
Richard Russell
Posts: 591
Joined: Tue 18 Jun 2024, 09:32

Re: Neural Networks

Post by Richard Russell »

The 'digit recognition' neural net used here can give better results if trained more thoroughly. Two ways in which you can improve the training are:
  1. Use the full MNIST training dataset (60,000 images) rather than the subset of 12,000 images I use in the listed program. To that end you can download mnist_full.dat and edit the program to use that file rather than mnist.dat.

  2. Loop through the training images more than once, randomising the order in which the images are used, whilst at the same time using a lower value for the LEARN constant. This will reduce the tendency for images later in the training set to have a disproportionate influence on the weights.
Of course the downside to doing more training is that it will take longer, perhaps an hour rather than just a couple of minutes if you implement both of the above measures. Therefore I would suggest that if you want to experiment with this you split the program into two, with the slow training phase saving the weights and biases to a file, and the running phase loading the weights and biases from that file.