Saturday 15 March 2014

Plotting the Mandelbrot Set on the ZX Spectrum

ZX Spectrum MandelbrotZX Spectrum Mandelbrot

The Mandelbrot set is a fractal which iterates the equation zn+1 = zn² + c in the complex plane and plots which points tend to infinity. Plotting the set with Sinclair BASIC takes over 24 hours so I was curious how much faster it would be in assembly.

It turns out if we use fast 16-bit fixed-point arithmetic we can plot the Mandelbrot in about 5 minutes. To minimise multiplications each iteration is calculated as:

rn+1 = ( rn + in ) × ( rn - in ) + x

in+1 = 2 × in × rn + y

The following test is used to detect points which tend to infinity:

|in| + |rn| ≥ 2 × √ 2.
  org 60000
  ld de,255*256+191
XLOOP:
  push de
  ld hl,-180   ; x-coordinate
  ld e,d
  call SCALE
  ld (XPOS),bc
  pop de
YLOOP:
  push de
  ld hl,-96    ; y-coordinate
  call SCALE
  ld (YPOS),bc
  ld hl,0
  ld (IMAG),hl
  ld (REAL),hl
  ld b,15      ; iterations
ITER:
  push bc
  ld bc,(IMAG)
  ld hl,(REAL)
  or a
  sbc hl,bc
  ld d,h
  ld e,l
  add hl,bc
  add hl,bc
  call FIXMUL
  ld de,(XPOS)
  add hl,de
  ld de,(REAL)
  ld (REAL),hl
  ld hl,(IMAG)
  call FIXMUL
  rla
  adc hl,hl
  ld de,(YPOS)
  add hl,de
  ld (IMAG),hl
  call ABSVAL
  ex de,hl
  ld hl,(REAL)
  call ABSVAL
  add hl,de
  ld a,h
  cp 46        ; 46 ≅ 2 × √ 2 << 4
  pop bc
  jr nc,ESCAPE
  djnz ITER
  pop de
  call PLOT
  db 254       ; trick to skip next instruction
ESCAPE:
  pop de
  dec e
  jr nz,YLOOP
  dec d
  jr nz,XLOOP
  ret

FIXMUL:        ; hl = hl × de >> 24
  call MULT16BY16
  ld a,b
  ld b,4
FMSHIFT:
  rla
  adc hl,hl
  djnz FMSHIFT 
  ret

SCALE:         ; bc = (hl + e) × zoom
  ld d,0
  add hl,de
  ld de,48     ; zoom

MULT16BY16:    ; hl:bc (signed 32 bit) = hl × de
  xor a
  call ABSVAL
  ex de,hl
  call ABSVAL
  push af
  ld c,h
  ld a,l
  call MULT8BY16
  ld b,a
  ld a,c
  ld c,h
  push bc
  ld c,l
  call MULT8BY16
  pop de
  add hl,de
  adc a,b
  ld b,l
  ld l,h
  ld h,a
  pop af
  rra
  ret nc
  ex de,hl
  xor a
  ld h,a
  ld l,a
  sbc hl,bc
  ld b,h
  ld c,l
  ld h,a
  ld l,a
  sbc hl,de
  ret

MULT8BY16:     ; returns a:hl (24 bit) = a × de
  ld hl,0
  ld b,8
M816LOOP:
  add hl,hl
  rla
  jr nc,M816SKIP
  add hl,de
  adc a,0
M816SKIP:
  djnz M816LOOP
  ret

PLOT:          ; plot d = x-axis, e = y-axis
  ld a,7
  and d
  ld b,a
  inc b
  ld a,e
  rra
  scf
  rra
  or a
  rra
  ld l,a
  xor e
  and 248
  xor e
  ld h,a
  ld a,d
  xor l
  and 7
  xor d
  rrca
  rrca
  rrca
  ld l,a
  ld a,1
PLOTBIT:
  rrca
  djnz PLOTBIT
  or (hl)
  ld (hl),a
  ret

ABSVAL:        ; returns hl = |hl| and increments
  bit 7,h      ; a if the sign bit changed
  ret z
  ld b,h
  ld c,l
  ld hl,0
  or a
  sbc hl,bc
  inc a
  ret

XPOS:dw 0
YPOS:dw 0
REAL:dw 0
IMAG:dw 0

3 comments:

  1. It looks as though you've adapted Gauss's complex multiplication algorithm (see https://en.wikipedia.org/wiki/Multiplication_algorithm#Gauss.27s_complex_multiplication_algorithm) to your need for squaring followed by adding, then streamlined it. But that wikipedia article states "This algorithm uses only three multiplications, rather than four, and five additions or subtractions rather than two. If a multiply is more expensive than three adds or subtracts, as when calculating by hand, then there is a gain in speed. On modern computers a multiply and an add can take about the same time so there may be no speed gain." If there is a gain, you could gain even more by replacing that multiplication by 2 with a doubling by addition (I haven't looked at your code, maybe you already did).

    It also looks as though your test for unbounded divergence is based on the result that "... if the absolute value of P_c^n(0) ever becomes larger than 2, the sequence will escape to infinity" (see https://en.wikipedia.org/wiki/Mandelbrot_set#Basic_properties), with you simply testing for your simpler norm at a value that always lies outside the cartesian norm of 2 (since anything passing your test has met the requirements of the result, and anything that diverges unboundedly must necessarily get past your norm's boundary at some point). You could possibly simplify your test further by substituting a slightly larger but rational (better still, dyadic rational, i.e. with finitely many non-zero binary places) test criterion not involving the square root of 2 (again, maybe you already did).

    ReplyDelete
  2. You can speed it up considerably by not using multiplications at all, since you're just suqaring anyway. Squares can be determined from a lookup table, which, in case of the kind of 16-bit arithmetics that you use takes up 32 kilobytes and can be generated in a blink of an eye.

    Thus:
    rₙ₊₁ = x + (rₙ)² - (iₙ)²
    iₙ₊₁ = y + (rₙ + iₙ)² - [(rₙ)² + (iₙ)²]

    That makes a total of 3 lookups. Moreover, if you calculate the Euclidean norm [(rₙ)² + (iₙ)²] first, it can be used to abort the iteration if it grows beyond 4.

    ReplyDelete
  3. Is there a basic version i can speed up

    ReplyDelete

Note: only a member of this blog may post a comment.