Nice series! I've considered doing something similar due to the dearth of decent material on FPGA and hardware design on the internet though never quite got around to it...
For the sprites did you consider a part on dealing with multiple overlapping sprites? (I just skimmed, so apologies if you did this somewhere and I missed it!) There's at least two approaches here and I think it's an interesting thing to explore to discuss design trade-offs.
The two approaches I can see are:
1. Load the relevant line of each sprite into flops during the blank interval, when drawing determine which sprites produces a pixel for a given output pixel then apply ordering, taking into account transparent pixels to determine which one wins. You have a simultaneous read of all of the sprite flops each output pixel to do this
2. Pre-render all of the sprite pixels for a line into a memory buffer during the blanking interval. If you have multiple sprite pixels per cycle from memory and take advantage of that fact your FPGA can probably run at some multiple of the pixel clock (say 100 MHz vs the ~25 MHz pixel clock for 640x480) you do quite a lot in the blanking interval.
I reckon 2 wins when you're wanting more sprites per line and higher bit-depths per sprite pixel (8-bit 256 colour sprites vs 2/4/8 colour sprites).
I've got a ~60% complete version of 2, maybe I should follow you lead and finish it off and write about how it works.
I’d strongly encourage you to finish your write up and post it. As you say, there is a dearth of decent FPGA material, especially for hobbyists and those trying to learn hardware design.
You can do the pre-rendering on the previous line if you can spare two buffers. You just render into a buffer, then during the blanking interval latch it into a shift register. That gives you a lot more time to render.
You're right and I think I'm actually doing that on my 60% complete prototype. Haven't touched it for a couple of months and I've forgotten some of the details!
Having now actually looked at the code looks like you did a version of 1, but as you've only got 1-bit pixels the combination stage is trivial (of the 8 sprites are any on, if so output a sprite pixel)?
I am planning another sprite post with overlapping coloured sprites combined with a framebuffer background. I will probably use approach 2, as you describe, but with hardware design, I’ve found that the best algorithm can surprise you in practice.
Doing VHDL designs for a living here, using SystemVerilog once in a while when absolutely required. Take my words with a good pinch of salt!
SV is anyway a hardware description language, so you still think and code in terms of modules and signals (wires) to be connected. But if you want to create structures that use variables and have a more "procedural" style, I find that SV is more agile than VHDL for this specific purpose.
If you want real high level languages for hardware there are tools that provide synthesis from C/C++, Python and more software oriented languages.
SystemVerilog has several useful additions for FPGA engineers, but tool support has been slow to materialise. For the moment, I’ve kept to the few features that Xilinx Vivado and Yosys http://www.clifford.at/yosys/ are both happy with, such as enums, logic data type, and always_ff etc.
I’ve not used VHDL in anger, so can’t give you a useful opinion there.
None of the introductory FPGA books have impressed me, so I don’t have any recommendations there. Maybe some other HN readers will have some suggestions?
Thanks!
I feel the best way to learn could be a set of labs of increasing difficulty with attached testbenches for simulation self-check
This specialization https://www.coursera.org/specializations/fpga-design has something like that, but unfortunately it's fairly shallow
Not the author, but regularly also get asked this question, and I usually recommend Digital Design and Computer Architecture, 2nd Ed [1]; supplemented with practical work on an FPGA.