This is the new version, with faster MCU, STM32H7 and cleaner design. It uses the common TDA9983 as the HDMI transceiver instead of ADV7511WBSWZ that is very hard to source, expensive and power inefficient.
Wouldn’t it be cool to carry your favorite computer with you at all times? 😀
Well, this is perhaps a little too geeky even for me, but that was the initial thought when starting up this project. Personally I would love to have a tiny box behind my TV set at home to be able to play some good ol’ games whenever I felt like it, or just bring it along without feeling like you have anything extra with you at all. Ideally the box should be able to fit in a keychain, but as it looks today it is too big for that. Because of this the form factor is crucial and custom hardware is definitely needed in order to accomplish this tiny footprint. The motivation for the project was never on the pioneering side, many emulators of this kind have already been made (although in a bigger package), rather it was the challenge that comes with making something as complex as this using, at least for the purpose, very limited resources. Almost putting myself in the same shoes as the guys back in the days when writing software for the c64, which of course you all know, has its very own limitations.
That brings us to the question:
Is it possible to emulate a complete c64 computer (excluding SID) cycle perfect with full frame rate by using an ARM Cortex-M7 processor at 216Mhz.
The answer to this question is… not really.
However, if you lessen the requirements a bit, it is very much possible to get a decent emulator going. For example, and as seen in this project, if you decrese the emulated frame rate by a factor of 2 and not running everything cycle perfect, then the the performance of the cortex starts to keep up with the emulation even with all 8 sprites activated and moving around.
I made the design using 6 layer PCB. The MCU is absolutely tiny and fits perfectly together with the memory under the SID socket.
The circuit board (MK2) just arrived as you can see in the pictures. Excellent …
And here it is, assembled 🙂
MCU: STM32F756IGK6 (ARM Cortex-M7, 216Mhz, UFBGA176)
HDMI transmitter: ADV7511W (165Mhz, LQFP-64)
SDRAM: IS42S16400J-5BL (200Mhz, 64 Mbit, FBGA)
Partlist: U3, 1 MCU, STM32F756IGK6 (to be changed) U4, 1 HDMI transmitter, ADV7511WBSWZ U2, 1 SDRAM, IS42S16400J-5BL CN1, 1 SDCARD, 104031-0811 P5, 1 Audio connector, SJ-3523-SMT-TR P7, 1 DC power connector, PJ-037A P1, 1 HDMI connector, 10029449-001RLF P4, 1 USB-A connector, 62900416021 P6, 1 USB micro connector, 47590-0001 U1, 1 IC-sockel, DIL 28 SMD, 114-87-628-41-117101 C10, 1 Tantalum Capacitor, TAJB226M016RNJ C0-C2, 3 Tantalum Capacitors, TMCMB0J227MTRF C70, 1 AUDIO capacitor, RFS-50V010ME3#5 U6, 1 Volt regulator (3.3V), NCP5662DS33R4G U5, 1 Volt regulator (1.8V), NCP5662DS18R4G G1, 1 1Mhz Oscillator (SID), FXO-HC735-1MHZ X1, 1 8Mhz Crystal, FQ7050B-8 U7-U9, 3 TPD4S010DQAR C4-C5, 2 2.2uF (0603) C21, 1 15pF (0603) C22-C23, 2 10pF (0603) C30-C33, 4 10nF (0603) C40-C46, C47-C48, C80-C88, 14 100nF (0603) C50-C54, 5 1.0uF (0603) C60-C61, 2 2.2nF (0603) R0-R5, 6 10K (0603) R14, 1 0 (0603) R15, 1 DNI (0603) R20, 1 887 (0603) R30-R34, 5 47K (0603) R50, 1 1K (0603)
- USB keyboard
- 2 DSUB9 Joystick Ports
- 12V power input (when using original MOS6581)
- Micro USB connector (main power)
- External SID socket
- 3.5mm audio jack
- SD card push-pull socket
Using real SID
12V is needed to run with original sid chip (MOS6581). This can either be supplied using external source and the dedicated power connector on the board or with a boost converter, such as this. The converter can be hooked up to the 3.3V, GND and 12V solder pads on the backside of the board.
- Full C64 emulator with graphics (sound is handled separately)
- Full disk drive emulator
- Support for T64, D64, PRG and TAP files
- Firmware updates through SD card
- USB CDC support with custom command interface
- USB HID keyboard support
- Reset from keyboard
- Separate emulator libraries (host can be changed easily)
- Configurable palette (palette config file)
- Configurable key mapping (keyboard config file)
Limitations and bugs (V1.0.0):
- Half frame rate is needed to make the emulator reach correct frequency (Approx. 1Mhz/50fps). This will have major impact on some games where collisions are not detected at every frame. Also it will make “blinking effects” (like blinking sprites in commando and ghost ‘n goblins) not being displayed as it should be.
- Due to performance reasons the emulator is not cycle perfect. This means it will queue up cycles for the emulated components to a specific threshold before acting on them. This basically makes the timing wrong but will, under normal circumstances, not lead to any big problems with the emulation as a whole.
- The granularity for keeping the correct frequency is poor. Today it is set to 40ms (every second frame). This means that the emulator will go through 40ms of real time as fast as it can and enter a wait state in order to meet the emulated speed of approx. 1Mhz. The drawback of this is obvious, the timing will be adjusted 25 times a second so the real speed will be faster between these points leading to somewhat fast forward graphics and sound. Personally, I cannot really tell that it is doing this and I do not see or hear any negative things related to this limitation.
- No sprites will be rendered (and collided with) outside the display window so titles like wizball will not get a correct emulation when it comes to graphics.
- When the disk drive is turned on, then there are actually 2 complete computers being emulated and talking to each other through the serial port. This will have negative effect on the speed and the emulation will drop a bit. Luckily the disk drive can safely be paused when it is not needed and turned on again when it is.
- There are many “bugs” left in the code that will make many games either not working at all or partly working. Mostly graphic problems and crashes involved in the emulation process.
The goal for the first release of software was near perfect emulation of the below games:
- Boulder Dash
- Bubble Bobble
- Giana Sisters
With the exceptions stated in “limitation and bugs” the goal was reached with sw version 1.0.0 (according to me).
Interested in buying one ?
There will be a very limited batch of these. The good folks that supported me with kickstarter campaign will come first. Visit the shop if it is open.
This project was made in the memory of my old computer that I played around with as a young boy. I have a lot to thank this machine for, among other things it made me understand what I wanted to do with my life. So in this project I created software and hardware to make it possible to play those wonderful games yet again.
I always wanted to do an emulator for Commodore 64 and I finally pulled me together to get one up and running. It is not flawless in any way but it is working quite decent looking at the hours I spent on it and given my almost non existing knowledge about the HW beforehand. Many short cuts were also needed in order to get the emulation running for all components utilizing as few clock cycles as possible. Full optimization, in lined code and code running in RAM. These are some examples to help with saving clock cycles and make the emulation acceptable. Also, as you may know, writing fast code is almost never conformable with code that have a nice structure and is pleasing for the eye. When making this project I was very exited and amused to see the progress. Since seeing this progress was the only thing that kept me going sometimes, the code was written quite fast and the quality was affected in a negative way.
Most of software was developed using Visual Studio. The reason for this is that Visual Studio is the only program from Microsoft that I like and it is very easy to debug the software if compared to on-chip debugging for the Memwa board. When running on Visual Studio on a PC the sound, unlike when using Memwa board which have the actual SID chip installed, needs to be emulated. This is done using the reSID C++ library.
The only piece code that is downloaded to Memwa board and that is not written by me is the Solomon-Reed algorithms to handle the MLC flash memory. So thanks to whomever wrote this code.
Emulating is very difficult and the more accurate the emulation is the more performance it will consume. Ideally, you want hardware powerful enough to emulate all components and still have cycles over in order to render the screen at exactly 50 FPS. In the case where the execution is too fast, you will always have the option to stop it for a while.
An MCU running at 235Mhz is not enough to emulate commodore 64 according to me. In fact, running with the software created in this project can only render at about 20 FPS give or take depending on the functionality that is used for the different components (e.g. graphic mode, number of timers etc.). Also, simultaneous emulation of these many components will make the timing fluctuate a lot, making a mess to the sound and movement.
A simple load balancer was created as a remedy for the fluctuations of performance and ensured that the frame rate was constant at about 50 FPS. The load balancer will balance the load by looking at two factors, time and lines rendered where time is naturally the time it takes for each frame to be displayed. When a line is rendered all pixels for this line are calculated individually using a set of conditions determined by the VIC component. This is very time (cycle) consuming. So by altering the number of lines that should be rendered at each frame its possible to balance the load and get a correct frame-rate (assuming that the hardware is never capable to render all lines needed at the correct FPS).
So why not only render the pixels or lines that was changed in previous frame?
To get a constant frame rate using this technique is very hard, but I think it is possible. I tried this two times with two different implementations but was not successful and got problem when whole screen was updated (e.g. memory pointer changed), when border color changed etc. I also tried this together with the load balancer which I think is the correct approach but still major problems getting stable FPS.
CPU C64, CPU 1541, VIC, CIA1, CIA2, TAP, VIA1, VIA2, SER, BUS
Illegal opcodes not verified and SID chip is not emulated.
The CPU of the commodore 64 is emulated as a 6502. The actual CPU is of another model (6510) but it does not matter much. The emulation mimics all the different assembler instructions used by 6502 to give same result using the STM32F4 MCU. The CPU is based on 8 bits so the instructions are limited to 256. Since it is specified exactly how many commodore 64 cycles one instruction takes, the CPU component is working as a clock to the other components in the system.
The graphics is a little complicated and to do an exact emulation of the vic chip would be very time consuming, hard and demanding a lot performance of the HW. This is why I took some short cuts with this component making the graphic less than perfect, but this is a reasonable tradeoff for getting something working with as little effort as possible. The VIC chip is so complex and flexible that it is hard to set any fixed specification. You can make a lot of tricks with this device. Having said this, the VIC chip, on paper, supports a resolution of 320×200 (not including borders) together with 8 sprites. The sprites are blocks of 24×21 (twice the size if expanded) that can be positioned and controlled individually. VIC can also control the collision between the sprites and other sprites or background. It supports a number of different modes when it comes to graphics.
These peripheral components mainly handles timers (CIA1 and CIA2), keyboard input (CIA1) and the serial port (CIA2) on commodore 64.
This component emulates a datasette device.
These peripheral components mainly handles timers (VIA1 and VIA2), serial port (VIA1) and the disk drive controller (VIA2) on the 1541 disk drive.
Emulating the SID chip would take humonggously many cycles, which I do not have. The HW SID chip itself will be used, so no emulation for sound.
This component handles all the memory in the 1541 disk drive. It works the same as the BUS component.
This component handles all the memory in the commodore 64. It provides an interface to read and write to the memory. It will take care of the connection between different components and make the necessary modifications for the surrounding hardware configuration. For example, the connection between the disk drive and commodore 64 is done using a serial port. The register for the former is located at address 0x1800 in VIA1 and for the latter at 0xDD00 in CIA2. The HW lines between VIA1 and CIA2 are affected by HW using pull up, pull down and inverters etc. Also the registers look very different. To take one example the attention line is connected to VIA bit 7, but is connected at CIA2 bit 3. When some component is using the interface for the BUS to modify memory, all other components connected to this memory (directly or by HW lines) will get this information by a simple callback.
Compiler GCC ARM Embedded
Development platforms CooCOX IDE, Visual Studio
Files that are supported:
T64, TAP, D64
Good stuff to know
The commodore 64 has a number of ROM sections and as the name suggests also have 64Kb of RAM. Since the computer uses 16-bit address bus the maximum number of bytes that can be addressed are therefore 64Kb. It handles the shortage with the use of banks and switching them in and out. This is done by manipulating the first two addresses (0x0000 and 0x0001). Regardless of the bank setup the VIC always see certain memory at fixed location (0x1000-0x2000 and 0x9000-0xA000 will always hold the character ROM). If writing to an address where ROM is switched in will result in a write operation for the same address but in the RAM “underneath”.
Hardware is really fun! I enjoy doing PCB designs for projects like these. This is why I decided to create a board for this project to see if I could produce something that would suffice to run the emulation good enough. The SID chip was the crown jewel for this board, no doubt about it 🙂
The pitch for the HW chip was sometimes below 0.5 which made soldering a bitch to be honest. But with some patience, sweat and a lot of flux it was indeed possible.
MCU: STM32F407 (overclocked 67Mhz giving 235Mhz)
Display controller: SSD1963
Flash Memory: H27UBG8T2BTRBC
Display: 7″ TFT 4-WRT
This handy little 100 pin MCU is running at max 168Mhz. This MCU seems stable at 235Mhz with internal clock and without cooling. MCU was used with FSMC. This made the communication with the graphic chip and flash memory very easy and necessary to get neat performance.
A SOLOMON SYSTECH 1963 chip was used to handle the 7″ display.techtoys.com.hk
A MLC 4Gb memory to hold about 30k C64 games. Being an MLC memory it contains a lot of bit errors per default and also some bad sectors. In this project I have ignored the bad sectors (will get led indication if bad sector is found when loading a game). The bit errors on the other hand cannot be ignored and must be corrected using some ECC algorithm. Solomon-Reed algorithm is used in this project and is configured so it can correct 5 bit errors for every 256 bytes read. Can be found at ebay.
This display has 800×480 resolution and comes with 4-wire touch panel. Can be found at techtoys.
In order to make the initial prototype, sockets for the major part was needed in order to test.
Solomon 1963 EVK (techtoys)
QFP100 socket for MCU (waveshare)
TSOP48 socket for flash memory (ebay)
Design Software – CadSoft’s EAGLE PCB Design Software
Debugger – Olimex ARM-USB-TINY-H
I made the design using 6 layer PCB with the same dimensions as the popular Raspberry-Pi board. I liked the small dimensions and the different cases that are available for raspberry. The board was manufactured using a fab called PCBCart, which I only have good things to say about.
To make use of the flash memory, and image was needed to be created that will hold all the games and also the roms needed for the C64 and disk drive (kernel rom, basic rom, character ROM and 1541 kernel rom). I made a simple program that took all games I had in a specific folder (~30k games), made a TOC of these games 64byte aligned and created an image based on this. Image is about 4Gb large and will be transferred to the board using com port (yes, it will take some time) together with another simple utility program I made.