Haskell: The challenge

I’ve come up with a definitive version for the programming challenge for Haskell. It is as follows:

Write an LC_RLE1 decompressor. If that works out, attempt to write an LC_LZ2 decompressor. The keyword is “attempt” as I’m pretty sure I won’t be able to finish it in time due to the difficulty. I do have a rough idea of how to attempt the latter though:

Write a function for each command. In pseudo code, it would roughly look something like this:

byte[] directCopy(byte[] input, int length)
byte[] directFill(byte data, int length)
byte[] wordFill(uint16 word, int length)
byte[] increasingFill(byte data, int length)
byte[] repeat(uint16 index, int length)

And then somehow put them together as one big “decompressLZ2” command. As for LC_RLE1 I suppose it’s just the first two functions.

The idea is that I work my way up from a simple format to a more complex format. LZ2 is quite complex though and it’s possible that I might actually get stuck solely due to my inexperience with Haskell rather than the format itself.

The biggest challenge would probably be reading out the individual bits out of a byte, as well as loading/saving a file in Haskell (it might sound trivial but I have absolutely no clue how this is done).

I presented my idea to the instructor and got his approval. Hopefully, I can finish at least the RLE1 challenge tomorrow already as I’m quite short on time.

Haskell: Session 2

Today I’ve read day 2 of the Haskell chapter and I’ve learned some pretty nice features of Haskell, which might be useful for the programming challenge idea I had the previous chapter. Here are the most notable features which stuck with me the most.

Cheat sheet

In the previous blog post I mentioned that it’d be nice to make some sort of a cheat-sheet for Haskell. Turns out there is already one (sort of):


Anonymous functions

I don’t know much about anonymous functions besides the fact that they sort of… exist. I heard it’s a nice feature, but personally, I never had the need to use them so far. The book doesn’t go in-depth about this feature because the previous chapters for the other six languages also have them covered, which is a bummer because I did skip to Haskell immediately.

Haskell’s anonymous functions seem to be like the following:

(\ params -> function_body) input
(\ x -> x + 1) 4
Output: 5

This creates an anonymous function with 1 input parameter, which increases x by 1. Its C# equivalent would be something like the following (not 100% accurate, I never used anonymous functions before):

delegate int TestDelegate(int x);

TestDelegate testDel = (x) => { return x++; }

var result = testDel(4); //returns 5

To add more parameters, you simply separate them with a space:

(\ x y -> x + y) 2 5
Output: 7

You can also assign anonymous functions to variables which is pretty much like a function definition. I guess the only difference is that you can then pass them as a parameter?

addOne = \x -> x + 1
addOne 2
Output: 3

Map and Where

The book mentions that Haskell has a map. Unfortunately, I’m new to the concept of mapping but one look at the code and I almost immediately figured out what it does.

map (\x -> x * x) [1, 2, 3]

What this seems to do is apply the anonymous function to each element within the list, making it a “square the list” function. It seems pretty handy, applying a function on a list without having to manually code a loop.


There seems to be a feature in Haskell which is called currying. See the following code example:

let prod x y = x * y

This multiplies x with y. Now, if I define a function which partially uses this function’s declaration…

let double = prod 2

…it actually means something like calling:

prod y = 2 * y

because the “2” substitutes the x. The C# equivalent of this would be roughly something like this:

int prod(int x, int y)
 return x * y;
int double(int y)
 return prod(2, y);

It’s a pretty interesting feature, but because of my inexperience with Haskell, I can’t think of a lot of practical applications.


As with day 1, this day also has a self-study.

Write a sort that takes a list and returns a sorted list.

I decided to go with QuickSort as I’ve already seen a bunch of examples floating around on the internet. Admittedly, not 100% of the code is written by me but it’s easy enough to understand what it does.

quicksort :: (Ord a) => [a] -> [a]
quicksort [] = []
quicksort (h:t) = quicksort [x | x <- t, x <= h] ++ [h] ++ quicksort [x | x <- t, x > h]

This function takes a list of sortable items (Ord) and sorts them recursively. The quicksort pivot is always the first element of the list (which isn’t the smartest thing to do if the list is already sorted). Using list comprehension, the elements smaller (& equal) and greater than the pivot are separated. These lists are appended by using ++.

It’s amazing how the function is so tiny. No temporary variable declarations or anything, the code is straight to the point. For comparison, here is the (bloated) C# quicksort which I wrote for my programming class:

private T[] QuickSort(T[] list, int left, int right)
 if (list.Length <= 1)
 return list;

T[] output = (T[])list.Clone();

int i = left, j = right;
 T pivot = output[(left + right) / 2];

while (i <= j)
 while (output[i].CompareTo(pivot) < 0)

while (output[j].CompareTo(pivot) > 0)

if (i <= j)
 // Swap
 T tmp = output[i];
 output[i] = output[j];
 output[j] = tmp;


// Recursive calls
 if (left < j)
 output = QuickSort(output, left, j);

if (i < right)
 output = QuickSort(output, i, right);

return output;

I have to manually loop through the variables to check if they’re greater or less than the pivot, and separate them accordingly. In Haskell’s version, list comprehension handles that for me. That’s extremely convenient.

Write a sort that takes a list and a function that compares its two arguments and then returns a sorted list.”

I modified my quicksort algorithm to accept two anonymous functions:

quicksort :: (Ord a) => (a -> a -> Bool) -> (a -> a -> Bool) -> [a] -> [a]
quicksort _ _ [] = []
quicksort lt gt (h:t) = quicksort lt gt smaller ++ [h] ++ quicksort lt gt bigger 
 where smaller = filter (lt h) t
 bigger = filter (gt h) t

When I input the following into the GHC, I get the following output:

quicksort (\ a b -> if (a <= b) then False else True) (\ a b -> if (a > b) then False else True) [4, 6, 1, 3, 2]
Output: [1, 2, 3, 4, 6]

Write a Haskell function to convert a string to a number. The string
should be in the form of 
$2,345,678.99 and can possibly have leading

This solution is based on list comprehension. Haskell has a function called “read” which can convert a string into another type.

strToNum :: String -> Float
strToNum num = read [ x | x <- num, x /= ',', x /= '$'] :: Float

As a string really is made of an array of chars, I go through each character, check if they contain undesirable float characters (in this case, ‘,’ and ‘$’, and discard them. I specified the Float type for the read function but really it probably also works without it, as the function already defines Float as an output. Haskell is smart enough to figure that out by itself.

Funnily enough, to test this function I kept calling the following for a good ten minutes:

strToNum $2,345,678.99

…without the quotes around the input string. No wonder the GHC kept complaining about parsing a ‘,’.

Write a function that takes an argument x and returns a lazy sequence that has every third number, starting with x. Then, write a function that includes every fifth number, beginning with y. Combine these functions through composition to return every eighth number, beginning with x + y.

To be perfectly honest I didn’t understand this question due to my limited English, so I had to look it up. The solution is from here:

every3 x = x:(every3(x + 3))
every5 y = y:(every5(y + 5))
every8 x y = zipWith (+) (every3 x) (every5 y)

From what I understand, the first two functions generate an infinite list of numbers. The third function takes the two lists and “zips” them together by using the add function.

Use a partially applied function to define a function that will return half of a number and another that will append \n to the end of any string.”

half = (/ 2)

Input: half 4
Output: 2
newline = (++ "\n")

Input: newline "Hello"
Output: "Hello\n"

Partially applied functions seems to be easier than I thought. All I had to do was… partially apply the infix functions for division and concatting a newline.

Here are the self-study solutions inside zip: Haskell – Day 2

I’m slowly starting to get used to the language. I’m slowly thinking out how to program that lz2 decompressor. The language seems to prefer recursion over looping so I’m really hoping that coding a decompressor is possible at all. I’ll find out soon, hopefully.

Haskell: Session 1

As mentioned in my previous blog post, I decided to learn Haskell for my programming class. Why I decided on Haskell? Because mainly the syntax really caught my attention. It’s unlike anything I’ve seen before, and I would definitely like to explore its possibilities. Furthermore, I’ve also heard that Haskell is mostly used in academic situations rather than actual real-life practices, but that won’t stop me from learning such a different language.

The blog posts will be divided into “sessions”. Normally I was planning to do one blog post per day, but due to time restrictions and the high workload of my semester, I might have to cram multiple sessions into one post.

In order to get started, I decided to follow day 1’s instructions according to the book “Seven Languages in Seven Weeks”. I downloaded the Glasgow Haskell Compiler (GHC) from the download (http://www.haskell.org/ghc/), and ran ghci.exe. For now, that’s all I need.

The chapter starts off with very simple instructions using primitive types, inputted into the GHC console. It then transitions to strings & chars, booleans and finally, functions. I won’t go into a lot of details, else this’ll be one hell of a long blog post. There’s how numbers work:

Input: 4
Output: 4

Input: 4 + 1
Output: 5

Input: 4 + 1.0
Output: 5.0

Input: 4 + 2.0 * 5
Output: 14.0

So far, seems simple enough. It’s processing basic math, and once you start using numbers with decimals, it’ll start outputting the result with decimals also. The GHC seems to append decimals only when necessary, not excessively.

Input: 1.0 + 1.0100
Output: 2.01

Order of operations seems to work as expected – based on the real math world. Parentheses are processed properly as well.

Input: 4 * 5 + 1
Output: 21

Input: 4 * (5 + 1)
Output: 24

Onwards to strings and characters. Strings are represented using double quotes:

Input: "Hello"
Output: "Hello"

However, you can’t concatenate strings using “+”. Rather, you have to use “++”.

Input: "Hello" ++ " World"
Output: "Hello World"

Characters are represented using single quotes:

Input: 'a'
Output: 'a'

However, strings really are just built up from an array of characters. From my experience, this is similar to C#.

Input: ['a', 'b']
Output: "ab"

Onwards to booleans. Seems like booleans are expressed using “True” and “False” (without the quotes). Not much can be said about them.

It looks like Haskell is a strongly-typed language. I can’t do something like “1 + True“, or ” “Hello” + 1 “.

Assigning variables to a function in a local scope:

let x = 10

Declaring functions:

 double x = x * 2

Calling said function with a parameter:

double 5

Declaring above function with a type definition, which accepts and returns integers:

double :: Integer -> Integer
double x = x + x

Where the first “Integer” is the accepted parameter, and the second “Integer” being the return type.

Haskell seems to have pattern matching. From what I understand, it seems to function like some sort of an if/elseif/else construction before the function itself is even called. Here’s an example of a factorial function which at the same time is recursive:

factorial :: Integer -> Integer
factorial 0 = 1
factorial x = x * factorial (x - 1)

To me, this looks like “if the input is 0, then return 1. Else, if the input is anything, execute [code after “x =”]”. The order of the pattern seems to be important. If that third line (which is a “catch-all” pattern) were to be the second line, the pattern would always match said line, rendering the third line for the input “0” useless. I guess it saves you writing if then elses. If you wanted to reverse the pattern matching, you’d use guards:

factorial :: Integer -> Integer
factorial x
    | x > 1 = x * factorial (x - 1)
    | otherwise = 1

The format is basically this:

 | [condition] = [function]

If the condition isn’t satisfied, it’ll continue to the next guard. If that isn’t satisfied, it’ll continue to the next guard, and so on. It seems that you can use it as a base case for recursion as well, which is pretty nice.

Haskell seems to support tuples, which is basically a bunch of variables bundled together. I remember C# having a similar feature also (e.g. “Tuple<int, string>”). It can be used to bundle variables together without making a separate class for them. Tuples in Haskell seems to be bundled in parentheses, like so:

(var1, var2, ...)
(1, "hello")

The feature that I find interesting the most is “list comprehensions”. I’ve never heard of this before. The only way I can describe this is “apply an algorithm to a list”. For example, here is an algorithm which multiplies each element in a list of 1 to 10 by 2:

num = [1..10]
test = [multiplied * 2 | multiplied <- num]

which returns: [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Or here is an algorithm which lists all numbers above 5:

num = [1..10]
test = [abovefive | abovefive <- num, abovefive > 5]

which returns: [6, 7, 8, 9, 10]

If I had to write such algorithms in, say, C#, I’d have to start writing LINQ queries or custom iterators which would be a lot of manual work. It’s interesting how you can solve this problem in Haskell with pretty much a single line.

From here on, I’ll skip to the end – the self-study. Else this blog post is going to be even longer.

The self-study results I have included with this post as a zip: Haskell – Day 1

To me, it feels like information about Haskell is all over the place right now. It’s like exploring a strange new world without a map. I’m thinking that I might actually have to make a map – a cheat-sheet – in order to speed up my coding a bit, at least until I get used to the language, else I’ll have to repeatedly look up trivial things I keep forgetting about, such as how to write a function with parameters.

While we’re learning this new language, we also have to think of a programming challenge which could fit the programming language’s problem-solving capabilities very well. I haven’t thought much about it yet, but I had a rather impulsive idea which I think is really funny but cool at the same time – writing an LZ2 decompressor (or a decompressor for some other format). The challenging part would be reading a binary file, reading bits from certain bytes and applying a certain algorithm to the processed data in order to get a decompressed file. I’ll have to look into this to see if it’s actually possible or not, though. I’ll also have to get approval from my instructor. Maybe for testing purposes, I’ll start with an extremely simple compression format which is RLE.

To me, it feels like this blog post is a bit rushed/unstructured, but hopefully, my future posts will be of better quality.

More activity soon?

It appears that in my current semester, for the programming class, we have to learn a new programming language and paradigm. What’s more, we actually have to blog about our progress in learning the new language and basically getting a grasp on the concepts. The blogging will be done on this blog. For now, I suppose this is blog post 0. I expect there will be about 7 blog posts or so.

We’re following this book called Seven Languages in Seven Weeks by Bruce A. Tate, and the book introduces the following languages: Ruby, Io, Prolog, Scala, Erlang, Clojure, and Haskell. From what I understand, we need to pick one of these (except that Ruby and Io are disallowed).

The point of the exercise is that we have to learn a programming language outside of our comfort zone and then write a small application with said language to prove you learned and are able to use it.

I’m feeling ambitious so I might actually give Haskell a try. Despite the fact that “functional programming” is a paradigm completely new to me, I think that if I manage to comprehend the concept, I’ll prove to myself that I can really learn any language I’d like, given enough time and effort. I think Haskell is a good language for that as it’s purely functional.

Here’s hoping for a successful finish!

A Chip-8 emulator for the Super Nintendo Entertainment System

Oh hey. It has been ~2 years since my last post. Guess it’s time to bring a change to this.

So yeah, I coded a Chip-8 emulator for the SNES. You can view the project here:


And here is the playable SNES ROM!

The namesake is inspired by “Snes9x” (I never knew what the 9x really meant). I didn’t want to simply call it Super Chip-8 because there’s a variant of the Chip-8 called SCHIP (Super Chip) and it could cause confusion, and conveniently enough the name already had a number so I just stuck an “x” at the end.

How the idea came to be

Writing a (crude) SNES emulator has always been on my wishlist – so I started doing research on the SNES as well as emulation in general. Google seems to have taken notice of my emulation-related searches. During my daily routine of checking the Google app for stories to read, it suggested to me this certain article:

Writing a CHIP-8 emulator with Rust and WebAssembly

Upon seeing the title alone, I had three questions in mind: “What is Chip-8”, “what is Rust” and “what is WebAssembly”? I looked into the latter two and they didn’t really interest me. Looking into Chip-8 was another story, though.

I seem to have a soft spot for emulation, assembly and old/retro consoles in general. So when I read about the Chip-8 I noticed how simple it was (it’s considered the “hello world” of emulation) and I got really into the idea of writing an emulator for this. Then I found out there’s already like a thousand emulators for this. It was then that I had this impulsive thought: The Chip-8 is so simple, it could probably run even on the SNES. Amused by my own idea, I set up the project directory.

Setting up the project

To work on this project, all I needed are three tools:

  • A text editor: Notepad++
  • An assembler: asar
  • An accurate emulator/debugger: bsnes-plus

I also needed proper Chip-8 documentation, so I used “Cowgod’s Chip-8 Technical Reference“.

And… that’s about it. The rest was up to my coding and problem-solving skills.

Actual coding

Prior to this project, I had never finished a SNES homebrew ROM before. The greatest extent of my homebrewing is activating the screen and displaying things, but in terms of gameplay or controller inputs, I did nothing. This was going to be a whole new experience. I decided to approach this project in my own way – take a really safe approach and define every RAM address, and (almost) every magical number. I also decided to not optimize the code, because I think I would’ve lost control over my code really fast if I did that from the very beginning already.

The display of the Chip-8 is 64×32. The display of the (NTSC) SNES is 256×224 by default. p4plus2 gave me the idea to make the screen mode 7. It has a very simple graphics format (1 byte per pixel basically) and you can scale the screen, so I could make the Chip-8 display 256×128. People won’t have to squint their eyes when using the emulator, at least. Because the (scaled up) horizontal resolution is a perfect 256 pixels wide, I decided to use HDMA to color the screen boundaries.

In order to emulate the Chip-8, I had to allocate some RAM for its registers.

;;; Chip-8 memory and registers
		!Chip8Memory = $7E1000 ;0x1000 bytes
		!Chip8MemoryROM = !Chip8Memory+$200
		!Vreg = $70 ;16 bytes
		!Stack = $80 ; 32 bytes (16 words)
		!Ireg = $A0 ;word
		!DelayTimer = $A2 ;byte
		!SoundTimer = $A3 ;byte
		!StackPointer = $A4 ;byte
		!RNGOutput = $A5
		!ProgramCounter = $A6 ;word
		!PressedKey = $1E

These are all in the SNES direct page (except for the memory), which would allow for slightly faster access to the registers.

I also allocated some RAM for opcode parameters. Each opcode could be ‘dissected’ into 6 variables: The opcode itself and 5 parameters. All of these variables are filled in regardless of the opcode currently being processed.

;;; Chip-8 opcode parameters RAM

		!Opcode = $A8 ;word
		!NNN = $AA ;word
		!NN = $AC ;byte
		!N = $AD ;byte
		!X = $AE ;byte
		!Y = $AF ;byte

Inside the main game loop, I added a subroutine call to an opcode parser. Because every opcode is exactly two bytes, it was a matter of reading an opcode, then increasing the program counter by 2 to get to the next instruction.

I made use of a pointer table which is used by !Opcode. Each opcode will have its own function, but there are certain opcodes which act as a ‘container’ for another group of opcodes. The greatest example is opcode $08 – “Arithmetic”. The SNES – in my opinion – is pretty okay with pointer tables, and I had no problems with making yet another pointer table for those specific container opcodes.

;;; This block handles opcodes $0-$F.
;;; Opcodes $8 and $F are handled in yet another table as
;;; they contain multiple other opcodes

		dw ClearOrReturn		; $00
		dw Jump				; $01
		dw CallSubroutine		; $02
		dw SkipIfXEqual			; $03
		dw SkipIfXNotEqual		; $04
		dw SkipIfXEqualY		; $05
		dw SetX				; $06
		dw AddX				; $07
		dw Arithmetic			; $08
		dw SkipIfXNotEqualY		; $09
		dw SetI				; $0A
		dw JumpWithOffset		; $0B
		dw Rnd				; $0C
		dw DrawSprite			; $0D
		dw SkipOnKey			; $0E
		dw Misc				; $0F

My biggest problem was thinking of a proper solution for the Chip-8 input. Officially, the system has sixteen keys. The SNES only has twelve and you can’t use all 12 for conventional input. Start and select are in the middle of the controller and you’ll have to reach them with your thumb, forcing you to stop using the D-pad or the ABXY buttons. This leaves you with 10 buttons, and there’s no real way to divide 16 buttons over 10 buttons.

Then I got the idea of using button combinations. You could map the 16 keys to the ABXY buttons by using button combinations with the L+R buttons:

  • ABXY
  • L+[ABXY]
  • R+[ABXY]
  • LR+[ABXY]

However, using this controller scheme for every single ROM would be very as awkward and uncomfortable. So I got the idea to make a custom controller scheme for every playable ROM.

		dw CDefault : db CDefault_end-CDefault ;boot
		dw CDefault : db CDefault_end-CDefault ; fifteenpuzzle
		dw CBlinky : db CBlinky_end-CBlinky
		!CUP = $0800
		!CDOWN = $0400
		!CLEFT = $0200
		!CRIGHT = $0100

		!CY = $4000
		!CX = $0040
		!CB = $8000
		!CA = $0080

		!CL = $0020
		!CR = $0010
		dw !CY : db $01 ; Y
		dw !CX : db $02 ; X
		dw !CB : db $03 ; B
		dw !CA : db $0C ; A
		dw !CY|!CL : db $04 ; Y + L
		dw !CX|!CL : db $05 ; X + L
		dw !CB|!CL : db $06 ; B + L
		dw !CA|!CL : db $0d ; A + L
		dw !CY|!CR : db $07 ; Y + R
		dw !CX|!CR : db $08 ; X + R
		dw !CB|!CR : db $09 ; B + R
		dw !CA|!CR : db $0e ; A + R
		dw !CY|!CL|!CR : db $0a ; Y + LR
		dw !CX|!CL|!CR : db $00 ; X + LR
		dw !CB|!CL|!CR : db $0b ; B + LR
		dw !CA|!CL|!CR : db $0f ; A + LR
		;generally accepted directional keys
		dw !CUP : db $02 ; up
		dw !CLEFT : db $04 ; left
		dw !CRIGHT : db $06 ; right
		dw !CDOWN : db $08 ; down

		dw !CUP : db $03
		dw !CDOWN : db $06
		dw !CLEFT : db $07
		dw !CRIGHT : db $08
		dw !CA : db $0F
		dw !CB : db $0F
		dw !CY : db $0F
		dw !CX : db $0F

It involved extra work, but in the end, it was worth it. Technically the emulator supports all 16 keys, but at the same time, you can set intuitive controls for each playable ROM.

Closing words

In the end, I am surprised at myself for being able to complete this project at all. I think the very idea of coding an emulator for the SNES kept me going on. Personally, I think it’s a pretty crazy idea.

SMWCentral’s C3 event was also nearing so I thought it was the perfect opportunity to finish a project and show it off to everyone.

The fact that I can tell people that I coded an emulator for the SNES, no matter how simple the Chip-8 system may be, is something that I can be proud of.

Idea: A potential game engine

Most SNES ROM hackers probably have realized how limiting the SNES sometimes can be. You can only have 8 (H)DMA channels, the amount of OAM tiles on the screen is limited (think of the 33’s Range Over and 35’s Time Over limitations), the VRAM sometimes is small when you have a level with many varying graphics, etc. I guess that is understandable. Consoles didn’t have that much processing power back in the days due to limited technology, which was possibly also expensive.

Limited ROM space is also a major downer. Nintendo fit an entire game into a 512kB ROM – Super Mario World – using various tricks. They used LC_LZ2 compression, Reduced graphics from 4bpp to 3bpp, they used RLE compression, etc. They could only afford so much ROM space. As time passed, ROMs got bigger and bigger but the ROMs were still small enough to not unlock the extreme potential of the SNES (like allowing movie sequences and/or making quick time-event games).

The PPU registers are also pretty limiting. Certain registers can only be written to during V-blanks (like writing new graphical data). Write too much and you risk flashing scanlines at the top of the screen.

Sometimes, certain SNES limitations could be bypassed. There were enhancement chips which allow bankswitching for more ROM space for example, or chips which allow you to write bitmap data directly like the SuperFX. However, even with the enhancement chips, some limitations just don’t change. You can’t increase the amount of audio channels, you can’t increase the amount of layers, you can’t increase the amount of sprites shown on-screen, etc.

Finally, coding for the SNES has always been a pain to begin with. You code in ASM (Assembly). You write a bunch of LDAs and STAs and hope things work out. It’s very unreadable.

A question I had in mind for a long time is “How do I overcome the SNES limitations?” Of course, there are multiple answers for that, ranged from something as simple as “just don’t attempt it” to “even if you overcome it, it’ll have limited practical use”. So I decided to look at it from another point of view – emulators.

It should be possible to modify emulators to include less limitations, but the concept of “ROM hacking” wouldn’t remain the same anymore. Sure, you can allow 512 or more layers, but obviously this won’t run on the actual SNES. It’ll pretty much be like building your own game engine, and you’re still limited to ASM of course.

This idea gave me another idea. How about building a game engine based on the SNES’ hardware, but with its limits gone? You’d have infinite graphics space. You’d have infinite layers. You’d have infinite music channels. Infinite palettes, etc. You can have as many HDMA channels as you want affecting layers, brightness registers, and so on. You won’t have to allocate RAM for variables manually (because declaring variables in a high-level language just picks a free RAM address for you). You can display as many sprite/OAM tiles as you want to display as long as your graphics card can handle it. And so on.

For example, want a level with very neat parallax scrolling? Pick 5 layers with 4bpp graphics. Or pick 5 layers with 8bpp graphics with one set of palettes for each 8bpp layer. Or you could have a mode 7 background while you have a fully playable level.

Only question is, how would this be built? Take bits and pieces from emulators and make your own engine, or start completely from scratch? Personally I’d attempt the former, but I wouldn’t even know in what language to start. Oh well.