{"id":217,"date":"2022-04-11T08:59:47","date_gmt":"2022-04-11T08:59:47","guid":{"rendered":"https:\/\/wqrld.net\/blog\/?p=217"},"modified":"2022-04-18T09:28:22","modified_gmt":"2022-04-18T09:28:22","slug":"building-a-boot-sector-os","status":"publish","type":"post","link":"https:\/\/wqrld.net\/blog\/building-a-boot-sector-os\/","title":{"rendered":"Building a boot-sector OS"},"content":{"rendered":"\n<p><\/p>\n\n\n\n<p>Many technical people always want to re-invent the wheel. &#8220;Operating systems are bloated&#8221; and all kinds of that stuff. I challenge you to make your own. (with a bit of help from this tutorial, i won&#8217;t leave you hanging \ud83d\ude42 )<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Tools<\/h2>\n\n\n\n<p>We are going to build a simle boot sector operating system. This is basically a operating system that fits in the first 512 bytes that the bios initializes when executing code on a disk.<\/p>\n\n\n\n<p>To do this we will need 3 tools:<br>&#8211; NASM: A x86 assembler to turn your assembly language into a binary file.<br>&#8211; QEMU &#8211; Quick emulator. This can natively deal with your raw binary files and has text output<br>&#8211; Any text editor, i like nano<\/p>\n\n\n\n<p>On a linux debian-y based os, these can be installed with:<br><code>apt -y install nasm qemu-system-x86 nano<\/code><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Simple boot sector<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>jmp $   ; jump to current address - infinite loop\n\ntimes 510-($-$$) db 0\n     ; fill the empty space with zeroes\n\ndw 0xaa55            ; write the magic bytes 55 aa at the end of our file.\n<\/code><\/pre>\n\n\n\n<p>This is a simple piece of assembly that will jump infinitely to the current address. This will not do much appart from spinning up your fans, but it will be a great start!<\/p>\n\n\n\n<p>The middle line fills the empty space with zeroes. $ is an alias for the current line and $$ is the alias for the start of your program. So with 510-($-$$) you can pad the program to be exactly 510 bytes (and then 2 for the magic bytes following to make our 512) <a href=\"https:\/\/nasm.us\/doc\/nasmdoc3.html#section-3.5\">docs<\/a><\/p>\n\n\n\n<p>As you can see, comments are added with the ; symbol. This works just like pyhon&#8217;s # or javascript&#8217;s \/\/. It is smart to add plenty of comments to your assembly code as it will otherwise become quite hard to read.<\/p>\n\n\n\n<p>We have to fill the last 2 bytes with 0xaa55 to show the BIOS that we have an operating system here. It wouldn&#8217;t be good if you started booting from your game storage disk would it?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Running our first program<\/h2>\n\n\n\n<p>Save your code to a file called boot.asm and then run<\/p>\n\n\n\n<p><code>nasm boot.asm -f bin -o boot_sect.bin<\/code><\/p>\n\n\n\n<p>to assemble boot.asm  into a boot_sect.bin that your pc can understand. And then you can use<\/p>\n\n\n\n<p><code>qemu-system-x86_64 boot_sect.bin -curses<\/code><\/p>\n\n\n\n<p>To boot, -curses will emulate the vga text on your screen.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/i.wqrld.net\/static\/Sysadmin_Wwb.png\" alt=\"\"\/><figcaption>You should see the bios boot screen pop up and see one core shoot to 100% (your jmp loop)<\/figcaption><\/figure>\n\n\n\n<p>You might not be able to ctrl-c\/z out of this. If this is the case you can open a separate shell and kill the qemu process id: <code>kill $(ps -ef | grep curses | awk '{print $2}' | head -n 1)<\/code><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">But thats just a bios boot screen?!<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>mov ah, 0x0e ; BIOS routine scrolling teletype\n\nmov al, 'H' ; move byte for 'F' into the al register\nint 0x10   ; execute the 0x10 \"print to screen\" interrupt\nmov al, 'E'\nint 0x10\nmov al, 'L'\nint 0x10\nmov al, 'L'\nint 0x10\nmov al, 'O'\nint 0x10\nmov al, '!'\nint 0x10\n\n\n\njmp $ ; freeze the screen so we can see our text\n\n\ntimes 510-($-$$) db 0\n\n\n\ndw 0xaa55 ; magic bytes<\/code><\/pre>\n\n\n\n<p>The code above will use a BIOS routine to print some text to your screen. Try it yourself and play around a bit with the text.<\/p>\n\n\n\n<p>The code first loads the routine name into the AH register and then the ascii code for the letter in the AL register. Then it calls the 0x10 interrupt which tells the BIOS to use the AH and AL register to print something to the screen. You can find more on registers <a href=\"https:\/\/en.wikibooks.org\/wiki\/X86_Assembly\/X86_Architecture\">here<\/a>. (absolutely worth reading after this article)<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>And then assemble and run:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/i.wqrld.net\/static\/Games_rMx.png\" alt=\"\"\/><figcaption>Yay!<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Logic<\/h2>\n\n\n\n<p>Right now we are just manually printing out registers, but it wouldn&#8217;t be a computer with some compute. So let&#8217;s do a simple calculation<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mov ah, 0x0e ; BIOS text typing routine\r\n\r\nmov al, '1' ; our text, to make it look better\r\nint 0x10\r\nmov al, ' '\r\nint 0x10\r\nmov al, '+'\r\nint 0x10\r\nmov al, ' '\r\nint 0x10\r\nmov al, '1'\r\nint 0x10\r\nmov al, ' '\r\nint 0x10\r\nmov al, '='\r\nint 0x10\r\nmov al, ' '\r\nint 0x10\r\n\r\n\r\n; We are upping the ascii code here, not the actual integer\r\nmov cl, 49 ; ascii code for '1', make sure to use another 8 bit register\r\nadd cl, 1 ; after '1' comes '2'\r\nmov al, cl ; move to the al register that is used for printing\r\nint 0x10 ; print!\r\n\r\n\r\njmp $ ; freeze the screen so we can see our text\r\n\r\ntimes 510-($-$$) db 0\r\n\ndw 0xaa55 ; magic bytes<\/code><\/pre>\n\n\n\n<p>The import part here is the block right above the jmp $ instruction.<\/p>\n\n\n\n<p>Old computers use what&#8217;s called an &#8220;ascii table&#8221;, this is basically a mapping between numbers (bytes) and letters.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/www.asciitable.com\/asciifull.gif\" alt=\"ASCII Table\"\/><\/figure>\n\n\n\n<p>As you can see, we start by moving the decimal number 49 into the cl (8 bit) register. In The table this corresponds to a &#8216;1&#8217;. Nasm also lets you move the actual character directly but that does the same thing under the hood.<\/p>\n\n\n\n<p>The cool thing with ascii is that they are just numbers, you can increment 1 and get 2. But you can also increment 57 by one and end up with a &#8220;:&#8221;.<\/p>\n\n\n\n<p>The order of arguments might be a mit misleading to anyone who has never worked with assembly. Generally we use the first argument as the destination, and the second as source. You can see this quite well in the add and mov instructions.<\/p>\n\n\n\n<p>So let&#8217;s compile it and try!<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/wqrld.net\/blog\/wp-content\/uploads\/2022\/04\/image.png\" alt=\"\" class=\"wp-image-237\" width=\"272\" height=\"72\"\/><figcaption>yay!<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Functions and loops<\/h2>\n\n\n\n<p>(WIP chapter) It would be a lot easier if  we could print strings at once. Here&#8217;s a function that does just that!<\/p>\n\n\n\n<p>Do not forget the org symbol, this mentions where the code\/data of your program is located so you can use indirect addressing with square brackets.<\/p>\n\n\n\n<p>Caveat: In This way of recursively calling yourself, the stack will not be restored when the null byte is called. How do we fix this?<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;org 0x7c00]\r\n\r\nmov ah, 0x0e ; BIOS text typing routine\r\n\r\nmov bx, str\r\ncall printstring\r\n\r\nfreeze:\r\n  jmp $ ; freeze the screen so we can see our text\r\n\r\n\r\nprintstring:\r\n  pusha\r\n  cmp byte &#91;bx], 0 ; Stop when we see a null byte\r\n  je freeze ; pop back variables when we've had all letters\r\n\r\n  mov al, &#91;bx] ; move contents of bx register to al register\r\n  int 0x10 ; print!\r\n\r\n  add bx, 1 ; move to the next letter\r\n\r\n  call printstring ; recursively print all letters\r\n\r\n\r\nstr:\r\n  db 'Hello, Wqrld!',0 ; terminate with \\0\r\n\r\n\r\ntimes 510-($-$$) db 0\r\n\r\ndw 0xaa55 ; magic bytes\r<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/i.wqrld.net\/static\/Linux_Ako.png\" alt=\"\"\/><figcaption>You can now print strings<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">And onwards&#8230;<\/h2>\n\n\n\n<p><\/p>\n\n\n\n<p>You did it, you have now made your own very very very basic OS. The rest is up to you \ud83d\ude09<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Just joking, you can read more awesome stuff at <a href=\"https:\/\/www.cs.bham.ac.uk\/~exr\/lectures\/opsys\/10_11\/lectures\/os-dev.pdf \">https:\/\/www.cs.bham.ac.uk\/~exr\/lectures\/opsys\/10_11\/lectures\/os-dev.pdf <\/a><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many technical people always want to re-invent the wheel. &#8220;Operating systems are bloated&#8221; and all kinds of that stuff. I challenge you to make your own. (with a bit of help from this tutorial, i won&#8217;t leave you hanging \ud83d\ude42 ) Tools We are going to build a simle boot sector operating system. This is &hellip; <a href=\"https:\/\/wqrld.net\/blog\/building-a-boot-sector-os\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Building a boot-sector OS&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":245,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-217","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts\/217","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/comments?post=217"}],"version-history":[{"count":22,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts\/217\/revisions"}],"predecessor-version":[{"id":247,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts\/217\/revisions\/247"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/media\/245"}],"wp:attachment":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/media?parent=217"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/categories?post=217"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/tags?post=217"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}