Thursday, 17 November 2016

Building Your Own Boot loader:



I am one of those folks who are very interested in understanding how Operating Systems work, no matter if it is Windows, or Ubuntu (or other Linux distros), or MAC OSX. Building your own bootloader is a pretty easy thing to accomplish; however, I think it would be wise to explain the concept and the terminologies in this area.

Bootloader is a small program that is executed under boot up in 16-bit Real Mode. It is very small, in fact, it is only 512 bytes and it resides on the very first sector of a boot device. The bootloader is used to load more complex programs like an OS Kernel. The 512 byte bootloader code is stored with the Master Boot Record (MBR) and is loaded by the Basic Input/Output System (BIOS) via Interrupt (INT) 0x19 at 0x0000:0x7c00h. This is a standard sequence in booting for almost all x86 PCs, except that in some PCs it can be loaded to 0x7c00:0x0000h.

To be able to accomplish this successfully, you need the following tools installed on your computer machine:


  1. Notepad++ a freeware source code editor supports syntax color highlighting for all programming languages,
    as far as I can confirm it.
  2. Netwide Assembler (NASM) is a freeware assembler used to produce binary programs, or boot images.
  3. VMware Workstation (This one is not freeware), however, if you want a freeware I highly suggest Oracle’s VirtualBox.
    You will need a virtual machine to run this code one; I do not want you to run this code on your own machine.

Download:
Notepad++ (http://notepad-plus-plus.org/)
NASM (http://www.nasm.us/)
VMware Workstation (http://www.vmware.com/products/workstation/)
Virtual Box (https://www.virtualbox.org/)

Let’s build our own bootloader with assembly programming language. Open Notepad++ and created a new file, set the language to Assembly, and replace the content with the code below. Don’t be afraid, I will explain line-by-line below.

bits 16 ; 16-bit Real Mode
org 0x7c00 ; BIOS boot origin 

jmp main ;Jump to start main() entry-point 

;;;;;;;;;;;;;;
; Variables 
;;;;;;;;;;;;;;

Message db "Hello World, booting from low-level 16-bit...", 0x0 
MessageB db "Fisnik's own bootloader program written in x86 assembly language.", 0x0
AnyKey db "Press any key to reboot...", 0x0 

;Print characters to the screen 
Println:
    lodsb ;Load string 
    or al, al
    jz complete
    mov ah, 0x0e  
    int 0x10 ;BIOS Interrupt 0x10 - Used to print characters on the screen via Video Memory 
    jmp Println ;Loop    
complete:
    call PrintNwL

;Prints empty new lines like '\n' in C/C++  
PrintNwL: 
    mov al, 0 ; null terminator '\0'
    stosb       ; Store string 

    ;Adds a newline break '\n'
    mov ah, 0x0E
    mov al, 0x0D
    int 0x10
    mov al, 0x0A 
    int 0x10
 ret

;Reboot the Machine 
Reboot: 
    mov si, AnyKey
    call Println
    call GetPressedKey 

    ;Sends us to the end of the memory
    ;causing reboot 
    db 0x0ea 
    dw 0x0000 
    dw 0xffff 

;Gets the pressed key 
GetPressedKey:
    mov ah, 0
    int 0x16  ;BIOS Keyboard Service 
    ret 

;Bootloader entry-code 
main:
   cli ;Clear interrupts 
   ;Setup stack segments 
   mov ax,cs              
   mov ds,ax   
   mov es,ax               
   mov ss,ax                
   sti ;Enable interrupts 

   ;Print the first characters  
   mov si, Message 
   call Println 

   mov si, MessageB
   call Println 

   call PrintNwL
   call PrintNwL

   call Reboot 

   times 510 - ($-$$) db 0 ;Fill the rest of the bootloader with zeros 
   dw 0xAA55 ;Boot signature

That’s a lot of code, huh? Well, don’t worry, I will explain this line-by-line to make sure you understand it, and if you don’t, feel free to contact me via http://www.itknowledge24.com/contact.php.

The first two lines in this code, we tell NASM that we want it to assemble the output file for 16-bit, we set the boot origin to 0x7c00h which is the standard boot origin in which BIOS loads us at.
bits 16 ; 16-bit Real Mode
org 0x7c00 ; BIOS boot origin

In the next line we jump to the main() code-entry
jmp main ;Jump to start main() entry-point
In the next line we declare three text (strings) variables.
Message db "Hello World, booting from low-level 16-bit...", 0x0 
MessageB db "Fisnik's own bootloader program written in x86 assembly language.", 0x0
AnyKey db "Press any key to reboot...", 0x0
DB means declare byte.

Next, we make our own Println function with assembly code.
Println:
    lodsb
    or al, al
    jz complete
    mov ah, 0x0e  
    int 0x10 ;BIOS Interrupt 0x10 - Used to print screen on Video Memory 
    jmp Println ;Loop    
complete:
    call PrintNwL

The Println function, loads string bytes using lodsb, checks if AL==0 i.e. null, if null termination was found jz will jump to complete and call PrintNwL, but if AL!=0 i.e. does not equal null, make ah = 0x0E and invoke int 0x10, once character printed, keep looping until null termination. This is very straight forward to understand.

Next, we made our own Print New Line ‘\n’ function
PrintNwL: 
    mov al, 0 ; null terminator
    stosb

    ;Adds a newline break '\n'
    mov ah, 0x0E
    mov al, 0x0D
    int 0x10
    mov al, 0x0A 
    int 0x10
 ret

The PrintNwL function, sets al=0, causing ‘\0’ null termination, next it stores the string bytes using stosb. The we set ah = 0x0E, in addition we set al = 0x0D and invoke the int 0x10 BIOS function.

Next we make the Reboot function which reboot the PC when called.
Reboot: 
    mov si, AnyKey ;Store the variable AnyKey in SI (Source Index)
    call Println
    call GetPressedKey 

    ;Sends us to the end of the memory
    ;causing reboot 
    db 0x0ea 
    dw 0x0000 
    dw 0xffff

The Reboot function, stores AnyKey into the SI, and calls Println. Next, GetPressedKey function is called, and the program waits for GetPressedKey to return, once returned we set db 0x0ea, and point us to 0xffff:0x0000h, the end of memory (RAM) causing PC reboot. I explained what DB is, but what is DW? DW means Declare word.

Next we make the GetPressedKey function which waits for the user to press any key on the keyboard and returns when called.
GetPressedKey:
    mov ah, 0
    int 0x16
 ret

That’s it, now we’re done with all of the functions, let’s analyze the first 6 lines in main.
   cli ;Clear Interrupt Flag to equal 0 (Disable)
   ;Setup stack segments 
   mov ax,cs              
   mov ds,ax   
   mov es,ax               
   mov ss,ax                
   sti ;Set Interrupt Flag to equal = 1 (Enabled)

First, we clear interrupt and disable, then we store data from code segment (CS) to AX (Accumulator), next we store data from AX to data segment (DS), next we store data from AX to extra segment (ES), and finally we store data from AX to stack segment (SS), and we re-enable the interrupts by setting the interrupt flag to equal = 1.

Finally the last two lines in this code.
 times 510 - ($-$$) db 0 ;Fill the rest of the bootloader with zeros 
 dw 0xAA55 ;Boot signature

We use the TIMES directive to insert exactly enough 0 bytes into the output to move the assembly point up to 510. The single dollar sign ($) refers to the address of the current line being assembled, and the double dollar sign ($$) refers to the address of the current section. The $-$$ returns the size of the program. The last line in our bootloader is dw 0xAA55 which is the magic number for the boot signature, BIOS INT 0x19 searches for bootable disk, and looks for the following 511 == AA, and 512 == 55, when this is confirmed INT 0x19 loads the program and executes the code.

To build this using NASM, open the Windows Command Prompt (CMD) and move to the directory where you’ve stored the NASM files, make sure boot.asm is stored in the exact same location and execute the following command:
nasm -f bin boot.asm -o boot.img

This will produce boot.img, now open VMware Workstation, and select one of your virtual machines, and click the Floppy icon and select Use Floppy Image file, and browse and select the boot image file produced by NASM. Next power the virtual machine.


My bootloader running in VMware.

If everything worked, congratulations! If something did not work, it is probably because there is a difference between your machine and mine. Or you might have had some issues with the assembly code, but this is easy to fix.