Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MBR format #241

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Add MBR format #241

wants to merge 1 commit into from

Conversation

tlehman
Copy link

@tlehman tlehman commented Apr 27, 2022

Related to issue #23

This is a first pass, it breaks down the MBR into the code_area, the partition_table, and the signature (magic number denoting the end of the boot record)

Future work

  • Disassemble the 16-bit opcodes in the code area
    for example
objdump -D -Mintel,i8086 -b binary -m i386 format/mbr/testdata/mbr.bin | head -15

format/mbr/testdata/mbr.bin:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:	fa                   	cli
   1:	bc 00 7c             	mov    sp,0x7c00
   4:	31 c0                	xor    ax,ax
   6:	8e d0                	mov    ss,ax
   8:	8e c0                	mov    es,ax
   a:	8e d8                	mov    ds,ax
   c:	52                   	push   dx
   d:	be 00 7c             	mov    si,0x7c00

Background

MBR is short for Master Boot Record. It is the legacy method of booting,
the code area is limited to 446 bytes, and it's all 16-bit x86 opcodes.
Old BIOS code knows how to call into it and execute the instructions there.
MBR also has a 64-byte partition table that can store up to 4 partitions.
The GPT partitioning scheme obsoletes this, and allows for more partitions, but
that requires UEFI to work.

For more information on MBR, see: https://thestarman.pcministry.com/asm/mbr/PartTables.htm#mbr

fq.go Outdated
@@ -6,7 +6,7 @@ import (
"github.com/wader/fq/pkg/cli"
)

const version = "0.0.6"
const version = "0.0.7"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can probably leave this out, i usually increase this just before a new release

doc/dev.md Show resolved Hide resolved
0x1b0| 00 00| ..| partition_table: raw bits
0x1c0|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................|
* |until 0x1fd.7 (64) | |
0x1f0| 55 aa| U.| boot_record_sig: raw bits
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always 0x55aa for a mbr? if so could seek and assert i guess

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll do that

}

func mbrDecode(d *decode.D, in interface{}) interface{} {
d.FieldRawLen("code_area", 446*8)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan is to use some kind of x86_16 decoder here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first iteration I'm going to focus on the partition table. Your work on the x86_16 decoder can be reused here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍


func mbrDecode(d *decode.D, in interface{}) interface{} {
d.FieldRawLen("code_area", 446*8)
d.FieldRawLen("partition_table", 64*8)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Know if this partition table format is used outside mbr? if not i guess maybe not worth having it as separate format

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, I just wanted to format it nicely enough that you can tell how big the partitions are

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@wader
Copy link
Owner

wader commented Apr 27, 2022

Nice start! will be interesting to get started with ISA decoding in fq, have some ideas but nothing clear yet so we will probably have to try things out :)

@wader
Copy link
Owner

wader commented Apr 27, 2022

Just comment here, IM or send me an email if you have any questions or want to discuss something

@wader
Copy link
Owner

wader commented Apr 27, 2022

Tried the x86_16 decoder, seems to work ok. But i'm not that pleased with the current output, it would be nice to split out more operands and such.. hmm

$ go run fq.go -o line_bytes=4 -d raw 'tobytes[0:446] | x86_16 | dd' mbr.bin
     │00 01 02 03│0123│.[0:193]: (x86_16)
     │           │    │  [0]{}:
0x000│fa         │.   │    opcode: "cli" (raw bits)
     │           │    │    op: "cli" (0xfa000000)
     │           │    │  [1]{}:
0x000│   bc 00 7c│ ..|│    opcode: "mov sp, 0x7c00" (raw bits)
     │           │    │    op: "mov" (0xbc000000)
     │           │    │  [2]{}:
0x004│31 c0      │1.  │    opcode: "xor ax, ax" (raw bits)
     │           │    │    op: "xor" (0x31c00000)
     │           │    │  [3]{}:
0x004│      8e d0│  ..│    opcode: "mov ss, ax" (raw bits)
     │           │    │    op: "mov" (0x8ed00000)
     │           │    │  [4]{}:
0x008│8e c0      │..  │    opcode: "mov es, ax" (raw bits)
     │           │    │    op: "mov" (0x8ec00000)
     │           │    │  [5]{}:
0x008│      8e d8│  ..│    opcode: "mov ds, ax" (raw bits)
     │           │    │    op: "mov" (0x8ed80000)
     │           │    │  [6]{}:
0x00c│52         │R   │    opcode: "push dx" (raw bits)
     │           │    │    op: "push" (0x52000000)
     │           │    │  [7]{}:
0x00c│   be 00 7c│ ..|│    opcode: "mov si, 0x7c00" (raw bits)
     │           │    │    op: "mov" (0xbe000000)
     │           │    │  [8]{}:
0x010│bf 00 06   │... │    opcode: "mov di, 0x600" (raw bits)
     │           │    │    op: "mov" (0xbf000000)
....

@tlehman
Copy link
Author

tlehman commented Jun 23, 2022

Probably going to pick up work on this next week, it's Hack Week at SUSE

@wader
Copy link
Owner

wader commented Jun 23, 2022

🥳 let me know if you have questions or want to discuss something

sectorHighBits, err := d.Bits(2)
if err != nil {
d.IOPanic(err, "chs")
}
Copy link
Owner

@wader wader Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this to just d.U2() or d.U(2) and it will do the error check and panic. I should probably rename Bits to TryBits, the convention usually is Try* functions returns error other just panic and there is usually pairs of functions.

d.FieldU8("partition_type", partitionTypes)
d.FieldStrScalarFn("ending_chs_vals", decodeCHSBytes)
d.FieldStrScalarFn("starting_sector", decodeCHSBytes)
d.U8() // extra byte
Copy link
Owner

@wader wader Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe field for this byte also? otherwise it will just be skipped and end up as "unknown" fields. fq figures out where there is gaps and fill them in with unknown fields, so decoders should only skip things that are truely unknown. The idea is to always make all bits "addressable", is quite useful when digging into broken or strange files

func decodePartitionTableEntry(d *decode.D) {
d.FieldU8("boot_indicator", scalar.UToDescription{
0x80: "active",
0x00: "inactive",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do a bit field struct here (struct with d.FieldBool("...")) if it's useful to be able to address individual bits. Sometimes i also use 0b1000_0000 etc literals for binary literals

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could use scalar.UToSym that way you can write queries as .boot_indicator == "active" and if you really want to compare the non-mapped value do (.boot_indicator | toactual) == 0x80

}
cylinder := (sectorHighBits << 2) | cylinderLowerBits
return scalar.S{Actual: fmt.Sprintf("CHS(%x, %x, %x)", cylinder, head, sector)}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it make sense to read CHS as a struct with cylinder, head , sector fields?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and that suggestion is in line with this general tip

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, forgot i wrote that :) maybe should add something more about that it also makes it easier to write queries if things are split up into more individual fields

d.FieldStruct("entry_2", decodePartitionTableEntry)
d.FieldStruct("entry_3", decodePartitionTableEntry)
d.FieldStruct("entry_4", decodePartitionTableEntry)
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes when i have problem figure how to model i think about how it will feel and look to user doing queries etc, ex for this a query to get partition size will be .partition_table.entry_1.partition_size for struct vs .partition_table[0].partition_size for array.

@@ -0,0 +1,41 @@
# head -c 512 /dev/sda > mbr.bin
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add $ fq -d mbr dv mbr.bin line here and the test framework should be able to regenerate expected output using WRITE_ACTUAL=1 go test .... so you don't have to copy/paste or do it manually. My workflow is usually to run tests, check diffs on failure, if looks ok run with WRITE_ACTUAL=1

doc/formats.md Show resolved Hide resolved
@wader
Copy link
Owner

wader commented Jun 30, 2022

Cleaned up some of the Try* things #302 if you switch to d.*U*() instead of d.Bits it should be fine.

@wader
Copy link
Owner

wader commented Jun 30, 2022

Also fixed some of the doc typos #303

if err != nil {
d.IOPanic(err, "chs")
}
cylinder := (sectorHighBits << 2) | cylinderLowerBits
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh the cylinder bits are "continuous" as 16LE, this is a bit messy to represent :( i've run into this issue with some other formats like https://github.com/wader/fq/blob/master/format/vpx/vp8_frame.go#L36 haven't figured any nice way to handle this yet. Also would be nice if there was some decoding helpers for doing this

@@ -0,0 +1,7 @@
def _mbr__help:
{ notes: "Supports decoding Master Boot Record data",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the generated help text maybe this is redundant? I've used this to note missing/extra features etc

var mbrFS embed.FS

func init() {
registry.MustRegister(decode.Format{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was refactored a bit in master, is now interp.RegisterFormat(...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will rebase and fix

@wader
Copy link
Owner

wader commented Sep 26, 2022

Just a note if you start working on this again: in master format documentation is now .md files again, so you can convert format/mbr/mbr.jq into a normal markdown file, see macho.md etc, and then embed it, see macho.go etc. Then it will be used both for generating documentation and cli help.

@wader
Copy link
Owner

wader commented May 1, 2023

Format registration on master have changed a bit but should be straight forward to copy/paste from msgpack.go etc and rename some things and also do the same for formats.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants