Fun Times Binary Patching the Mach-O Format

And a half-working solution

I’m gonna look into making my own images soon.

No filesystem this week because I never made it on time. It turns out the things that are left to do are actually pretty hard. So this week I’ve decided to indulge myself a little bit in my other, once-favorite hobby; reverse engineering (sort of). I actually wrote this article quite a long time ago but never got to publishing it. It’s from a time when I had access to an OSX operating system and wanted to dig deeper into its internals. It always felt so different compared to Linux in ways that felt silly and that just gripped me in a way.

So let’s call this my love letter to the OSX operating system. This one’s a full walkthrough of an attempt I made at patching code into a binary on Mac. Hope you enjoy. Also, I failed at this too but it was a lot of fun 😃.

Let’s take things back a year or so.

I found this small achievement a significant steppingstone in my career as a casual reverse engineer (dammit, I love that phrase). I mean casual because those who know me also know how much I can’t bring myself to go all in because of my aversion to complex RE tools like IDA or Ghidra. I find them complex partly because of my love of the simple things in tech like vim and iTerm, but also the bottleneck that having poor eyesight causes to my workflow.

I liked this project because for what I was trying to achieve, a simple text editor and gcc were all I needed to get the job done.

So let’s get into the what, why and how.

Let’s assess the target we’ll be lining up in our crosshairs; his name is Woody Woodpacker, a 42 school project that did quite the number of my sleep schedule and self-esteem. The one project I just couldn’t beat no matter how hard I tried and how many books I read. I’d say things are different this time round but you already read this far and you know how this ends. But just like any other daredevil, I get paid for the attempt. :D

The goal of the original project was to take this output

And then after patching the binary, make it produce this;

without recompilation of any kind.

Sorcery at its finest.

I wanted to achieve the same thing but with the Mach-O format as the original project was to target the ELF format that’s supported by Linux systems. I figured it might actually be easier in comparison because the format seemed so much easier to grasp. Naivete at it’s finest.

Let’s catch you up on a few basics of how the Mach-O format is structured.

This is the basic layout. For those in the know it must look a lot less of a meatier challenge than the ELF or even PE format because in comparison it’s relatively basic. We can think of it in three pieces; the Header, Load Commands and the Data sections.

The only pieces of info that’ll be reasonably glossed over here are about the Load Commands as they’re the main focus for figuring this thing out.

Also, no Go (or source code snippets) this week as I opted to code my solution in pure C. It’s a process I undertake when I’m looking to truly understand a technical concept because, in my opinion, it highlights the tiny details that a parse over with a tool might keep hidden and abstracted away. All my code will be linked at the end of this article so you can check it out if you’re interested.

Let’s get started with the first little program I wrote.

Here I wrote a program to show me every single section of the binary with each section’s start and end values. It’s output looks like this:

Mind the typos, it was a long night.. :(

Basically, those segments marked LOAD_COMMAND are all the Load Commands in the binary. They all have unique names but writing the names of all the ones we didn’t need wasn’t necessary so we left those out.

The names that all start with “__” are the names of various sections which are referenced in some way or another in the LOAD_COMMAND sections. Under the hood, I got all their offsets by observing them in their respective Load Command blocks that have the info.

The start and end index values tell us where these specific blocks of code are in the program; where they start and end relative to the start of the file.

These details were extremely important for the first approach I came up with where I found relatively minor success. Although, I’m not entirely convinced it was all for nothing.

So here’s how it went.

When a binary file or program is executed (think ls, python3, cat, env etc.), the code that actually runs exists in the __text section inside the __TEXT segment which is located somewhere in the Load Commands section of the binary file. Or at least the Load Commands have info on where these sections actually are with respect to their offsets from the start of the file.

The __TEXT segment has a __text section which, in turn, has offset information about where exactly the executable code in the file is stored.

My first bright idea was to see if it was possible to simply add my own custom code to this section and run it.

The code to be attached had to be written completely in Assembly to ensure it was as small as possible. My first test was relatively quite small. It didn’t do anything, I just needed to ensure that everything still worked after its insertion.

So, the target C code we’d be infectin — I mean editing looks like this:

And it’s assembly output after a quick compilation with gcc is;

This is just the code section. This file’s a lot bigger.

And lastly, the code I wrote that I’d use to patch the binary above;

NOP is an instruction that basically does nothing. There’s more to it sure but for this article, that explanation will do.

And a patch later, we have a successful prepend to the code.

But I’m sure you would’ve noticed that there’s a glaring issue with my newly patched version when compared to its predecessor’s output.

Yup. In the previous, pre-patched output, the string value at address 0x100000ed4 is not only no longer at this position in the newly patched binary (it’s been shifted to 0x100000ed8, by the 4 NOP bytes I prepended), it also no longer has its string value.

Something’s gone horribly wrong and this binary no longer works.

Note: a header file in C is a way of importing separate functions in other files and objects into your program. These files (that end with a .h) tend to have a lot of helpful info if you’re trying to debug native C/C++ code on your system.

After further reading the loader.h header file that contains info on the structs needed to parse the binary, I discovered that there was a Load Command value I neglected. One called LC_MAIN. LC_MAIN points to the starting address of C’s main() function — the entrypoint — and where it is in the binary. This is basically where the program actually starts running. It doesn’t start at the very top of the __text section as I thought it did. Because my NOP bytes shifted every value down by 4, all these offsets were completely wrong, mangling the binary.

Sure, I could’ve just experimented with updating LC_MAIN’s entry offset value by 4 bytes to accommodate my NOP values but I started to wonder if there were more invisible numbers and offsets I’d have neglected to update. Prepending wasn’t going to work. Or at least, not with the current skill and knowledge I had over the format.

So, after posting my problems on Reddit and StackOverflow and a few mugs of coffee later I finally had something to go on; the Mach-O file format had a lot of code caves.

And I mean a lot.

Here’s a pretty picture of a cave to break the monotony of this post.

A code cave is a specific section in a binary that’s filled with a lot of NULLs. That’s programmer speak for “nothing happens here” and is represented as a bunch of ‘’0’s. This is great for patching because it implies that we can do whatever we want and we won’t be penalized with breaking binaries. We can’t break anything because nothing’s meant to run in these positions anyway.

This is an example of a few code caves in the ‘ls’ utility. See all those inviting zeroes? My patch is gonna go in one of those.

Output on running `hexdump -D /bin/ls` on OSX

And so I decided on my target. We were gonna go for a big one; we’ll be patching LS.

So, I proceeded to write yet another tool. This time, it finds all code caves in the binary file and shows me how big each cave is as well as its start index.

Its output isn’t glamorous but let’s face it, I was hacking away at this point as frustration hit fever pitch.

The code caves inside the ‘ls’ utility.

I kinda feel a little guilty for phoning it in the way I did with this one. But basically, see all those dashes lined up? Those are contiguous sections of long strings of zeroes (I didn’t use ‘0’s for the output because it started to hurt my eyes after a while). These are the caves with their starting index values.

All I had to do now is find a cave that was bigger than the size of the code I wanted to patch in. And considering the size of my patching efforts, I had a few options. Maybe out of paranoia or something, I opted for a rule in which any code that would be patched had to be as close to the __TEXT,__text section as it could be. I felt I knew a lot more about all the sections near and around that part of the binary than anything too far away from that so crashes would be a lot easier to debug.

The section of code I was going to patch started at index 1849. That section that went on forever and ever. So, I had my target, now for the patch. I wanted something a little more elaborate to christen my upcoming conquest over the format that’s eluded me for so long.

And here it is;

Sure, it’s a classic “hello world’, but there’s a lot of subtle tricks to it that make it relatively different to the standard Assembly “hello world” that takes a bit of understanding of stacks frames and how they work in execution and such (if you’d like to know more, feel free to drop me a message… Or I could, you know, link you to an article on it or whatever).

Apart from the hello world string itself, that exceptionally large number on line 18 is significant and we’ll spend some time talking about it.

After fiddling with LLDB (OSX’s binary debugger) a little bit — the details in this are enough to make a separate blog entry entirely — I was able to determine that the entry point for the main() function is the value of that exceptionally large number at line 18. After figuring that out, the rest of the patch was easy to write.

The basic anatomy of the patch goes like this:

⦁ Find the largest code cave and save its index.

⦁ Find the LC_MAIN segment in the binary and change its entry offset to point to this saved index as it’s starting point.

⦁ Patch my assembly code’s bytes at the code cave’s index.

⦁ Run the program.

My program would start, jump to my patch, run my “hello world”, then at the end, jump back to the original main’s code and run that. Patch successful with flawless execution!

And then…

A segmentation fault is basically a crash… :/

It crashed but I don’t think this was a complete fail, not entirely. I’d made significant progress. I got the first part of the code to run; tt printed my “Hello World”. But why did it fail? I thought I’d patched in LC_MAIN’s original address into the code pretty accurately. I had to go back into LLDB. But then something even stranger happened.

It runs just fine inside LLDB which was strange at first.

It took me a while to realize that parent/child process issues were happening.

How debuggers work is that once they’re run, they make a separate process for the programs they attach to.

The address I had for the LC_MAIN (that massive number on line 18 of my patch) was relative to where in memory it is when LLDB is running it. The number I got wasn’t its true address but the address it got when it was run inside LLDB. Basically this would never work outside of LLDB or at least I’d have to know how to figure out how much address padding LLDB adds to the files it attaches itself to.

That’s that. A new thing to research on my journey towards perfectly patching the Mach-O format. In the end, I never got it working completely but partly. It’s not really useful if it needs a debugger to work but I’ll come back to this some other time. It was a fun project at the time and perhaps now I could actually go back and beat it as I’m a lot better at this engineering stuff than I was when I attempted this.

Or perhaps not. My filesystem won’t build itself.

So we’re back to it next week (incidentally, if you’re new, check out my filesystem project and articles. You can start at it’s Github page:

Alright, that’s it.

Cheers 😃

Links to all this article’s stuff:

My first attempt:

The code that neatly dumps segments out:

The Code Cave semi-perfect (not even close) packer:

Mach-O file format basics:

I’m a software developer by day and tinkerer by night. Working on getting into opensource stuff with a focus on C and Python. I’m also a Ratchet and Clank fan.