Biteye & Vix


Several years ago, when I started playing in random CTFs and wargames, continuously opening files for guessing what hides inside, trying to understand the format and layout of their structures, or even at home when I need to dig into a core file, a broken partition or any file whose type I just don't know, I found a extremely useful way to display its contents.

Biteye (and its improved version, Vix) is another graphical (SDL-based) hexadecimal dump tool designed for GNU/Linux (alhtough it may compile in other Unices) with an important feature: it lets you see the patterns formed by its bits. With some practice, you can train your eyes to spot sections within an executable file, a firmware, or any file made out of blocks of any content.

Some say this sounds like science fiction, and some others may say this is nothing new, but as there are no many tools like this under GNU/Linux and I use it quite often at work (I work as a security consultant, and doing reverse engineering on firmwares is part of my job) I decided to make it available to the public under the terms of the GPL3 license.

These tools are extremely straight-forward to use (arrow up/down, page up/down to scroll). In these early versions they just display two bit views (bits arranged horizontally on byte rows in orange pixels, and bits arranged horizontally on bit cols in green pixels) and an hexdump, but I plan to add some features like interactive edition, a command line, bfd-based (or even radare-based) disassembly, real time memory debugging of an external process... it will take time, but I'll prioritize my work on it if it ends up becoming popular. Also, there are some examples at the bottom of this page, showing how you'd see code from different architectures, some ELF structures, bitmaps and so on. Enjoy! :)

Features

Some of the current features (yet still experimental) are:

These are the next features I have in the TODO right now and I plan to implement ASAP:

Download and source code

You can download the latest working version of biteye here, and the last working version of vix here. You will need at least SDL 1.2.5 developement files to build them. Note that biteye will be probably discontinued if nobody seems to care about it, as vix does the same better and fancier.

These projects are both hosted in this website (check my git repos here) and GitHub (here for biteye and here for vix). If you are interested in writing them and sending me a patch, the easiest way to get the source code is to download it directly from GitHub. To clone biteye's repositiry, just run:

% git clone https://github.com/BatchDrake/biteye.git

and for vix:

% git clone https://github.com/BatchDrake/vix.git

The configure script is missing in the git repos, so you'll need to generate it with libtoolize and autoreconf:


  % libtoolize
  % autoreconf -fvi

Once you got the code (either by git or by the link to the latest version) you can compile it with:


  % ./configure
  % make
  % sudo make install
  

Contact me

For questions, doubts, patches or anything, you can find my e-mail address in the main page of this website here. Also, you can follow me on Twitter or follow/fork me on GitHub.

Screenshots

Biteye screenshots
/bin/ls Floppy boot record The first bytes of a Win32 DLL Plain text file

Vix screenshots
/bin/ls Floppy boot record The first bytes of a Win32 DLL Plain text file

How the Hilbert display will look like

I've been doing some research and implemented some tests to get the most useful Hilbert curve representation of a file, DWORD by DWORD. I came to the conclusion that there must be two kinds of Hilbert curve representations, one for little endian and one for big endian. This is because I decided to implement the Hilbert curve representation as a representation of all 32-bit words contained in the file, taking three bytes of each as the RGB components and ignoring the other. As small values are more likely to appear in most binary files, this most-significant byte will be zero most of the time, and the relative position of that MSB depends on the endianness.

Although I haven't decided yet the proper way to implement this in Vix, I've done my experiments already. For a little-endian representation of some files, showing in red the LSB, in green second LSB and in blue the third LSB, we'll have results like these:

Hilbert curve display preview
MP4 video ZIP compressed file /bin/bash /bin/busybox Ext2 partition
Note for those inquiring minds: without the MSB you won't be able to unzip that file, and in addition it's not anything you can't find in celestiamotherlode.net. The ext2 partition has nothing important in it but a kernel image I used to do experiments with. The answer for the question some of you have in mind is yes, you can gather a lot of information with 24 of each 32 bit word (mainly text files), so in the future try not to share too many views of a file if the information being displayed is secret. The same thing applies to bitviews, of course.

Some commonly found patterns

One of the most typical use cases of biteye/vix is to guess the data type of a chunk or region within a binary file. In order to achieve this, we need to get used to the most commonly found data structures and how their bit patterns look like. This section doesn't intend to be a exhaustive list of all what we can find, but just to give a quick glimpse of the kind of patterns you need to get used to. I encourage you to open files in your spare time whose content you already know, "just for fun", and try to ask yourself why the patterns you're seeing have that specific shape. Remember: these tools are not automagical, they require some experience and training to quickly identify patterns and datatypes.

ELF header, x86-64 (64 bits, little endian)


ELF header, SPARC64 (64 bits, big endian)


.dynsym section, x86-64 (64 bits, little endian)


Plain text


.plt section, x86-64 (64 bits, little endian)


ARM machine code, little endian


MIPS machine code, little endian


MIPS machine code, big endian


SPARC32 machine code, big endian


SPARC64 machine code, big endian


SuperH-4 machine code, little endian


PowerPC machine code, (usually) big endian


8086 (16 bit Intel) machine code, little endian


i386 (32 bit Intel) machine code, little endian


x86-64 (64 bit Intel) machine code, little endian


Z80 machine code, little endian


Motorola 68000 machine code, big endian


LZMA compressed data


Tux bitmap found in a Linux for Motorola 68000


DOS font file (ega.cpi) for FreeDOS


Pokémon Gold sprites and fonts found inside an old ROM



© Gonzalo J. Carracedo (BatchDrake) 2014