Overview of File System Analysis in this article explain American Standard Code for Information Interchange (ASCII) and unicode diffrent computer languages which can be used in file system and also explain which of file system cannot be analysis and Hex View of Popular Image File formats.
Understanding ASCII, Unicode, and Offset
1. American Standard Code for Information Interchange (ASCII)
Developed from telegraph codes, ASCII is a character encoding standard used in digital devices such as computers. The standard has 128 specified characters coded into 7-bit integers. Source code of a program, batch files, macros, scripts, HTML and XML documents are also ASCII files. The characters encoded are:
- Numbers 0 to 9
- Lowercase letters a to z
- Uppercase letters A to
- Basic punctuation symbols
- Control codes that originated with teletype machines
- A space
ASCII is a machine readable language, used in major digital operations such as sending and receiving emails. The ASCII table has 3 divisions namely, non-printable (system codes between 0 and 31), lower ASCII (codes between 32 and 127), and higher ASCII (codes between 128 and 255). The graphics files and documents use non-ASCII characters made in word processers, spreadsheet or database programs and sent as email file attachments.
Unicode is a computing standard, developed along with the Universal Coded Character Set (UCS) standard for encoding, representation, and management of texts, which most of the world’s writing systems use. It provides a unique number for every character, irrespective of the platform, program, and language.
Unicode contains more than 128,000 characters from about 135 modern and historic scripts. Technologies such as modem operating systems. XML, Java, and the Microsoft .NET Framework have adopted the Unicode standards.
Related Product : Computer Hacking Forensic Investigator | CHFI
File System Analysis (Cont’d)
1. Understanding Hex Editor
A hex editor is a program that allows users to modify the fundamental binary data of a file. Using a hex editor, the user can see or edit the contents of a file. A hex editor has three display areas including an address area, a hexadecimal area, and a character area. Each area shows different values.
In digital forensic investigations, the hex editors allow the investigators to view any data stored in disk and also search for the remnants of deleted files. A hex editor allows investigators to view the physical contents stored on a disk, including the files, directories, or partitions.
Other functions include cracking of copy-protected software, studying of how computer viruses work, identify and retrieve hidden information.
2. File Carving
File Carving is the process of recovering files from their fragments and pieces from unallocated space of the hard disk in the absence of file system metadata. In computer forensics, it helps investigators to extract data from a storage media without any support of the file system used in creation of the file.
Unallocated space refers to the hard disk space that does not contain any file information, but store file data without the details of its location. Investigators can identify the files using certain characteristics like file header (the first few bytes) and footer (the last few bytes).
For example, a suspect may try to hide an image from detection by investigators by changing the file extension from .jpg to .dII. However, changing the file extension does not make changes to the file header, which on analyzing will reveal the actual file format.
File carving methods may vary based on different elements such as the fragments of data present, deletion technique used, type of storage media, etc. This process depends on the information about the format of the existing files of interest and guesses of the file information layout on other devices. Investigators can take a look at file headers to verify the file format using tools such as 010 Editor, Cl Hex Viewer, Hexinator, Hex Editor Neo, Qiew, WinHex, etc.
Also Read : What is RAID Storage System?
3. Image File Formats: JPEG
JPEG is an abbreviation for Joint Photographic Experts Group, the committee that created the JPEG standard and JPED is the term used for representing any graphic image file produced by using a JPEG standard.
It is a method of lossy compression for digital images and allows users to adjust the degree of compression, which has selectable impact on the storage size and image quality. JPEG files allow compression ratio of 90%, which is one-tenth of the size of the data.
A JPEG bit stream contains a sequence of data chunks. Every chunk starts with the marker value, each marker having a 16-bit integer value, and it is stored in big endian byte format. The most significant bit of marker is set to 0xff. The lower byte of the marker determines the type of marker.
The first bits of a file represent the file type and WEG files start with binary value 0xffd8 (SOI—start of image) and end with binary value 0xffd9 (EOI—end of image). Therefore, ffd8 (the 0x is implied) at the beginning represents a JPEG file when viewed with a hex editor. A JPEG bit-stream contains a sequence of data chunks, or segments, and every chunk starts with a marker value. The basic format of a segment is the 16-bit integer value that determines the file size value. The most significant byte of the marker (the left-most bit) is 0xff. The lower byte of the marker determines the type of marker.
The basic format of a segment is as follows: 0xff marker number (1 byte) data; size (2 bytes); and data (n bytes)
For example, for the marker FF E1 00 0E, the marker (0xFFE1) has 0x000E (which equals 14) bytes of data. But the data size 14 includes the data size descriptor (2 bytes); thus, only 12 bytes of data follow after 0x000E. Figure 5-3 shows a JPEG file structure. Figure 5-4 shows possible kinds of segment markers in JPEGs.
Image File Formats: BMP
BMP file format, also called as bitmap image file or device independent bitmap (NB) file format or a bitmap, is a standard graphics image file format used to store images on Windows operating systems. Microsoft developed this format so that Windows can display the image on any type of screens. Bitmap images can include animations. The size and color of these images can vary from 1 bit per pixel (black and white) to 24-bit color (16.7 million colors).
BMP File Structure: Every bitmap file contains the following data structure:
- File header: The first part of the header that includes the data about the type, size, and layout of a file,
- Information header: A header component that contains the dimensions, compression type, and color format for the bitmap.
- The RGBQUAD array: A color table that comprises the array of elements equal to the colors present in the bitmap; this color table does not support bitmaps with 24 color bits, as each pixel is represented by24-bit RGB values in the actual bitmap.
- Image data: The array of bytes that contains bitmap image data; image data comprises color and shading information for each pixel.
A bitmap file always has 42 4D as the first characters in a hexadecimal representation. These characters translate to BM in the ASCII code.
Hex View of Popular Image File formats
1. GIF File Format
GIF is a file format that contains 8 bits per pixel and displays 256 colors per frame. CompuServe generated the GIF format in 1987. GIF uses lossless data compression techniques, which maintain the visual quality of the image.
There are currently two versions of GIF:
- GIF 87a: Developed in 1987; it is the first version of GIF that uses the [7W file compression technique and supports features such as interlacing, 256-color palettes, and multiple image storage.
- GI 89a; Created in 1989; this version supports features like background transparency, delay times, and image replacement parameters. These features are useful for storing multiple images as animations.
GIF file structure includes a header, image data, optional metadata, and footer. The hex value of a GIF image file starts with the values 47 49 46, which represent the GIF file name.
2. PNG File Format
PNG, short for Portable Network Graphics, is a lossless image format intended to replace the GIF and TIFF formats. PNG improves the GIF file format and replaces it with the image file format. It is copyright and license free. PNG file format supports 24-bit true color, transparency in both the normal and alpha channels as well as indexed/palette-based images of 2O-bit RGB or 32-bit RGBA colors and grayscale images.
PNG file signature consists of the reminder of the file having single PNG image. These images are comprised of a series of chunks, starting with an IHDR chunk and ending with an lEND chunk.
PNG file hex values begin with 89 50 4e, which is the be value for GIF.
3. PDF File Format
Adobe Inc. developed the Portable Document Format (PDF) file format in 1992. It helps users to easily view, save, and print a document independent of any platform, operating system, hardware or any software, which they are using. It consists of text with media components like images, links etc. An Acrobat reader can open and display the PDF files.
- PDF files are device independent and support different systems like MAC, Linux, etc.
- These files support different compression algorithms
- They also support several multimedia elements
- It allows password protection
Hex values for a PDF begin with 25 50 44 46, which is the signature of every PDF file representing the %PDF values in hexadecimal form. The file version of PDF follows the signature while the file ends with %EOF value, representing the end of file.
Questions related to this topic
- What is the file signature for a JPEG file?
- What are the first 2 bytes of a BMP file?
- What type of files are JPG BMP GIF?
- How do I convert a JPG to BMP?
This Blog Article is posted by
Infosavvy, 2nd Floor, Sai Niketan, Chandavalkar Road Opp. Gora Gandhi Hotel, Above Jumbo King, beside Speakwell Institute, Borivali West, Mumbai, Maharashtra 400092
Contact us – www.info-savvy.com