Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 956 Categories, Downloaded 50.246.497 Times

Parsing binary files with regular expressions 1.0

  Date Added: May 10, 2013  |  Visits: 834

Parsing binary files with regular expressions

Report Broken Link
Printer Friendly Version


Product Homepage
Download (57 downloads)



This script allows you to use the regular expression engine to parse binary files, especially those for which the struct module alone is inadequate.The typical way to parse binary data in Python is to use the unpack method of the struct module. This works well for fixed-width fields, but becomes more complicated when you need to parse variable-width fields. Perl's implementation of unpack accepts "*" as the field length, and even allows grouping with parentheses, which mitigates this problem. Python does not currently offer these features. Although you can dynamically generate a format string for unpack with a lot of slicing and calls to calcsize, the resulting code will likely be hard to read and error-prone.Fortunately, in some cases there is a simpler way to do it: use the regular expression engine to grab each field, and use struct.unpack on the results.First, you construct a regular expression (RE) describing the entire record structure, grouping each field you'd like to extract with parentheses, and compile it. To create the regular expression, you just have to remember that one character in the RE equals one byte in the record. So, the expression ".." would match any short (2 bytes). To match a variable-width field, the REengine will have to be able to recognize where the field ends. In a null-terminated string, for example, the field ends with a zero byte. You'd therefore look for any number of characters followed by a null byte: "(.*?)". Note the use of the non-greedy qualifier "?" -- this way, we only match up to the first null, rather than the last null in the buffer.When compiling, make sure to pass the re.DOTALL flag to the compiler, or it will consider bytes that happen to match ASCII '' to be newlines. Then, you use the findall method of the compiled expression object on your buffer. findall finds all non-overlapping matches, one match for each record. It returns a list of tuples, one for each match; each tuple will contain one element for each field you grouped in the RE.You still need to unpack the fields in the tuples before using them, since they're still strings rather than usable values. Generally, you'll call unpack once for each field, with only one format character. (You can also group multiple consecutive fixed fields in one set of parentheses in the RE, and then unpack them in one call. But that may get confusing.)The code above demonstrates how to unpack a binary file that has an indeterminate number of variable-width records, each consisting of a little-endian short, a null-terminated string, and two more shorts. It drops the resulting values into a list and also into a dictionary.This technique is useful when your variable-width fields are terminated with a sentinel, such as the zero-terminated strings described above. If your field length is embedded in the data, and you can't use the "p" (Pascal string) modifier, you'll probably have to resort to slicing the buffer up manually.This technique is also applicable even if your fields are all fixed-width. The findall method will operate on the entire buffer at once with a single regular expression, which saves you from having to dynamically create a long format string encapsulating all your data, or alternatively iterating over slices of the buffer.

Requirements: No special requirements
Platforms: Windows, Mac, *nix, Mac OSX, Linux, Python , BSD Solaris
Keyword: Binary Binary Files Parser Files Files Management Parser Regular Expressions
Users rating: 0/10

License: Freeware
USER REVIEWS
More Reviews or Write Review


PARSING BINARY FILES WITH REGULAR EXPRESSIONS RELATED
Log Analyzers  -  LMF 0.5
LMF project is a flexible log monitoring framework that allows the user to match text from log files, using perl regular expressions and capturing parentheses (pattern). An optional external command (trigger) will be executed when a...
16.38 KB  
ActiveX Components  -  Miraplacid Binary and Text DOM SDK 3.2
Miraplacid Binary and Text DOM SDK represents two complementary technologies - BinaryDOM and TextDOM. This is redistributable software library component (dll) which works on .Net platform accompanied with documentation, data files and examples....
548 KB  
Libraries  -  Miraplacid BinaryDOM SDK 1.0
Miraplacid Binary Document Object Model (Binary DOM) provides easy access to binary files in known formats. It can be used for binary data analysis and modification. It provides easy access to internal binary file structure hierarchy, navigation...
1.38 MB  
Development Editors  -  Free Hex Editor Neo 4.97.02.3667
View, Edit and Analyze Hexadecimal Data and Binary Files of any Size. Free Hex Editor Neo is a large files optimized freeware hex editor for everyone who works with ASCII, hex, decimal, float, double and binary data. Make patches with just two...
8.33 MB  
Boot Managers  -  bicl 0.1.0
bicl is a tool for editing the built-in command line boot arguments in binary files like the PPC64 Linux compressed kernel image and the PPC64 Xen compressed hypervisor boot image. he boot argument processing for powerpc Xen is much less complex...
3.07 KB  
Business  -  NewsGrab 0.5.0 Pre4
NewsGrab provides a tool to retrieve binary files from an NNTP server. NewsGrab is a small tool that uses regular expressions to download and uudecode/ydecode binary files from USENET. Whats New in This Release: - Bumped version to 0.5.0pre4...
21.5 KB  
Programming  -  radare 0.8
radare is a toolchain that aims to create a complete set of utilities for handling binary files from the command line. The project is mainly an hexadecimal editor for the command line but with advanced features. There are extensions for...
245.76 KB  
Libraries  -  Pitfdll 0.8.2
Pitfdll is a GStreamer plugin that allows the use of binary files, such as Quicktime QTX or DMO DLL/Directshow files, for use as a playback codec in GStreamer-based media applications, such as Totem. With this plugin, people can playback...
593.92 KB  
Desktop Utilities  -  run in xterm 0.9.1
run in xterm is an servcie menu which adds "run in xterm" & "run in xterm as root" to action menu on binary files, scripts etc. It has 2 languages: english and polish. Installation: copy/save this file in...
 
Development Tools  -  UUDeview 0.5.20
UUDeview is a program that helps you transmit and receive binary files over the Internet, using electronic mail or newsgroups. The UUDeview package includes both an encoder and a decoder. The decoder automatically detects the type of encoding...
 
NEW DOWNLOADS IN SCRIPTS, FILE MANAGEMENT
Scripts  -  Free Ecommerce website creator 1.2
Free Ecommerce website creator is a free PHP shop creating script. This allows you to put a online shop on your own website. Create your own free ecommerce website for Your Business. Create an online shop using easyGUI online shop creator. The...
1.44 KB  
Scripts  -  MochiGames PHP Script ZDR 1.00
MochiGames PHP Script ZDR is web site, ready for use, for flash games. These flash games are downloaded automatically by "MochiGames PHP Script ZDR" from MochiGames media. The use of the games is free, you can use your own Mochi Publisher ID and...
368.54 KB  
Scripts  -  Php Chat 2.0
Add a free php site, single sign-on and multiple skins, 100% free 1. Server Modes: The chat server has paid mode and free mode. If the free chat mode, a free chat room will be assigned to your website with your domain as the room name. 2....
938.87 KB  
Scripts  -  Nibbleblog 3.0.1
Nibbleblog it's a powerful engine for creation and manipulation of BLOG's completely free. Very simple to install and configure (Only 1 step). The database used is based on XML files and this way it is not necessary to use MySQL or similar DBMS....
371.09 KB  
Scripts  -  PHP File Manager | CloudOsys 2.9b8
CloudOsys is a PHP file manager, a tool that allows your visitors upload files such as media content directly to your website. Your visitors will upload files directly to your website, where they can share and comment on them. Through cloud...
1.41 MB  
File Management  -  YetiShare - File Hosting Script Free 2.1
YetiShare is a PHP script that allows you to create your own professional file sharing service that you can earn a revenue from. The script has a wide range of features including an extensive admin area, user accounts, and more. This is a free...
6.2 MB  
File Management  -  XtraUpload for File sharing 1.6
XtraUpload has all the features you would expect from a file Hosting Script. Free users as well as premium users have the ability to upload files but premium users get a lot more features such as viewing files, instantly downloading files and...
4.8 MB  
File Management  -  PHPGnuCacheII for File sharing 2.1.1
PHPGnuCacheII is a PHP/MySQL WebCache script for the gnutella/gnutella2 Peer-to-Peer network. It implements the Version 2 GWebCache specifications.
20.48 KB  
File Management  -  TorrentFlux for Download Managers 2.3
TorrentFlux is a FREE PHP based Torrent client that runs on a web server. It allows you to manage all of your Torrent downloads through a convenient web interface from anywhere. Features:- Upload Torrents via URL or File Upload- Start, Stop, and...
1.29 MB  
File Management  -  FastBeats for File sharing 2.0.95
FastBeats is an all-in-one BitTorrent Tracker content management system written in PHP.It uses a MySQL database to store all of your site content such as News, Torrents, Descriptions and much more. FastBeats is dedicated only for FastWeb users.
1.09 MB