Posts with the “Code” tag

Rebuilding gallery.chaostangent.com

gallery.chaostangent.com front page
gallery.chaostangent.com is an application for storing and organising images – ostensibly a very simple desire but one I found not catered for by existing web applications when it was first conceived in 2005. The concept was an application that was simple and easy to use while still allowing for a degree of organisation to ensure images weren’t stored in a single “pool”.

With a small, well-defined feature set it seemed like a good time to address some of the issues which had crept in

Background

When I first started developing the application, PHP 5 hadn’t been released for very long and was receiving a mixed reception. Regardless, I started developing using a custom built framework I had cobbled together from scratch – one that would eventually go on to be refined and used in some of my work projects. With the lack of other mature frameworks to compare with, it was rough round the edges and did little more than segment out code into the MVC pattern and even then it wasn’t an entirely clean encapsulation; it was however useful.

Read the rest of this entry

Expectancy: PHP 5.3

Pretty Hard Panda

The release of PHP 5.3 is due sometime soon and with a feature freeze in place since the 24th of July and a pre-release alpha now available, it's worth exploring some of the many additions and changes that are going to be introduced.

As PHP is the language I most frequently work in and one which I've done all sorts with (from web applications, to file exploration to media player scripting), I like to think I'm sensitive to deficiencies and oddities in the released implementations. Version 5.3 contains a lot of elements backported from the still distant version 6, the most glaring omission being end-to-end Unicode support without mb_* fudges or iconv; being able to use string-backed functions like array_unique() without suspicion will be a big help, but I digress.

The most high-profile addition is that of namespaces, gone will be the warts that dot current frameworks (e.g. Zend_Db_Table_Rowset) which will make different frameworks and modules far easier to use and far more friendly when you want them to play nicely together.

PHP and MySQL have always been bedfellows despite their conflicting release licenses

Static functions have also been promoted to all a lot of the meta-programming niceties that member functions have including true overloading support which will allow first level abstractions such as database wrappers to not require instantiation before being called (which I discovered around the same time as my get_class exploration). For instance, if using an ORM, doing People::getAllById() will now be easier to achieve. Along side this many of the magic methods have been tightened up to make them less ambiguous (__get can only be public and not static, signatures enforced etc.)

Looking through some of the other changes detailed in the PHP Wiki it seems that a selection of new functions surrounding garbage collection are now being exposed including checking whether it is enabled, and selectively enabling or disabling it. Whether this is a mistake (close by get_extension_funcs() is detailed as a new function but appears to have been in since PHP4) and these are bleed-throughs from the Zend Engine is unclear, but without some surrounding memory management facilities, it would seem unwise to disable or allow disabling of garbage collection.

On the extension front numerous ones have been standardised and moved into the PECL system which goes some way to neatening things up; the change some are talking about is the choice between a local MySQL library (mysqlnd) versus the native libmysql library that comes when compiling against a MySQL release. PHP and MySQL have always been bedfellows despite their conflicting release licenses (especially so since Sun gobbled up MySQL) so this seems like a smart move for all concerned with separate code-base, better engine integration and statistical analysis now possible (PDF details).

What all of this adds up to is a release that's solid on paper, but the bum-rush for patches is sure to be as swift as any other PHP release. Especially with the OO enhancements though, it feels like these should have been included from day one, as not only will there now be a disjoint between PHP4 and PHP5 shared servers, but PHP5.2 and PHP5.3 as well. For someone who runs their own server this is not massive worry, especially when the list of backwards compatibility changes are so small, but for service providers (hosts, ISPs etc.) still dragging their feet over 4 > 5 > 5.2, this adds another step of complexity.

The real test will obviously be the frameworks and high profile applications that PHP utilises and with word that the Zend Framework won't be supporting namespaces until its 2.0 release next year the lead time could be immense, especially when you consider phpBB, what was once considered the yardstick of PHP usage, still supports 4.3 with its most recent version, the playing field for cutting edge PHP seems less than agile.

Read the rest of this entry

Deconstruction part 2

Attacking those "random" files a couple of days ago provided enough of a challenge to keep me interested for a few hours, especially as it seemed like I was treading new ground in terms of spec'ing out previously unexplored file formats. It turned out that the files had already been mapped and successfully decompressed and the only thing left to do was build an unpacker which was in the pipeline. It seemed my work wasn't exactly fruitless but other, probably smarter people had everything under control. I wasn't about to let that stop me though.

Note (2008-01-11): The full (official?) SDK for this file format has been located which includes both a packer and an unpacker as well as other tools I'm sure are useful for working on the file format. The full name of the file format is "Yaneurao" with the SDK going by the nomenclature of "yaneSDK" which is the stem for the file format signature of "yanepkDx". There is already a .NET version of the SDK so if you're interested in my deconstruction process then read on, otherwise I would recommend using the official/fully-featured SDKs.

Then, in that moment of lucid elation, I realised exactly what was going wrong.

The compression format was identified as LZSS and reading through several sites revealed that some of the data I had initially spotted but attributed to SHIFT JIS (or at one point a Unicode Byte Order Marker, perfect for a non-Unicode file) were the tell-tale signatures of LZSS; the gradual degradation into junk data was also typical of the algorithm as the further into the file the stream progresses, the more back references are present.

yanePkDX
While I hadn't heard of LZSS, it came as no surprise that it was a modified version of LZ77 which I had come across before though never toyed with. Having to dig through a dense PDF was not my idea of fun and my university days had proven that reading academic proofs rarely lead to workable implementations for me so I searched for a ready-made PHP version which (for reasons which will soon become glaringly apparent) didn't prove fruitful. After coming up against dead-ends with other languages I settled on the defacto C version which seemed most other versions I found were based off.

Read the rest of this entry

Deconstruction

Out of curiosity and a favour to someone, I decided to take a look at some random .dat files that were ripe for the translating; what ensued was a morning of head scratching, hex scrying and using some of the lesser used PHP functions.

Sample File 1, Sample File 2, Sample File 3

All screenshots taken from data1.dat, sample file 1 and the window is resized for the most appropriate screenshot rather than general workability.

so garbled that it sent a few hundred bell tones to my computer speaker

First thing I did was to crank open the lovely XVI32 hex editor and have a look at the sample files provided, their .dat extension more or less indicated they were a proprietary format and were unlikely to relinquish their secrets easily. What was known was that the files contained a header portion, a bundle of XML files in a contiguous stream and a lot of junk data. The XML files could be seen and their encoding was stated as SHIFT JIS and, after cursing its existence, I attributed the junk data to that which seemed like a good place to start.

01
The first eight bytes seemed to be a file signature, but Google searches for all or parts of the signature were fruitless which meant it was time to pick things apart.

02
The next four bytes were different for each file and at first I thought it was part of the block format that made up the header part of the file but the section repetition for the header block didn't match up so after converting it to a variety of different number formats (I'm no hex wizard and I originally thought it was only a two byte short rather than a four byte integer or long) and assumed it was an unisgned long (32 bits) in Little Endian order.

Read the rest of this entry

Screenshotter

An "automatic" screenshot taker is something that I've always wanted, but the commercial offerings leave much to be desired and the only other option seems to be the "manual" approach. I am of course talking about screenshots from video files rather than screenshots of your desktop, that sort of thing is well covered.

One of the problems with making your own is that the options are fairly limited on just how you go about opening video files and pulling out the candy frame goodness. For Windows users, the option is to use DirectShow which I can only describe as The Crystal Maze for it's Byzantine ways of operating are beyond mortal ken. The other option is to use a pre-built library such as ffmpeg or similar. This was out as well as not only was it a whole new way of working for me (Windows development files were few and far between) it was a whole new set of a programming challenges which made the learning curve more of a learning cliff.

"during testing I had a number of problems with this"

So I turned forlornly to existing media-players in the slim hope that one of them would have the abilities required for scripting a makeshift screenshotter. Media Player Classic has limited command line support, VLC is more geared towards client/server setup and I couldn't even figure out whether that route would lead to any semblance of success, BSPlayer... The list goes on as to the number of players which don't supply a full body of command line options.

The silver lining, the angel of hope was MPlayer. If you're prepared to wade through a bit of fudge to get there, MPlayer provides everything you need to script a screenshotter:

  • jump to any part of the file from the command line
  • output into different (static) formats such as PNG and JPEG
  • can output file information (length, dimensions etc.)

With these three functions MPlayer is almost all you need. Almost.

Read the rest of this entry