Building a better screenshotter

High definition snow

My previous forays into crafting an automatic screenshot taker were, at the time, very successful. The system managed to pump out usable images in a fraction of the time it would have taken me to seek and do them manually; I even extended the script to handle multiple file-inputs which made 'capping an entire series a breeze. Lamentably, this was a honeymoon period before cracks started to show, followed by gaping chasms.

The only workaround the first screenshotter used was a glitch for Windows Media files which meant the first frame sought was always blank, it swerved around this limitation by taking two shots and discarding the first. This symptom, however, was indicative of what would become a persistent problem.

Background

The first significant problem I encountered with the setup was with the series Claymore, a great many of the resulting images seemed to have a lot of "bleed through", as if one frame were being intermingled with another, this was above an beyond the standard cross-fade transition screenshots that were common. At the time I assumed it was because the files I used were modern H264 MKV files rather than the standard XViD ones I had been using before, or that the encoding was particularly shoddy. After downloading an updated version of mplayer for Windows the problem seemed to disappear; I ended up regenerating a lot of the images for episodes which were most severe offenders.

The thing to understand is that seeking in a video files is very difficult

After a spate of swift updates, I didn't blog anime any more so the screenshotter shortcut on my desktop lay dormant until I decided to unleash some madness on Strawberry Panic. While the setup worked, it was producing an unusual amount of exact duplicate images, despite being over five seconds apart. I realised there was a fundamental underlying cause for this that an mplayer update wouldn't fix. True high-definition versions (not upscales) of certain releases were now readily available, namely the seminal Ghost in the Shell: Standalone Complex (and associated movie Solid State Society) and a selection of Makoto Shinkai works including 5 centimeters per second, which I wanted to pluck some quality captures from (for desktop wallpaper or other purposes). These files did not agree with the screenshotter at all and stoically produced correct resolution but entirely black captures which was less than useful.

Read the rest of this entry

Expectancy: PHP 5.3

Pretty Hard Panda

The release of PHP 5.3 is due sometime soon and with a feature freeze in place since the 24th of July and a pre-release alpha now available, it's worth exploring some of the many additions and changes that are going to be introduced.

As PHP is the language I most frequently work in and one which I've done all sorts with (from web applications, to file exploration to media player scripting), I like to think I'm sensitive to deficiencies and oddities in the released implementations. Version 5.3 contains a lot of elements backported from the still distant version 6, the most glaring omission being end-to-end Unicode support without mb_* fudges or iconv; being able to use string-backed functions like array_unique() without suspicion will be a big help, but I digress.

The most high-profile addition is that of namespaces, gone will be the warts that dot current frameworks (e.g. Zend_Db_Table_Rowset) which will make different frameworks and modules far easier to use and far more friendly when you want them to play nicely together.

PHP and MySQL have always been bedfellows despite their conflicting release licenses

Static functions have also been promoted to all a lot of the meta-programming niceties that member functions have including true overloading support which will allow first level abstractions such as database wrappers to not require instantiation before being called (which I discovered around the same time as my get_class exploration). For instance, if using an ORM, doing People::getAllById() will now be easier to achieve. Along side this many of the magic methods have been tightened up to make them less ambiguous (__get can only be public and not static, signatures enforced etc.)

Looking through some of the other changes detailed in the PHP Wiki it seems that a selection of new functions surrounding garbage collection are now being exposed including checking whether it is enabled, and selectively enabling or disabling it. Whether this is a mistake (close by get_extension_funcs() is detailed as a new function but appears to have been in since PHP4) and these are bleed-throughs from the Zend Engine is unclear, but without some surrounding memory management facilities, it would seem unwise to disable or allow disabling of garbage collection.

On the extension front numerous ones have been standardised and moved into the PECL system which goes some way to neatening things up; the change some are talking about is the choice between a local MySQL library (mysqlnd) versus the native libmysql library that comes when compiling against a MySQL release. PHP and MySQL have always been bedfellows despite their conflicting release licenses (especially so since Sun gobbled up MySQL) so this seems like a smart move for all concerned with separate code-base, better engine integration and statistical analysis now possible (PDF details).

What all of this adds up to is a release that's solid on paper, but the bum-rush for patches is sure to be as swift as any other PHP release. Especially with the OO enhancements though, it feels like these should have been included from day one, as not only will there now be a disjoint between PHP4 and PHP5 shared servers, but PHP5.2 and PHP5.3 as well. For someone who runs their own server this is not massive worry, especially when the list of backwards compatibility changes are so small, but for service providers (hosts, ISPs etc.) still dragging their feet over 4 > 5 > 5.2, this adds another step of complexity.

The real test will obviously be the frameworks and high profile applications that PHP utilises and with word that the Zend Framework won't be supporting namespaces until its 2.0 release next year the lead time could be immense, especially when you consider phpBB, what was once considered the yardstick of PHP usage, still supports 4.3 with its most recent version, the playing field for cutting edge PHP seems less than agile.

Read the rest of this entry

Javascriptery: Tabbed forms

Forms are perhaps the bane of web development for me; you can't get them to look good, you can't find a foolproof way to make them act well and lets not even start of trying to get them into a pacified state, free from the dangers of user input (surprise ending: form input will never be completely trustworthy). A lot of sites would appear to have aesthetically pleasing forms, however this is a careful ruse by them as they sidestep the problem of forms by having only one or two of them, and then they usually only have a few fields. The monstrosities I am required to deal with almost daily are things of grotesque beauty, veritable Rube Goldberg machines of complexity.

Read the rest of this entry

Sayonara Zetsubou Sensei

How would a studio approach a manga known for its wordplay and focusing on a depressively suicidal teacher, a manga that was notoriously (even infamously) claimed to be untranslatable? Surely even SHAFT, known for their off-the-wall adaptations of other, more straightforward manga such as Pani Poni and Negima, could manage such a feat? They did, and with such reckless disregard for obstacles such as plot, continuity and sanity; Sayonara Zetsubou Sensei is bizarre, satirical, cynical and rambunctious and solidifies SHAFT as a skilled and confident studio.

each episode is a scatter-shot of styles and content, the speed and veracity of each bite-size skit causes as much humour as the subject matter

Describing the premise of the series would never be enough to encapsulate what it is actually about: the histrionically pessimistic Itoshki Nozomu is at thwarted in his attempts to kill himself by the outwardly naive and interminably optimistic Kafuka. This satisfies the first twelve minutes of the series as it then goes on a journey involving stalkers, hikkikomori, escape routes and courting rituals but most of the time it concerns itself with nothing in particular: a multicoloured collage of gags, perceptions on life and randomness. Sayonara Zetsubou Sensei has very little to say and has a damn good time saying it. The series doesn't cover a specific time frame or tell a coherent story, it is a staccato whimsy of wordplay and wonder; a möbius strip of pop-culture references and banter on the thralls of modern existence. If all this sounds like the series occupies a different existence to the rest of the world, you wouldn't be far off the mark. An episode can focus on one specific topic, often meandering along the way, veering off on tangents of logic but ultimately digging through an obscure subject such as what can be accepted as minimal culture, or clearing away impurities or escaping from blame and responsibilities. Other episodes which make up the majority of the twelve episode barrage concern themselves with frittering away on whatever shiny issue takes its fancy, the opening episodes concern themselves with introducing the core set characters and their associated archetypal personality quirks then strobing fanservice, insults, family members and all points in between. Episodes are sometimes over before one knows it, other times the closing animation can be just a punctuation mark before it continues, seemingly unabated.

Read the rest of this entry

Deconstruction part 2

Attacking those "random" files a couple of days ago provided enough of a challenge to keep me interested for a few hours, especially as it seemed like I was treading new ground in terms of spec'ing out previously unexplored file formats. It turned out that the files had already been mapped and successfully decompressed and the only thing left to do was build an unpacker which was in the pipeline. It seemed my work wasn't exactly fruitless but other, probably smarter people had everything under control. I wasn't about to let that stop me though.

Note (2008-01-11): The full (official?) SDK for this file format has been located which includes both a packer and an unpacker as well as other tools I'm sure are useful for working on the file format. The full name of the file format is "Yaneurao" with the SDK going by the nomenclature of "yaneSDK" which is the stem for the file format signature of "yanepkDx". There is already a .NET version of the SDK so if you're interested in my deconstruction process then read on, otherwise I would recommend using the official/fully-featured SDKs.

Then, in that moment of lucid elation, I realised exactly what was going wrong.

The compression format was identified as LZSS and reading through several sites revealed that some of the data I had initially spotted but attributed to SHIFT JIS (or at one point a Unicode Byte Order Marker, perfect for a non-Unicode file) were the tell-tale signatures of LZSS; the gradual degradation into junk data was also typical of the algorithm as the further into the file the stream progresses, the more back references are present.

yanePkDX
While I hadn't heard of LZSS, it came as no surprise that it was a modified version of LZ77 which I had come across before though never toyed with. Having to dig through a dense PDF was not my idea of fun and my university days had proven that reading academic proofs rarely lead to workable implementations for me so I searched for a ready-made PHP version which (for reasons which will soon become glaringly apparent) didn't prove fruitful. After coming up against dead-ends with other languages I settled on the defacto C version which seemed most other versions I found were based off.

Read the rest of this entry