Wednesday, June 30, 2004

This is a post to remind myself as much as anything else (I'm curently in Romania).
Mark Pilgrim has just officially released Universal Feedparser 3.0. This is an RSS feed parser that can read all of the various competing RSS formats (including Atom) and present the contents of a feed in a standard way. That means the python programmer can write a script that reads (or presents) a feed and doesn't have to worry about what format it's in. Now that my CGI proxy is basically finished (for remotely fetching web pages in a restricted environment - glad I brought the PDA with me) - I might make decent use of it by building in an RSS aggregator.... hmm..
posted by Mike Foord on Wednesday, June 30, 2004

(0) comments

Tuesday, June 22, 2004

Final post for a couple of weeks.. I've just rediscovered - so great stuff for techie junkies there and I haven't visited for ages. I've also written the first working draft of a CGI-proxy script in python. It's not a scratch on the James Marshall one yet... but heck, it's Python.

See you all soon..
posted by Mike Foord on Tuesday, June 22, 2004

(0) comments

Friday, June 18, 2004

I've cleaned up the Voidspace CGI Homepage a bit. Or more to the point sorted out the guestbook scripts and the Nanagram-CGI anagram server.

I'm not sure if I mentioned but I completed my Ordered Dictionary.

Blurb from the docs :
It behaves as a drop in replacement for an ordinary dictionary in almost every circumstance. Many dictionary methods normally return a random value, or return values in a random order. Hopefully all those methods in oDict return values in an ordered and consistent manner. The ordering is applied in the keys() method and uses the Python sort() method of lists to do the sorting. You can also set it to apply the reverse method by passing in a parameter when you create the instance. See the oDict docstring for more details.

An ordered dictinary is useful where, for example, a consistent return order for the iterators and pop methods is helpful. I use it in FSDM markup structures (describing files and directories in a file structure) so that the markup files are built in a consistent order.

Methods which are now ordered are :

pop, popitem, keys, items, values, __iter__ ( for key in odict ), iteritems, iterkeys, itervalues

As oDict has methods defined for almost all the dictionary methods, and also has custom iterators, it would be a good template for anyone else who wanted to create a new dictionary type with custom access methods etc. It doesn't actually subclass dict, nor use the iter function, so it ought to be compatible with versions of Python pre 2.2

posted by Mike Foord on Friday, June 18, 2004

(0) comments

Thursday, June 17, 2004

Looks like Yahoo have finally responded to the Gmail Threat. Gmail is google's offering of email - with a 1 gigabyte inbox and powerful mail search facility. On the other hand, do you really want google searching through your email to bring you targetted adverts :-)

Anyway - this morning I received an email from yahoo (in my yahoo inbox of course). Saying this :

Thanks for using Yahoo! Mail. It's our goal to offer you an email experience that makes it easy to stay in touch. Periodically, we make service changes to enhance that experience for our users. As of June 16, 2004, you'll enjoy the following benefits:

* Increased storage capacity – from your current level to 100MB
* Increase in individual message send/receive size from 3MB to 10MB
* An improved layout that’s even easier to use

etc etc
Thats almost enough to make yahoo mail useful for storage - maybe I'll have to get another couple of accounts. Maybe we could have shared inboxes and share the passwords as an alternative file store :-) I'm sure that's why they've resisted bigger inboxes before - so long as they don't take away POP3 capability.

By the way, the 'Joel On Software' article I linked to yesterday had some interesting points on Web User Interfaces... and why Microsoft had stopped developing nternet explorer and features like DHTML that promised to offer a smoother and better user interface to web delivered applications.... and the answer is because Microsoft want to make money from selling client based applications (software you have to buy) rather than allowing people to develop web application with a good user itnerface that makes client based applications obsolete.. and it's all to do with the developer base moving away from programming with the ever-shifting Windows API. Anyway - it's a good article... read it.

Hey, I *almost* did a whole techie blog without mentioning python... *damn*

By the way, the The Vaults of Parnassus has been updated after a couple of weeks break.. and some of my stuff is on it.
posted by Mike Foord on Thursday, June 17, 2004

(0) comments

Wednesday, June 16, 2004

Here we go, for the first time in a while a techie blog that isn't about python.... well almost - it's about programming. More specifically it's about how object oriented program isn't all it was cracked up to be in the nineties. This is a great shame for me as I've just got to grips with it and love it *sigh* The trouble is that when you try and scale the object oriented model up to distributed systems it all starts to suck quite badly.. at least that's what the following article says anyway. It says that messages are the way forward and it's something that microsoft are having to address in Longhon. Now personally I thought message passing was part of the textbook definition of an object oriented system (not that you notice it much practically in python - object methods are really just functions, not messages at all).

Anyway - this article is about the future direction of longhorn and is from CNET.. interesting stuff.

I came across it in the latest article from Joel on Software - he does a lot of good stuff. His latest article is about the move away from the microsoft API (and towards which is still largely Microsoft, before you get too excited).. but it is interesting.

Oh, and as an aside, I've just completed my Ordered Dictionary. This is on the python utils page and is a small component in the FSDM module. It is a new implementation of the python dictionary - but dictionary methods that would normally return keys/values in a randomn order now do so in an ordered way. The ordering is implemented in the keys() method, so it's easy t o see how it works. Because I've built from scratch basically *all* the dictionary methods, including a custom iterator, it's an excellent place to start if you want to build your own dictionary with custom behaviour.
posted by Mike Foord on Wednesday, June 16, 2004

(0) comments

Monday, June 14, 2004

My goodness, 3 posts in one day - that has to be some kind of record.... Decreases the chance of them ever getting read of course.

I forgot to mention that I've slightly revamped the pythonutils page. It has a nicer picture and is slightly less wordy... as well as a link to this blog page etc.

I've also done an executable version of my Pyshell. This is a replacement for the windows cmd shell program - which is rather pathetic. It doesn't have completion or anything (yet!) but it does allow python commands as well as windows one. This is a bit experimental (the executable version) - but it does expose the python interpreter in an executable file... albeit in a limited way (although it needn't be limited).. hmmmm...
posted by Mike Foord on Monday, June 14, 2004

(0) comments
Hurrah - a more cheerful tone to this post.

DirWatcher is ready. This is the Tkinter GUI for the filestruct module. There is a python source code zip (30k), and a windows executable version (2.6MB) available from the download page.

DirWatcher Home Page

DirWatcher is..... the GUI for the filestruct and FSDM tool and classes. It is a simple tool that helps you keep directories in different locations in sync. I have various directories - research projects, programming projects, reference material etc - that I keep at work and at home. If DirWatcher is installed on both machines, it can profile the directories at a point when you know they are both identical. After making changes DirWatcher can reprofile the directory and save any changes as a single zipfile. If you transfer that zipfile to the other location, DirWatcher can then make all the changes for you.The two directories you are 'syncing' don't even need to have the same name. Be careful though, when DirWatcher is making changes it will delete files and directories if that is what you've done - make sure you are certain you want it to make the changes and that you are getting it to make the changes in the right place.

The GUI is written using Tkinter which is the standard GUI toolkit for Python. The actual python source code for DirWatcher (without the filestruct module) is less than 20k.
posted by Mike Foord on Monday, June 14, 2004

(0) comments
Aaargh... I hate it when that happens... and it all comes of not properly testing *every* change. The upshot of it is that the Python source version of Nangram that has been on my site for the last three months was broken.....

It all comes of me trying to fix the difficulties caused when putting together an executable to be distributed using Innosetup. When a program is launched from a windows menu option (put there by Innosetup)the current working directory for the script isn't the script directory.... which royally jiggers things. Added to which - when you turn a script into an executable using py2exe it bundles all the libraries together in a single zip file. It then uses the newfangled 'import from zip' feature (whose real name esacpes me - zipimport ?). That means that the directory listed in sys.path[0] (which is normally the directory the script is in) is the zip file !!

So what I was doing was taking the first directory from sys.path (which is the path list that python searches for module in) and using os.path.split to remove the last part of the path - which was the zipfile.... Then I change the current working directory to that directory. A bit of an ugly hack - but it works fine for anything built with py2exe and also then with innosetup. What I forgot (doh !) was that if we were running from source code - then using os.path.split would simply remove the last directory.. and bingo, everything fails.

*Damn* - I've fixed it now though. What's even more annoying is that python always gives you the path to the script in sys.argv[0] - so I could have used that at any time without problems....
posted by Mike Foord on Monday, June 14, 2004

(0) comments

Friday, June 11, 2004

My filestruct program got featured on the Python Daily URL a couple of days ago.... I've had 400 visitors since - nearly back to my usual 20 a day though now :-)

About halfway through the GUI for it using Tkinter.
posted by Mike Foord on Friday, June 11, 2004

(0) comments

Tuesday, June 08, 2004

Ok... so this is a correction to my post a couple of days ago about subclassing the built in types (in python).

I *nearly* got it right. Because new is the 'factory method' for creating new instances it is actually a static method and *doesn't* receive a reference to self as the first instance... it receives a reference to the class as the first argument. By convention in python this is a variable named cls rather than self (which refers to the instance itself). What it means is that the example I gave *works* fine, but the terminology is slightly wrong...

See the docs on the new style classes unifying types and classes. Also thanks to Paul McGuire on comp.lang.pyton for helping me with this.

My example ought to read :

class newstring(str):
def __new__(cls, value, *args, **keywargs):
return str.__new__(cls, value)
def __init__(self, value, othervalue):
self.othervalue = othervalue

See how the __new__ method collects all the other arguments (using the *args and **keywargs collectors) but ignores them - they are rightly dealt with by __init__. You *could* examine these other arguments in __new__ and even return an object that is an instance of a different class depending on the parameters - see the example Paul gives...

Get all that then ? :-)
posted by Mike Foord on Tuesday, June 08, 2004

(0) comments

Monday, June 07, 2004

Another 'official release' jobby.

This time it's filestruct and splitter.
Both python modules that can be found at :
The Voidspace Pythonutils Page

filestruct is a command line tool for maintaing identical file/directory structures in different locations. It will profile a directory - and then later reprofile and save all the changes into a single zip file. You can use filestruct to then recreate those changes and only have to transfer the files that have changed - useful for when I have to maintain various projuct and research folders at both home and work.

It is also set of three classes of objects (and functions) that can profile file structures, write them out as a basic markup language and then read them in and compare them. This could form the basis of an incremental archive, a simple version control system, or anything that wants to monitor or describe a file structure. I'll build a GUI for the command line tool soon and then probably work on a simple version control system....

Splitter is a simple class that has been written many times by many people - but dammit Python makes this kind of hacking so much fun !! It can split and automatically rejoin files with various attributes that control how it does it..... (Meant I could get the 132meg starseige file home on a 128meg memory card and a 64meg one....)...
posted by Mike Foord on Monday, June 07, 2004

(0) comments

Wednesday, June 02, 2004

I've just spent ages trying to subclass string.... and I'm very proud to say I finally managed it !

The trouble is that the string type (str) is immutable - which means that new instances are created using the mysterious __new__ method rather than __init__ !! :-) You still following me.... ?

SO :

class newstring(str):
def __init__(self, value, othervalue):
str.__init__(self, value)
self.othervalue = othervalue

astring = newstring('hello', 'othervalue')

fails miserably. This is because the __new__ method of the str is called *before* the __init__ value.... and it says it's been given too many values. What the __new__ method does is actually return the new instance - for a string the __init__ method is just a dummy.

The bit I couldn't get (and I didn't have access to a python manual at the time) - if the __new__ method is responsible for returning the new instance of the string, surely it wouldn't have a reference to self; since the 'self' wouldn't be created until after __new__ has been called......

Actually thats wrong - so, a simple string type might look something like this :

class newstring(str):
def __new__(self, value):
return str.__new__(self, value)
def __init__(self, value):

See how the __new__ method returns the instance and the __init__ is just a dummy.
If we want to add the extra attribute we can do this :

class newstring(str):
def __new__(self, value, othervalue):
return str.__new__(self, value)
def __init__(self, value, othervalue):
self.othervalue = othervalue

The order of creation is that the __new__ method is called which returns the object *then* __init__ is called. Although the __new__ method receives the 'othervalue' it is ignored - and __init__ uses it. In practise __new__ could probably do all of this - but I prefer to mess around with __new__ as little as possible ! I was just glad I got it working..... What it means is that I can create my own class of objects - that in most situations will behave like strings, but have their own attributes. The only restriction is that the string value is immutable and must be set when the object is created. See the excellent path module by Jason Orendorff for another example object that behaves like a string but also has other attributes - although it doesn't use the __new__ method; or the __init__ method I think.
posted by Mike Foord on Wednesday, June 02, 2004

(0) comments
Hmmm... again too much to say, too little time (phew I hear you say ?). Mainly python related stuff.

Just got the latest version of FormEncode - which hopefully will help me write CGI scripts that use forms. Generating the forms using python code rather than HTML templates. It was unecessarily difficult to get it though (I used 'unduly complicated' in my personal blog... have to find some heterogeneous phrases..). He has no http interface because he hasn't done a proper release yet - it's all alpha quality code. So I can't get it from work where I only have http(restricted) access and my home account is a dial up one. I've had to download Subversion (which looks like an excellent CVS replacement) and *then* use it to fetch FormEncode.... Not even sure it's going to do what I want.

I edit code at home and work. I also have a 60 meg folder of project research which I use at work and at home. The trouble is when I edit either I have to transport the entire folders home - which can be a royal pain with flash memory cards. So I've written a set of functions for describing and comparing file structures using a simple markup language - FSDM - File Structure Description Markup. It will then build into a single zip file only the files that have changed. I take that home - and zing it puts them all int he right place and deletes any folders that have gone. Because it can decribe changes to file structures I can build a Simple Version Control (SVC !) system off the code too. I've tested it over the last couple of days and it works great... I haven't uploaded the code - will do shortly.

I also had a 132 meg file to transport using a 128meg flash card and a 32meg one. I hacked together some code to split a file and automatically recombine it. Probably loads of programs around to do this - but Python makes this kind of hacking so much fun !
posted by Mike Foord on Wednesday, June 02, 2004

(0) comments