Differences between revisions 5 and 6
Revision 5 as of 2003-09-15 20:25:45
Size: 10979
Editor: dial81-131-115-73
Comment: Fixed link to PyKDE wiki page.
Revision 6 as of 2004-01-02 03:03:48
Size: 27230
Editor: dial81-131-18-161
Comment: Completely changed the subject of the tutorial...
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= IOSlaves Tutorial =

Note: This is a work in progress.

== Abstract ==

A small library was written to allow IOSlaves, the protocol handlers for the
[http://www.kde.org/ K Desktop Environment], to be written in Python using the
PyQt and ["PyKDE"] modules. An example IOSlave was created for the purpose of
examining the user's Konqueror bookmarks using this library and a simple Python
class. Since Python works well as a "glue" language, it is hoped that the creation
of IOSlaves will be made more accessible to a wider range of users, leading to a
richer, more transparent user experience on the desktop.

Note: the installation procedure for IOSlaves in forthcoming versions of PyKDE will probably be different to that described below.
= An IOSlave Tutorial =

== Note ==

This document is a ''work in progress''. Please feel free to add
constructive comments, make additions and edit accordingly.
Line 19: Line 11:
In KDE's infrastructure, IOSlaves handle the transfer of data between
applications and remote servers using common protocols such as ''http'' and
''ftp'', but also for more mundane protocols like the ''file'' protocol for
local files. Many of the mainstream protocols provided in the standard KDE
distribution are implemented in C++, although some, like the ''finger''
IOSlave, rely on support scripts to handle various aspects of communication
with remote servers. Since ["PyKDE"] provides Python implementations (or
''wrappers'') for the relevant classes in the `kio` library, it is possible to
write IOSlaves almost completely in Python; a simple C++ handling function is
only required for dynamic linking purposes and to set up the interpreter.

Much of the documentation describing the creation of IOSlaves is,
naturally, written to assist the C++ programmer by providing examples of the
appropriate classes in use. When read alongside some of the distributed
examples in the {{{kdebase}}} package of the KDE distribution, these
tutorials provide most of the information required to write an IOSlave in
Python "from scratch". However, some aspects of their operation would benefit
from further description so it is useful to take this opportunity, when
translating the material for a new audience, to try and provide clear and
concise documentation to complement existing material. Note that I am not a
implentor of IOSlaves in C++ so there may be scope for future additions and
corrections to this document from KDE experts.

We will begin by describing the implementation of the Python module, since
the C++ handler should be transparent in use, before discussing potential
problems with IOSlaves, methods for debugging them and any known limitations
to their use.

== Implementation ==

=== Modules and the slave class ===

We begin by importing the necessary modules. Some of these are needed for
general interoperability with the KDE IOSlave infrastructure:
{{{
from qt import QString, QByteArray, QDataStream, IO_ReadOnly
The ADFS IOSlave presents the contents of ADFS floppy disk images to the user by
inspecting the ADFS filesystem stored within each image and using the KIOSlave API
to return this information to client applications. Although the underlying Python
module was written with a command line interface in mind, it provides an interface
which can be used by an IOSlave without too much of an overhead.

This document outlines the source code and structure of the ADFS IOSlave and aims
to be a useful guide to those wishing to write IOSlaves, either in Python or C++.


== Annotated code ==

It is convenient to examine the source code as it is written in the ''kio_adfs.py'' file.
However, we can at least group some of the methods in order to provide an overview of
the IOSlave and separate out the details of the initialisation.


=== Initialisation ===

Various classes and namespaces are required in order to communicate with the KIOSlave
framework. These are drawn from the `qt`, `kio` and `kdecore` modules:

{{{
from qt import QByteArray
Line 59: Line 39:
Other familiar Python modules are used when performing tasks specific to this
IOSlave:
{{{
import os, time
}}}

Additionally, we are going to use the XML module from the Qt libraries:
{{{
import qtxml
}}}

We define a class which will be instantiated when the ''bookmarks'' protocol is
used. This is a subclass of `KIO.SlaveBase` and relies on the facilities of
this base class for communication with applications. At this point, it is
useful to describe how the bookmark information will be encapsulated in the
form of files. We will be representing each bookmark as a Desktop file, so
we declare a template to be filled in for each bookmark and the MIME type for
these files:
The `os` and `time` modules provide functions which are relevant to the operation of
the IOSlave; the `sys` and `traceback` modules are useful for debugging the IOSlave:

{{{
import os, sys, traceback, time
}}}

The `ADFSlib` module is imported. This provides the functionality required to read disk
images:

{{{
import ADFSlib
}}}

We omit the debugging code to keep this tutorial fairly brief. This can be examined
in the distributed source code.


=== The slave class ===

We define a class which will be used to create instances of this IOSlave. The class
must be derived from the `KIO.SlaveBase` class so that it can communicate with
clients via the DCOP mechanism. Various operations are supported if the appropriate
method (represented as virtual functions in C++) are reimplemented in our subclass.

Note that the name of the class is also declared in the ''details.py'' file so
that the library which launches the Python interpreter knows which class to
instantiate.
Line 79: Line 70:
    desktop_template = \

u"""[Desktop Entry]
Encoding=%(encoding)s
Icon=%(icon)s
Type=Link
URL=%(href)s
"""

    bookmark_mimetype = "application/x-desktop"
}}}

The `__init__` method for this class simply calls the corresponding method of
the base class and initialises some variables for later use. The `__del__`
method currently does nothing.

    """SlaveClass(KIO.SlaveBase)
    
    See kdelibs/kio/kio/slavebase.h for virtual functions to override.
    """
}}}

An initialisation method, or constructor, is written which calls the base class
and initialises some useful attributes, or instance variables. Note that the name
of the IOSlave is passed to the base class's `__init__` method:
Line 97: Line 84:
        KIO.SlaveBase.__init__(self, "bookmarks", pool, app)
        
        self.dcopClient().attach()
        # We must call the initialisation method of the base class.
        
KIO.SlaveBase.__init__(self, "adfs", pool, app)
        
        # Initialise various instance variables.
Line 102: Line 91:
        self.document = None
        self.file = None
    
    def __del__(self):
    
        pass
}}}

=== General operations ===

Although the actions performed by an IOSlave will typically be closely related
to its purpose, we will define the methods required by many IOSlaves and only
later define the methods which are specific to the ''bookmarks'' protocol.

==== setHost ====

The `setHost` method is called when the bookmarks IOSlave is asked to perform an
operation in which a host is specified. Since we will only be looking at the user's
local bookmarks, the host name is both not required and not wanted. When a host name
is given, we indicate that an error has occurred using the base class's `error` method:
{{{
    def setHost(self, host, port, user, passwd):
    
        if unicode(host) != u"":
        
            self.closeConnection()
            self.error(KIO.ERR_MALFORMED_URL, host)
            return
}}}

==== openConnection and closeConnection ====

The `openConnection` method is called before an application tries to perform operations
on the file system presented by an IOSlave. For the bookmarks protocol, we take this
opportunity to read the user's bookmarks file from the appropriate place in their home
directory.
{{{
    def openConnection(self):
    
        # Don't call self.finished() in this method.
        
}}}
We find the user's home directory from a shell variable and look for a "bookmarks.xml" file in
a subdirectory beneath the ".kde" subdirectory.
{{{
        self.home = os.getenv("HOME")
        
        path = os.path.join(
            self.home, ".kde", "share", "apps", "konqueror",
            "bookmarks" + os.extsep + "xml"
            )
}}}
For convenience, we record the path used to obtain the bookmarks before trying to open the
file specified by that path.
{{{
        self.file = path
        
        self.disc_image = None
        self.adfsdisc = None
}}}

We create a method to parse any URLs passed to the IOSlave and return a path
into a disk image. This initially extracts the path from the `KURL` object
passed as an argument and converts it to a unicode string:

{{{
    def parse_url(self, url):
    
        file_path = unicode(url.path())
}}}

The presence of a colon character is determined. If one is present then it
will simply be discarded along with any preceding text; the remaining text
is assumed to be a path to a local file.

{{{
        at = file_path.find(u":")
        
        if at != -1:
        
            file_path = file_path[at+1:]
}}}

Since we are implementing a read-only IOSlave, we can implement a simple
caching system for operations within a single disk image. If we have cached
a URL for a disk image then we check whether the URL passed refers to an
item beneath it. This implies that the cached URL is a substring of the
one given.

If the disk image has been cached then return the path within the image:

{{{
        if self.disc_image and file_path.find(self.disc_image) == 0:
        
            # Return the path within the image.
            return file_path[len(self.disc_image):]
}}}

An uncached URL must be examined element by element, as far as possible,
comparing the path with the local filesystem. Since a valid path will
contain at least one slash character then we can immediately discard
any paths which do not contain one, returning `None` to the caller:

{{{
        elements = file_path.split(u"/")
        
        if len(elements) < 2:
        
            return None
}}}

Starting from the root directory, we apply each new path element to the
constructed path, testing for the presence of the objects it refers to.
If no object can be found at the path given then `None` is returned to
the caller to indicate that the URL was invalid.
If a file is found then it is assumed that an ADFS disk image has been
located; a check could be performed to verify this.
Finally, if all the path elements are added to the root directory, and
the object referred to is itself a directory, then the URL is treated
as invalid; it should have referred to a file.

{{{
        path_elements, elements = elements[:1], elements[1:]
        
        while elements != []:
        
            path_elements.append(elements.pop(0))
            
            path = u"/".join(path_elements)
            
            if os.path.isfile(path):
            
                break
            
            elif elements == [] and os.path.isdir(path):
            
                return None
            
            elif not os.path.exists(path):
            
                return None
}}}

At this point, it is assumed that a suitable image has been found at the
constructed path. The characters following this path correspond to the
path within the image file. We record the path to the image and construct
the path within the image:

{{{
        self.disc_image = path
        
        image_path = u"/".join(elements)
}}}

If not already open, it is necessary to open the image file, returning `None`
if the file cannot be found. (Its presence was detected earlier but it is
better to catch any exceptions.)

{{{
Line 161: Line 195:
            b = open(self.file, "r").read()
}}}
If the bookmarks file was not found at the expected location then an error is returned to the
application. This may be passed on to the user.
{{{
            adf = open(self.disc_image, "rb")
        
Line 168: Line 199:
            self.error(KIO.ERR_CANNOT_OPEN_FOR_READING, self.file)
            return
}}}
The XML contained in the file is converted to a DOM representation of the document which
is stored in an instance attribute.
{{{
        self.document = qtxml.QDomDocument()
        
        result, errorMsg, errorLine, errorColumn = self.document.setContent(b)
}}}
If the XML in the bookmarks file could not be interpreted then an error is returned.
{{{
        if result == 0:
        
            self.error(KIO.ERR_CANNOT_OPEN_FOR_READING, self.file)
            self.document = None
            return
}}}
For this IOSlave, the `closeConnection` method simply writes any bookmark information
to the user's bookmarks file and indicates that the document will need to be read again
before subsequent operations can be performed. Other IOSlaves may need to perform more
complicated operations at this point.
{{{
    def closeConnection(self):
    
        # Don't call self.finished() in this method.
        
        self._flush()
        
        self.document = None
}}}

==== get ====

The `get` method is called when an application requests an object, represented
by the URL given, using the relevant protocol; in this case the ''bookmarks''
protocol. When the calling application requests a bookmark, represented as a
file by the IOSlave, the contents of this file are generated using a template
and returned to the application.
            return None
}}}

We attempt to open the disk image using a class from the support module. This
will read and catalogue the files within the image, storing them in an
internal structure. However, if a problem is found with the image, then an
exception will be raised. We tidy up and return `None` to signal failure
in such a case, but otherwise return the path within the image:

{{{
        try:
        
            self.adfsdisc = ADFSlib.ADFSdisc(adf)
        
        except ADFSlib.ADFS_exception:
        
            adf.close()
            return None
        
        return image_path
}}}


=== The get file operation ===

Various fundamental operations are required if the IOSlave is going to
perform a useful function. The first of these is provided by the `get`
method which reads files in the disk image and sends their contents
to the client. The first thing this method does is check the URL supplied
using the previously defined `parse_url` method, reporting an error if the
URL is unusable:
Line 210: Line 234:
        self.openConnection()
}}}
The URL must be examined and the relevant node found from the document
describing the user's bookmarks. To do this, we call a
method which is specific to this IOSlave. If the URL refers to a valid object
then the path to the object, the name of the object and a DOM node are
returned to this method; otherwise, the name and object return values are
set to `None`.
{{{
        path, name, obj = self.find_object_from_url(url)
        
        if obj is None:
        path = self.parse_url(url)
        
        if path is None:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, url.path())
            return
}}}

Having established that the disk image referred to is valid, we now have
a path which is supposed to refer to a file within the image. It is now
necessary to attempt to find this file. This is achieved by the use of
the as yet undeclared `find_file_within_image` method which will return
`None` if a suitable file cannot be found:

{{{
        adfs_object = self.find_file_within_image(path)
        
        if not adfs_object:
Line 226: Line 256:
If, on the other hand, an object was found then we can check whether it is a
bookmark or some other object. Objects such as folders cannot be fetched using
the get method.
{{{
        if unicode(obj.nodeName()) != u"bookmark":

Since, at this point, an object of some sort was located within the image,
we need to check whether it is a directory and return an error if so.

The details of the object returned by the above method is in the form of a
tuple which contains the name, the file data and some other metadata.

If the second element in the tuple is a list then the object found is a
directory:

{{{
        if type(adfs_object[1]) == type([]):
Line 236: Line 273:
With the object representing a bookmark determined, we can return the details
to the calling application in the form of a file with a MIME type as defined
earlier. We declare the MIME type before we send the file data.
{{{
        self.mimeType(self.bookmark_mimetype)
}}}
The bookmark's details such as its title, target URL and the icon used to represent it in a menu, are
read and converted into a form suitable for presentation.
{{{
        details = self.read_bookmark_details(obj)
        
        details[u"title"] = self.decode_name(details[u"title"])
}}}
The contents of the file to be returned are created using a template combined
with information from the DOM node found earlier.
{{{
        text = self.desktop_template % details
        output = text.encode(details[u"encoding"])
}}}
The constructed file is sent to the application through the use of the base class's `data` method.
Note the use of the `QByteArray` class for this purpose.
{{{
        self.data(QByteArray(output))
}}}
To report the end of the data, we must send the application an empty byte array:
For files, the second element of the tuple contains a string. In this method,
we are only interested in the file data. Using the base class's `data` method,
which we can access through the current instance, we send a `QByteArray` to the
client:

{{{
        self.data(QByteArray(adfs_object[1]))
}}}

The end of the data string is indicated by an empty `QByteArray` before we
indicate completion of the operation by calling the base class's `finished`
method:
Line 263: Line 288:
}}}
We must also report that we have finished the operation by calling the base class's `finished` method:
{{{
        
Line 270: Line 293:

==== Synchronisation ====

This IOSlave defines a `_flush` method to allow changes to the user's bookmarks to
be written to their bookmarks file. This is typically only performed in methods where
a request was made to change some aspect of the presented file system.
{{{
    def _flush(self):
    
        if self.document is None:
        
            return
        
        try:
        
            b = open(self.file, "w")
        
        except IOError:
        
            self.error(KIO.ERR_CANNOT_OPEN_FOR_WRITING, self.file)
            return
        
        b.write(str(self.document.toString()))
        b.close()
}}}
Resetting the document attribute for this instance will cause the bookmarks file to be read
from the user's home directory before further operations can be performed on the presented file
system:
{{{
        self.document = None
}}}
=== The stat operation ===

The `stat` method returns information about files and directories within
the disk image. It is very important that this method works properly as,
otherwise, the IOSlave will not work as expected and may appear to be
behaving in an unpredictable manner. For example, clients such as Konqueror
often use the `stat` method to find out information about objects before
calling `get`, so failure to read a file may actually be the result of a
misbehaving `stat` operation.

As for the `get` method, the `stat` method initially verifies that the URL
supplied is referring to a valid disk image, and that there is a path
within the image to use. Unlike the `get` method, it will redirect the
client to the path contained within the URL if the `parse_url` fails to
handle it. This allows the user to use URL autocompletion on ordinary
filesystems while searching for images to read.

{{{
    def stat(self, url):
    
        path = self.parse_url(url)
        
        if path is None:
        
            # Try redirecting to the protocol contained in the path.
            redir_url = KURL(url.path())
            self.redirection(redir_url)
            self.finished()
            #self.error(KIO.ERR_DOES_NOT_EXIST, url.path())
            return
}}}

As before, non-existant objects within the image cause errors to be
reported:

{{{
        adfs_object = self.find_file_within_image(path)
        
        if not adfs_object:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)
            return
}}}

In the tuple containing the object's details the second item may be in the
form of a list. This would indicate that a directory has been found which
we must deal with appropriately. However, for ordinary files we simply
generate a suitable description of the file to return to the client:

{{{
        if type(adfs_object[1]) != type([]):
        
            entry = self.build_entry(adfs_object)
}}}

If the object was not a file then we must ensure that the path given
contains a reference to a directory. If, on the other hand, the path is
either empty or does not end in a manner expected for a directory then
it is useful to redirect the client to an appropriate URL:

{{{
        elif path != u"" and path[-1] != u"/":
        
            # Directory referenced, but URL does not end in a slash.
            
            url.setPath(unicode(url.path()) + u"/")
            self.redirection(url)
            self.finished()
            return
}}}

If the URL referred to a directory then a description can be returned to
the client in a suitable form:

{{{
        else:
        
            entry = self.build_entry(adfs_object)
}}}

After a description of the object found has been constructed, it only
remains for us to return the description (or entry in the filesystem)
to the client by submitting it to the KIOSlave framework. This is
performed by the following operation to the `statEntry` method, and is
followed by a `finished` call to indicate that there are no more
entries to process:

{{{
        if entry != []:
        
            self.statEntry(entry)
            
            self.finished()
        
        else:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)
}}}


=== The mimetype operation ===

In many virtual filesystems, the `mimetype` operation would require a certain
amount of work to determine MIME types of files, or sufficient planning to
ensure that data is returned in a format in line with a predetermined MIME
type. Since its use is optional, we do not define a method to determine the
MIME types of any files within our virtual filesystem. The client will
have to inspect the contents of such files in order to determine their MIME
types.


=== The listDir operation ===

The contents of a directory on our virtual filesystem are returned by the
`listDir` method. This works like the `stat` method, but returns information
on multiple objects within a directory.

As for the previous methods the validity of the URL is checked. If no
suitable directory found within the disk image, the path component of the
original URL is extracted and the client redirected to this location.
Note that the `url` argument is a `kdecore.KURL` object.


{{{ def listDir(self, url):
    
        path = self.parse_url(url)
        
        if path is None:
        
            redir_url = KURL(url.path())
            
            self.redirection(redir_url)
            self.finished()
            return
}}}

Having established that the path refers to a valid disk image, we try
to find the specified object within the image, returning an error if
nothing suitable is found:

{{{
        adfs_object = self.find_file_within_image(path)
        
        if not adfs_object:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)
            return
}}}

If the path does not end in a slash then redirect the client to a
URL which does. This ensures that either a directory will be
retrieved or an error will be returned to the client:

{{{
        elif path != u"" and path[-1] != u"/":
        
            url.setPath(unicode(url.path()) + u"/")
            self.redirection(url)
            self.finished()
            return
}}}

If a file is referenced then an error is returned to the client because
we can only list the contents of a directory:

{{{
        elif type(adfs_object[1]) != type([]):
        
            self.error(KIO.ERR_IS_FILE, path)
            return
}}}

A list of files is kept in the second item of the object returned
from the support module. For each of these files, we must construct
an entry which is understandable to the KIOSlave infrastructure in
a manner similar to that used in the method for the `stat` operation.

{{{
        # Obtain a list of files.
        files = adfs_object[1]
        
        # Return the objects in the list to the application.
        
        for this_file in files:
        
            entry = self.build_entry(this_file)
            
            if entry != []:
            
                self.listEntry(entry, 0)
                
                # For old style disk images, return a .inf file, too.
                if self.adfsdisc.disc_type.find("adE") == -1:
                
                    this_inf = self.find_file_within_image(
                        path + "/" + this_file[0] + ".inf"
                        )
                    
                    if this_inf is not None:
                    
                        entry = self.build_entry(this_inf)
                        
                        if entry != []:
                        
                            self.listEntry(entry, 0)
        
        # We have finished listing the contents of a directory.
        self.listEntry([], 1)
        
        self.finished()
}}}


=== The dispatch loop ===

Although not entirely necessary, we implement a `dispatchLoop` method
which simply calls the corresponding method of the base class:

{{{
    def dispatchLoop(self):
    
        KIO.SlaveBase.dispatchLoop(self)
}}}


== Utility methods ==

We define some methods which, although necessary for this IOSlave to
work, are not standard virtual methods to be reimplemented. However,
they do contain code which might be usefully reused in other IOSlaves.


=== Building file system entries ===

We create a method to assist in building filesystem entries to return
to the client via the KIOSlave infrastructure. For this example, some
basic details of each file or directory in the disk image is derived
from information contained within and stored within standard
`KIO.UDSAtom` instances.

{{{
    def build_entry(self, obj):
    
        entry = []
}}}

We check the type of object passed to the method in order to
determine the nature of the information returned. For files
we do not provide a predetermined MIME type, leaving the client
to determine this from data retrieved from the disk image.

{{{
        if type(obj[1]) != type([]):
        
            # [name, data, load, exec, length]
            name = self.encode_name_from_object(obj)
            length = obj[4]
}}}

Files stored in old disk images require accompanying ''.inf'' files
to describe their original attributes. The following code provides
details of these files which are not actually present in the disk
image, but are generated automatically by this IOSlave:

{{{
            if self.adfsdisc.disc_type.find("adE") == -1 and \\
                obj[0][-4:] == ".inf":
            
                # For .inf files, use a MIME type of text/plain.
                mimetype = "text/plain"
            
            else:
            
                # Let the client discover the MIME type by reading
                # the file.
                mimetype = None
        
        else:
        
            name = self.encode_name_from_object(obj)
            length = 0
            mimetype = "inode/directory"
}}}

Having determined the MIME type we now declare all the relevant
attributes of the object and return these to the caller:

{{{
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_NAME
        atom.m_str = name
        
        entry.append(atom)
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_SIZE
        atom.m_long = length
        
        entry.append(atom)
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_MODIFICATION_TIME
        # Number of seconds since the epoch.
        atom.m_long = int(time.time())
        
        entry.append(atom)
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_ACCESS
        # The usual octal permission information (rw-r--r-- in this case).
        atom.m_long = 0644
        
        entry.append(atom)
        
        # If the stat method is implemented then entries _must_ include
        # the UDE_FILE_TYPE atom or the whole system may not work at all.
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_FILE_TYPE
        
        if mimetype != "inode/directory":
        
            atom.m_long = os.path.stat.S_IFREG
        
        else:
        
            atom.m_long = os.path.stat.S_IFDIR
        
        entry.append(atom)
        
        if mimetype:
        
            atom = KIO.UDSAtom()
            atom.m_uds = KIO.UDS_MIME_TYPE
            atom.m_str = mimetype
            
            entry.append(atom)
        
        return entry
}}}


=== Encoding filenames ===

The following two internal methods deal with the translation of
paths and filenames within the disk image to and from canonical
URL style paths. These are only of interest to those familiar with
ADFS style paths.

{{{
    def encode_name_from_object(self, obj):
    
        name = obj[0]
        
        # If the name contains a slash then replace it with a dot.
        new_name = u".".join(name.split(u"/"))
        
        if self.adfsdisc.disc_type.find("adE") == 0:
        
            if type(obj[1]) != type([]) and u"." not in new_name:
            
                # Construct a suffix from the object's load address/filetype.
                suffix = u"%03x" % ((obj[2] >> 8) & 0xfff)
                new_name = new_name + "." + suffix
        
        return unicode(KURL.encode_string_no_slash(new_name))
    
    def decode_name(self, name):
    
        return unicode(KURL.decode_string(name))
}}}


=== Locating objects within a disk image ===

A key element in the construction of an IOSlave is the method used
to map between the URLs given by client applications and the
conventions of the virtual filesystems represented by the IOSlave.
In this instance, the disk image contains a working snapshot of an
ADFS filesystem which must be navigated in order to extract objects
referenced by the client.

Since the `ADFSlib` support module provides objects to contain the
directory structure contained within the disk image, only a minimal
amount of work is required to locate objects, and this mainly
involves a recursive examination of a tree structure. However, there
are a few special cases which are worth mentioning.

In this method, the `path` argument contains the path component of
the URL supplied by the client in the standard form used in URLs.

{{{
    def find_file_within_image(self, path, objs = None):
}}}

A convention we have adopted is the use of a default value of `None`
for the final argument of this method. Omission of this argument
indicates that we are starting a search from the root directory of
the disk image. As we descend into the directory structure,
recursive calls to this method will supply suitable values for this
argument but, for now, a reasonable value needs to be substituted
for `None`; this is a structure containing the entire filesystem:

{{{
        if objs is None:
        
            objs = self.adfsdisc.files
}}}

If an empty path was supplied then it is assumed that an object
corresponding to the root directory was being referred to. In such a
case a slash character is given as the name of the object, and the
list of objects supplied is given as the contents of the root
directory:

{{{
        if path == u"":
        
            # Special case for root directory.
            return [u"/", objs, 0, 0, 0]
}}}

For non-trivial paths, we split the path string into elements corresponding
to the names of files and directories expected as we descend into the
filesystem's hierarchy of objects, then we remove any empty elements:

{{{
        elements = path.split(u"/")
        
        elements = filter(lambda x: x != u"", elements)
}}}

For each object found in the current directory, we examine each object
and compare its name to the next path element expected.

{{{
        for this_obj in objs:
        
            if type(this_obj[1]) != type([]):
            
                # A file is found.
}}}

If a file is found in the filesystem, we translate its name so that
we can compare it with the next path element expected:

{{{
                obj_name = self.encode_name_from_object(this_obj)
                
                if obj_name == elements[0]:
                
                    # A match between names.
}}}

For files, we perform the simple test that the current path element
is the final one in the list, and return the corresponding object if
this is the case. If this is not the case then the URL is likely to
be invalid:

{{{
                    if len(elements) == 1:
                    
                        # This is the last path element; we have found the
                        # required file.
                        return this_obj
                    
                    else:
                    
                        # There are more elements to satisfy but we can
                        # descend no further.
                        return None
}}}

If a direct match between names is not possible then, for old-style
disk images, it is possible that the path is referring to a ''.inf''
file; we check for this possibility, applying the same check on the
remaining path elements as before:

{{{
                elif self.adfsdisc.disc_type.find("adE") == -1 and \\
                     elements[0] == obj_name + u".inf":
}}}

If successfully matched, a ''.inf'' file is created and returned,
otherwise a `None` value is returned to indicate failure:

{{{
                    if len(elements) == 1:
                    
                        file_data = "%s\t%X\t%X\t%X\n" % \\
                            tuple(this_obj[:1] + this_obj[2:])
                        
                        new_obj = \\
                        (
                            this_obj[0] + ".inf", file_data,
                            0, 0, len(file_data)
                        )
                        
                        return new_obj
                    
                    else:
                    
                        # There are more elements to satisfy but we can
                        # descend no further.
                        return None
            
            else:
            
                # A directory is found.
}}}

As for files, the names of directories found in the filesystem are
translated for comparison with the next path element expected:

{{{
                obj_name = self.encode_name_from_object(this_obj)
                
                if obj_name == elements[0]:
                
                    # A match between names.
}}}

Unlike files, directories can occur at any point in the descent into
the filesystem. Therefore, we either return the object corresponding
to the last path element or descend into the directory found:

{{{
                    if len(elements) == 1:
                    
                        # This is the last path element; we have found the
                        # required file.
                        return this_obj
                    
                    else:
                    
                        # More path elements need to be satisfied; descend
                        # further.
                        return self.find_file_within_image(
                            u"/".join(elements[1:]), this_obj[1]
                            )
}}}

At this point, no matching objects were found, therefore we return `None`
to indicate failure:

{{{
        return None
}}}

An IOSlave Tutorial

Note

This document is a work in progress. Please feel free to add constructive comments, make additions and edit accordingly.

Introduction

The ADFS IOSlave presents the contents of ADFS floppy disk images to the user by inspecting the ADFS filesystem stored within each image and using the KIOSlave API to return this information to client applications. Although the underlying Python module was written with a command line interface in mind, it provides an interface which can be used by an IOSlave without too much of an overhead.

This document outlines the source code and structure of the ADFS IOSlave and aims to be a useful guide to those wishing to write IOSlaves, either in Python or C++.

Annotated code

It is convenient to examine the source code as it is written in the kio_adfs.py file. However, we can at least group some of the methods in order to provide an overview of the IOSlave and separate out the details of the initialisation.

Initialisation

Various classes and namespaces are required in order to communicate with the KIOSlave framework. These are drawn from the qt, kio and kdecore modules:

from qt import QByteArray
from kio import KIO
from kdecore import KURL

The os and time modules provide functions which are relevant to the operation of the IOSlave; the sys and traceback modules are useful for debugging the IOSlave:

import os, sys, traceback, time

The ADFSlib module is imported. This provides the functionality required to read disk images:

import ADFSlib

We omit the debugging code to keep this tutorial fairly brief. This can be examined in the distributed source code.

The slave class

We define a class which will be used to create instances of this IOSlave. The class must be derived from the KIO.SlaveBase class so that it can communicate with clients via the DCOP mechanism. Various operations are supported if the appropriate method (represented as virtual functions in C++) are reimplemented in our subclass.

Note that the name of the class is also declared in the details.py file so that the library which launches the Python interpreter knows which class to instantiate.

class SlaveClass(KIO.SlaveBase):

    """SlaveClass(KIO.SlaveBase)
    
    See kdelibs/kio/kio/slavebase.h for virtual functions to override.
    """

An initialisation method, or constructor, is written which calls the base class and initialises some useful attributes, or instance variables. Note that the name of the IOSlave is passed to the base class's __init__ method:

    def __init__(self, pool, app):
    
        # We must call the initialisation method of the base class.
        
        KIO.SlaveBase.__init__(self, "adfs", pool, app)
        
        # Initialise various instance variables.
        
        self.host = ""
        self.disc_image = None
        self.adfsdisc = None

We create a method to parse any URLs passed to the IOSlave and return a path into a disk image. This initially extracts the path from the KURL object passed as an argument and converts it to a unicode string:

    def parse_url(self, url):
    
        file_path = unicode(url.path())

The presence of a colon character is determined. If one is present then it will simply be discarded along with any preceding text; the remaining text is assumed to be a path to a local file.

        at = file_path.find(u":")
        
        if at != -1:
        
            file_path = file_path[at+1:]

Since we are implementing a read-only IOSlave, we can implement a simple caching system for operations within a single disk image. If we have cached a URL for a disk image then we check whether the URL passed refers to an item beneath it. This implies that the cached URL is a substring of the one given.

If the disk image has been cached then return the path within the image:

        if self.disc_image and file_path.find(self.disc_image) == 0:
        
            # Return the path within the image.
            return file_path[len(self.disc_image):]

An uncached URL must be examined element by element, as far as possible, comparing the path with the local filesystem. Since a valid path will contain at least one slash character then we can immediately discard any paths which do not contain one, returning None to the caller:

        elements = file_path.split(u"/")
        
        if len(elements) < 2:
        
            return None

Starting from the root directory, we apply each new path element to the constructed path, testing for the presence of the objects it refers to. If no object can be found at the path given then None is returned to the caller to indicate that the URL was invalid. If a file is found then it is assumed that an ADFS disk image has been located; a check could be performed to verify this. Finally, if all the path elements are added to the root directory, and the object referred to is itself a directory, then the URL is treated as invalid; it should have referred to a file.

        path_elements, elements = elements[:1], elements[1:]
        
        while elements != []:
        
            path_elements.append(elements.pop(0))
            
            path = u"/".join(path_elements)
            
            if os.path.isfile(path):
            
                break
            
            elif elements == [] and os.path.isdir(path):
            
                return None
            
            elif not os.path.exists(path):
            
                return None

At this point, it is assumed that a suitable image has been found at the constructed path. The characters following this path correspond to the path within the image file. We record the path to the image and construct the path within the image:

        self.disc_image = path
        
        image_path = u"/".join(elements)

If not already open, it is necessary to open the image file, returning None if the file cannot be found. (Its presence was detected earlier but it is better to catch any exceptions.)

        try:
        
            adf = open(self.disc_image, "rb")
        
        except IOError:
        
            return None

We attempt to open the disk image using a class from the support module. This will read and catalogue the files within the image, storing them in an internal structure. However, if a problem is found with the image, then an exception will be raised. We tidy up and return None to signal failure in such a case, but otherwise return the path within the image:

        try:
        
            self.adfsdisc = ADFSlib.ADFSdisc(adf)
        
        except ADFSlib.ADFS_exception:
        
            adf.close()
            return None
        
        return image_path

The get file operation

Various fundamental operations are required if the IOSlave is going to perform a useful function. The first of these is provided by the get method which reads files in the disk image and sends their contents to the client. The first thing this method does is check the URL supplied using the previously defined parse_url method, reporting an error if the URL is unusable:

    def get(self, url):
    
        path = self.parse_url(url)
        
        if path is None:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, url.path())
            return

Having established that the disk image referred to is valid, we now have a path which is supposed to refer to a file within the image. It is now necessary to attempt to find this file. This is achieved by the use of the as yet undeclared find_file_within_image method which will return None if a suitable file cannot be found:

        adfs_object = self.find_file_within_image(path)
        
        if not adfs_object:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)
            return

Since, at this point, an object of some sort was located within the image, we need to check whether it is a directory and return an error if so.

The details of the object returned by the above method is in the form of a tuple which contains the name, the file data and some other metadata.

If the second element in the tuple is a list then the object found is a directory:

        if type(adfs_object[1]) == type([]):
        
            self.error(KIO.ERR_IS_DIRECTORY, path)
            return

For files, the second element of the tuple contains a string. In this method, we are only interested in the file data. Using the base class's data method, which we can access through the current instance, we send a QByteArray to the client:

        self.data(QByteArray(adfs_object[1]))

The end of the data string is indicated by an empty QByteArray before we indicate completion of the operation by calling the base class's finished method:

        self.data(QByteArray())
        
        self.finished()

The stat operation

The stat method returns information about files and directories within the disk image. It is very important that this method works properly as, otherwise, the IOSlave will not work as expected and may appear to be behaving in an unpredictable manner. For example, clients such as Konqueror often use the stat method to find out information about objects before calling get, so failure to read a file may actually be the result of a misbehaving stat operation.

As for the get method, the stat method initially verifies that the URL supplied is referring to a valid disk image, and that there is a path within the image to use. Unlike the get method, it will redirect the client to the path contained within the URL if the parse_url fails to handle it. This allows the user to use URL autocompletion on ordinary filesystems while searching for images to read.

    def stat(self, url):
    
        path = self.parse_url(url)
        
        if path is None:
        
            # Try redirecting to the protocol contained in the path.
            redir_url = KURL(url.path())
            self.redirection(redir_url)
            self.finished()
            #self.error(KIO.ERR_DOES_NOT_EXIST, url.path())
            return

As before, non-existant objects within the image cause errors to be reported:

        adfs_object = self.find_file_within_image(path)
        
        if not adfs_object:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)
            return

In the tuple containing the object's details the second item may be in the form of a list. This would indicate that a directory has been found which we must deal with appropriately. However, for ordinary files we simply generate a suitable description of the file to return to the client:

        if type(adfs_object[1]) != type([]):
        
            entry = self.build_entry(adfs_object)

If the object was not a file then we must ensure that the path given contains a reference to a directory. If, on the other hand, the path is either empty or does not end in a manner expected for a directory then it is useful to redirect the client to an appropriate URL:

        elif path != u"" and path[-1] != u"/":
        
            # Directory referenced, but URL does not end in a slash.
            
            url.setPath(unicode(url.path()) + u"/")
            self.redirection(url)
            self.finished()
            return

If the URL referred to a directory then a description can be returned to the client in a suitable form:

        else:
        
            entry = self.build_entry(adfs_object)

After a description of the object found has been constructed, it only remains for us to return the description (or entry in the filesystem) to the client by submitting it to the KIOSlave framework. This is performed by the following operation to the statEntry method, and is followed by a finished call to indicate that there are no more entries to process:

        if entry != []:
        
            self.statEntry(entry)
            
            self.finished()
        
        else:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)

The mimetype operation

In many virtual filesystems, the mimetype operation would require a certain amount of work to determine MIME types of files, or sufficient planning to ensure that data is returned in a format in line with a predetermined MIME type. Since its use is optional, we do not define a method to determine the MIME types of any files within our virtual filesystem. The client will have to inspect the contents of such files in order to determine their MIME types.

The listDir operation

The contents of a directory on our virtual filesystem are returned by the listDir method. This works like the stat method, but returns information on multiple objects within a directory.

As for the previous methods the validity of the URL is checked. If no suitable directory found within the disk image, the path component of the original URL is extracted and the client redirected to this location. Note that the url argument is a kdecore.KURL object.

{{{ def listDir(self, url):

  • path = self.parse_url(url) if path is None:
    • redir_url = KURL(url.path()) self.redirection(redir_url) self.finished() return

}}}

Having established that the path refers to a valid disk image, we try to find the specified object within the image, returning an error if nothing suitable is found:

        adfs_object = self.find_file_within_image(path)
        
        if not adfs_object:
        
            self.error(KIO.ERR_DOES_NOT_EXIST, path)
            return

If the path does not end in a slash then redirect the client to a URL which does. This ensures that either a directory will be retrieved or an error will be returned to the client:

        elif path != u"" and path[-1] != u"/":
        
            url.setPath(unicode(url.path()) + u"/")
            self.redirection(url)
            self.finished()
            return

If a file is referenced then an error is returned to the client because we can only list the contents of a directory:

        elif type(adfs_object[1]) != type([]):
        
            self.error(KIO.ERR_IS_FILE, path)
            return

A list of files is kept in the second item of the object returned from the support module. For each of these files, we must construct an entry which is understandable to the KIOSlave infrastructure in a manner similar to that used in the method for the stat operation.

        # Obtain a list of files.
        files = adfs_object[1]
        
        # Return the objects in the list to the application.
        
        for this_file in files:
        
            entry = self.build_entry(this_file)
            
            if entry != []:
            
                self.listEntry(entry, 0)
                
                # For old style disk images, return a .inf file, too.
                if self.adfsdisc.disc_type.find("adE") == -1:
                
                    this_inf = self.find_file_within_image(
                        path + "/" + this_file[0] + ".inf"
                        )
                    
                    if this_inf is not None:
                    
                        entry = self.build_entry(this_inf)
                        
                        if entry != []:
                        
                            self.listEntry(entry, 0)
        
        # We have finished listing the contents of a directory.
        self.listEntry([], 1)
        
        self.finished()

The dispatch loop

Although not entirely necessary, we implement a dispatchLoop method which simply calls the corresponding method of the base class:

    def dispatchLoop(self):
    
        KIO.SlaveBase.dispatchLoop(self)

Utility methods

We define some methods which, although necessary for this IOSlave to work, are not standard virtual methods to be reimplemented. However, they do contain code which might be usefully reused in other IOSlaves.

Building file system entries

We create a method to assist in building filesystem entries to return to the client via the KIOSlave infrastructure. For this example, some basic details of each file or directory in the disk image is derived from information contained within and stored within standard KIO.UDSAtom instances.

    def build_entry(self, obj):
    
        entry = []

We check the type of object passed to the method in order to determine the nature of the information returned. For files we do not provide a predetermined MIME type, leaving the client to determine this from data retrieved from the disk image.

        if type(obj[1]) != type([]):
        
            # [name, data, load, exec, length]
            name = self.encode_name_from_object(obj)
            length = obj[4]

Files stored in old disk images require accompanying .inf files to describe their original attributes. The following code provides details of these files which are not actually present in the disk image, but are generated automatically by this IOSlave:

            if self.adfsdisc.disc_type.find("adE") == -1 and \\
                obj[0][-4:] == ".inf":
            
                # For .inf files, use a MIME type of text/plain.
                mimetype = "text/plain"
            
            else:
            
                # Let the client discover the MIME type by reading
                # the file.
                mimetype = None
        
        else:
        
            name = self.encode_name_from_object(obj)
            length = 0
            mimetype = "inode/directory"

Having determined the MIME type we now declare all the relevant attributes of the object and return these to the caller:

        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_NAME
        atom.m_str = name
        
        entry.append(atom)
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_SIZE
        atom.m_long = length
        
        entry.append(atom)
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_MODIFICATION_TIME
        # Number of seconds since the epoch.
        atom.m_long = int(time.time())
        
        entry.append(atom)
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_ACCESS
        # The usual octal permission information (rw-r--r-- in this case).
        atom.m_long = 0644
        
        entry.append(atom)
        
        # If the stat method is implemented then entries _must_ include
        # the UDE_FILE_TYPE atom or the whole system may not work at all.
        
        atom = KIO.UDSAtom()
        atom.m_uds = KIO.UDS_FILE_TYPE
        
        if mimetype != "inode/directory":
        
            atom.m_long = os.path.stat.S_IFREG
        
        else:
        
            atom.m_long = os.path.stat.S_IFDIR
        
        entry.append(atom)
        
        if mimetype:
        
            atom = KIO.UDSAtom()
            atom.m_uds = KIO.UDS_MIME_TYPE
            atom.m_str = mimetype
            
            entry.append(atom)
        
        return entry

Encoding filenames

The following two internal methods deal with the translation of paths and filenames within the disk image to and from canonical URL style paths. These are only of interest to those familiar with ADFS style paths.

    def encode_name_from_object(self, obj):
    
        name = obj[0]
        
        # If the name contains a slash then replace it with a dot.
        new_name = u".".join(name.split(u"/"))
        
        if self.adfsdisc.disc_type.find("adE") == 0:
        
            if type(obj[1]) != type([]) and u"." not in new_name:
            
                # Construct a suffix from the object's load address/filetype.
                suffix = u"%03x" % ((obj[2] >> 8) & 0xfff)
                new_name = new_name + "." + suffix
        
        return unicode(KURL.encode_string_no_slash(new_name))
    
    def decode_name(self, name):
    
        return unicode(KURL.decode_string(name))

Locating objects within a disk image

A key element in the construction of an IOSlave is the method used to map between the URLs given by client applications and the conventions of the virtual filesystems represented by the IOSlave. In this instance, the disk image contains a working snapshot of an ADFS filesystem which must be navigated in order to extract objects referenced by the client.

Since the ADFSlib support module provides objects to contain the directory structure contained within the disk image, only a minimal amount of work is required to locate objects, and this mainly involves a recursive examination of a tree structure. However, there are a few special cases which are worth mentioning.

In this method, the path argument contains the path component of the URL supplied by the client in the standard form used in URLs.

    def find_file_within_image(self, path, objs = None):

A convention we have adopted is the use of a default value of None for the final argument of this method. Omission of this argument indicates that we are starting a search from the root directory of the disk image. As we descend into the directory structure, recursive calls to this method will supply suitable values for this argument but, for now, a reasonable value needs to be substituted for None; this is a structure containing the entire filesystem:

        if objs is None:
        
            objs = self.adfsdisc.files

If an empty path was supplied then it is assumed that an object corresponding to the root directory was being referred to. In such a case a slash character is given as the name of the object, and the list of objects supplied is given as the contents of the root directory:

        if path == u"":
        
            # Special case for root directory.
            return [u"/", objs, 0, 0, 0]

For non-trivial paths, we split the path string into elements corresponding to the names of files and directories expected as we descend into the filesystem's hierarchy of objects, then we remove any empty elements:

        elements = path.split(u"/")
        
        elements = filter(lambda x: x != u"", elements)

For each object found in the current directory, we examine each object and compare its name to the next path element expected.

        for this_obj in objs:
        
            if type(this_obj[1]) != type([]):
            
                # A file is found. 

If a file is found in the filesystem, we translate its name so that we can compare it with the next path element expected:

                obj_name = self.encode_name_from_object(this_obj)
                
                if obj_name == elements[0]:
                
                    # A match between names.

For files, we perform the simple test that the current path element is the final one in the list, and return the corresponding object if this is the case. If this is not the case then the URL is likely to be invalid:

                    if len(elements) == 1:
                    
                        # This is the last path element; we have found the
                        # required file.
                        return this_obj
                    
                    else:
                    
                        # There are more elements to satisfy but we can
                        # descend no further.
                        return None

If a direct match between names is not possible then, for old-style disk images, it is possible that the path is referring to a .inf file; we check for this possibility, applying the same check on the remaining path elements as before:

                elif self.adfsdisc.disc_type.find("adE") == -1 and \\
                     elements[0] == obj_name + u".inf":

If successfully matched, a .inf file is created and returned, otherwise a None value is returned to indicate failure:

                    if len(elements) == 1:
                    
                        file_data = "%s\t%X\t%X\t%X\n" % \\
                            tuple(this_obj[:1] + this_obj[2:])
                        
                        new_obj = \\
                        (
                            this_obj[0] + ".inf", file_data,
                            0, 0, len(file_data)
                        )
                        
                        return new_obj
                    
                    else:
                    
                        # There are more elements to satisfy but we can
                        # descend no further.
                        return None
            
            else:
            
                # A directory is found.

As for files, the names of directories found in the filesystem are translated for comparison with the next path element expected:

                obj_name = self.encode_name_from_object(this_obj)
                
                if obj_name == elements[0]:
                
                    # A match between names.

Unlike files, directories can occur at any point in the descent into the filesystem. Therefore, we either return the object corresponding to the last path element or descend into the directory found:

                    if len(elements) == 1:
                    
                        # This is the last path element; we have found the
                        # required file.
                        return this_obj
                    
                    else:
                    
                        # More path elements need to be satisfied; descend
                        # further.
                        return self.find_file_within_image(
                            u"/".join(elements[1:]), this_obj[1]
                            )

At this point, no matching objects were found, therefore we return None to indicate failure:

        return None

IoSlavesTutorial (last edited 2010-06-26 22:52:46 by PaulBoddie)

Unable to edit the page? See the FrontPage for instructions.