HierConfig - Python Wiki

Many programs are built these days by assembling components together, and Python programs are no exception. In general, the designer may choose to expose multiple configuration points, and will benefit if there is one standard way of doing so. From the perspective which views programs as hierarchical constructions of configurable components, it would seem to follow logically that configuration of the components should also be hierarchical in nature. The two-level (section, key) model as exemplified by the present ConfigParser does not offer sufficient power. If it did, why does Windows need a registry?

I've been thinking about how to improve the configuration of the logging package (which currently uses ConfigParser) and playing with some ideas which may have more general applicability. I'm posting them here and seeking feedback.

I think a good configuration system should provide the following (in addition to being textual, easy to read and edit):

Allow a hierarchy of configuration information, with no specific limit on the depth of the hierarchy.
Allow inclusion of sub-configurations held in external files, at any point in the hierarchy
Allow the defining of sequences of items as well as items accessed by key
Allow late-bound references to any point in the hierarchy
Allow simple expression evaluation, but not unrestrained eval()-type functionality.
The ability to specify standard library entities (e.g. sys.stderr or os.sep)

To illustrate these points, two example configuration files are given below. Please forgive the bias towards logging-related configuration. (My excuse is that I'm thinking about how to make logging configuration easier.)

The first is the application configuration file. It includes the logging configuration file using the notation @"logging.cfg".

app:
{
  name : MyApplication
  base: '/path/to/app/logs/'
  support_team: myappsupport
  mail_domain: '@my-company.com'
}
logging: @"logging.cfg"

The second file contains the logging configuration. It refers to the application configuration through $app

root:
{
  level     : DEBUG
  handlers  : [$handlers.console, $handlers.file, $handlers.email]
}
handlers:
{
  console:  [ StreamHandler, { level : WARNING, stream  : `sys.stderr` } ]
  file:     [ FileHandler, { filename: $app.base + $app.name + '.log', mode : 'a' } ]
  socket:   [ `handlers.SocketHandler`, { host: localhost, port: `handlers.DEFAULT_TCP_LOGGING_PORT`} ]
  nt_eventlog: [`handlers.NTEventLogHandler`, { appname: $app.name, logtype : Application } ]
  email:    [ `handlers.SMTPHandler`,
              { level: CRITICAL,
                host: localhost,
                port: 25,
                from: $app.name + $app.mail_domain,
                to: [$app.support_team + $app.mail_domain, 'QA' + $app.mail_domain, 'product_manager' + $app.mail_domain],
                subject: 'Take cover' } ]
}
loggers:
{
  "input"     : { handlers: [$handlers.socket] }
  "input.xls" : { handlers: [$handlers.nt_eventlog] }
}

The $-notation resolves entries when they are required. The use of specific characters such as '@' and '$' is preliminary and can be easily changed. You can lay out the file with as much whitespace as you like - indent according to taste.

I'm currently working on a module to parse this format, and have just released the first version: see http://www.red-dove.com/python_config.html for more details.

-- VinaySajip

I (PeterOtten) have tried to translate the above sample configuration into a much simpler format that was recently suggested by SkipMontanaro.

I think that calculations like building an email address should not be performed in a configfile but are rather the task of the application that uses it. The example has therefore significantly been dumbed down.

# -*- coding: ascii -*-
#
# allows the usual comment style
#
loggers:
    root:
        level=DEBUG
        handler=console
        handler=file
        handler=email
    input: 
        handler=socket
    input_xls:
        name=input.xls
        handler=nt_eventlog
        
handlers:
    console:  
        class=StreamHandler
        level=WARNING
        stream=sys.stderr
    file:     
        class=FileHandler
        filename=myapp.log
        mode=append
    socket:   
        class=SocketHandler
        host=localhost
        port=DEFAULT_TCP_LOGGING_PORT
    nt_eventlog: 
        class=NTEventLogHandler
        appname=MyApplication
        logtype=Application
    email:    
        class=SMTPHandler
        level=CRITICAL
        host=localhost
        port=25
        from=myapp@mydomain.com
        to=support@pythonsolutions.com
        subject=Take cover

I (VinaySajip) like this syntax, except for:

I don't see why we need to (in the above example, under loggers/root) say "handler=XXX" three times, rather than using a notation like "handlers = [console, file, email]".
It may be restrictive to only allow identifiers as keys.

As far as use of expressions in the config file, I am ambivalent, and curious to hear more opinions; sometimes a declarative approach is good. But more than the syntax, I'm interested in what people think about the semantics: for example, allowing cross-references between config elements (perhaps across multiple files), and the ability to perform restricted "special" evaluation of some elements (e.g. sys.stderr)

[http://cvs.zope.org/Zope3/doc/zcml/ ZCML] (unlike .ini files) handles nested input fairly well. The syntax looks a little like an Apache config; though from much of what I've seen, it's straight XML. -- IanBicking

Is there a better link for ZCML than the one above? There appears to be nothing in the form of overview documentation on the Zope site, including dev.zope.org - I did site searches for "ZCML" and "configuration" and nothing came up which looked immediately useful. I don't have a particular problem with ZCML, other than the fact that by dint of being XML, it is fairly verbose; and I would guess (please correct me if I'm wrong) that it is Zope-centric rather than general-purpose. -- VinaySajip

Maybe I'm confusing ZCML and [http://www.zope.org/Members/fdrake/zconfig/ ZConfig]; there's a ZConfig presentation at http://zope.org/Members/mcdonc/Presentations/ZConfigPaper -- ZCML is much more XML-based, where ZConfig is like an Apache file. ZCML is used for a lot of configuration inside Zope 3, which has a very different idea of what configuration is. (Intended for "system integrators" as opposed to "system administrators".) I believe both were intended to be usable outside of Zope, but haven't been packaged as such (yet).

Another way to deal with nested configurations would be to parse the names. I think one package I've seen does this. So sections are just a sort of "with" statement. I.e.:

loggers.input = nt_eventlog
[logggers]
input.nt_eventlog.filename = /some/path

Well, not a very good example. Anyway, this creates the keys "loggers.input" and "loggers.input.nt_eventlog". This works for keys that are nicely named; it might be harder for a virtual host, which I'd be apt to do like:

[vhost my.vhost.domain]
document_root = /some/path

Or something like that. Anyway, you want to create something like vhost['my.vhost.domain'].document_root, not vhost.my.vhost.domain.document_root. Maybe it could be like:

vhost[my.vhost.domain].document_root = /some/path
# or....
[vhost[my.vhost.domain]]
document_root = /some/path

Anyway, there's still some other possibilities when using the ini syntax.

I agree, but the approach is still more verbose than it need be. Java properties work like this; although they don't have sections, they give each property a hierarchical dotted name, and that can be used to create a hierarchy. For example:

app.name=GenericApp
app.varname=surveyMaster
app.logfile=/var/logs/${app.name}.log
app.servlet=default_servlet
app.templates=templates
entity.user=User
user.firstName.description=first name
user.lastName.description=last name
user.emailAddress.description=email address
user.emailAddress.validator=email
entity.group=Group
group.description=respondent group
group.name.description=name
group.description.description=description

You could use the above to indicate that a user entity is described by a class called User, and that a user instance has firstName, lastName and emailAddress properties ... and so on. It's workable, but hardly optimal.

Using my scheme, you could express the virtual hosts either like this:

virtual_hosts :
{
  host1 : { domain: 'host1.my-company.com', root : '/var/www/domains/host1.my-company.com/docs' }
  'host two' : { domain: 'host2.my-company.com', root : '/var/www/domains/host2.my-company.com/docs' }
}

or like this:

virtual_hosts : 
[
  { domain: 'host1.my-company.com', root : '/var/www/domains/host1.my-company.com/docs' }
  { domain: 'host2.my-company.com', root : '/var/www/domains/host2.my-company.com/docs' }
]

The first form would be more useful for being able to refer to individual hosts by name (using $virtual_hosts.host1 or $virtual_hosts['host two']. You can also have your cake and eat it, by specifying both forms and having one refer to the other:

virtual_host_definitions :
{
  host1 : { domain: 'host1.my-company.com', root : '/var/www/domains/host1.my-company.com/docs' }
  'host two' : { domain: 'host2.my-company.com', root : '/var/www/domains/host2.my-company.com/docs' }
}

virtual_hosts : 
[
  $virtual_host_definitions.host1
  $virtual_host_definitions['host two']
]

You might use the dict to access a specific host, and the sequence for iterating over the hosts in a pre-determined order. -- VinaySajip

Page

User