Differences between revisions 29 and 30
Revision 29 as of 2012-03-15 06:34:11
Size: 14392
Editor: techtonik
Comment: added performance report about logging being slow from Stanford VNS project - I have no reasons not to believe them
Revision 30 as of 2012-03-15 07:16:13
Size: 15127
Editor: techtonik
Comment: logging with 3rd party libraries is a PITA
Deletions are marked like this. Additions are marked like this.
Line 46: Line 46:

== Is a pain in the arse for working with libraries ==

Libraries should not output any logging information by default unless explicitly asked to do so. That's why they should not try to configure logging. But on the first call to any log() function logging configures itself automatically. This is only documented in thread safety note for logging.log() function and in logging.basicConfig() (one of the examples why '''Docs are not complete''' speculation is not corrent - the docs are there, they are just hard to find). So, if any used library uses logging, you ''''have to configure root logger'''. So, instead of providing an unobtrisive way to debug library behavior, it just makes their usage harder. -- techtonik

----

This page should cover pros and cons of current stdlib logging package.

Pros

  • Almost every feature you could want from a logging package is there

Cons

  • See below. Vinay made sections of each of the points in the original bulleted list for discussion.

Docs are not complete

False. There is pretty much everything in the docs. If you think something is missing, please, be more specific about the missing stiff.

Docs are too long

True.

Config files are a little bit hard to comprehend

An alternate configuration mechanism is provided by the ZConfig package (PyPI). (Note ZConfig is not a small package to pull in.)

Another alternative is provided by the config package, which is a single module and easy to incorporate into logging. However, what say people to the question of backward compatibility? -- VinaySajip

  • So the answer is True, but can't not be changed, because of backward compatibility. -- techtonik

If you want to suggest an alternative mechanism which uses ConfigParser, please suggest alternatives to the format. -- VinaySajip

I'd rather avoid using config files (for logging) entirely in almost every project I work on. If logging has its own configuration files, that's a smell. If I need configfiles for my project, I'll choose the format I need, and expose the necessary settings to the user coherently, including logging settings. That it has its own configuration file to me is a smell, it suggests configuring in code is too hard, and too necessary. -- JoshuaRodman

  • It's not a smell - it's not necessary to use, but allows for easier configuration in some scenarios. You can certainly put other settings in the same file, or have separate files for other settings. I don't know of any convention that limits the number of configuration files in an application to zero or one. -- VinaySajip

API uses camelCase (goes against PEP8 recommendation and most of the stdlib)

PEP8 says - consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is most important.

  • So True, but can't not be changed, because of backward compatibility. logging2 maybe. -- techtonik

    • It's a low priority right now, unless there's an initiative to ensure the rest of the stdlib is made to conform to PEP8. -- VinaySajip

Rather slow considering the large number of function calls performed internally to check which handler to use

There is no official confirmation, but users report 25% boost in performance after commenting logging stuff. So, this stays to be True until somebody proves otherwise.

Vinay argues that 'logging can be too slow in some specialised scenarios', and asks people to provide more objective metrics than "lot of function calls". Although we don't speak about specialized scenarios, it would be nice if at least these function calls were counted. There are (very simple) timing test results below.

Is a pain in the arse for working with libraries

Libraries should not output any logging information by default unless explicitly asked to do so. That's why they should not try to configure logging. But on the first call to any log() function logging configures itself automatically. This is only documented in thread safety note for logging.log() function and in logging.basicConfig() (one of the examples why Docs are not complete speculation is not corrent - the docs are there, they are just hard to find). So, if any used library uses logging, you 'have to configure root logger. So, instead of providing an unobtrisive way to debug library behavior, it just makes their usage harder. -- techtonik


Doesn't have runtime scoping (i.e., log messages handled based on the call stack)

Please give some more details on what you mean here. How exactly would you want to log messages based on the call stack? -- VinaySajip

Difficult to extend log records

LogRecords need to be pickleable to be sent across the wire. You can add arbitrary attributes to LogRecords by using the "extra" keyword parameter to logging calls. Exactly how would you like to extend LogRecords in a way which is difficult at the moment, and why? -- VinaySajip

I don't really know what this offers, but I usually prefer to simply add attributes to the python objects if I need to decorate them. Perhaps unconventional, but it feels pythonic to me. -- JoshuaRodman

  • Yes, but you don't normally have access to a LogRecord when logging an event, so you can't simply add attributes to it. -- VinaySajip

Difficult to add general context to log messages (e.g., add the request URL to all logging messages during the request)

There's information on this very topic at Adding contextual information to your logging output. Can you provide more details on how the mechanism provided fails to meet your needs? -- VinaySajip

I'd prefer some log-invocation magic for the common cases, like the function name I'm in as a token to be expanded. -- JoshuaRodman

  • The funtion name is already available without magic, have you checked the documentation? -- VinaySajip

By default it does nothing; basicConfig makes it do something but makes it hard to tweak logging.

How would you want to tweak logging in a way which basicConfig doesn't make easy? Suggest improvements which can be made to basicConfig, I'm receptive. Remember to consider backward compatibility. -- VinaySajip

A logger with no handlers should default to sending to stderr or something like this. Perhaps it already has a handler. Perhaps there is a default handler set by default. I shouldn't have to configure anything for the simplest case of a script. Further, this means that I can test modules independently from my larger project and get a sane behavior. -- JoshuaRodman

  • Don't agree with your suggestions - a third party library which uses logging should not spew logging output by default which may not be wanted by a developer/user of an application which uses it. For the simplest case of a script, just use logging.info() etc. and it invokes basicConfig() under the covers. Have you checked the documentation? -- VinaySajip

filters are an abstract class instead of callables

Filter is not an abstract class - it's a concrete class with a reasonable default implementation. Is this really a major design flaw? It would be easy to modify the package to add callables to the filter list, and the system could expect either a Filter instance or a callable. -- VinaySajip

Thread-local handlers aren't easy to setup

Please suggest how you would like these to work, and why you need thread-local handlers? You already have the ability to insert thread-local context information into messages using non-thread-local handlers. -- VinaySajip

Nothing like keywords or tags for doing multi-dimensional categorization or grouping of messages

How many use cases need this? How would you see this being implemented? Buffering of messages is already possible. Talk is cheap, so please describe your use cases in more detail and how you think an "ideal" API for "multi-dimensional categorization" or "grouping" would look. -- VinaySajip

No clear way to introduce HTML-formatted messages (or an HTML formatting option)

You can post messages to web sites, and present messages in HTML format in numerous ways; is this a common use case and would we get common agreement on how this would work (i.e. a specific HTML representation) ? -- VinaySajip

Not fast enough to do pervasive logging in libraries (we've encountered this with Paste/Pylons, where we'd like to put lots of logging in that people could turn on, but it becomes a notable performance hit).

Please clarify - how much of a performance hit is it if logging is turned off for particular modules? When you encountered performance issues, what mitigating strategies did you try? (e.g. isEnabledFor) -- VinaySajip

I set up a very simple script to time logging calls, available here. Output looks like this (Python 2.6 on Windows, Core Duo E8400):

log_noop              0.13 microseconds
log_simple           57.36 microseconds
log_filtered          4.19 microseconds
log_mitigated         3.78 microseconds
log_disabled          0.98 microseconds
No caller, thread, process info...
log_simple           49.89 microseconds
log_filtered          4.19 microseconds
log_mitigated         3.79 microseconds

which means that a simple logging call takes around 57 microseconds, reduced to around 50 microseconds if you opt to not collect caller, thread or process info. That's for logging to a file. So what are your expectations for performance? These numbers don't look too shabby to me. Of course, it's a simplistic test, but at least there are some actual numbers there. --VinaySajip

Too complicated, with logging levels, namespaces, handlers, and other things -- they took all the terrible things from java.util.logging and log4j, which are incredibly slow (but powerful) in python

Good ideas don't only come from Python people, folks. These ideas were proven in log4j and other packages and they are based on the ideas of "what happened? where did it happen? how important is it? who wants to know?" and if you think about it, these ideas are hardly Java-specific. OTOH, they are pretty central to the problem domain addressed by logging. So - "what happened?" is the details of the logging call, "where did it happen?" is the namespace, "how important is it?" is the level, and "who wants to know?" is the handler. Hardly that complicated, and AFAICT pretty much a minimum requirement for any logging package that aspires to the name.

And, "incredibly slow" is pretty emotive. Care to back that up with some hard data? See the numbers above for some ideas. Also, anyone who bothers to look at log4j in detail will see that Python logging is not a mindless translation - it's fairly Pythonic. Beyond the basic abstractions of "What? Where? How Important? Who Needs To Know?", there's no real correspondence between the Java artifacts and the Python ones. Using David A. Wheeler's SLOCCount, log4j 1.2.15 = 168 source files, around 16K SLOC; Python 2.6 logging = 3 source files, < 1.5K SLOC. To me the Java connection and inferences that people draw from the "Java heritage" is bordering on FUD a lot of the time, I have to say. But feel free to put me right with specific comments rather than vague arm-waving, and you'll find me receptive. -- VinaySajip

It is too complicated. More defaulting is needed, and simpler construction. I should be able to get a logger talking to a rolling logfile in a single line, and that line should not be overly verbose. Sure, it might not be what I'll eventually want, but I need it to start working right away, so I can continue to figure out what I need. -- JoshuaRodman

  • Then write your own utility function in a utility library of your own to make this happen. Not everybody thinks the same way, so it's better if people write their own simple-to-use (for them) wrappers on top of the existing functionality. Once written, you can import from your utility package into other applications you write. -- VinaySajip

The appropriate roles for args and kwargs in loggerAdapter are ambiguous.

I'm finding the paradigms for augmenting log arguments with relevant context info confusing. The docs and general web info are helpful but slim on the topic so far. Adding contextual info via the 'extra' parameter with loggingAdapter(. . . ,extra=myClassObj | extra=my.dict ) is slick for some use cases, not so much for others. It seems that the extra= is set in stone when the loggerAdapter object is instantiated, then passed to the individual LogRecords as needed (true?/false?). Context specific to the location of the log() call is superseded by the context of the loggerAdapter instantiation. Any kwargs['extra'] in the log() calls are overwritten silently. Perhaps that's a feature, and context info specific to the log() call location should be Youincorporated using log(... args) and having Formatters manage and replace the the actual args? At present, the use of the kw extra is handled by logging in the reverse behavior of most inherited object properties, where lower level objects supercede the higher level class values. Fortunately, the logging mechanisms are so flexible that it's not difficult to use args in place of kwargs (even for class arguments), or to alter the default behavior of logging for the extra kw. Still, it's confusing as is when the use case is not to pass down the context of the logger instantiation to all subsequent log() calls. I have code for both solutions if there is interest. (This comment seems long and detailed relative to the intended context of this page. Is there another place for similar discussion? Also: where to post demo code that is neither patch nor bug?)

  • Please give details of the use case which is giving you problems, on comp.lang.python rather than here: that's a better platform for receiving support. The "extra= is set in stone" only applies to the default implementation of LoggerAdapter - you're free to override this behaviour by creating a subclass of LoggerAdapter. The "overwritten silently" is clearly documented at

    http://docs.python.org/library/logging.html#adding-contextual-information-to-your-logging-output

    You should only really be using LoggerAdapter for specialised cases e.g. when you are logging information relating to database or network connections and need to log the context of individual connections as well as the specifics of an individual event. For most other uses, just using a %-format string and arguments should suffice. Without knowing the details of your use case, it's hard for me to know whether there is a real problem with LoggerAdapter functionality or just with your understanding of it; your comments above indicate to me that it could well be the latter. So, please post what you are trying to do on comp.lang.python, and what problems you are having, with "logging" in the title/subject of your post. You can post short snippets of code there directly; if you want to post longer pieces you can use any public pastebin (e.g. dpaste, LodgeIt, gist.github.com, ...) and link to the posted snippets from your mailing-list post. You can either mail into comp.lang.python using the email address python-list@python.org, or use e.g. Google Groups' web interface via

    http://groups.google.com/group/comp.lang.python/topics

    -- VinaySajip

LoggingPackage (last edited 2012-03-15 12:46:11 by VinaySajip)

Unable to edit the page? See the FrontPage for instructions.