Distutils Version Fight
Sorry for the crazy moon proposal here, but I have example code that I think even works. (Though I'm not operating on enough sleep, so sorry 'cause I already know the code is crappy.) lch.version.py
The basic idea:
- A version string should be transformable into a version tuple.
- Version tuples are tuples of arbitrary length containing only integers.
- A version tuple conceptually ends with an infinite number of 0 fields; (1,3) == (1, 3, 0, 0, 0)
- While you see integers, accumulate them, and when it's done int().
- While you see alphanumerics, accumulate them, and when it's done map it if it's a known string ("alpha", "beta") otherwise break up the characters into individual numbers.
- Pre-releases (alpha, beta) are represented by having a negative number in the version tuple.
- When you compare version tuples, if there are any negative fields, split there and compare the non-negative parts first.
I doubt y'all will go for this. But I thought you should at least consider supporting a more flexible format. People like to express their version numbers a wide variety of ways, and most folks could find a way to map their personal weird approach to something this approach would make consistent.
Just throwing this out there since it's a little different from what was discussed earlier, but I think has some merit:
* Versions are a series of integers. Ie 0.0.1, 1.0.0, or 220.127.116.11. Versions define the "intended API level" of the software in question.
Pre-release versions are denoted by a version, followed by a string (I suggest "pre" but it doesn't really matter), followed by a series of integers, seperated by periods if needed. Pre-releases are important because while they may be an implementation of a future API version, they are likely to be buggy or incomplete.
* For instance 1.0.0pre2.1 > 1.0.0pre2 > 1.0.0pre1 < 1.0.0.
* Or for the case of daily builds, 1.0.0pre1 > 1.0.0pre0.20090327 < 1.0.0
This obviously trades away the flexibility of roll-your-own naming entirely, but makes up for it by defining a standard that leaves enough flexibility to represent most cases easily. It is easy to parse and explain, extensible, and able to cover most use cases aside from that of "backported bug fixes". I believe that eliminating words will be a net positive because it eliminates any complaints along the lines of "you included alpha, beta, and rc but where's pre-release or testing?!".
The proposal that I believe we tentatively were agreeing on towards the end of last night's Open Space is basically distutil's StrictVersion plus the allowance of 'c' for "release candidates".
Good: 1.2 # equivalent to "1.2.0" 1.2.0 1.2a1 1.2.3a2 1.2.3b1 1.2.3c1 Bad: 1 # mininum two numbers 18.104.22.168 # max 3 numbers 1.2a # release level must have a release serial 1.2.3b
Code for this: verlib.py
Two stabs and what we discussed. We should pick *one* of these (or, of course, one of the other proposals here):
Close to distutils' StrictVersion in that it only allows between 2 and 3 <num>. sections. I.e. "1.2.3", but not "22.214.171.124".
Allows any number of <num>. sections (still must be at least 2 of them).
>>> from verlib import * >>> V = RationalVersion1 >>> V('1.2.3') 1.2.3 >>> V('1.2.3').info (1, 2, 3, None, 0) >>> V('1.2.3a4') '1.2.3a4' >>> V('1.2.3a4').info (1, 2, 3, 'a', 4) >>> (V('1.0.0') > V('1.0.0c2') > V('1.0.0c1') > V('1.0.0b2') > V('1.0.0b1') ... > V('1.0.0a2') > V('1.0.0a1')) True
Some stats against the list of current PyPI versions that MvL provided:
-- matches againsts current PyPI versions count: 4975 RationalVersion1 matches: 1986 (39.92%) RationalVersion2 matches: 2386 (47.96%) -- with some naive cleaning up of PyPI versions (e.g. '1.0-alpha1' -> '1.0a1') cleaned RationalVersion1 matches: 2499 (50.23%) cleaned RationalVersion2 matches: 3003 (60.36%)
A link from RubyGems that might be interesting:
* Version Policy: http://rubygems.org/read/chapter/7
Because RubyGems provides support for version comparisons, we want to pick a policy that works well with the RubyGems comparisons and gives the end user what they expect. We call such a policy “rational”. Also, if we call non-working policies “irrational”, then we apply a little bit of social engineering to gently prod offenders to conform.
Simple modification of Larry's proposal: split the tuple at the "tag" (defined by an alpha part in the version number), and allow only one tag. Map alpha, beta and rc to known negative values, and every other tag to a value lower than all of these (so that setuptools' .dev-r102 would still work correctly).
Sample implementations at gb.version.py.
The setuptools versioning scheme is described here. It is pretty similiar to the schemes which have been proposed so far. The setuptools versioning scheme should meet most of our needs and has found wide acceptance in the community. Neither of the two current distuils versioning schemes are widely used. The setuptools versioning scheme is geared to the needs of developers who are the primary audience of distuils and provides a useful scheme for controlling software releases.
Third party packagers each have their own version schemes with their own features and idiosyncracies . We cannot accomodate every third party packager within the Python versioning scheme. We should focus primarily on the needs of developers as without developer buyin the scheme will not be adopted. We can accomodate third parties by allowing inclusion of packager specific version numbers with the metadata that is distributed with the Python application distribution. This would look something like:
version = 1.0.pre1 [rpm] version = 1.0
This would enable a version numbering scheme per packaging scheme and would also allow for extensibility as new packaging schemes are developed. The information in the tags can be processed downstream by packaging organisations.
The idea put forward by Larry Hastings is an excellent method for converting pre-release and post release tags to a numeric based schema. It could be included as an example of tag scheme conversion from the setuptools versioning scheme to a completely numeric packaging scheme.
Rough sketch of an idea:
A version is a series of "."-separated numeric fields. The introduction of a non-numeric character anywhere in the version string marks the version as a pre-release.
For sorting: Compare numeric fields as tuples of integers. For equivalent numeric versions, prefer ones that do not have any trailing non-numeric fields.
In essence: '3.1' == '3.1.0' > '3.1.rc' > '3.1.0.funtime32'
This simply and flexibly handles pre-release versioning. Post-releases are handled by incrementing the least significant numeric field.
This is, in essence, the inverse of Loose Version in distutils -- it handles pre-releases instead of post-releases, but still has room for "non-standard" annotations.
Pointing to the version numbering scheme used by Debian, which is in production use for many years. Pointing out some points:
- It is a strict system, not relying on heuristics and any special meanings of strings like alpha, beta, rc, pre.
- It allows correction of a bad version number by allowing prefixing the version by an epoch (: character).
It allows to write a version number very close to the upstream version, handling parts of the version number which should be considered less than another version (using the tilde character), e.g. 1.0~alpha1 < 1.0.
- The paragraph about the "debian_revision" mentioned in the referenced URL is not appropriate for a python version. It would be nice if the version system could handle this revisions usually by packages for distributions of operating systems.
Characters which cannot directly be encoded on the filesystem can be encoded using %<hexvalue> and be used when the version number needs to be encoded in a file name.
1.0 < 1.0.0 < 1.0.1 < 1.1 < 1.10 2.1 < 1:1.0 3.1~~svn20090328 < 3.1~alpha1 < 3.1
The approach I'd like to see is simple and avoids all complicated and error prone parsing of string versions:
The package must provide a string version of the package version in any format the author likes to use This is for humans to read and should provide enough information for a potential user to identify the version as right for his or her use.
The package must provide a tuple version which follows a strict ordering The tuple may contain integers and/or ASCII strings and can (theoretically) have arbitrary length. It is meant for computers to read and must therefore provide the version information in a lexically correct order. The Python sys.version_info and its approach to providing the version information in lexically correct order is a good example of such a version tuple: (major, minor, patch_level, status, status_version). Some authors may also want to include the release date/time and/or the repository revision used to cut the release as extra tuple entries.
Package management software should always use the tuple version for version comparison and display the string version to the user.
Package repositories must provide a way to:
- get a list of available versions (in tuple form)
- convert a version tuple for a package to a download URL
Ideally, they should display the version string to the user and allow searches based on these as well.
@MAL: One of the issues is being able to specify dependencies (e.g. in the current "setup_requires" or whatever). For example: "simplejson > 3.0.1". That means either a deterministic translation btwn version tuple and version string is necessary OR dependencies need to specify version *tuples*. The latter might be painful. --TrentMick
@Trent: Right, dependency definitions will need to use the tuple version, probably using a version matching function or instance, e.g. requires=PythonPackage('simplejson', supported_versions=((3,0), (3,1))). Note that just specifying a minimum version is likely not going to provide a robust setup, e.g. "simplejson > 3.0.1" would also match simplejson 4.0, but that may have a completely incompatible interface. -- MarcAndreLemburg
This is more a statement of what we want/don't want than a specific implementation:
- A method of expressing "alpha", "beta"... releases in ways that sort lower than the final release
- Would be nice if python version numbers were directly usable by system packagers, or, in the worst case, reliably transformed
- No special casing of certain words, i.e. "pre" or "alpha". i.e. A simple/reliable sorting that you can explain to people
- A standard way to format the version number to fit on the filesystem