PyPI Download Meta Data Proposal

This page collects comments and ideas for a proposal to migrate away from having PyPI-based installers such as easy_install and pip crawl arbitrary links on the PyPI /simple/ index.

Here's the original posting to the catalog-sig:

Sketch of the proposal

Version: 0.1 (older versions can be found in the page history)

Here's an approach that would work to start the transition away from crawling websites while not breaking old tools:

Limiting scans to download_url

Installers and similar tools preferably no longer scan the all links on the /simple/ index, but instead only look at the download links (which can be defined in the package meta data) for packages that don't host files on PyPI.

Going only one level deep

If the download links point to a meta-file named "<packagename>-<version>-downloads.html#sha256=<sha256-hashvalue>" (the downloads.html file for the purpose of this proposal), the installers download that file, check whether the hash value matches and if it does, scan the file in the same way they would parse the /simple/ index page of the package - think of the downloads.html file as a symlink to extend the search to an external location, but in a predefined and safe way.

Notes

Comments

The above is a sketch, not a fully worked out proposal, so feedback is welcome. Please add your comments here:

...

PyPI/DownloadMetaDataProposal (last edited 2013-02-28 16:28:35 by MarcAndreLemburg)

Unable to edit the page? See the FrontPage for instructions.