Differences between revisions 38 and 39
Revision 38 as of 2020-06-02 17:32:04
Size: 20969
Comment: more details on the PyPI API revamp
Revision 39 as of 2020-06-05 22:32:24
Size: 808
Comment: moved to GitHub
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
This page lists specific things that The [[https://github.com/psf/fundable-packaging-improvements|Fundable Packaging Improvements repository]] lists specific things that
Line 9: Line 9:
That list [[https://wiki.python.org/psf/Fundable%20Packaging%20Improvements?action=info|was previously on this wiki page]] and has now been moved [[https://github.com/psf/fundable-packaging-improvements|to GitHub]].
Line 10: Line 12:


== Foundational tool improvements ==

=== Better specifications, toolchain, and services for building distributions ===

!PyTorch, !TensorFlow, and many other Python packages (especially science packages) suffer from cross-platform installability problems, which affect both users and developers. Packagers and users prefer using built distributions (usually in the wheel format); publishing built distributions increases convenience for end users because source code is pre-compiled, which significantly reduces install time (e.g., from 10+ minutes to several seconds).

Supporting the multifarious Linux platforms is something we've been lagging on; we are [[https://discuss.python.org/t/where-we-are-on-manylinux2010-and-how-that-relates-to-manylinux2014/1987/|still finishing up the rollout of manylinux2010]] and [[https://www.python.org/dev/peps/pep-0599/|recently approved the new standard manylinux2014]]. But even so, packagers will have to build their own wheels to release packages, which can be fiddly, brittle, and time-consuming.

We'd like help to:

 * [[https://github.com/pypa/manylinux/issues/338|Implement the manylinux2014 standard]] throughout the toolchain, to help users move off already end-of-life'd Linux distributions and get on a better foundation for security patches
 * [[https://www.python.org/dev/peps/pep-0600/|Finish the "perennial" manylinux PEP]] and get it approved and implemented to reduce the churn of hardcoded, brittle manylinux standards and react better to ongoing platform change
 * [[https://github.com/pypa/packaging-problems/issues/25|Create a generic wheel-building service]] to make releases faster and more robust

We need funding for specification research and writing, backend and frontend development, testing, DevOps/infrastructure/platform services, user experience work, technical writing for end users, project management, and community outreach.

=== Robust interoperability testing ===

We need funding to ensure core packaging tools work well with each other; currently they aren't seamlessly interoperable. See [[https://github.com/pypa/integration-test|the integration-test project]]. This will help us get faster at testing and rolling out bugfixes and features for '''all''' [[https://packaging.python.org/key_projects/|Python packaging and distribution tools]]: well-known projects like pip, virtualenv, and wheel, but also all the downstream projects that depend on them.

=== Revamp PyPI API ===

The Python Package Index, a key platform for Python developers, has [[https://warehouse.readthedocs.io/api-reference/|a minimal API]] that does not implement [[https://github.com/pypa/warehouse/labels/APIs%2Ffeeds|many features that users have requested]]. The lack of a full-featured API in Warehouse (the PyPI codebase) blocks many improvements, including:

 * [[https://github.com/pypa/warehouse/issues/474|Light-bandwidth metadata-only API calls]] and [[https://github.com/pypa/pip/issues/7406#issuecomment-583891169|JSON standardization]] that would enable better downloads, installations, dependency resolution features, and troubleshooting for pip and other clients
 * [[https://github.com/pypa/warehouse/issues/7730|Asynchronous uploads]] and thus [[https://github.com/pypa/warehouse/issues/5420|some validation checks]]
 * [[https://github.com/pypa/warehouse/pull/7013|RSS feeds]] that other platforms could reuse to get PyPI updates in user tooling
 * [[https://github.com/pypa/warehouse/issues/6378|Release security features (via token generation)]]
 * [[https://github.com/pypa/warehouse/issues/798|Security notification feeds]]
 * [[https://github.com/pypa/warehouse/issues/284|Caching for the bandersnatch mirroring client]]

This requires backend development work, technical writing, user experience research, and publicity and coordination work within Python's community.

=== Make setuptools the reference implementation of the distutils API ===

There is a part of the Python standard library called `distutils`, and some users directly use it. [[https://github.com/pypa/packaging-problems/issues/127|We want users to instead switch to the supported toolchain]], which uses `setuptools`, and move all the functionality from `distutils` into `setuptools`. This requires backend development work, technical writing, project management, and publicity work within Python's community.

=== Provide more standardized editable installations ===

Developers of Python projects want to be able to use "editable installations" -- changing the code of on applications while simultaneously running those applications. Right now, the support for that kind of usage is rough and not standardized across different tools. [[https://discuss.python.org/t/specification-of-editable-installation/1564|Packaging tools maintainers have rough plans for how to standardize the feature and support for it]] using distutils and setuptools. We would like funding for developing a proof of concept and coordinating subsequent standards changes, tool improvements, and documentation. This requires backend development work, technical writing, and coordination and publicity work within Python's community.

=== Add support for pyproject.toml as a way to configure setuptools ===

`setuptools` [[https://github.com/pypa/setuptools/issues/1688|does not yet allow]] project creators to use the new `pyproject.toml` standard configuration file to configure `setuptools` behavior. This distracts and confuses package creators, and prevents platforms and tools from depending on the presence of standard `pyproject.toml` metadata in packages. We'd like to implement `pyproject.toml` configuration support in `setuptools`. This requires backend development work, technical writing, and coordination and publicity work within Python's community.

=== Audit and update package metadata ===

If we [[https://github.com/pypa/warehouse/issues/474#issuecomment-370986838|audit and update PyPI metadata for existing projects based on already-uploaded artifacts]], we can publish information about what packages depend on each other and on certain environments, and ensure a high-quality API for many tools to reuse and build upon. The current PyPI upload API relies on the upload client extracting the metadata and supplying it with the first upload request, and that isn't a valid assumption for older upload clients. Currently, our constraint is a combination of developer time, compute resources, and privileged backend database access; funding would break this bottleneck.

=== Improve user experience of packaging ===
User experience research and UX and development implementation work would [[https://github.com/pypa/packaging-problems/issues/1|make it easier for packagers to create configuration files]]. We aim to use the UX research work from [[https://wiki.python.org/psf/Fundable%20Packaging%20Improvements#Improve_pip_user_experience|improvements in pip's user experience]] and build on them to improve the larger experience of packaging for Python in general.

=== Improve specificity of license classifiers ===

Our packaging ecosystem relies on [[https://pypi.org/classifiers/|a particular structured data format (classifiers)]] to indicate a package's legal license. However, our current system [[https://github.com/pypa/warehouse/issues/2996|allows for ambiguity that makes some downstream data display incoherent or very difficult, and doesn't allow for some license specificity that downstream consumers need]] ([[https://libraries.io/|Libraries.io]] and similar projects). Fixing this is a fairly small project, involving Python development, public communications, project management, and potentially a few hours of legal counsel for review.

=== Standardize and implement a lockfile format ===

`pip` currently uses `requirements.txt` to specify dependencies; it can specify ''versions'' of packages but not ''hashes''. The [[https://github.com/pypa/pipfile|newer pipfile format]] can include hashes, which some users prefer. But `pip` [[https://github.com/pypa/pip/issues/4732|doesn't yet support]] `pipfile`, so many users are blocked from using hashes to better secure their Python runtimes. We have [[https://github.com/uranusjr/lock-file|made some progress toward standardizing an interoperable lockfile format]], but we need to finish [[https://discuss.python.org/t/structured-exchangeable-lock-file-format-requirements-txt-2-0/876/|that design standardization and consensus-gathering work]] and implement it in `pip`, `pipenv`, and related tools. We'd need Python engineering work and project management to develop and deploy this.

=== Package preview feature for PyPI ===

Right now, there are ways for package maintainers to test and share draft versions of their upcoming releases, but they cause friction and confusion. So we want to add [[https://github.com/pypa/warehouse/issues/726|staged releases -- a temporary state that a release can be in, where PyPI ''has'' it and can evaluate it, but hasn't ''published'' it yet]].

This will:

 * let project owners/maintainers preview/[[https://github.com/pypa/warehouse/issues/720|test]] how their package metadata displays on the website, and review where their fresh releases are out of compliance with site and interoperability requirements (preventing the problem of [[https://github.com/pypa/packaging-problems/issues/74|maintainers wanting to re-upload removed files]])
 * help cross-platform package maintainers [[https://github.com/pypa/warehouse/issues/4056|coordinate dozens of wheels built on multiple machines]] for simultaneous release
 * [[https://github.com/pypa/warehouse/issues/2286|Provide an interoperability check for toolchain developers, and a testing site for people learning packaging]]
 * [[https://github.com/takluyver/flit/issues/125|Simplify packagers' upload configuration files]]
 * reduce complexity that currently forces maintainers to use [[https://github.com/pypa/warehouse/issues/5707|confusing "dev" or prerelease version numbers]]
 * [[https://github.com/pypa/warehouse/issues/6378|Improve security of package uploads, by allowing maintainers to scope upload API tokens to the newly staged package]]
 * [[https://github.com/pypa/packaging-problems/issues/114|Prevent package name conflicts]]
 * [[https://github.com/pypa/warehouse/issues/918|Streamline infrastructure maintenance and confusing documentation by letting us take down the separate test.pypi.org staging site]]
 * Provide pre-release warnings to maintainers of packages that [[https://github.com/pypa/packaging-problems/issues/264|fail metadata checks]] (such as rejecting or warning for [[https://github.com/pypa/warehouse/issues/3889|packages without Python requirements metadata]], or [[https://github.com/pypa/warehouse/issues/5420|manylinux wheels that fail auditwheel checks]]) -- as we increase the packaging ecology's strictness regarding metadata standards compliance, during the intermediate period where we're warning maintainers/owners about failing strictness checks but not yet blocking releases on those new stricter checks, the package preview feature will help us provide soft warnings.

We'll need database support for understanding the release state ("is this published or not"), user experience and developer support, and testing, security, infrastructure, and project management support.

=== Feature flag system on PyPI ===

It's difficult to roll out new features gradually to PyPI's test site or to selected test users. A [[https://github.com/pypa/warehouse/issues/5869|feature flag system]] would help us do targeted outreach to particular groups of users, deploy more confidently, and roll back changes when needed. We'd need user experience, front and backend engineer, data analytics, and project management support to develop and deploy this.

=== User support ticket system ===

Python packagers who need help currently create [[https://sourceforge.net/p/pypi/support-requests/|Sourceforge]] and [[https://github.com/pypa/warehouse/issues/|GitHub]] tickets, email mailing lists, tweet at maintainers, and so on. A [[https://github.com/pypa/warehouse/issues/3231#issuecomment-405561741|unified user support ticket system]], integrated into Warehouse, would:

 * help managers, entrepreneurs, and academics [[https://github.com/pypa/warehouse/issues/2082|reserve specific package names]]
 * [[https://github.com/pypa/warehouse/issues/1190|support username changes]]
 * give users [[https://github.com/pypa/warehouse/issues/2982|a reporting system to quickly flag malware and spam]]
 * provide a [[https://github.com/pypa/warehouse/issues/1506|transfer system for abandoned/unmaintained projects]]
 * reduce work for PyPI's core developers who currently have to sift through user support issues to find bug reports and feature requests
 * enable PyPI admins to better delegate support and moderation work to volunteers

We need funding for backend and frontend development, testing and security checks, DevOps/infrastructure/platform services (including API/email integration), user experience work, technical writing for end users, project management, and community outreach.

== Security improvements and prerequisites ==

=== System to label projects on PyPI with administrative statuses/attributes ===

To scale up our anti-abuse moderation and help package maintainers with security response, we need to be able to, for instance, mark a release as deprecated or a project as unsupported. This means we need a generic system to add, edit, and remove administrative attributes ("flags" or "statuses") to individual projects and releases. We need support to do the architectural design to implement this. (See [[PackagingWG/2019-03-22-Warehouse|notes from this meeting]].)

=== Security notifications for vulnerable packages ===

To keep PyPI's users secure, we want to give them [[https://github.com/pypa/warehouse/issues/798|an opt-in communication channel to hear about security vulnerabilities for the packages they use]]. Implementing this would also give us architectural support to ''warn or prevent'' `pip` users who try to install a PyPI package that's been found to be broken or malware. We need funding for user experience work, development, testing, infrastructure, potentially platform services (e.g., SMS), and community outreach.

= Items that have now been funded =

Some TODOs that were on this page have now received funding!

== Foundational tool improvements ==

=== Finish dependency resolver for pip ===
''[[https://pyfound.blogspot.com/2019/11/seeking-developers-for-paid-contract.html|This is now funded]] and [[https://github.com/python/request-for/blob/master/2020-pip/RFP.md|we seek developers to work on this project (apply by 22 November 2019]].)''

We're partway through a next-generation rewrite of the dependency resolver within pip, Python's package download and installation tool. The project ran into massive technical debt, but the refactoring is nearly finished and prototype functionality is in alpha now. ([[https://docs.google.com/document/d/1x_VrNtXCup75qA3glDd2fQOB2TakldwjKZ6pXaAjAfg/edit|In-depth explanation by Sebastian Awwad of the problem & our approach]], [[https://gist.github.com/pradyunsg/5cf4a35b81f08b6432f280aba6f511eb|lead developer Pradyun Gedam's initial plan]], [[https://pradyunsg.me/gsoc-2017/|2017 status updates]], and [[https://pradyunsg.me/blog/2019/06/23/pip-update/|June 2019 status update]], [[https://github.com/pypa/pip/issues/988|GitHub issue #988 tracking progress]] and [[https://github.com/pypa/pip/issues/6536|issue #6536 for planning rollout]].)

Funding would support user experience, communications/publicity, and testing work (including developing robust testing/CI infrastructure) as well as core feature development and review.

We need to finish the resolver because so many other improvements are blocked on it:
 * [[https://github.com/pypa/pip/issues/4551|adding an "upgrade-all" command to pip]]
 * [[https://github.com/pypa/pip/issues/5497|warning when trying to download or build wheels from incompatible set of packages/requirements]]
 * [[https://github.com/pypa/pip/issues/4745|adding a no-implicit-upgrades strategy]]
 * [[https://github.com/pypa/packaging-problems/issues/264|making PyPI and pip enforce metadata compliance more strictly]]
 * [[https://github.com/pypa/pip/issues/4681|warning the user when uninstalling a package that other packages depend on]]
 * [[https://github.com/pypa/pip/issues/6495|properly respecting constraints]]
 * [[https://github.com/pypa/packaging-problems/issues/215|recording requested and installed extras]]
 * [[https://github.com/pypa/pip/issues/53|option to show what versions of packages are currently available]]
 * [[https://github.com/pypa/packaging-problems/issues/54|listing packages' dependencies and dependents on PyPI]]
 * [[https://mail.python.org/archives/list/distutils-sig@python.org/thread/2QECNWSHNEW7UBB24M2K5BISYJY7GMZF/#2QECNWSHNEW7UBB24M2K5BISYJY7GMZF|minimizing duplication of work between pip and pipenv]]
 * [[https://github.com/pypa/pipenv/issues?q=is%3Aopen+is%3Aissue+label%3A%22Category%3A+Dependency+Resolution%22|better pipenv functionality]]
 * [[https://discuss.python.org/t/namespace-support-in-pypi/1609/35|package namespace support]]
 * [[https://discuss.python.org/t/if-python-started-moving-more-code-out-of-the-stdlib-and-into-pypi-packages-what-technical-mechanisms-could-packaging-use-to-ease-that-transition/1738/24|moving more code out of Python's standard library so we can release improvements faster]]

and it would fix so many dependency issues for our users:
 * [[https://github.com/pypa/pip/issues/4907|Django installation conflict]]
 * [[https://github.com/pradyunsg/zazo/issues/2|cherrypy/six/cheroot installation conflict]]
 * [[https://github.com/pypa/pip/issues/5043|Spyder downgrade requirement]]
 * [[https://github.com/pradyunsg/zazo/issues/4|boto3/bravado dependency failure]]
 * [[https://github.com/pypa/pip/issues/5313|Ansible/PyOpenSSL/cryptography failure]]
 * [[https://github.com/pypa/pip/issues/4957|extras installation failure]]
 * [[https://github.com/pypa/pip/issues/4391|extras upgrade failure]]
 * [[https://github.com/pypa/pip/issues/6494|breaking installed packages]]
 * [[https://github.com/pradyunsg/zazo/issues/14|elasticsearch/requests failure]]
 * [[https://github.com/ofek/hatch/issues/47|hatch, another packaging tool]]

And in our larger ecology, this causes installation problems for:
 * [[https://github.com/conda/conda/issues/8657|conda's compatibility with pip]]
 * [[https://github.com/servo/servo/issues/10611|the Servo browser engine]]
 * [[https://github.com/pypa/pip/issues/4582|numpy and scipy]]
 * [[https://github.com/juju/python-libjuju/issues/45|Canonical's DevOps tool Juju]]
 * [[https://github.com/antocuni/capnpy/issues/16|a Cap'n Proto implementation]]
 * [[https://github.com/DataBiosphere/toil/issues/2230|toil, awscli, and boto3]]
 * [[https://github.com/mozilla/bedrock/issues/5967|the Mozilla website & icalendar]]
 * [[https://github.com/certbot/certbot/issues/5195|certbot, in the past and possibly the future]]
 * [[https://github.com/TurboGears/tg2devtools/issues/13|TurboGears]]
 * [[https://github.com/pycontribs/jira/pull/744|a JIRA API client library]]
 * [[https://github.com/crossbario/autobahn-testsuite/issues/55|a WebSocket protocol test suite]]
 * [[https://github.com/gerkey/ros1_external_use/issues/7|Robot Operating System tooling]]

=== Improve pip user experience ===
''This is now funded, thanks to [[https://chanzuckerberg.com/eoss/proposals/improving-user-experience-and-debuggability-of-pip-for-all-python-users/|the Chan Zuckerberg Initiative]] and [[https://www.mozilla.org/en-US/moss/|Mozilla Open Source Support]].''

`pip`'s user experience needs to improve by providing [[https://github.com/pypa/pip/milestone/25|better error messages]] and prompts, logs, output, and reporting, and becoming more consistent across features, to fit the user's mental model better, make hairy problems easier to untangle, and reduce unintended data loss. `pip`'s maintainers have [[https://github.com/pypa/pip/milestone/10|a list of TODOs]] and need funding so that user experience researchers, UX designers, developers, and technical writers can spend dedicated time addressing them.

Packaging improvements that could be funded

The Fundable Packaging Improvements repository lists specific things that

  1. the Python packaging community wants
  2. are fairly well-scoped
  3. would happen much faster if the Packaging Working Group got funding to achieve them (through donations or grants/directed gifts)

That list was previously on this wiki page and has now been moved to GitHub.

Please contact the Packaging WG to ask us to estimate how much one of these improvements would cost; we'll get back to you within a few business days.

Fundable Packaging Improvements (last edited 2020-06-05 22:32:24 by SumanaHarihareswara)

Unable to view page? See the FrontPage for instructions.