Test discussion meeting
March 27, 2020

Participants:
 * Ilan
 * Paul
 * Pradyun
 * Sumana
 * Tzu-Ping

* Current status - test infrastructure & type & number of automated tests
	
		* Paul: current situation: we have the basics of the new resolver linked in and accessible from pip. There are a lot of bits of functionality we haven't yet implemented.
		* Basic functionality -- we say "install x y z" and it goes to find some dependencies - that is present.
		* Now: worth talking about this -- as things stand, we've written a very small number of simplistic tests to make sure that the test suite confirms that the new resolver can do an install.
		* We've also tried running the full  pip test suite with the new resolver. It probably passes ~50% of the tests. Not THAT indicative -- how many of those tests exercise the resolver particularly. So maybe not 50% of the capability in place....

		* Ilan: other tests .... are just failing? might be unrelated to resolver?

On failures:

		* Paul: probably all failing for resolver-related reasons. There are enough tests in this suite that looking at even a sample of failing tests would be a LOT of work. We have critical work to do re: specific things we know the resolver is failing at.
		* Upon inspection: Some tests, we know why, e.g., extras is not yet supported.

		* Now: at some stage we will want to tackle: how do we insure the functionality is covered?

		* We are writing a lot of code right now with very limited tests.

		* We could do with improving test coverage of new code with some testing.... don't want to spend a lot of our time duplicating work you are going to do, delivering tests

		* New resolver implementation (ongoing work): https://github.com/pypa/pip/tree/master/src/pip/_internal/resolution
		* New resolver tests: https://github.com/pypa/pip/blob/master/tests/functional/test_new_resolver.py
		* Older resolver/dependency YAML tests: https://github.com/pypa/pip/tree/master/tests/yaml

		* It would be useful for us to understand -- what sort of tests do you anticipate delivering, when, how can we use them, when will we have something we can start to work with?
		* In the time between when that happens and where we are now, how do we avoid investing effort [duplicatively]

Where we want to be by the end of May
	
	Main thing:
		* Paul: Add more tests that exercise that resolver
			* the main test suite just exercises _pip_ -- not specifically resolver-related.
		* Ilan: tests would only exercise the _resolver_? Unit tests or integration tests? Probably both?
			* Paul: we only need tests for the resolver in principle.... in practice, everything is coupled enough that, to get very far, you quickly have to run end-to-end tests. We have small unit tests, but for anything that exercises the real dependency resolution, I have found, the easiest way to do that is to run pip install in a test harness and see what comes out. Might not be best way. I believe they will have to be integration tests, not just whitebox unit level tests.
			* Ilan: This also means tests are quite slow. pip installing is slow. I think it would be beneficial to be some sort of tests closer to the resolver itself
			* Tzu-Ping: wondering if we could have, in the list of requirements, what the resolver spits out is correct ..... that is the only blackbox test. Avoids pulling in all of pip itself.... does pip have enough internal global state that this won't work?
			* Paul: I'd agree.... at the end of the day, I am fine with whatever Ilan thinks is best. Tests will be slow. Right now, 1800 tests, pip install in all of them. Takes 30 min on my 8-core laptop (?). Good enough. I will be practical: I'd rather have tests and not worry about speed. I'd be quite happy for you to try to do something black-box, unit testing, and if you manage, brilliant! If you don't, rethink and do something higher level. See what you can deliver. Will be useful, whatever it is
	
		* Pradyun: Right now, tests we have -- the ones we've started writing for new resolver are in a different form from the ones we had earlier [?]

		* Consolidating both of those forms?

		* The older resolver tests were written in YAML. They implement some of the same functionality as some of the new resolver tests we are writing or have written

		* One area that it would be useful to have Ilan work on: consolidating [old and new] tests

			* New resolver tests: https://github.com/pypa/pip/blob/master/tests/functional/test_new_resolver.py
			* Older resolver/dependency YAML tests: https://github.com/pypa/pip/tree/master/tests/yaml

		* Paul: the reason for the duplication: when I wrote the tests we have, I didn't know how the YAML stuff worked well. Was not familiar. Did what worked. I am feeling a bit uncomfortable - ought to have someone looking at doing what's best for the test suite, not just "it works"

		* Ilan: when I looked at tests a month or so ago, liked the way the YAML tests are done. Concise way of covering many scenarios.
		* I'd use those tests for the new resolver as well.

		* Flagging concern: We might need to improve either test harness, whichever one we'd work on, since it's likely that they're not 100% on functionality between each other -- YAML tests can't do uninstalls, for example.
		* TODOs:
			* TODO - Pradyun: make a list of some of the tests that need deduplicating -- as a starter
			* TODO: Ilan to start this afternoon, deduplicate one, submit a PR, work through the list of the tests that need deduplicating, speak up if he is confused or unsure of anything or needs direction
			* TODO: Ilan to speak up if he runs out of tests to consolidate

What Ilan specifically would be best placed to do next

		* (checked avaliability of Ilan)

		* > Adding more tests for exercising the resolver

		* SH: Next step, Ilan does this consolidation and a good first step of that would be to have Ilan work on adding more tests to the existing tests.
		* SH: (to Ilan) Do we need some requirements gathering or specification before we can get started?
		* Ilan: Probably need to write some tests first before answering. Not sure where the problems might be.
		* SH: Sounds like, we could talk to you again on late Monday? to work toward makign sure the resolver is better exercised.
		* Ilan: Yes, will it be another call?
		* SH: Call, Or email, Or Zulip etc
		* Ilan: yes, could answer on Zulip by Monday. What more specifications I need to write more tests.
		* SH: You could be the one writing specifications. It could be that you could be doing the information gather for writing specifications.
		* Ilan: I understand.
		* SH: We're really interested in bringing your experience from past owrk on pacakge mangers -- edge cases we'd need to eb concerned with, what might make sense would be for you to do this gathering such that it enables all of us to better understand... we'd want your experience to come in.m
		* Pradyun: edge cases.... cases where pip gets stuck... where [inaudible] the resolver search would result in suboptimal results.... issues in resolution strategy .... 

		* TODO - Pradyun (today) to share with Ilan a set of places to look for current edge cases/bugs (such as the zazo GitHub issues list) -- DONE

		* Paul: I'd like to get some insights from Ilan's experience into good ways of doing error reporting - how to tell the user what happened when things go wrong, in a way that helps the user to understand/address the issue.
		* Ilan: I'm struggling with those kinds of problems with conda resolver as well. Initially the conda resolver would just say "it didn't work" "unable to satisfy resolver". In the last year, it was improved, but now the problem is the opposite - it lists too much info, much of which is not relevant. The user doesn't really understand where the problem is.
		* Paul: Sounds like the insight of either too-much or too-little.
		* SH: Any guidance to share of how to do this well?
		* Ilan: my guidance would be: be as specific as possible about what went, wrong to a user. Don't just tell the user "something went wrong" but not a list of many different things that COULD be wrong.
		* SH: Yes/No -- any guidance to share of how to walk this line -- between specific information vs being inacurate?
		* Ilan: Depends on the resolver + how easy/difficult it is to get that information. eg: hard to say what went wrong; easy to say it went wrong w/ conda. Need to see how it could produce the resolver.
		* SH: So, it depends on debuggability + user visibility etc. PM+PG+TP could look into this; debugging documentation?

		* Paul: figuring out what hooks we should add .... we're working on that
		* Sumana: (to Ilan) can I ask you in Zulip to talk about guidance on what hooks; based on your experience; what hooks could be useful etc.
		* Ilan: sssuuuureeee.....
			* TODO: Sumana to ask Ilan to talk about that in Zulip .... on Monday or Tuesday

		* Ilan has almost all of his hours remaining.
		* 66 hours left

			* TODO: Start with deduplicating tests listed by Pradyun (see above)
		* TODO: Group to check in with Ilan late Monday (US time) to get some more info about what he needs to get going on writing more tests exercising the resolver
	
Summary:

we have a way forward