Python – Turning a personal Python project into a releasable library

development-processprogramming practicespython

I'm an academic rather than a programmer, and I have many years' experience writing Python programs for my own use, to support my research. My latest project is likely to be useful to many others as well as me, and I'm thinking of releasing it as an open-source Python library.

However, there seem to be quite some hurdles to cross in going from a functioning personal project to a library that can be installed and used painlessly by others. This question is about the first steps I should take in order to start working toward a public release.

Currently, I have a single git repository that contains my code that uses the library as well as the library itself, and I use git as an emergency undo button in case anything breaks. All of this works fine for a single user but is obviously not appropriate if I want to release it. Where I want to end up is that my library is in a separate repository and can be installed by others using pip, and has a stable API.

Learning to use setuptools etc. is probably not so hard once I'm at the point of wanting to publish it – my problem is in knowing how I should be working in order to get to that point.

So my question is, what are the first steps one should take in order to start preparing a Python library project for public consumption? How should I reorganise my directory structure, git repository etc. in order to start working towards public a release of the library?

More generally, it would be very helpful if there are resources that are known to be helpful when attempting this for the first time. Pointers toward best practices and mistakes to avoid, etc., would also be very helpful.

Some clarification: the current answers are addressing a question along the lines of "how can I make my Python library a good one for others to use?" This is useful, but it's different from the question I intended to ask.

I'm currently at the start of a long journey towards releasing my project. The core of my implementation works (and works really well), but I'm feeling overwhelmed by the amount of work ahead of me, and I'm looking for guidance on how to navigate the process. For example:

  • My library code is currently coupled to my own domain-specific code that uses it. It lives in a subfolder and shares the same git repository. Eventually, it will need to be made into a stand-alone library and put into its own repository, but I keep procrastinating this because I don't know how to do it. (Neither how to install a library in 'development mode' so that I can still edit it, nor how to keep the two git repos in sync.)

  • My docstrings are terse, because I know that eventually I will have to use Sphinx or some other tool. But these tools seem not to be simple to learn, so this becomes a major sub-project and I keep putting it off.

  • At some point I need to learn to use setuptools or some other tool to package it and track the dependencies, which are quite complex. I'm not sure whether I need to do this now or not, and the documentation is an absolute maze for a new user, so I keep deciding to do it later.

  • I've never had to do systematic testing, but I definitely will for this project, so I have to (i) learn enough about testing to know which methodology is right for my project; (ii) learn what tools are available for my chosen methodology; (iii) learn to use my chosen tool; (iv) implement test suites etc. for my project. This is a project in itself.

  • There may well be other things I have to do as well. For example, jonrsharpe posted a helpful link that mentions git-flow, tox, TravisCI, virtualenv and CookieCutter, none of which I'd heard of before. (The post is from 2013, so I also have to do some work to find out how much is still current.)

When you put this all together it's a huge amount of work, but I'm sure I can get it all done if I keep plugging away at it, and I'm not in a hurry. My problem is knowing how to break it down into manageable steps that can be done one at a time.

In other words, I'm asking which are the most important concrete steps I can take now, in order to reach a releasable product eventually. If I have a free weekend, which of these things should I focus on? Which (if any) can be done in isolation from the others, so that I can at least get one step done without needing to do the whole thing? What's the most efficient way to learn these things so that I will still have time to focus on the project itself? (Bearing in mind that all of this is essentially a hobby project, not my job.) Is there any of it that I don't actually need to do, thus saving myself a huge amount of time and effort?

All answers are greatly appreciated, but I would especially welcome answers that focus on these project management aspects, with specific reference to modern Python development.

Best Answer

Adding a setup.py, while necessary, is not the most important step if you want your library to be used. More importantly is to add documentation and advertise your library. Since the second point strongly depends on the library, let me rather focus the documentation aspect.

  1. You know everything about your library. And this is problematic. You already know how to install and how to use it, so many things may seem intuitive or plainly obvious to you. Unfortunately, the same things may be neither obvious, not intuitive for the users. Try to look at your library as if you knew nothing about it, and more importantly, ask other people to use it and try to spot all the difficulties they had.

  2. Explain, in plain English, what is your library about. Too many libraries assume that everybody knows about them. When this is not the case, it may be difficult to grasp what is the purpose of the library.

  3. Write detailed technical documentation, but also don't forget about short pieces of code which show how to do some of the tasks with your library. Most developers are in a hurry, and if they need to spend hours trying to understand how to do a basic thing, they may tend to switch to other libraries.

  4. Include your contact information. If your library is a success (and my own experience have shown that this is the case even for rather unknown ones as well), people would encounter difficulties with it: either bugs or simply difficulties understanding or using some parts of it. It is often useful to receive their feedback to improve your library: for every person who reported a problem, there are possibly hundreds who, when encountering it, would just prefer to switch to another library.

Additionally to that:

  1. Make it clear if your library works with Python 2 or 3 or both.

  2. If the library doesn't work on Windows, say so.

  3. Ensure you use official conventions (use pep8 to check). If not, either explain it clearly or fix it.

  4. Take care of handling edge cases. When your library is called with a wrong type or with a value which is not supported, it should say, in plain English, what exactly is wrong. What it shouldn't do is to raise a cryptic exception ten levels down the stack and let the user figure out what went wrong.

Related Topic