Packaging and Distributing Code (for python)

Kevin Gullikson

Why Package code?

  • Get citations. Profit.
  • Make things easy on collaborators
    • They can install things easier
    • They can all use the same code base as you
    • They can contribute to the code
    • Motivate them to switch to python
  • Make things easy on YOU
    • Easily install on new computers
    • Documentation --> helps you remember what your 2 year old code does
  • Overall good experience for industry jobs (I presume)

Setting up a Python Package

Minimal Package Structure

root
+-- setup.py
+-- README
+-- LICENSE
+-- package_name/
|   +-- __init__.py
|   +-- foo.py
|   +-- bar.py

What is the setup.py?

  • This tells python how to install the program
  • You have probably done this before:
python setup.py install
  • Here is a good tutorial on setting it up.
  • An example of a setup.py I created:
from setuptools import setup

setup(name='fitting_utilities',
      version='0.1.0',
      description='Various useful classes for fitting stuff.',
      author='Kevin Gullikson',
      author_email='kevin.gullikson@gmail.com',
      license='BSD',
      classifiers=[
          'Development Status :: 3 - Alpha',
          'Intended Audience :: Science/Research',
          'License :: OSI Approved :: BSD License',
          'Programming Language :: Python',
          'Topic :: Scientific/Engineering :: Astronomy',
          ],
      packages=['fitters'],
      requires=['numpy', 'astropy'])

setup.py arguments

  • name: This is what goes on pypi (more on that later)
  • classifiers: Think of these like the keywords you put in your abstract. You want to make your code searchable. A list of classifiers is available here
  • packages: In the simple/standard case, it is just a list of the packages you are making available. This is the name you import
    import fitters
    
    NOT
    import fitting_utilities
    
    • Having different things for the 'name' and 'packages' field can lead to confusion

python setup.py develop

  • Connects the package to your python environment without installing it.
  • Changes you make are instantly available anywhere on your system
  • Don't be like 1st-year me and hack at your PYTHONPATH!

What is the __init__.py?

  • It tells python that this directory is a package
  • Does not need to have anything in it - an empty file is fine
  • Lets you do:
import package_name
package_name.foo.foofunction()
  • You CAN put some stuff in it, though. Putting this in the __init__.py
    from foo import foofunction
    
    Lets you do:
    import package_name
    package_name.foofunction()
    

Some final thoughts about package setup

  • Always have a README
    • github will initialize one for you when you make a repository
    • github works with markdown or reStructured Text (.rst files)
    • pypi only works with reStructured Text
  • Always have a LICENSE

    "Because I did not explicitly indicate a license, I declared an implicit copyright without explaining how others could use my code. Since the code is unlicensed, I could theoretically assert copyright at any time and demand that people stop using my code. Experienced developers won't touch unlicensed code because they have no legal right to use it. That's ironic, considering the whole reason I posted the code in the first place was so other developers could benefit from that code. I could have easily avoided this unfortunate situation if I had done the right thing and included a software license with my code." -- Jeff Atwood, (codinghorror)

  • The main choices are:

    • BSD/MIT: Permissive. Anyone can use for any purpose. The only legalese is saying that I don't guarantee this will work.
    • GPL: copy-left. Anyone can use, but then their license must be GPL as well.
    • Choosing between the two gets nerds as riled up as vim vs emacs.

Documenting Your Code

Readme

  • Should contain general information about the package
  • Should include how to install it (even if that is just python setup.py install)
  • A simple usage example is a good idea too

Docstrings

  • One of the best parts of python. USE THEM
  • From astropy:
def blackbody_nu(in_x, temperature):
    """Calculate blackbody flux per steradian, :math:`B_{\\nu}(T)`.
    .. note::
        Use `numpy.errstate` to suppress Numpy warnings, if desired.
    .. warning::
        Output values might contain ``nan`` and ``inf``.
    Parameters
    ----------
    in_x : number, array-like, or `~astropy.units.Quantity`
        Frequency, wavelength, or wave number.
        If not a Quantity, it is assumed to be in Hz.
    temperature : number, array-like, or `~astropy.units.Quantity`
        Blackbody temperature.
        If not a Quantity, it is assumed to be in Kelvin.
    Returns
    -------
    flux : `~astropy.units.Quantity`
        Blackbody monochromatic flux in
        :math:`erg \\; cm^{-2} s^{-1} Hz^{-1} sr^{-1}`.
    """

Docstring conventions

  • Give a short description of what the class/function/method does
    """Calculate blackbody flux per steradian, :math:`B_{\\nu}(T)`.
    """
    
  • Describe each parameter (both what the variable type should be and what the parameter means)
    """
      Parameters
      ----------
      in_x : number, array-like, or `~astropy.units.Quantity`
          Frequency, wavelength, or wave number.
          If not a Quantity, it is assumed to be in Hz.
      temperature : number, array-like, or `~astropy.units.Quantity`
          Blackbody temperature.
          If not a Quantity, it is assumed to be in Kelvin.
      ...
    """
    
  • Explain what the function returns, if applicable
    """
      Returns
      -------
      flux : `~astropy.units.Quantity`
          Blackbody monochromatic flux in
          :math:`erg \\; cm^{-2} s^{-1} Hz^{-1} sr^{-1}`.
    """
    

Sphinx (+readthedocs)

  • Can probably skip this step unless you are widely publishing the code
  • Builds documentation from reStructuredText
  • Similar stuff to the README on the main page
  • Include tutorials
  • Auto-documentation of the API

Sphinx setup

Run sphinx-quickstart in your root directory.

root
+-- setup.py
+-- README
+-- LICENSE
+-- package_name/
|   +-- __init__.py
|   +-- foo.py
|   +-- bar.py
+-- docs/
|   +-- conf.py
|   +-- Makefile
|   +-- index.rst
|   +-- foo.rst
|   +-- baz.rst

index.rst

  • This is what people will open the documentation to.
  • My index.rst:
Welcome to TelFit's documentation!
==================================

Contents:

.. toctree::
   :maxdepth: 2

   Intro
   Installation
   Tutorial
   API
   Updating the atmosphere profile <GDAS_atmosphere>

API (application programming interface)

  • Setup in the conf.py (gets mostly generated when you run
sphinx-quickstart
  • Include the autodoc extension:
# -- General configuration ------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.coverage',
    'sphinx.ext.mathjax',
]

Docstring format for autodoc

You can use this format:

"""
Parameters
==========
- wave            wavelength (in nanometers)
"""

But this format will render nicely with sphinx

"""
:param wave: wavelength (in nanometers)`
"""

Sam Harrold informs me that you can use the former, if you add this to your conf.py (for details see here):

extensions = [..., 'sphinxcontrib.napoleon']

Readthedocs integration

  • Set up an account, connect to your github
  • Add commit hooks to rebuild the documentation every time you commit to master
  • Mostly will just work if the documentation builds on your own computer
  • readthedocs won't install random code (understandably)
  • You almost definitely will need to hack the conf.py file to make it work on readthedocs
# Mock a few modules
from mock import Mock as MagicMock

class Mock(MagicMock):
    @classmethod
    def __getattr__(cls, name):
            return Mock()

MOCK_MODULES = ['FittingUtilities', 'numpy', 'scipy', 'matplotlib', 'scipy.interpolate', 'numpy.polynomial',
                'lockfile', 'scipy.optimize', 'astropy', 'pysynphot', 'fortranformat', 'cython', 'requests',
                'scipy.linalg', 'matplotlib.pyplot']
sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES)

Distributing Your Code

Pypi

  • de-facto standard in python installation.
pip install astropy
  • Works great for pure-python packages, especially if they don't depend on anything too complicated.
  • Works for more complicated things as well, but can get hairy...

Tutorial (mostly stolen from here)

  • One-time stuff:
    1. Create an account on pypi and pypi testing
    2. Create a .pypirc file in the home directory (to make your life easier):
[distutils] # this tells distutils what package indexes you can push to
index-servers =
  pypi
  pypitest

[pypi]
repository: https://pypi.python.org/pypi
username: your_username
password: your_password

[pypitest]
repository: https://testpypi.python.org/pypi
username: your_username
password: your_password

Tutorial (continued)

  • For each package

    1. Make a setup.py if you haven't already. Make sure there is a version number in there!
    2. Register to pypitest:

      python setup.py register -r pypitest
      
    3. Upload to pypitest

      python setup.py sdist upload -r pypitest
      

Tutorial (continued)

  1. Test:

    # Make a new environment to isolate this from the installation you probably already have working
     conda create -n package_test python=3 numpy astropy ...
    
     # Switch to the new environment
     source activate package_test
    
     # Install your new package
     pip install -i https://testpypi.python.org/pypi <package name>
    
     # Test that it works. At the very least, make sure you can import the package
     python -c 'import package_name'
    
    • If it works, move on.
    • If not figure out what went wrong, increment the version number and start from step 3
  2. Upload to pypi (de-increment the version number if you had to update it while testing)

    python setup.py register -r pypi
     python setup.py sdist upload -r pypi
    

Anaconda

  • Quickly starting to rival pypi for installation
  • Installs binaries rather than compiling from source
  • Knows more about dependencies, can install other things in the right order
  • Can easily install non-python things too (like the HDF5 library need by h5py/pytables)

Tutorial

  • I will assume you have your package on pypi already. There are other ways to make conda packages...
  • One-time stuff:

    1. Make an account on anaconda.org
    2. login:

      anaconda login
      # Enter username and password when prompted
      

Tutorial (continued)

  • For every package

    1. cd to your home directory
    2. conda skeleton pypi <package_name>
    3. Look at the meta.yaml in the new package_name directory. Make sure the information, and especially the required packages, are correct
    4. If you have the "install_requires" keyword in your setup.py, you may need to edit the build.sh to have:

      $PYTHON setup.py install --single-version-externally-managed --record=/tmp/record.txt
      

      (that fix came from this issue)

    5. build the package. The full path name to the package will be printed to the screen

      conda build <package_name>
      

Tutorial (continue)

  • Uploading to anaconda.org

    1. Convert to work for other platforms (I am not sure this is guaranteed to work)

      conda convert -f --platform all full/path/to/package -o output_directory
      
    2. Upload all of the packages to anaconda.org (must be logged in)

      for f in output_directory/*/*
      do
         anaconda upload $f
      done
      

Publicize your code!

How to publicize sort of depends on how big the package you are developing. Options are:

  • Astrophysics Source Code Library - Needs a paper to link to, but it doesn't have to be a code paper. Pick any paper where you used the code in a reasonable way.
  • Write a full-fledged paper - This is mostly for big packages.
  • Advertise on python users in astronomy facebook group
  • Advertise on astropy mailing list

Questions?