This post originated from an RSS feed registered with Agile Buzz
by Travis Swicegood.
Original Post: Using Basketweaver with GitHub
Feed Title: Travis Swicegood
Feed URL: http://travisswicegood.com/atom/
Feed Description: Posts on Git from Travis Swicegood, author of Pragmatic Version Control using Git.
Last month I blogged about using Travis CI with
Armstrong. Things have been going along fine until the last few weeks.
Tests were failing due to network timeouts while talking to PyPI. Never
one to take failing tests lightly, I set out to fix it.
From local testing, it appeared that there was some sort of selective filtering
happening at the server level on PyPI that was causing our tests to fail. All
of our tests in the CI environment follow these tests:
Install all of the development requirements with pip install -r requirements/dev.txt
Install the local package
Execute the tests using fab test
I could follow these steps to the letter locally in a fresh virtualenv, but the
second they hit the Travis-CI server they would time out while trying to
install everything. We’ve seen similar behavior at the Tribune when we roll
out new servers. PyPI appears to be up, but installs fail due to timeouts.
Once I confirmed this, I started looking at alternatives to pypi.python.org as
our main index for testing. My initial thought was to have a dynamic server
that would act as a proxy to PyPI and cache everything locally. This requires
the least amount of work long-term—assuming the server stays up. The problem
was that nothing worked quite the way I wanted. The closest I found was
collective.eggproxy. It felt a little odd and wasn’t very configurable
without going the Paster route, so I decided to fall back on basketweaver.
Basketweaver builds a static index suitable for using with pip via the
--index-url option. It takes a directory of files, then generates the HTML
that pip can scrape to determine if the package exists. This HTML can be
hosted anywhere that can serve a static HTML page, such as GitHub Pages.
Working with GitHub
There’s a few hoops to jump through when deploying to GitHub Pages. First,
make sure you include an empty .nojekyll file. GitHub assumes everything you
want to publish is in Jekyll, but this file tells GitHub to not parse your
files.
Next, and I can’t count the number of times I’ve done this, GitHub Pages
doesn’t give you directory indexes. Basketweaver generates its index in the
/index/ directory so you can’t hit the plain GitHub Pages URL and expect to
see anything more than an error message. Make sure to add the /index/ after
your GitHub Pages URL to view the it once you’ve published your changes.
The next thing I do is rework where basketweaver looks for files to build the
indexes. I really don’t want to look at a full directory of files at my root
directory, instead I want all of the files stored in the creatively named
./files/ directory. Basketweaver installs a file called makeindex which I
can never remember, so I created a run.py file that remembers it for me.
The last thing to do is to use the newly created index when installing
packages. For Armstrong, we do this with:
I haven’t gone to the trouble of setting up a CNAME for
pypi.armstrongcms.org yet, so we’re using the main github.com-based address.
There’s one final gotcha: PyPI uses routing that treats
http://pypi.python.org/pypi/South/ and
http://pypi.python.org/pypi/south/ as the same URL. That’s why
pip install Django and pip install django both work even though the former
is the correct package name. The URL spec is ambigous as to whether this is
correct, but most web servers are case sensitive, including GitHub Pages.
This will get you if you have dependencies on packages that don’t use all
lowercase names, such as South, Fabric, or Django. All three of these are
dependencies of Armstrong. The fix is to make sure that your
install_requires and requirements files have the correct case. The easiest
way to determine this is to look at the output of pip freeze and make sure
you’re using the same package name as it generates.
Conclusion
At the end of the day, this keeps our tests from being held hostage whenever
PyPI goes on the fritz or starts randomly filtering requests as it seemed to do
this past week. All that said, we’re still borrowing other people’s
infrastructure. GitHub had a little blip while I was writing this post,
underlining that you get what you pay for.
While you can use Basketweaver and GitHub to create a mirror of sorts for your
packages, make sure you control the infrastructure if its mission critial that
everything always stay up. That, or pay for it so there’s someone to call when
it goes down.