I tend to use GitHub’s star feature pretty liberally: I flag projects that look immediately useful, those that are interesting because of the language/ecosystem they’re part of and those that may be potentially useful down the road. Starring a project is a simple fire-and-forget operation whereas adding a bookmark to Pinboard is moderately distracting since you need to add a description, tag it, and so on. The downside is that my collection of GitHub stars lives in its own little silo, cut off from the ‘canonical’ collection of links that I have built up in Pinboard.
I had first thought of resolving this problem by using IFTTT but the triggering events supported by its GitHub channel are attached to major repository-level actions such as pull requests and issues, so that was out. Plan B was to write a script to do it myself. How hard can it be? As it happens, not all that difficult, though there were a few little corner cases to consider.
Run it
You can get the script on GitHub. While this information is also in the README, in order to run the script, you will need:
- the Requests library
- your Pinboard API token
- a personal API token on GitHub
The last item is not strictly necessary as a) stars are part of your public profile and b) GitHub’s API can be used without authentication, but usage is limited to 60 requests/hour vs 5000 requests/hour for authenticated use; a token avoids issues with rate limiting. Finally, your GitHub username is used by the script as the HTTP user-agent string, as required by GitHub’s API.
You can either pass the API tokens as command line parameters or stick them in specially-named dotfiles in your home directory:
# Assuming you copy the token to the clipboard before running each command,
# pbpaste dumps the clipboard contents to stdout
$ pbpaste > ~/.github_api_token
$ pbpaste > ~/.pinboard_api_token
# Keep prying eyes out
$ chmod 600 ~/.*_api_token
Running with dotfiles:
$ python pin-github-stars.py -u GITHUB_USERNAME
Running without dotfiles:
$ python pin-github-stars.py -g GITHUB_TOKEN -p PINBOARD_TOKEN -u GITHUB_USERNAME
Implementation details
Requests
I love Requests. It takes its motto of “HTTP for humans” seriously, and it’s one of the first tools I reach for when I start doing anything HTTP-related in Python: it’s both that good and the standard libary’s mess of options for doing the same are, comparatively, that painful. It’s so good that—for cases like this where I only need to use a subset of a service’s REST API—I typically don’t bother with the service-specific library. (It also doesn’t hurt that GitHub’s API in particular is among the better ones you may come across.)
While adding external dependencies is rarely without cost, Requests makes a strong argument for itself in that the resulting code is extremely concise and as a library, it’s largely self-contained.
Generators
This is perhaps a bit indulgent but I’ve found that practical examples of Python generator usage are sometimes hard to find. Since GitHub’s API makes extensive use of paging for calls that may return large numbers of results, it matches up well with the “on demand” nature of a generator. I’ve slightly modified a chunk of the script here:
1 def get_github_stars(headers, sort_dir):
2 page = 1
3 last = 1
4 params = {'direction': sort_dir}
5 while page <= last:
6 params['page'] = page
7 r = requests.get('https://api.github.com/user/starred',
8 params=params,
9 headers=headers)
10 yield r.json()
11 if 'link' in r.headers:
12 m = re.search(r'page=(\d+)>; rel="next",.*page=(\d+)>; rel="last"',
13 r.headers['link'])
14 # Lazy way to see if we hit the last page, which only has 'first' and 'prev' links
15 if m is None:
16 break
17 page, last = m.groups()
18
19 # Simple usage of the generator:
20 for results_page in get_github_stars(my_header_dict, 'desc'):
21 # Do the thing with the stuff
When execution reaches the yield
statement, a page of results is returned to the for
loop and execution of the generator is suspended (though its state is preserved). When the for
loop finishes iterating over the contents of results_page
, execution returns to the generator and picks up after the yield
statement, where the function uses the link
header to determine URLs for subsequent pages of results. The while
loop continues, exiting once the final page URL is followed.
It’s important to keep in mind that a generator function differs from a normal function: calling a generator function directly returns a generator object; it does not start running the generator. To do that, you’d call the generator object’s next()
method—in a for
loop, that’s handled automatically.
If you’d like to dig deeper into the world of generators, this David Beazley presentation is a good place to start.