Reading List Referenced at Usenix Talk

My Usenix talk this year uses various books I've drawn on for inspiration as backgrounds for my slides.  The goal of this was to share some of the broader world beyond what we usually look to as computer scientists.  Some of these books are accessible, while others are extremely dense.  I recommend picking things up and putting them down if they don't speak to you.  It's all about what is useful, helpful and challenging to you where ever you are right now.

When Buying Computer Components

I'm doing a workshop on putting together computers in two hours, and rather than do hand outs I figured I'd toss the links up on my blog. When buying components for a computer, I usually read:

Ars Technica

If you are buying the components for a machine, they do systems guilds that are a useful report on the state-of-the-art each December-ish.  The one for 2012 is available here:

Video Card Benchmarks

They have performance statistics and a useful ranking of video cards by performance per price that I find particularly useful.  It is helpful to remember that many of these are relatively arbitrary, so if you have some specific game or application in mind it is useful to find reviews specifically for that application.


This is the easy way to find the RAM that goes with your motherboard.


Tom’s Hardware 

Has comparative reviews of various components, like hard drives, though I find their comparisons less easy-to-read than the video card benchmarks site.


Fractal Design Patterns

The difference between Architecture and Code blurs quickly when refactoring becomes sufficiently common, so the distinction made between various pattern languages never seemed especially helpful to me. Between Architecture and Service the line is firmer: this code is mine, that code is yours, here is the interface. At the same time, I've found that the design patterns that work when I'm writing methods and classes still apply when I'm working with services. The goal is still to increase cohesion and decrease coupling, even if often I have no control over half of the code.

Thus, the idea of a Fractal Design Patterns. Instead of the usual pattern description, which describes the pattern at a specific level of abstraction, a Fractal Pattern would illustrate it at multiple levels and try to get at the underlying principle.

For example, I'll take the algorithm-swapping-base-on-state that is described by the Strategy pattern.

The Internet is Convincing Women Not To Study Computer Science

A summary from YodasEvilTwin on Slashdot:

"The internet is dominated by sexist men, which discourages women from getting involved in related fields."  

I add a bunch more caveats, references and empirical data, but that is a good summary of how I interpret the evidence.


There is currently a responsibility-dodging contest between industry and academia over who is to blame for the declining enrollment of women in Computer Science and declining employment of women in software development. I hear people in industry bemoan the "empty pipeline", while academics maintain that women aren't entering their programs because of perceptions of the industry.  I have compiled some data that may help resolve the question by highlighting a third factor common to both: access to an Internet-based culture of computing.

Assumptions Make Programming Possible

Scott McCloud, in Understanding Comics, uses a simple image to explain how people employ assumptions when reading comics:

I may have drawn an axe being raised in this example, but I'm not the one who let it drop or decided how hard the blow or who screamed or why.  That, dear reader, was your special crime, each of you committing it in your own style.

I argue that the same is true when reading code.  The difference, however, is that with executables we can check those assumptions against our invented reality.

An Introduction to Services From This Coder's Perspective

First, there is the question of “what is a Service?”  

W3C defines a web service as:
a software system designed to support interoperable machine-to-machine interaction over a network.”[1]

They go on to specify some specific technologies mandated by this particular standard and the role a service plays:
“The purpose of a Web service is to provide some functionality on behalf of its owner -- a person or organization."
This definition describes the infrastructure, focusing on how we deploy services and what purpose they serve right now to MBAs.  It is as though they are defining a program as "a sequence of machine instructions that perform mathematical or logical operations on behalf of a computer operator”.  I want a definition that tells me how to address the technical questions that emerge and provides guidance on how we will be writing and employing services going forward.  My current personal definition is:
A Service is an Object bound at run-time by a networking protocol.

Note that I mean “object” here under Alan Kay’s definition [2] and not in the particular sense that the word is used any specific language like Java or C++.  Despite varying dynamics, all service frameworks encapsulate some combination of data and behavior (not necessarily *well*, of course, but when they fail it is usually clear from the pain that follows.)  They hide implementation details while fulfilling either implicit or explicit contracts in response to messages sent by other services.[3]

The interesting change from current Object-Oriented programming paradigms is that they also encapsulate the location they are executing and the provider of the object.  Because we bind to the specific implementation at runtime using a networking protocol of some kind, the object can be running on any machine, virtual or physical, anywhere reachable by that protocol.  I don’t think this fundamentally changes the paradigm any more than dynamically linked libraries did, but it is really, really cool.

Swapping from Google Reader to Tiny Tiny RSS

Now that Google has broken integrated Google Reader with Google+, I was looking for a replacement that would let me use my daily reading of feeds the way I always had: as a way to share long-form content with other folks who specifically wanted to read the long-form content I shared this way (opt-in broadcast). Google+ defeats the purpose: I like my RSS feed and my friends' shared items specifically because of the high signal-to-noise ratio and the lack of dilution with other content.

My search led me to Tiny Tiny RSS. It offers a similar feature to Google's shared items, except instead of a specific social area it “publishes” items to your own RSS feed. It does not replace the comment or discussion capabilities of Google Reader, but it has the advantage of being something I can host myself and open source; if I ever have some free time I can address any flaws that continue to bother me.

A Preface to My Future Work on the Economics of Open Source

“Economics is basically about incentives and interaction — or, as Schelling put it, micromotives and macrobehavior. You try to think about what people will do in certain circumstances, and you try to understand how individual behavior adds up to an overall result.” – Paul Krugman

The economics of open source software has generally been approached from the perspective of “why would people do this thing?”  This makes some sense; classical economic models leave non-monetary considerations to the realm of game theory and sociology and the question of micromotives initially looks exceptionally opaque.  The result, however, has been a skeptical approach to applying an economic lens to open source and a general failure to explain the macrobehavior involved.  Most papers I’ve found attempt to explain away open source as human irrationality, rather than demonstrate the way it fits with, and indeed validates, our existing models.  I'm one of those people who think that if reality clashes with a model, the problem is probably not reality.


CMake Tips & Tricks: Drop Down List

In a recent CMake project I was setting up, I wanted users to be able to choose one of several possible libraries at project generation, to make performance comparisons easy on multiple platforms.  This is easy enough to do with a configuration parameter, but since the libraries available were a limited set offering the available options seemed better.  I discovered that in the CMake GUI it is possible to have a drop down menu of options for a given property, and it’s actually quite easy.  The only thing to keep in mind is that this approach doesn’t enforce anything; the user could still enter other options.  Since this is only used by developers to generate projects, I didn’t particularly care.  They break it, they bought it, as it were.

First, we use a cache variable and enumerate the options for our drop-down list:

SET(LIBRARY_TO_USE "Option1" CACHE STRING "library selected at CMake configure time")


After that it’s just a matter of changing the things that should change when this option changes.  There are a couple possible approaches here, though none of them completely satisfy my aesthetic sense.

1.       The first is simply to call everything on every invocation, but that defeats the purpose of the caching in the first place. 

2.      I can make sure this .cmake file is included at the top of the project CMakeLists.txt, so it is called before anything else that might include this library.  In that case I can check the LIBRARY_FOUND variable, which is set the first time any of these libraries are loaded during a build.  The upside of this is that if multiple files include this .cmake file it will only reload everything once per project generation.  The downside is that it relies on not having someone load the library before this file is included, and that was a deal breaker; I don’t want to rely on implicit assumptions.  Also, it still reloads the cache once per build. On the up side, if I want to vary non-cache values this allows me to group all the change logic in one place.

3.       The final option is explicitly checking to see if the variable has changed by caching the last value inside of the has-changed if statement. This requires using a second cached variable to hold state and initializing it if it is undefined. Additionally, this variable should never be changed by a user, so I use MARK_AS_ADVANCED to hide it from the GUI.

I used option three, which looks like:


     SET(LIBRARY_LAST "NotAnOption" CACHE STRING "last library loaded")





     SET(LIBRARY_LAST ${LIBRARY_TO_USE} CACHE STRING "Updating Library Project Configuration Option" FORCE)


The important part of this is “UNSET”.  Any cached variables that are set in the Find.cmake file will need to be explicitly cleared in order for them to be actually updated.  The rest of it is simply determining whether or not the parameter changed.

Finally, we need to change parameters on the basis of what option is selected.  If only cache variables change, we can include this in the “if changed” loop, but I was using non-cached variables accessed by the Find.cmake files, so I set these each time.  It would be cleaner to separate these into their own CMake files with a regular naming scheme, but since I was only setting one parameter I didn’t bother.  This looked like:








Etc.  The path naming conventions were not actually that regular either, or wouldn’t have needed the switch statement.

All of these went into a SetLibraryOptions.cmake file, and I added INCLUDE(SetLibraryOptions.cmake) to the root level CMakeLists.txt file.  When I included this library in a future target, I used the regular package syntax with ${LIBRARY_TO_USE} as the package name.  This is why it was so useful to have a drop down menu here: each package name must exactly match the format of the Find.cmake file.  T

Now, when I use the library in another package the include will look something like:




           MESSAGE(FATAL_ERROR “failed to find “ ${LIBRARY_TO_USE})



And that’s it; when the user selects a different library all of the projects will be regenerated with the new option.  The final result looks like:


@dharmesh Polls Twitter: What Do We Call People Who Code?

For those of you that write code, what term do you prefer? Programmer? Engineer? Developer? Something else?

Dharmesh Shah asked this question Twitter yesterday, and I did a quick compilation of the public responses.  For the answers with more than one vote I include the total votes and also a score. About a third of responces used some form of, “I like X, but sometimes I use Y”, and instead of throwing that information away I awarded 3 points for a first choice, 2 for a second choice and 1 point for a third choice.

  • Developer – 17 votes/score 52
  • Engineer –11 votes/score 29
  • Programmer – 4 votes/score 11
  • Hacker – 4 votes/score 11
  • Coder – 3 votes/score 8

The answers that appeared only once were:

  • Byte Surgeon
  • Architect
  • Tinkerer
  • Code Monkey
  • Codewright
  • Professional Geek
  • Someone who types on a keyboard all day in air conditioning
  • Chief Ideas Officer


It definitely looks like “Developer” is the standard, but what immediately jumped out at me was the way some people embrace the same aspects of the job others try to avoid. Some people reported that “Programmer” sounded too much like someone who just wrote code and didn’t think about it, whereas someone else described their job as “Code Monkey”, which revels in that role. Some of the creative responses, like “Chief Ideas Officer” didn’t imply any contact with code at all, where as others, like “Byte Surgeon”, implied a visceral, low-level involvement.

It seems like sone of the trade off is between “code” and “prestige”, which is always disappointing for me to discover.  Several people suggested they would use different words if talking to a fellow coder rather than someone outside the profession, usually preferring "Engineer" when talking to people who don't write code themselves. This is perhaps why “Developer” wins out in the end: it seems to suggest a job that involves typing things that get executed, one way or another, without also suggesting that someone handed you pseudocode to implement. Which may be to say, it is uniformly bland and uninformative, conveying as little information about the tasks performed and the role plays as humanly possible.

 It is clear that there are multiple jobs that would fall in this category, though, even if we don’t yet have the language to articulate the differences.  Certainly independence vs. subordination is a common theme, but I also noticed there were no terms proposed that specifically called out “team member” or “collaborator”. I would personally prefer such a term to either the independence of “Hacker” or the subordination of “Code Monkey”.  Unfortunately, any such word runs the risk of stepping too far from the technical roots,and implying that the code writer is no longer elbow-deep in bloody code.