I grew up a hacker (in the original sense) and thus a True Believer in open knowledge. And so, when it came time to start publishing science, I figured I’d make all my products Open. But it turns out that there’s a bewildering array of things to think about if you want to do so. More recently, I’ve been wanting to incorporate other people’s creations in my own, and have encountered various difficulties in using Open products. I’m writing this post, in part, so I have notes I can easily reference in the future. But I figure if it helps me, it can help others, so here you go.
I have to put a note here that I am not a lawyer, and so this is not legal advice. This is just my good faith understanding of the intersection of U.S. copyright law, licensing, and academic products.
What is copyright, and why do I care?
When you make a Thing, you get to decide how to it’s used and how to distribute it to other people. That’s copyright. The sorts of Things we’re concerned with here are scientific writing (journal papers, reports, dissertations, etc.) and other media (photos, video, audio, etc.), scientific data, and software. You’ll see these Things referred to as “creative works” if you read a lot about copyright. Copyright is a type of intellectual property, and is different from patents, which cover inventions  specifically a physical thing or a process, and trademarks, which distinguish products and services from similar ones. And most likely, if you make a scientific Thing, you are automatically granted copyright.  There are exceptions, though. If you work for the U.S. government, your Things will automatically be in the public domain. And if you are the employee of a University or other institute, you may have signed away your rights in that flurry of paperwork you got when you were hired; in other words, your institution may own the copyright on Things you make, not you.
What do I do with my copyright?
Whatever you want.
The historical use of copyright goes something like this… I wrote a scientific paper and now Journal of Things (JoT) wants to publish it. I assign a license to JoT saying that they can use my writing to make a new Thing — a journal article — and that this journal article can be disseminated as JoT sees fit. Note that I retain the copyright to my actual writing, but JoT has copyright to the formatted, spiffed-up, published version. Now, let’s say someone else wants to use a figure from the published article, they now need JoT to assign a license to them for the use of that figure.
This model of assignment can work fine if the Thing you make is just used once or twice by others, or if you feel strongly about how your Thing is used and distributed. But otherwise, it can get cumbersome. Instead of (or in addition to) assigning licenses on a case-by-case basis, you can assign a general non-exclusive license that automatically allows people to use and disseminate your Things.
How do I assign one of these general non-exclusive licenses?
The first thing you have to do is pick one. And sadly, there are a lot of options for you out there. I really like the Choose A License site to get a sense of what the possibilities are. But if you just have time for a single blog post, here’s a quick run-down. Answer these questions:
- Are you willing to let your Thing be distributed to anyone who wants it, free of charge?
- Are you willing to let your Thing be modified into some other Thing by others? (e.g. If you take a picture that someone else wants to use, is it okay if they crop it differently or change the lighting or include it in a collage?)
- Are you willing to let your Thing and its modifications be distributed by someone else for commercial purposes? (i.e. They might make money off of it.)
- Do you require attribution? (i.e. You require that your name be attached to your Thing.)
- Do you want to make sure everyone who uses or distributes your Thing (or modifications of it) uses the same set of answers to these questions as you do?
This seems straightforward enough until you realize that your answers to these questions might have complicated ramifications. For example, if you decide you do not want your beautiful photo of a rail to be used for commercial purposes without your explicit permission, I would totally understand that. But what that means is that when I want to use it in my Ecology article, I probably still need to contact your for explicit permission. That’s because Ecology, although a publication of the non-profit Ecological Society of America, is published by Wiley, a for-profit publisher. This is, of course, a murky area, but none of us are lawyers, right? So I should ask permission. Now, if you had put an open license on that image that didn’t curtail commercial use, then I could have used it in my article without asking. Even within the Open Source community, there are arguments about which are the best licenses to use. (That’s why there are so many of them.)
Ugh, this all sounds like a lot of effort. What if I just don’t do anything?
If you don’t do anything, you retain the strictest copyright allowable under law. In other words, if you don’t assign a general license to your Thing, then legally, it can’t be used, modified, or disseminated by anyone else without getting explicit permission from you.
Well, huh. I’d like to be more Open than that. What do you suggest?
Here’s where I’m at in my thinking of open licenses, though my thoughts may continue to evolve. For creative things I write, such as blog posts, scientific articles, and so forth, I usually retain full copyright, and don’t assign an open license.
For other media, such as photos, videos, and audio, I typically assign Creative Commons license CC BY. I used to care more about commercial use and so some of my stuff is licensed CC BY-NC. But as someone who’s been stymied by the NC (“non-commercial”) designation when trying to use something for not-for-profit purposes because there’s an awful gray area, I’ve given it up. If there is something that I think might have actual commercial value (such as our Snapshot Serengeti photos), I am more conservative with licensing and will slap on an NC. If anyone does wants to use it for a commercial purpose, they can ask and I can issue a separate non-exclusive commercial license that provides me with some slice of the income (as royalties or a one-time payment).
I also used to be a fan of Creative Commons’ “share alike” (SA) restriction, e.g. CC BY-NC-SA, which forces people who use your Thing to use the same license as you. But I’ve found that such “copylefts” are severely limiting for reuse of material. For example, I am never going to be able to persuade a publisher — even a clearly non-profit one — to make a journal article CC BY-NC-SA, so if you give that license to your rail photo, I’m going to have to ask you for explicit permission if I want to use it in an article. Every. Single. Time. So for me, CC BY is where it’s at, unless I think my Thing has actual commercial value. It essentially mirrors what we do in academia already: reuse and distribute work with attribution.
For data, I make it truly Open. I assign it to the public domain, meaning that anyone can use it for any purpose, without attribution. I do this both because it aligns with standard academic practice and because I don’t want anything to get in the way of anyone using my data.  Note: please use my data! (Of course, there are potential ramifications of doing so.)
I divide code into two types: code that I consider “end code” that is very specific to particular scientific study and “general code” that might reasonably be expected to be built upon by others. An example of the former is the specific agent-based model I used for a paper on disease dynamics. And for this sort of code, I tend towards a CC BY license because it’s simple and easy and I don’t have much expectation of reuse. An example of the latter is an R package. For this sort of code, I lean towards GPL-compatible licenses to make sure that my code license meshes easily with the code licenses of others. And since I’m no longer a fan of copyleft, the MIT license works just fine most of the time. It essentially says, “go ahead and use my code as you like, but I’m not providing any guarantees that it’s any good.”
Still seems complicated. Any other thoughts?
I have read a convincing argument  that I can’t find now, despite lots of searching. If you know it, can you send me the link? that as academics we might reasonably put everything under a public domain or MIT license (which limits liability). The reasoning is essentially that (1) academic culture already provides for attribution by default; (2) there are lots of murky gray waters in the copyright code such that definitions may vary between people (e.g. my definition of “commercial” may be different than yours), meaning that it’s hard to know what people’s real intentions are when they choose an Open license; and (3) we aren’t prone to go around suing each other over copyright infringement. After all, copyright only really matters if you’re willing to enforce it. And that takes time and money and effort.
I’m still chewing on this argument.
And I’m happy to hear others. How do you license your scientific Things?