Legal and Ethical Issues of Social Media Collecting: Annotated Bibliography

Along with creating a tool to capture social media posts for research purposes, the Social Feed Manager team sought to identify key policy issues that should be considered before collecting or providing access to collections of social media. Seemantani Sharma, a recent graduate of George Washington University’s Law School, conducted legal and policy research as background for this project. This bibliography represents a part of her work with the Social Feed Manager project team. This bibliography provides a resource for those interested in using social media as a research tool or for cultural heritage institutions interested in creating collections of social media for use by future researchers. The aim of this bibliography is to direct others to resources for understanding legal and ethical issues they will possibly encounter. This bibliography does not constitute legal advice.

In addition, the team has developed Building Social Media Archives: Collection Development Guidelines for guidance in thinking through a social media archiving program. We also encourage those interested in doing this work to look at the annotated bibliography by North Carolina State University Libraries, created in support of Social Media Archives Toolkit. Finally, there is a growing literature within particular disciplines and the social media research community, which have relevance to research and archiving of social media.

Articles, Books, and Web-Based Resources

  1. Society of American Archivists. “ALA - SAA Joint Statement of Access : Guidelines for Access to Original Research Materials” revised August 2009. http://www2.archivists.org/statements/ala-saa-joint-statement-of-access-guidelines-for-access-to-original-research-materials-au, August, 2009.
    This document provides guidelines for archivists when providing access to manuscript or archival materials. Although intended to address access to all original research materials held by an archive, it provides guidance that is especially pertinent to collecting social media, specifically sections 2a, 2c, 4d, and 4e.

  2. Alger, J. E. (2014, July 23). re: Tweet #452. Washington, DC: U.S. Copyright Office. https://topromotetheprogress.files.wordpress.com/2014/07/tweet_reg_denied.pdf
    A notice issued by the Copyright Office in response to a Twitter user’s request to register a tweet. The claim was refused on grounds that a tweet lacks a sufficient degree of original, copyrightable material.

  3. Allen, Eric. “Update on the Twitter Archive at the Library of Congress”, Library of Congress Blog, January 4, 2013. https://blogs.loc.gov/loc/2013/01/update-on-the-twitter-archive-at-the-library-of-congress/
    An update on the Library of Congress’ Twitter Archive. Highlights the status of the project and states its future course of action. A white paper linked at the bottom of the page clarifies LoC’s intention to make content within the Twitter archive available to viable researchers six months after it was posted to Twitter.

  4. Association of Internet Researchers, Ethical Decision-Making and Internet Research, 2012. http://aoir.org/reports/ethics2.pdf
    Critical guidelines, definitions, and processes for conducting research ethically on the Internet.

  5. Baker, A.E. “Ethical considerations in web 2.0 archives,” Student Research Journal, 1 no. 1 (2011): 1-14. http://scholarworks.sjsu.edu/slissrj/vol1/iss1/4
    In this paper, Baker delves into some of the pressing ethical issues in building an archive of ephemeral web based materials. Using the example of Library of Congress’ Twitter archive, she explores ethical issues in archiving web 2.0 content. Her paper considers issues such as the imperative to preserve the web-based historical record, the problem of collecting content from “blind-donors”, and possible ways of restricting access as solutions to these problems.

  6. Band, Jonathan. “A New Day for Website Archiving: Field v. Google and Parker v. Google.” Association of Research Libraries, 2006. http://www.arl.org/storage/documents/publications/band-web-archive-2006.pdf.
    This paper explores the implications two district court cases, Field v. Google and Parker v. Google, could have on web archiving, focusing especially on copyright. Field established that it was not a violation of fair use to create an exact duplicate of an image from a website. This was a clarification of a previous case, Kelly v. Arriba Soft, in which the court ruled that it was not a violation of fair use to display only a thumbnail of an image, leaving users to go to the image owner’s website to retrieve the original image. The Field court only considered the volitional act of allowing access to content, not harvesting it. It did, however, state that by not setting an “no-archive” meta-tag, the content owner was providing an implicit license to harvest it. Parker v. Google built on Field by finding that Google’s automated web crawler removes the necessary element of volition, a human act.

  7. Behrnd - Klodt, Menzi. Navigating Legal Issues in Archives, Chicago: Society of American Archivists, 2008.
    This book examines various legal issues faced by American archivists.

  8. Eds. Behrnd-Klodt, Menzi and Peter J. Wosh. Privacy & Confidential Perspectives: Archives & Archival Records, Chicago: Society of American Archivists, 2015.
    This is an anthology of selected essays intended to aid cultural heritage institutions in navigating the legal, ethical, administrative and institutional issues they face in administration of personal information records.

  9. Breslawski, Tara M. “Privacy in Social Media: To Tweet or Not to Tweet?” Touro Law Review 29 no. 4 (2014): Article 16. http://digitalcommons.tourolaw.edu/lawreview/vol29/iss4/16.
    This paper asks whether it is reasonable to have an expectation of privacy in publicly posted social media content. Breslawski analyses two pertinent precedents, People v. Harris and Romano v. Steelcase. The author concludes that given the nature of social networking sites, it is counterintuitive to expect social media posts to remain private, even to governments.

  10. Cadavid, Jhonny Antonio Pabón C, Johnkhan Sathik Basha, Gandhimani Kaleeswaran. Legal and Technical Difficulties of Web Archival in Singapore. International Federation of Library Associations and Institutions July, 2013. http://library.ifla.org/217/1/198-cadavid-en.pdf.
    This paper analyses the legal, ethical, and technical issues of the web archiving initiative of the National Library of Singapore (NLS), and calls for clear and precise policies to direct Singapore’s web archiving programs. It identifies ethical and legal questions a web archiving program should be prepared to face, both when collecting and when providing access. It highlights some of the methods that Singapore’s National Library Board (NLB) and NLS have used to overcome a lack of legislation clarifying their right to archive the web, and explains under what criteria an archived website is made accessible. There is a brief mention of NLS’ taxonomic categories, although it is not clear if this is used to facilitate discovery. The paper finishes with policy recommendations it believes NLB and NLS should implement.

  11. Chou, Sophie. “To scrape or not to scrape: technical and ethical challenges of collecting data off the web,” Storybench, April 4, 2016. http://www.storybench.org/to-scrape-or-not-to-scrape-the-technical-and-ethical-challenges-of-collecting-data-off-the-web/
    Chou poses a series of questions to answer when approaching a possible web-scraping endeavor.

  12. Clarida, Robert W. “Beware the Right of Publicity”, Graphic Artists Guild. https://graphicartistsguild.org/tools_resources/beware-the-right-of-publicity
    A blog post on the meaning of right of publicity, its exceptions and other ancillary issues. It serves as an excellent resource for non-lawyers to understand the different statutes, including state-by-state comparisons

  13. “Code of Best Practices in Fair Use for Academic and Research Libraries.’’ Association of Research Libraries, January 2012. http://www.arl.org/storage/documents/publications/code-of-best-practices-fair-use.pdf
    This document identifies eight situations that represent the library community’s current consensus about acceptable practices for the fair use of copyrighted materials and describes a carefully derived consensus within the library community about how those rights should apply in certain recurrent situations. Point number eight specifically addresses that not archiving websites and other material from the web will be a loss to scholarship, and states as a principle that it is fair use to archive web-based material. A short list of suggested practices is included.

  14. Cohen, Noam. “Use My Photo? Not Without Permission”, The New York Times, October 1, 2007. http://www.nytimes.com/2007/10/01/technology/01link.html?_r=0.
    Alison Chang’s church youth counselor posted a photograph of her to his Flickr page. Virgin Mobile Australia then used the image for an advertisement. She was a minor at the time, and neither her nor her parent’s consent to use the photograph was requested or given. The article mentions that the photograph was covered by a Creative Commons license, something true for many Flickr images, and that Chang’s lawyer claims Creative Commons does not do enough to address commercial use of images.
    The article also lists a few other cases where Flickr images were sold or repurposed without the consent of either the photographer or the subject of the photograph.

  15. Costello, Kaitlin L, and Jason Priem. “Archiving Scholars’ Tweets.” Chicago: Society of American Archivists, April 2011. http://www2.archivists.org/sites/all/files/KCFinal.pdf
    This paper provides the results of a survey of scholars who were asked how they feel about their tweets being preserved and who they think should be responsible for preserving them. It provides a background for what to expect from content creators when building a Twitter archive. The survey results included stated reasons why an individual either supported, partially supported, or opposed the idea that their tweets should be archived. Those who opposed an outside organization archiving their tweets did so for a variety of reasons, including a fear of professional consequences or a sense that tweets are trivial and have no enduring value. Some believed their academic freedom was threatened by preserving their tweets. One of the respondents believed it was an ethical imperative to seek the express permission of a creator before preserving his or her tweets, and others believed only the tweets of public figures should be preserved.

  16. Daxton, R Stewart. “Can I Use This Photo I Found on Facebook? Applying Copyright Law and Fair Use Analysis to Photographs on Social Networking Sites Republished for News Reporting Purposes” Journal on Telecommunications and High Technology Law, 10 No. 1 (2012): 93-121. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2440810
    This article applies the four-part fair use analysis to news media organizations use of photographs taken from social networking sites. The four factors to consider are: 1) the purpose and character of use; 2) the nature of the copyrighted work; 3) the amount and sustainability of the portion taken; and 4) the effect of the use upon the eventual market. The paper goes through each factor and explains what the court would look for when applying each to a case. Although the paper focuses on newspapers reproducing photographs, organizations collecting and preserving social media will find it useful. Some relevant points include an explanation of why reproducing a factual work is more likely to be considered in fair use than a creative work, and the importance given to of how much of an author’s work is archived and what effect the collection would have on the market value of the work. The paper concludes that it would be difficult to imagine a court supporting a newspaper’s claim that reproducing a photograph taken from a social media site is covered by fair use.

  17. Digital Media Law Project. Using the Name or Likeness of Another, Digital Media Law Project, May 16, 2016. http://www.dmlp.org/legal-guide/using-name-or-likeness-another.
    This blog explains the legal standard for claims of misappropriation and of right of publicity. Despite being distinct, they are so similar that courts will often confuse them. Only human beings, and not corporations or other organizations, can sue for either. Misappropriation is described as a violation of an individual’s privacy, while right of publicity is described as the right to control the commercial value of one’s persona.
    The elements of a claim for unlawful use of a name or likeness are, however, the same for both. To prove liability, the injured party must show: 1) the use of a protected attribute, the definition of which varies from state to state; 2) it was done for an exploitative purpose. The blog explains how exploitation is different depending on if the case is one of misappropriation or of right of publicity; 3) it was done without consent.
    Even though there were some exceptions to legal liability emanating from violation of a right of publicity claim, it is a good practice to seek consent of the person depicted in the photograph before using it. The same principle applies to using someone’s personal information in a blog post, particularly if done for commercial or promotional purposes.

  18. Georas, Chloe S. “Networked Memory Project: A Policy Thought Experiment for the Archiving of Social Networks by the Library of Congress of the United States.” Laws, 3 (2014): 469-508. http://www.mdpi.com/2075-471X/3/3/469
    This paper explores the challenges of archiving social media, and considers if it is prudent to collect it at all. The author focuses on Facebook. Library of Congress follows a policy of allowing individual political figures to opt-in to having their Facebook page archived. This contrasts with the Library of Congress Twitter archive, which is also mentioned, including the Library’s as-of-yet unfulfilled intention to make Tweets available for research six months after date of creation. She suggests the Library of Congress is ideally suited as a publicly available archive of social networking. Georas opines that content creators, rather than platform owners, own copyright for anything they post to social media, which leaves unclear the copyright status should the creator delete his or her handle. Delving into the ethical and legal debates surrounding privacy on social networks, she suggests that privacy not only serves individual interest but also common, public and collective purposes.

  19. Griffin, Jodie C. “Save the Tweets: Library Acquisition of Online Materials” AIPLA Quarterly Law Journal, 39 no. 2 (2011): 270-294. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1865425
    Calling the internet a “repository for public ideas,” Griffin believes that [d]ocumenting online public conversations…in a searchable, stable format would be immensely valuable for historical and cultural purposes. She says that libraries can make online discussions available for research, but says this requires the legal authority to do so. She suggests legislation be passed granting the right to preserve and give access to online works libraries deem important to scholarship, and that this access be given only to libraries as keepers of knowledge for the public good. This includes a recommendation that Congress amend Section 108 of the Copyright Act, which governs limitations and exceptions for reproduction and distribution by libraries.

  20. Henderson, Stephen E. “Expectations of Privacy in Social Media”, Mississippi College Law Review, 31 (2012): 227-247. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2018425
    This paper traces the constitutional underpinnings of information privacy with a special focus on the distinction between public and private social media content. The author concludes that there can be no legitimate expectation of privacy in content posted to the public web, but that there is an expectation of privacy for email and instant messaging. The paper then considers content that might fall between these two poles. This information is divided into four categories: 1) subscriber information–where the social media user would require identifying information in order to provide any service; 2) transactional information–information which the social media provider requires to facilitate desired communications; 3) non-public communications, such as a post that has a limited target audience as opposed to the world at large, e.g. Facebook wall posts visible only to a limited number of friends and protected tweets; and 4) content that was once public and is now either deleted or no longer public. For each of these, the author considers relevant legal arguments to support the content creator’s reasonable expectation of privacy.

  21. Herzfeld, Oliver. “Are Website Terms of Use Enforceable?”, Forbes, January 22, 2010. http://www.forbes.com/sites/oliverherzfeld/2013/01/22/are-website-terms-of-use-enforceable/
    This article considers if website terms of service are enforceable on users. The Second Circuit Court of Appeal’s decision in Schnabel v. Trilegiant Corp held that the terms of use of a website were not binding on the plaintiffs when they are inconspicuous and there is no evidence that offeree knew or should have known of the terms.

  22. Hilfer, Kyle-Beth. “Tweet Tweet: Can I Copyright That??”, WebCite, January 19, 2010. http://www.webcitation.org/5ovHc7j9z
    This blog post analyses whether tweets are copyrightable subject matter. It states that due to issues of length and originality, Twitter users face a challenge in getting copyright protection for their tweets. However, in exceptional cases, protection cannot be ruled out.

  23. Hirtle Peter B., Hudson Emily, Kenyon Andrew T. Copyright and Cultural Institutions: Guidelines for Digitization for U.S. Libraries, Archives and Museum. Ithaca: Cornell University Library, 2009. https://ecommons.cornell.edu/bitstream/handle/1813/14142/Hirtle-Copyright_final_RGB_lowres-cover1.pdf;jsessionid=66551B63E1852789DB8154F1775F7A51?sequence=2
    This book serves as background material for deciphering the various legal issues affecting cultural heritage institutions.

  24. Hookway, Nicholas. “Entering the Blogosphere : Some Strategies for Using Blogs in Social Research”, SAGE Publications, 8 no. 1: 91-113. http://qrj.sagepub.com/content/8/1/91.full.pdf+html
    Hookway’s article highlights how the blogosphere can expand the social researcher’s toolkit and some of the practical, theoretical and methodological issues that arise from it. The author considers ethical and legal challenges inherent in collecting and using this data. While he suggests contacting blog authors, it seems to be out a desire to find material relevant to his research rather than a desire to get content creator’s consent before using their material. In fact, in the Ethics of Blog Research section, he argues that blogs are firmly rooted in the public domain and the need for consent should be waived. Hookway also believes that researchers using blogs do not have to worry about copyright, as their work would be comfortably covered by fair use. The bloggers also did not suffer financial loss, as the material was used for research alone.

  25. Humphreys, Lee, Phillipa Gill, Balachander Krishnamurthy. How Much is Too Much? Privacy Issues on Twitter. http://www.cs.utoronto.ca/~phillipa/papers/ica10.pdf
    By analyzing the content of tweets, Humphreys et al found that Twitter users have a propensity to write about themselves, and about a quarter of tweets included time and location information. They conclude that this has the potential to infringe on a user’s privacy, especially if coupled with other kinds of publicly available information.

  26. Knutson, Alyssa N. “Proceed with Caution: How Digital Archives Have Been Left in the Dark.” Berkeley Technology Law Journal, 24 no. 1 (2009): 437-473. http://scholarship.law.berkeley.edu/btlj/vol24/iss1/17
    The author highlights the potential legal liabilities that digital archives may encounter in light of Internet Archives v Shell, 505 F. Supp. 2d 755 (D. Colo. 2007), especially as the case was non-precedential. Knutson provides an overview of practices, including archives’ use of opt-out policies allowing content creators to request removal of their material from archives. The paper also mentions Healthcare Advocates, Inc. v. Harding, Earley, Follmer & Frailey, in which Internet Archive was named as co-defendant for failing to respect a robot.txt request to not provide access. Knutson says there is uncertainty whether courts were likely to apply the fair use exemption to copying and providing access to digital content, and suggests Congressional legislation would bring clarity to a gray area.

  27. Library of Congress. Privacy and Publicity Rights, The Library of Congress, http://lcweb2.loc.gov/ammem/copothr.html.
    This blog post explains how privacy and publicity rights are separate legal issues from copyright. Publicity rights cover an individual’s right to their own image; for example, a photographer may give a person permission to use one of their photographs for commercial purposes, but any individual in those photographs must also give permission to have their image used. The right to privacy concerns an individual’s right to be sure their “likeness [will] not be cast before the public eye without [their] consent.” Patrons desiring to use materials from LOC’s website bear responsibility for making individualized determinations on potential privacy and publicity rights in any materials they use.

  28. Library of Congress. “Twitter Gift Agreement”, Library of Congress Blog, April 2010. http://blogs.loc.gov/loc/files/2010/04/LOC-Twitter.pdf
    This is the Deed of Gift between Library of Congress and Twitter, transferring to the Library the entire public collection of tweets from Twitter’s inception to the effective date of agreement. Notable clauses include:1) a requirement that researchers agree in writing not use the collection for commercial purposes; 2) an agreement by the library to respect access restrictions such as robots.txt and to dispose of any portion of the collection considered inappropriate for retention.

  29. Lovrics, Catherine. “Copyright and Privacy Questions Around Your Public Tweets and the New Library of Congress Archive and Google Replay.’’ Slaw, April 27th, 2010. http://www.slaw.ca/2010/04/27/copyright-and-privacy-questions-around-your-public-tweets-and-the-new-library-of-congress-archive-and-google-replay/.
    A blog post covering Library of Congress’ Twitter Archive and its potential privacy implications. Lovics believes that Twitter’s non-personal nature, made explicit in the Terms of Service, means it’s safe to assume making tweets publicly available would not illegally violate anyone’s privacy. She argues, however, that changes in Twitter’s Terms of Service mean early adopters had stronger expectations of privacy.

  30. Mandal, Tissya. Copyright in Quotes, October 12, 2010, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1818985
    The paper considers whether or not quotes meet the de minimis non curat lex threshold to be considered copyrightable. The de minimis threshold, a standard that says the law does not concern itself with trifles, has not been consistently applied in copyright cases. From this, we can assume that the copyrightablity of short works, such as a quote or an individual tweet, are not yet settled.

  31. Markham, Annette. “OKCupid data release fiasco,” Points. May 18, 2016. https://points.datasociety.net/okcupid-data-release-fiasco-ba0388348cd
    Markham describes how university educators and others can highlight and foster discussion of ethical aspects of social media research.

  32. Marshall, Catherine C., Frank M. Shipman. 2012. “On the Institutional Archiving of Social Media.’’ Paper presented at Joint Conference on Digital Libraries: #Preserving #Linking #Using #Sharing, Washington, D.C., June 10-14 http://research.microsoft.com/pubs/208008/jcdl12-loc-final.pdf
    Using responses from six different surveys of social media users, this paper summarizes public reaction to the Library of Congress decision to archive tweets. Among the survey’s finding are that: 1) respondents feel that social media postings should be seen in context. As one respondent put it, “[e]ven though the comments are public, people making them think they will remain on that site and for the site members, not be archived”; 2) respondents believe that content creators’ permission should be sought and their privacy respected, and any potential commercial value should be protected.

  33. Michael, Gabriel. “Can you Copyright a Tweet?”, To Promote the Progress?, July 30th, 2014. https://topromotetheprogress.wordpress.com/2014/07/30/can-you-copyright-a-tweet/
    The author highlights his failed attempt to register a tweet with the Copyright Office. The tweet for which copyright protection sought was, “Monkey bar fallacy: a bad person using something makes it bad. E.g., users of monkey bars include: children, TERRORISTS #tor.” Although Michael’s request to register a tweet was denied by the Copyright Office because the content “represents less than the required minimum amount of original literary…material”, he mentions that his tweet works as a haiku, a traditional form of Japanese poetry consisting of 17 syllables, leaving him to wonder if his tweet would be copyrightable if he had written it in a haiku format.

  34. Miller, Joe. “New York Times Ad Prompts Debate Over Twitter Copyright”, BBC News, January 7th, 2014. http://www.bbc.com/news/technology-25636009#?utm_source=twitterfeed&utm_medium=twitter
    This article discusses a case where a full page in the New York Times displayed an edited version of a tweet written by film critic A.O. Scott to promote the movie* Inside Llewellyn Davis*. Scott said his permission was sought but not given. A copyright lawyer interviewed for the article says Scott would hold copyright on his tweet, but his rights in this case are still an uncertain area of the law.

  35. Minow, Mary. “Copyright Protection for Short Phrases”, Copyright & Fair Use, Stanford University Libraries, September 9, 2003. http://fairuse.stanford.edu/2003/09/09/copyright_protection_for_short/
    This blog post is an excellent source for information and analysis on the copyrightability of short phrases. It summarizes many cases with commentary.

  36. Minow, Mary. “How I learned to Love Fair Use?” Copyright & Fair Use, Stanford University Libraries, July 6, 2003. http://fairuse.stanford.edu/2003/07/06/how_i_learned_to_love_fair_use/
    An informational blog post providing librarians with the tools necessary to assess copyright liability for their institutions. Using a chart, it breaks down the four factors that will determine if a copy was made under fair use.

  37. North Carolina State University Libraries. Legal and Ethical Implications, North Carolina State University Libraries. https://www.lib.ncsu.edu/social-media-archives-toolkit/legal.
    North Carolina State University Libraries undertook a project to give researchers the necessary tools to harvest and preserve social media collections. This blog provides information on both the legal and ethical challenges faced by institutions and individuals seeking to collect social media.

  38. Nissenbaum, Helen. “Privacy as Contextual Integrity” Washington Law Review, 79:1 (Feb. 2004): 119-158 http://digital.law.washington.edu/dspace-law/bitstream/handle/1773.1/61/volume79.pdf?sequence=1&isAllowed=y
    Nissenbaum’s article presents a theory that conceptions of privacy evolve over time and must be seen in context. She uses the example of various parties who, for differing reasons, object to information about them that is already publicly available and/or not sensitive in nature being made more accessible by being posted to a digital, internet-accessible medium. She wonders why they would object to simply improving the efficiency of sharing this kind of information.
    She identifies three independent principles to privacy: 1) protection against government agents who overzealously collect; 2) protection against dissemination of intimate, sensitive, or confidential information; and 3) curtailing intrusions into spaces or spheres deemed private or personal.
    When the three principles are applied together, one arrives at what she calls conceptual integrity, the central tenet of which is that there are no arenas of life not governed by two norms of information flow. She identifies these as: 1) norms of appropriateness, which dictate what types of information are appropriate to share about a person in any given situation; and 2) norms of distribution, which dictate who can share information about another person, who they can share it with, and what they can share.
    From this, Nissenbaum concludes that whether or not a specific action is a violation of someone’s privacy is a function of many variables.

  39. North, Stephanie Teebagy, “TwitterRight : Finding Protection in 140 Characters or Less”, Journal of High Technology Law, 11 (2011).
    This paper provides an analysis of the copyrightability of tweets. North considers the copyrightability of short phrases, e.g. haikus, and how these could apply to tweets. Relying on Religious Technology Center v. Lerma, 1996 U.S. Dist. LEXIS 15454, where the United States District Court for the Eastern District of Virginia held that poems, haikus, and musical scores were copyrightable, the author infers that a sufficiently creative Twitter status update would also be protectable. The author claims the legal doctrine of scènes à  faire, which holds that certain elements of a creative work cannot be copyrighted if they are customary elements of the work’s genre, could be an important impediment to the copyrightability of a tweet. To overcome this, she suggests that a Twitter user seeking protection for their tweets should register them as a compilation.

  40. Ohio Library Council. Implications of Right of Publicity on Library Activities. http://www.olc.org/pdf/VorysRightOfPublicityLibraryActivities112408.pdf.
    An article on right of publicity under Ohio Law. Noteworthy points include: 1) the right of publicity may extend to libraries’ use of an individual’s persona, even though the use is not commercial in nature; 2) there are safe harbor provisions under which libraries are permitted to use a persona for a commercial purpose without obtaining the written consent of the individual.

  41. O’Keeffe, Hope. Legal Issues in Building Social Media Collections, Association of Research Libraries, May 2011. https://web.archive.org/web/20150930105944/http://www.arl.org/storage/documents/publications/mm11sp-okeeffe.pdf.
    This presentation highlights what O’Keeffe feels are the major legal issues in building social media collections. It refers to case studies involving traditional web archiving to show how similar problems have been addressed in the past. At the end of the document is a decision tree for determining if a work can be made accessible, as well as a risk assessment checklist and tips for reducing risk.

  42. Peet, Lisa. “Documenting the Now Builds Social Media Archive”, Library Journal, May 2, 2016. http://lj.libraryjournal.com/2016/05/academic-libraries/documenting-the-now-builds-social-media-archive/
    A blog post about the DocNow project, a collaboration by three universities to collect, preserve, and provide access to tweets chronicling historically significant current events, particularly issues related to social justice. The post highlights how the project principals are seeking to exhibit ethical, conscientious collection. In this case, they are concerned not only with not violating copyright or the platform Terms of Service, but with ensuring content creators’ personal security. Their project is especially concerned with documenting historically significant current events. Twitter has proven to be a valuable tool for coordinating protests worldwide; as such, individual’s tweets may later be used to identify them for reprisal. The ability for individuals to opt out of having their tweets in the archive is suggested as a solution.

  43. Press Trust of India. “Twitter Starts Deleting Tweets Of Stolen Jokes Over Copyright Infringement”, The Economic Times, Jul 27, 2015. http://articles.economictimes.indiatimes.com/2015-07-27/news/64919001_1_twitter-jokes-tweets-copyright-infringement
    This article mentions that Twitter has started to delete tweets containing jokes copied from another user, with copyright infringement given as the reason.

  44. Ray, David M., The Copyright Implications of Web Archiving and Caching, Syracuse Journal of Science and Technology Law, (Spring 2006) http://jost.syr.edu/wp-content/uploads/the-copyright-implications-of-web-archiving-and-caching.pdf.
    The paper delves into copyright implications of archiving and caching of websites. Works published on the Internet are fully protected under the Copyright Act, subjecting them to the same qualifications and limitations applicable to works fixed in other media; therefore, any archive of content from the web must fall under fair use if it is to be legal. The author believes the Internet Archive’s work falls under fair use.
    The previously mentioned Field v Google *and *Kelly v Arriba Soft Corp cases are considered. The author mentions that the court in Field found that the plaintiff was acting in bad faith, as he admitted he was aware of Google’s caching practice at the time he allowed his website to be captured. Other examples are used to show support for an opt out recourse for people to remove their content from an archive or cache.

  45. Research Library Issues 279 (June, 2012). http://publications.arl.org/rli279/
    In a special edition of Research Library Issues, every article within the issue highlights the various legal and contractual impediments cultural heritage institutions must consider before undertaking large scale digitization projects. In the opening article, Peter B. Hirtle, Anne R. Kenney, and Judy Ruttenberg survey recent literature that may help institutions navigate potential legal issues. In the final article, Kevin L. Smith offers a four-fold strategy for evaluating risk. He says institutions should 1) consider that they may hold a lot of material in the public domain; 2) seek the permission of people who are most likely to object to public display of material; 3) have in place a takedown policy in place; and 4) recognize that many reproductions of many collections will fall under fair use.
    In between the two articles are three variations of model agreements that an institution can use to cover digitization of collections; two Deeds of Gift and a digitization agreement.

  46. Rustad, Michael L. “Copyrights in Cyberspace: A Roundup of Recent Cases”, Suffolk University Journal of High Technology Law, 12 (2011): 106-160. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1992104
    The paper is a survey of internet-related copyright cases heard between 1990 and 2010, from the first case to mention the word internet, though the passage of the Digital Millennium Copyright Act to the creation of social media. As of the publishing of this paper, the author notes that no court has ruled on the copyrightability of a tweet, although it mentions Agence Fr. Presse v. Morel, in which the court ruled on the copyrightability of a photograph embedded in a tweet. In Rustad’s view, the next wave of litigation will focus on whether or not tweets are copyrightable.

  47. Scola, Nancy. “Library of Congress’s Twitter Archive Is a Huge #Fail”, Politico, July 11, 2015. http://www.politico.com/story/2015/07/library-of-congress-twitter-archive-119698.html
    More than five years after it was first announced, the Library of Congress’ Twitter Archive is in limbo, with nothing available to researchers. The blog post offers no definitive reason why the tweets have not been released, but suggests that technical difficulties in handling half a trillion tweets have made it impossible for LOC to allow access.

  48. Stim, Rich. “Measuring Fair Use: The Four Factors”, Stanford University Libraries. http://fairuse.stanford.edu/overview/fair-use/four-factors/
    This article summarizes the four-factor test applied by courts to determine the applicability of the fair-use exception.

  49. Taylor, Nicholas. “Web and Twitter Archiving at the Library of Congress”, Web Archive Globalization Workshop, July 16, 2011. http://www.slideshare.net/nullhandle/web-and-twitter-archiving-at-the-library-of-congress.
    Nicholas Taylor, a member of the Library of Congress’ Web Archiving Team at the time, summarizes and presents the challenges in LOC’s web and Twitter archiving program in this presentation. The growth of Twitter use from its first tweet in 2006 through 2011 presents a sustainability challenge, and Taylor wonders what kind of access LOC will be able to provide.
    The section covering the web archive program mentions the technical and legal challenges LOC has faced in that area, as well.

  50. “Twagiarism”, MacMillan Dictionary, 2011. http://www.macmillandictionary.com/dictionary/british/twagiarism
    Dictionary listing for the word twagiarism, defined as plagiarism in or of a tweet.

  51. Trotter, J.K. “Twitter Just Killed Politwoops?” Gawker, June 3, 2015. http://tktk.gawker.com/twitter-just-killed-politwoops-1708842376.
    This article discusses Twitter’s decision to deny Politiwoops - a site that surfaced politicians’ deleted tweets - access to its API. A statement by Twitter clarifies that the reason for the denial was due to Politiwoops violating the developer agreement. Since this article was written, Politiwoops’ access has been restored.

  52. United States Copyright Office. Circular 34: Copyright Protection Not Available for Names, Titles, or Short Phrases, Washington, D.C.: U.S. Copyright Office, revised 2015. http://copyright.gov/circs/circ34.pdf
    This is a Circular issued by the Copyright Office stating that there is no copyright protection available for names, titles or short phrases even if novel or distinctive.

  53. Weller Katrin, Axel Bruns, Jean Burgess, Merja Mahrt, Cornelius Puschman. “Twitter and Society”, Digital Formations 89 (2014). http://nancybaym.com/TwitterandSociety.pdf.
    An anthology of essays describing how Twitter has been used as a research tool since its launch in 2006. Michael Beurskens, in Chapter 10, considers legal questions that come with using tweets for research. He says that while ethics and the law generally agree, it is not necessarily so, and laws are not uniform among states and nations. This makes a complete legal analysis impossible. He believes the ethical requirements may be stricter than the legal ones. For example, he believes researchers should make a concerted effort to anonymize any datasets they share.
    Beurskens also says that tweets may not be copyrightable, as the majority of them lack both originality and “sweat of the brow”, but that the length of a tweet would be inconsequential in determining if it should be protected. He thinks reuse would be allowed without worrying about violating intellectual property law.
    In chapter 13, Michael Zimmer and Nicholas Proferes delve into potential privacy concerns regarding personal or sensitive information Twitter users may have stored or shared. In contrast to Facebook and Google+, Twitter offers a simple binary of privacy control: either a Twitter user’s activity was public to everyone, or it was restricted. This means that up to 90% of Twitter users keep the default settings allowing public access. The authors provide use cases that suggest that users lack a sufficient understanding of how public tweets actually are. Even when users restricted access to their tweet, leakages can occur. Users who had been granted access to restricted accounts could easily retweet protected tweets by copying and pasting into their own, unprotected feed.

  54. Wolfson, Stephen. “The Library of Congress Twitter Archive”, Tarlton Library News Blog, January 9, 2013. https://sites.utexas.edu/Tarlton-library-news/2013/01/the-library-of-congress-twitter-archive/
    A blog post on LOC’s Twitter Archive and its still uncertain future. The author notes that, as of 2013, the Twitter archive contains 133 terabytes of data.

  55. Zansberg, Steven D. and Janna K. Fischer. Privacy Expectations in Online Social Media - An Emerging Generational Divide? Levine, Sullivan, Koch & Schulz, L.L.P, http://www.lskslaw.com/documents/EvolvingPrivacyExpectations(00458267).pdf.
    This article examines the role of social norms, customs, and mores in determining users’ expectations of privacy, especially in relation to social media. It begins with an example of the difference in how Americans and Europeans view privacy, but the authors assert that expectations of privacy vary not just by national boundaries, but by age, as well. Younger generations’ view of privacy is being altered by their prolific use of social media. The authors consider a few recent cases to shed light on how American courts have litigated privacy concerns related to social media. In Moreno v. Hartford Sentinel the court found that republishing a poem posted to MySpace did not violate the privacy of the author even though she expected it to be seen only by her close friends, as her “potential audience” was vast. In Romano v Steelcase, the court held that a defendant could be given access to the plaintiff’s private Facebook page, as the “very nature and purpose” of the site was to share information. In contrast with Romano, the court in Crispin v Christian Audigier, Inc. found that the plaintiff’s private social media postings may not be subject to a civil subpoena, and remanded the case to a trial court to answer this question. In Pietrylo v Hillstone Restaurant Group and Konop v Hawaiian Airlines, Inc., both courts found that a manager had acted inappropriately by accessing an employee-only messageboard without permission from the messageboard creators.
    The authors believe that expectations of “digital natives”, e.g. younger generations of people who have grown up using social media, differ from “digital immigrants” who come to social media later in life. While the latter have a binary view of privacy on the internet as either complete secrecy or no privacy, younger people expect that whatever privacy settings they request will be respected. The article makes the case that as the “digital native” generation matures, courts may rule in support of a stronger defense of a limited expectation of privacy for social media posts.

  56. Zimmer, Michael. “Open Questions about Library of Congress Archiving Twitter Streams”, michaelzimmer.org, April 14, 2010. http://www.michaelzimmer.org/2010/04/14/open-questions-about-library-of-congress-archiving-twitter-streams/
    The blog post by Michael Zimmer raises several questions about Library of Congress’ Twitter archive and its threat to the privacy of Twitter’s users. Some questions the author raises are: 1) will user profile information and follow/follower lists, including any historical changes, also be archived and made accessible? 2) is geolocation included in the capture? 3) will users have the ability to remove deleted (unwanted) tweets from the archive?

  57. Zimmer, Michael. “The Twitter Archive at the Library of Congress: Challenges for Information Practice and Information Policy”, firstmonday.org, July 6, 2015. http://firstmonday.org/ojs/index.php/fm/article/view/5619/4653
    Zimmer’s paper delves into technical and policy challenges the Library of Congress will face in building a Twitter archive. He identifies three main areas where LOC will be forced to make difficult policy decisions. The agreement between Twitter and LOC allows access only by “bona fide” researchers, a category that will have to be defined. Though the reason for this restriction appears to be the prevention of commercial reuse of tweets, any restriction on open access will be controversial. The agreement also granted the Library the ability to “dispose” of material in the archive it considered “inappropriate for retention”, but does not specify how such a determination would be made, how disposal would take place, and how, if at all, such decisions and actions would be recorded. Zimmer cites research showing that between 40% and 50% of all tweets include information about the author. When made accessible, this generates threats to privacy. He notes that LOC’s response when this question is raised has been to say that the archive only includes tweets already available on the web, an example of what Zimmer calls a “false dichotomy of privacy”, that something is either entirely private or entirely public. He says LOC should acknowledge contextual norms which a Twitter user might expect their tweet to follow after it has been sent.

  58. Zimmer, Michael. “How Your Private Tweets Might Be Included in the Library of Congress Public Archive”, michaelzimmer.org, April 14, 2010. http://www.michaelzimmer.org/2010/04/14/how-your-private-tweets-might-be-included-in-the-library-of-congress-public-archive/.
    In this post, Zimmer explains why he believes LOC’s Twitter archive raises privacy concerns. In rejecting the idea that the archive presented no privacy concerns because the tweets were already publicly available on the web, he invokes Nissenbaum’s theory of contextual integrity, which is available elsewhere in this bibliography.

Case Law

The following cases relate to social media, copyright, fair use, privacy, or other topics of relevance to social media collecting.

  1. American Geophysical Union v. Texaco Inc., 60 F.3d 913 (1994). Date of Decision : October 28, 1994
    In this case, the court held that a for-profit organization’s act of making unauthorized copies for use by company employees of copyrighted materials was not fair-use. The court opined that even though the copies were used for a socially beneficial purpose, the research was for commercial gain and thus not protected by fair-use.

  2. Agence Fr. Presse v. Morel, 934 F. Supp. 2d 584 ( S.D.N.Y. 2013). Date of Decision : May 21, 2013
    In this case, the United States District Court for the Southern District of New York ruled in favor of photographer Daniel Morel by clarifying that media houses could not assume that photographs he shared via Twitter are rights free and could be used as though they were in the public domain. While copyrightability of tweets has never been challenged in court, this is the first case where the copyrightability of a photograph embedded in a tweet has been adjudicated.

  3. Chang v.Virgin Mobile USA, 2009 U.S. Dist. LEXIS 3051. Date of Decision : January 16, 2009
    Virgin Mobile Australia obtained a photograph of Alison Chang from Flickr, where it was posted with a Creative Commons Attribution License. This gave Virgin Mobile permission to use the image in a commercial setting as long as the photographer who took the image was attributed. Virgin Mobile used the photograph in an advertising campaign for promoting its free text messaging and other mobile services without seeking Chang’s or her parents permission.
    In 2007, Chang’s parents sued Virgin Mobile in a state court in Texas for misappropriating Chang’s likeness and violating her right of publicity. The case was subsequently dismissed for lack of personal jurisdiction. A separate suit brought against Creative Commons was also dismissed.

  4. Field v. Google, 412 F. Supp. 2d 1106. Date of Decision : January 12, 2006
    The court held that Google was not required to seek consent of copyright holders before allowing access to their works via cache, in part because the optional “no-archive” tag gave copyright holders a means to prevent this access.

  5. Graphic Design Marketing Inc v. Xtreme Enterprises Inc., 2011 U.S. Dist. LEXIS 57486. Date of Decision : March 1, 2011
    The United States District Court for the Eastern District of Wisconsin ruled that copyrightability of a very short textual work depended upon its creativity. The word “STICKERS” was denied copyright protection on grounds that it was too common, too short, and too general to be considered copyrightable subject matter.

  6. Hoepker v. Kruger, 200 F. Supp. 2d. 340 (S.D.N.Y. 2002). Date of Decision : May 3, 2002
    In this case, the artist Barbara Kruger created an an untitled work of art incorporating a photograph taken by Charlotte Dabny. The Court held that that Dabny could not recover for misappropriation since Kruger had added sufficient transformative elements.

  7. Kelly v. Arriba Soft Corp., 336 F.3d 811. Date of Decision : July 7, 2003
    In this case, a search engine was sued over inclusion of thumbnail images from a site it indexed. The United States District Court of Appeals found that by improving access to information on the internet, the use of thumbnails in search engines results was transformative fair use.

  8. Moreno v. Hanford Sentinel, 172 Cal. App. 4th 1125. Date of Decision : April 2, 2009
    A woman who lived in Coalinga, California posted a poem titled “An Ode to Coalinga” to her MySpace page. It was derisive of her hometown and its residents. She deleted the post within six days of posting it. During the period it was posted, the principal of Coalinga high school discovered it and sent it to the editor of a local newspaper. It was published with the author’s last name appended. Following this, the author and her family received death threats prompting them to move out of Coalinga. The Court held that the principal did not invade the author’s privacy by sharing the post with the newspaper, as posting it to Myspace diminished her expectation of privacy.

  9. People v. Harris, 945 N.Y. S. 2d.505 (2012). Date of Decision : April 20, 2012
    The court, in deciding a case of disorderly conduct against Malcolm Harris, sent a subpoena duces tecum to Twitter seeking Harris’s account information and tweets, as it was believed they were relevant to the case. Mr. Harris, with support from Twitter, initially opposed the subpoena. The Court held that there was no reasonable expectation of privacy in a publicly posted tweet, and that even when deleted, a tweet could be discovered through a variety of tools, including Untweetable, Tweleted, and Politwoops. Ultimately, Twitter complied with the subpoena.

  10. Religious Technology Center v. Lerma, 1996 U.S. Dist. LEXIS 15454. Date of Decision : April 20, 2012
    In this case, the United States District Court for the Eastern District of Virginia held that poems, haikus, and musical scores were copyrightable subject matter. Haiku is a very short form of Japanese poetry consisting of three lines, easily within the 140 character limit of a single tweet.

  11. Romano v. Steelcase, 30 Misc. 3d 426. Date of Decision : September 21, 2010
    In this case, the Supreme Court of New York held that when the plaintiff created her Facebook and Myspace accounts, she consented to her personal information being shared with others, even when she used privacy settings to limit access to her accounts. The very nature and purpose of social networking sites prevents her from having a reasonable expectation of privacy.

  12. Salinger v. Random House, Inc, 811 F.2d 90; 1987 U.S. App. LEXIS 1554. Date of Decision : January 29, 1987
    In this case, the United States Court of Appeals for the Second Circuit held that though ordinary phrases may be quoted without fear of infringement, a copier may not quote or paraphrase the sequence of creative expression that includes such phrases.

  13. Stern v. Does, 978 F. Supp. 2d 1031 (C. D. Cal. 2011). Date of Decision : February 10, 2011
    This case concerned the copyrightability of an email sent to a listserv. The court held that while “the amount of creative input by the author required to meet the originality standard is low, it is not negligible”. The “vast majority of works make the grade quite easily, as they possess some creative spark, ‘no matter how crude, humble or obvious’ it might be.” Even single sentences can contain enough creativity to meet the standard.

  14. Vanginderen vs. Cornell University, 2009 U.S. App. LEXIS 26919 (9th Cir. Cal., Dec. 10, 2009. Date of Decision : December 10, 2009
    Kevin Vanginderen, a Cornell graduate, sued Cornell University for libel and publication of private facts after the university posted digitized issues of the Cornell Chronicle to the web. In one issue, the newspaper reported that Vanginderen had been charged with third degree burglary in connection with incidents on campus. The Court held that there was nothing illegal in Cornell’s act as Vanginderen’s record was publicly available and the article dealt with a matter of legitimate public concern.

  15. Yath v. Fairview Clinics, 767 N.W.2d 34, 43 (Minn. Ct. App. 2009). Date of Decision : June 23, 2009
    The Court of Appeals of Minnesota held that posting of another person’s private, personal medical information on a social media website constituted publicity for the purposes of an invasion of privacy action.


Updated