Postsecondary Learning

Internet Archive Hopes to Help Libraries Make Available Books Once Thought Trapped By Copyright

By Jennifer Howard     Oct 19, 2017

Internet Archive Hopes to Help Libraries Make Available Books Once Thought Trapped By Copyright

As libraries work to scan their collections, they tread carefully around books published after 1923 and likely to still be under copyright. But researchers at the Internet Archive and Tulane University say they’ve found a legal way to scan and make available many in-copyright works published between 1923 and 1941, which will come as welcome news to those who want to dive into a digital copy of a book like In Love With a T-Man, a 1937 pulp novel by Rob Eden.

Nothing frustrates researchers and librarians like material they know exists but can’t be easily accessed. That’s true for thousands of works like In Love With a T-Man, which happens to be available to buy on Amazon in a used hardcover edition but has been, until now, out of reach digitally for library patrons.

“We hope this will encourage libraries that have been reticent to scan beyond 1923 to start mass scanning their books and other works, at least up to 1942,” wrote Brewster Kahle, the founder and digital librarian of the Internet Archive, in a blog post describing how the group is “liberating” works that are otherwise off-limits.

Kahle, along with Elizabeth Townsend Gard, a Tulane law professor, are calling attention to Section 108(h) of the U.S. copyright code, which hasn’t gotten a lot of public attention. As the provision is understood by the Internet Archive, nonprofit educational institutions like libraries and archives are allowed “to reproduce, distribute, display and publicly perform a work if it meets the criteria of: a published work in the last twenty years of copyright, and after conducting a reasonable investigation, no commercial exploitation or copy at a reasonable price could be found.”

Using this provision, the Internet Archive has created The Sonny Bono Memorial Collection, named for the California representative who co-sponsored a 1998 bill that extended the copyright term on many works that would otherwise now be in the public domain. The Sonny Bono Act, as it’s sometimes called, delighted a lot of rights holders and displeased public-domain advocates, who challenged it in court and lost.

Kahle and his colleagues are calling the Sonny Bono Collection a “Last Twenty” collection because it contains works identified as being in the last 20 years of their copyright term and therefore eligible for public access of the kind libraries provide. If Kahle et al. prove they can create such collections efficiently, and on a large enough scale, their approach could put many once hard-to-get works in the hands of people who’d like to use them—and encourage libraries to take a chance on digitizing more copyrighted content from the post-1923 era. That could especially help students and professors looking to use such works in research and teaching.

Nothing about copyright is ever easy, though, and the “Last Twenty” approach has its risks, and its skeptics.

Making sure a work meets all the requirements takes time, and depends on accurate data, for instance. And who decides what counts as a reasonable investigation or reasonable price?

“Doing the work to do it right is hard, and doing it at scale is really hard,” says Michael Wolfe, scholarly communications officer at the Shields Library at the University of California at Davis. “I have a healthy dose of skepticism.”

According to Wolfe, libraries have already been taking advantage of 108(h) but in a limited way. “Mostly it’s pretty quiet,” he says. “In nine out of ten cases, you’re working with special collections of some other item unique to the library, and you do it on an ad-hoc basis.”

Checking Copyright Status

That’s where Tulane’s Elizabeth Townsend Gard comes in. The key, she thinks, is simplifying the copyright-vetting process for libraries, museums and individual users. She sees 108(h) as a means to achieve what she calls Library Public Domain.

She and her husband, Ron Gard, founded a company called Limited Times that does the legwork to determine the copyright status of various works. They built a software system, The Durationator, which includes the details of copyright laws from around the world and can deliver answers in minutes rather than the hours or days it might take a human to find them. Humans still need to be on hand to check results and troubleshoot as needed.

“It turns out there’s a lot of weird questions that come up,” Townsend Gard says. The substitution of a comma for a colon in a book’s title can be enough to skew the results. “That the data is that picky is a problem.”

Tulane provided some $200,000 in support and a steady stream of graduate students to help get the enterprise running. Some of those students worked with the Internet Archive on the “Last Twenty” project. To give themselves the most freedom, Townsend Gard and her husband have kept the company a limited for-profit enterprise, but they say they’re not doing it for the money. Townsend Gard has vivid memories of being a graduate student in history trying to figure out if she could use works from the 1920s, ’30s, and ’40s for her dissertation on narratives about World War I.

“For us it’s a social mission of making things more accessible,” she says. The goal remains “how to help people with copyright in a really affordable way.”

Working with the Internet Archive on the “Last Twenty” project represents the culmination of a decade’s worth of work. “It took a long time to think through the problems, it took us a long time to gather the laws, and it took us a long time to code it,” Townsend Gard says. The software can handle requests for all types of potentially copyrighted material, including text, photographs, and recordings. “We’re up to the challenge,” she says.

Limited Times did testing with a hundred institutions in the spring, according to Townsend Gard, and is now offering subscription packages as well as one-off services. They’ve been working closely with several major institutions, including the Frick Art Reference Library of the Frick Collection, and are signing up other institutional clients. “We’re a little afraid of too many people saying yes, because we’re still in the early stages,” she says.

Some experts say that libraries are better off spending their limited resources working on scanning materials that are in the public domain. One of them is Peter B. Hirtle, a longtime copyright policy advisor for Cornell University Library. (Hirtle spelled out the tortuous requirements of copyright in a 2012 article, “When Is 1923 Going to Arrive and Other Complications of the U.S. Public Domain.”) He’s now an alumni fellow of the Berkman Klein Center for Internet and Society at Harvard University.

Still, Hirtle says, the Internet Archive deserves credit for putting themselves on the line in a way that libraries might not be willing or able to. Making a mistake about copyright status could be punished with a $30,000 statutory fine. Even if that’s an unlikely outcome, it is enough to give many institutions pause. The Internet Archive “is willing to absorb more risk than libraries can,” he says. "IA should be praised for taking on a challenging project, but I would not fault other librarians for having passed on it.”

Lila Bailey, policy counsel for the Internet Archive, says that the group is taking “active steps” to avoid interfering with anyone’s commercial interest in the works they scan and add to the Sonny Bono Memorial Collection. “The statute only calls for a ‘reasonable’ search,” she says. “It does not have to be a perfect search.” Rights holders or readers are free to flag works erroneously included.

Copyright status can shift over time, Bailey says, and “we’re going to have to stay vigilant about that.”

One potentially affected group has yet to weigh in on the “Last Twenty” approach: publishers. Attempts to get around copyright can make them nervous, and it remains to be seen whether any will object to the Last Twenty approach.

To Gita Devi Manaktala, editorial director of MIT Press, the Internet Archive’s approach makes sense. “I really don’t think this is a project that should concern a lot of publishers,” Manaktala says. “If the works are not currently available and they haven’t made the investment to bring them back into print, I don’t see why they would object.”

For libraries, she says, “this gives them some cover, and it encourages what seems to be a fair use. It’s just labor—checking the rights and doing the work.”

Postsecondary Learning

Internet Archive Hopes to Help Libraries Make Available Books Once Thought Trapped By Copyright

By Jennifer Howard     Oct 19, 2017

Internet Archive Hopes to Help Libraries Make Available Books Once Thought Trapped By Copyright

As libraries work to scan their collections, they tread carefully around books published after 1923 and likely to still be under copyright. But researchers at the Internet Archive and Tulane University say they’ve found a legal way to scan and make available many in-copyright works published between 1923 and 1941, which will come as welcome news to those who want to dive into a digital copy of a book like In Love With a T-Man, a 1937 pulp novel by Rob Eden.

Nothing frustrates researchers and librarians like material they know exists but can’t be easily accessed. That’s true for thousands of works like In Love With a T-Man, which happens to be available to buy on Amazon in a used hardcover edition but has been, until now, out of reach digitally for library patrons.

“We hope this will encourage libraries that have been reticent to scan beyond 1923 to start mass scanning their books and other works, at least up to 1942,” wrote Brewster Kahle, the founder and digital librarian of the Internet Archive, in a blog post describing how the group is “liberating” works that are otherwise off-limits.

Kahle, along with Elizabeth Townsend Gard, a Tulane law professor, are calling attention to Section 108(h) of the U.S. copyright code, which hasn’t gotten a lot of public attention. As the provision is understood by the Internet Archive, nonprofit educational institutions like libraries and archives are allowed “to reproduce, distribute, display and publicly perform a work if it meets the criteria of: a published work in the last twenty years of copyright, and after conducting a reasonable investigation, no commercial exploitation or copy at a reasonable price could be found.”

Using this provision, the Internet Archive has created The Sonny Bono Memorial Collection, named for the California representative who co-sponsored a 1998 bill that extended the copyright term on many works that would otherwise now be in the public domain. The Sonny Bono Act, as it’s sometimes called, delighted a lot of rights holders and displeased public-domain advocates, who challenged it in court and lost.

Kahle and his colleagues are calling the Sonny Bono Collection a “Last Twenty” collection because it contains works identified as being in the last 20 years of their copyright term and therefore eligible for public access of the kind libraries provide. If Kahle et al. prove they can create such collections efficiently, and on a large enough scale, their approach could put many once hard-to-get works in the hands of people who’d like to use them—and encourage libraries to take a chance on digitizing more copyrighted content from the post-1923 era. That could especially help students and professors looking to use such works in research and teaching.

Nothing about copyright is ever easy, though, and the “Last Twenty” approach has its risks, and its skeptics.

Making sure a work meets all the requirements takes time, and depends on accurate data, for instance. And who decides what counts as a reasonable investigation or reasonable price?

“Doing the work to do it right is hard, and doing it at scale is really hard,” says Michael Wolfe, scholarly communications officer at the Shields Library at the University of California at Davis. “I have a healthy dose of skepticism.”

According to Wolfe, libraries have already been taking advantage of 108(h) but in a limited way. “Mostly it’s pretty quiet,” he says. “In nine out of ten cases, you’re working with special collections of some other item unique to the library, and you do it on an ad-hoc basis.”

Checking Copyright Status

That’s where Tulane’s Elizabeth Townsend Gard comes in. The key, she thinks, is simplifying the copyright-vetting process for libraries, museums and individual users. She sees 108(h) as a means to achieve what she calls Library Public Domain.

She and her husband, Ron Gard, founded a company called Limited Times that does the legwork to determine the copyright status of various works. They built a software system, The Durationator, which includes the details of copyright laws from around the world and can deliver answers in minutes rather than the hours or days it might take a human to find them. Humans still need to be on hand to check results and troubleshoot as needed.

“It turns out there’s a lot of weird questions that come up,” Townsend Gard says. The substitution of a comma for a colon in a book’s title can be enough to skew the results. “That the data is that picky is a problem.”

Tulane provided some $200,000 in support and a steady stream of graduate students to help get the enterprise running. Some of those students worked with the Internet Archive on the “Last Twenty” project. To give themselves the most freedom, Townsend Gard and her husband have kept the company a limited for-profit enterprise, but they say they’re not doing it for the money. Townsend Gard has vivid memories of being a graduate student in history trying to figure out if she could use works from the 1920s, ’30s, and ’40s for her dissertation on narratives about World War I.

“For us it’s a social mission of making things more accessible,” she says. The goal remains “how to help people with copyright in a really affordable way.”

Working with the Internet Archive on the “Last Twenty” project represents the culmination of a decade’s worth of work. “It took a long time to think through the problems, it took us a long time to gather the laws, and it took us a long time to code it,” Townsend Gard says. The software can handle requests for all types of potentially copyrighted material, including text, photographs, and recordings. “We’re up to the challenge,” she says.

Limited Times did testing with a hundred institutions in the spring, according to Townsend Gard, and is now offering subscription packages as well as one-off services. They’ve been working closely with several major institutions, including the Frick Art Reference Library of the Frick Collection, and are signing up other institutional clients. “We’re a little afraid of too many people saying yes, because we’re still in the early stages,” she says.

Some experts say that libraries are better off spending their limited resources working on scanning materials that are in the public domain. One of them is Peter B. Hirtle, a longtime copyright policy advisor for Cornell University Library. (Hirtle spelled out the tortuous requirements of copyright in a 2012 article, “When Is 1923 Going to Arrive and Other Complications of the U.S. Public Domain.”) He’s now an alumni fellow of the Berkman Klein Center for Internet and Society at Harvard University.

Still, Hirtle says, the Internet Archive deserves credit for putting themselves on the line in a way that libraries might not be willing or able to. Making a mistake about copyright status could be punished with a $30,000 statutory fine. Even if that’s an unlikely outcome, it is enough to give many institutions pause. The Internet Archive “is willing to absorb more risk than libraries can,” he says. "IA should be praised for taking on a challenging project, but I would not fault other librarians for having passed on it.”

Lila Bailey, policy counsel for the Internet Archive, says that the group is taking “active steps” to avoid interfering with anyone’s commercial interest in the works they scan and add to the Sonny Bono Memorial Collection. “The statute only calls for a ‘reasonable’ search,” she says. “It does not have to be a perfect search.” Rights holders or readers are free to flag works erroneously included.

Copyright status can shift over time, Bailey says, and “we’re going to have to stay vigilant about that.”

One potentially affected group has yet to weigh in on the “Last Twenty” approach: publishers. Attempts to get around copyright can make them nervous, and it remains to be seen whether any will object to the Last Twenty approach.

To Gita Devi Manaktala, editorial director of MIT Press, the Internet Archive’s approach makes sense. “I really don’t think this is a project that should concern a lot of publishers,” Manaktala says. “If the works are not currently available and they haven’t made the investment to bring them back into print, I don’t see why they would object.”

For libraries, she says, “this gives them some cover, and it encourages what seems to be a fair use. It’s just labor—checking the rights and doing the work.”

GET THE LATEST HIGHER ED NEWS
Be the first to know, with our weekly newsletter.

GET THE LATEST HIGHER ED NEWS
Be the first to know, with our weekly newsletter.