Page 218 - A Study on the Role of UGC Platforms in Copyright Law：An Intermediary-oriented Approach

P. 218

A Study on the Role of UGC Platforms in Copyright Law: Chapter 7 Platform Users’ Entitlement to UGCs: Human Use and Web Scraping
An Intermediary-oriented Approach

Despite the negative consequences of web scraping, it would be unreasonable for the
law to allow a UGC platform to prohibit web scraping of content that is not owned by the
platform and that is publicly available for human browsers. What is more, web scraping is
conducive to public welfare. By collecting and aggregating all information in one place, web
crawlers allow many users to find relevant information at a single location instead of having
to log into multiple websites. One-stop shopping significantly reduces customers’ cost of
searching. The search engines that underpin the Internet world today are essentially scrapers
that link a user to relevant webpage results according to the search terms a user has entered.
Considering the public availability of individual UGCs, a general ban on web scraping would
stifle freedom of speech, spur the fear of an information monopoly and diminish competition
in UGC platform markets in the context of big data. Nevertheless, due to the significant
value of UGC databases and UGC platforms’ huge investment in obtaining and presenting
them, UGC platforms should have some legitimate interests in UGC databases.
Considering the intricate balance between a UGC platform’s interest in recapturing
the investment in obtaining and presenting the UGC database, the web scraper’s interest in
freely participating in competition with publicly available UGC, and the public’s interest
in gaining free information to save search costs, the line between the lawful and unlawful
web scraping of UGC databases has become ambiguous. The legal framework that governs
the web scraping of UGC databases is in flux and is still evolving. The following account
61
explores the existing efforts to construct a boundary for web scraping: copyright, sui generis
database rights, trespass law, anti-unfair competition law and contract.

7.3.2 Copyright

Two types of copyright claims are available to curtail web scraping. The first is the UGC
creator’s copyright claim to individual UGC, which is usually brought to combat human users’
infringing use of the UGC. For the sake of efficiency, the UGC platform can serve as the agent
to file lawsuits on behalf of the UGC creator, as discussed in Section 6.2.4. However, the
UGC platform’s right to sue is derived from UGC creators’ copyright claim. The second type
of copyright claim is the UGC platform’s claim to the UGC database as a collective work if
the UGC database meets the originality requirement. The UGC platform has an independent
copyright claim in the UGC database because it arranges or selects individual UGC to create a
collective work. Collective works have been included in the list of copyrightable subject matter
in many jurisdictions. As the TRIPS Agreement states, the copyright protection of collective
62
works rests on the original selection or arrangement of content, not the content itself.
61 Jeffrey Kenneth Hirschey, ‘Symbiotic Relationships: Pragmatic Acceptance of Data Scraping’ (2014) 29 Berkeley Technology Law
Journal 897, 926; James Snell and Nicola Menaldo, ‘Web Scraping in an Era of Big Data 2.0’ (Bloomberg BNA) 1 < https://www.
perkinscoie.com/images/content/1/5/v2/156775/Snell-web-scraping-BNAI.pdf> accessed 17 May 2019; Riley (n 6) 246.
62 TRIPs, sec 10.2, ‘Compilations of data or other material, whether in machine readable or other form, which by reason of of
their contents constitute intellectual creations shall be protected as such. Such protection, which shall not extend to the data
or material itself, shall be without prejudice to any copyright subsisting in the data or material itself’.

• 204 •

213 214 215 216 217 218 219 220 221 222 223