Page 217 - A Study on the Role of UGC Platforms in Copyright Law：An Intermediary-oriented Approach

P. 217

A Study on the Role of UGC Platforms in Copyright Law: Chapter 7 Platform Users’ Entitlement to UGCs: Human Use and Web Scraping
An Intermediary-oriented Approach

55
reviews. LinkedIn, a social networking site for business connections, has also complained
about the scraping of its users’ information by hiQ Labs, a company that aims to help other
56
companies with human capital.
As mentioned at the beginning of this chapter, web scraping is characterised by its speed
57
and can extract UGC databases much faster than human users. Nevertheless, web scrapers
(e.g. web scraping tools) can access no more information than a human user, and the web
scraper’s entitlement to the scraped information is no greater than the human user’s. As the
United States District Court for the District of Columbia recently held in Sandvig v. Sessions,
web scraping ‘is merely a technological advance that makes information collection easier;
it is not meaningfully different from using a tape recorder instead of taking written notes,
or using the panorama function on a smartphone instead of taking a series of photos from
58
different positions’.
The quantitative difference in speed between human use and web scraping has brought
about qualitatively different responses from UGC platforms. Compared with human users, which
have the privilege of using UGCs via ToUs/ToSs as third-party beneficiaries, web scraping
tools have received more hostility than hospitality from ToUs/ToSs. In my dataset of 30 UGC
platforms, nine directly prohibited users from accessing and collecting UGCs through automated
means including without limitation, robots, botnets, scripts, apps, plugins, extensions, spiders,
59
offline readers, scrapers, crawlers and similar data gathering and extraction tools. Some UGC
platforms, although lacking a general ban on web scraping, have crafted stringent conditions for
web scrapers, such as the speed and purpose of the scraping. Human use and web scraping have
60
been treated differently because web scraping causes an unexpected load on the scraped website
by sending more requests per second than a human would and it deprives the scraped UGC
platform of its competitive advantage by copying its entire UGC database.

55 Dianping v. Aibang (2008) Hai Min Chu Zi No. 16204 Civil Judgement; Dianping v. Aibang (2009) Yi Zhong Min Zhong Zi
No. 5031 Civil Ruling; Dianping v. Aibang (2010) Hai Min Chu Zi No. 24463 Civil Judgement; Dianping v. Aibang (2010)
Hai Min Chu Zi No.4253 Civil Judgement; Dianping v. Aibang (2011) Yi Min Zhong Zi No. 7512 Civil Judgement; Aibang
v. Dianping (2014) Gao Min Shen Zi No. 00837 Civil Ruling; Xiecheng v. Qunaer (2009) Yi Zhong Min Zhong Zi No. 5525
Civil Ruling; Weimeng v. Taoyou (2016) Jing 73 Min Zhong No. 588 Civil Judgement
56 Knowmad Law, ‘HiQ v. LinkedIn and the Legality of Web Scraping’ (24 April 2018) <https://medium.com/@chris_70736/
hiq-v-linkedin-and-the-legality-of-web-scraping-e80b9ab06f1d> accessed 19 May 2019.
57 See supra notes 5-6 and accompanying text.
58 Sandvig v. Sessions, 315 F. Supp. 3d 1, 16 (D.D.C. 2018).
59 Soundcloud, Facebook, Myspace, Dianping.com, Weibo, WeChat, Tik Tok, IMDb, Craigslist.
60 Terms of Service in YouTube (n 17) art 4H (‘You agree not to use or launch any automated system, including without
limitation, “robots”, “spiders”, or “offline readers”, that accesses the Service in a manner that sends more request messages
to the YouTube servers in a given period of time than a human can reasonably produce in the same period by using a
conventional on-line web browser. Notwithstanding the foregoing, YouTube grants the operators of public search engines
permission to use spiders to copy materials from the site for the sole purpose of and solely to the extent necessary for creating
publicly available searchable indices of the materials, but not caches or archives of such materials’.)
Terms of Service in Fanfiction, art 4E <https://www.fanfiction.net/tos/> accessed 19 May 2019. (‘You agree not to use or
launch any automated system, including without limitation, “robots”, “spiders”, or “offline readers”, that accesses the Service
in a manner that sends more request messages to the FanFiction.Net servers in a given period of time than a human can
reasonably produce in the same period by using a conventional on-line web browser’.)

• 203 •

212 213 214 215 216 217 218 219 220 221 222