Home >> News & Publications >> Newsletter

Newsletter

搜尋

  • 年度搜尋:
  • 專業領域:
  • 時間區間:
    ~
  • 關鍵字:

Does Using Crawlers to Download Content from a Law Database Infringe Copyright?



The rise of the Internet and AI has spurred a trend of developing novel technologies and business models that leverage public data. However, using this data can lead to disputes regarding potential violations of the Copyright Act or the Fair Trade Act. Recent court judgments have affirmed copyright protection for legal databases, extending even to Criminal Code applications. One start-up company, using web scraping to download database content and offer innovative search services, was found to have infringed copyright and violated the Criminal Code. The company and its operators are facing imprisonment, criminal fines, and asset forfeiture, along with over NT$100 million in damages to the database operator. Key points of contention in this case include (1) whether the data in the legal database, generally believed to be widely available on public government websites, qualifies for copyright protection; (2) whether violating a database's terms of use through scraping should elevate a breach of contract to a criminal offense; and (3) whether database creation and maintenance costs are equivalent to the operator's actual losses, justifying full compensation from the start-up company. These issues have sparked considerable public debate.

 

1.     Facts and the Parties' Arguments 

(1)     The criminal complainant operates LB, a legal database. The accused are the company and its founders, who provide online legal information and search services under the Ln brand. The two companies are competitors. 

(2)     The complainant claimed that the accused were fully aware that LB's terms of use prohibited saving or copying LB content for personal or third-party use; nonetheless, the accused breached the terms of use by reproducing the legislative history, laws and regulations, and their attachments in LB with the intent to sell, without the complainant's consent, authorization, or any legitimate legal basis. These actions infringed the complainant's copyright and misappropriated the complainant's electromagnetic records without justification. 

(3)     The accused argued against copyright infringement for the following reasons: (a) the legislative history, laws and regulations, and their attachments are public government information belonging to the public domain and thus not subject to copyright; (b) the limited ways to express "legislative history" data mean that, under the merger doctrine, the complainant lacks copyright protection for its expression of the legislative history; (c) LB allows non-members to access legislative history content free of charge. Hence, regardless of how the accused used such content on Ln, the use did not adversely affect the market value or other potential factors of the legislative history content on LB. This use constitutes fair use under Article 65 of the Copyright. However, the accused denied the fairness of the complainant's prohibition of copying and compiling under its terms of use, arguing that the complainant's website states clearly that its information was collected from the government. 

(4)     The complainant (plaintiff in the civil case) additionally filed a supplementary civil action in the criminal proceedings against the accused (defendants in the civil case) for joint and several liability for damages under the Fair Trade Act, the Copyright Act, and the Civil Code. The plaintiff is also claiming triple damages under the Fair Trade Act. The defendants argued that the criminal proceedings found no violation of the Fair Trade Act, thus precluding the plaintiff's reliance on the Act for its claim in this supplementary civil action.

 

2.     Holding and Reasoning: 

The court determined that the legislative history content in LB is a compilation protected under the Copyright Act. The legislative history, laws and regulations, and their attachments are electromagnetic records protected under the Criminal Code. The accused used web crawler programs to copy and download these records. Such acts constitute an offense under Paragraph 2, Article 91 of the Copyright Act for infringing upon another's copyright by unauthorized reproduction with intent to sell (regarding the legislative history) and an offense under Article 359 of the Criminal Code for unlawfully obtaining another's computer electromagnetic records (regarding the legislative history, laws, and regulations, and their attachments). 

(1)     Article 7 of the Copyright Act states that a compilation is a work formed by the creative selection and arrangement of materials and is protected as an independent work. How can "selection" and "arrangement" meet the "creativity" requirement for protection as a compilation work? The following opinions can be referenced: (a) The Taiwan Intellectual Property Office's ruling: The "minimal requirement of creativity" or "lowest degree of creativity" (also referred to as the principle of aesthetic non-discrimination) is sufficient to meet the standard of creativity. (b) The Supreme Court's civil judgment (Case No. 99-Tai-Shang-225): "Creativity" does not have to be unprecedented; a work need only contain a distinguishable variation from pre-existing works sufficient to express the author's personality based on society's common understanding. (c) In its 1991 decision on Feist Publications, Inc. v. Rural Telephone Service Company, Inc., 499 U.S. 340 (1991), the Supreme Court held that a "database" need only possess "a minimum level of creativity" to qualify as a "compilation" protected under the United States Copyright Act; the requisite level of creativity is extremely low—even a slight amount will suffice. 

(2)     Accordingly, whether LB is a compilation protected under the Copyright Act hinges on whether the "selection" and "arrangement" of the data demonstrate at least a minimal degree of creativity or personal expression. The legislative history content published by the complainant on LB demonstrates a considerable degree of creativity, differing significantly from government versions and sufficiently reflecting an author's personality. Thus, it enjoys copyright protection. 

(3)     There are many ways to express the legislative history of various laws and regulations. Different persons indeed express "legislative history" in very different ways. The "entire legislative history" of each law or regulation on LB is compiled by selecting, arranging, and rewriting original materials that government agencies have publicly released at different times (including government gazettes, official documents, provisions of laws and regulations, government-edited works, etc.). It is neither part of the provisions in laws or regulations nor a government's official document or edited work. Furthermore, as its "expression" is clearly different from the "single amendment information" or "legislative history" provided by government agencies, it constitutes a "compilation work" receiving copyright protection under Article 7 of the Copyright Act due to its creativity. 

(4)     The accused incorporated the complainant's "Regulatory History" content into their own products for commercial purposes. The quantity and quality of the "Regulatory History" content crawled by the accused from LB account for 100% of the "Regulatory History" content in both parties' legal databases. The accused argued that customers prefer Ln mainly because of its powerful search engine, which allows users to easily search for legal information by entering keywords. However, if Ln did not copy the "Regulatory History" and other information crawled from LB, how could customers easily find the "Regulatory History" and other information they are searching for simply by entering keywords? In other words, a database's content is the most important foundation for any legal information retrieval service. Without this foundation, no matter how powerful the search engine, what information could it search for? Since the accused obtained the "Regulatory History" content from LB at almost no cost, they can compete with the complainant at a lower price, which undoubtedly harms the potential market and current value of the complainant's service. 

(5)     The court referred to the United States Federal Court's decision in Reuters v. Ross Intelligence, rendered on February 11, 2025, which confirmed the unlawfulness of training AI with data from another party without consent. In this case, the accused used a web crawler to extract as many as 98,068 entries from the "Regulatory History" compilation work on LB. The accused did not use the data to train or enable AI learning, but instead directly incorporated it into its own Ln database, thereby engaging in commercial competition with the complainant at a lower price. Such conduct harmed the complainant more directly and seriously than if the accused had used the data for AI training. To maintain the transactional order and business ethics of the domestic database market, and to prevent a mentality of reaping without sowing, the court denied the accused's fair use argument. 

(6)     Under Article 359 of the Criminal Code, "without justification" in the context of unlawfully obtaining another person's computer electromagnetic records includes circumstances such as "without legitimate cause," "without the owner's permission," "without disposition authority," "contrary to the owner's intent," and "exceeding the scope of authorization" (see Supreme Court criminal judgment Case No. 110-Tai-Shang-90). The accused, without legitimate cause or disposition authority and without the complainant's permission, acted contrary to the complainant's intent and violated LB's terms of use. By doing so, the accused was able to gain an unfair market advantage, avoiding data compilation costs and potentially harming the complainant's commercial interests or opportunities, thereby causing loss and damage to the complainant. Therefore, the accused' conduct constitutes an offense under Article 359 of the Criminal Code (Taiwan New Taipei District Court's criminal judgment (Case No. 111- Zhi Su-8), 24 June 24 2025). 

(7)     The court denied the plaintiff's right to claim triple damages under the Fair Trade Act in the supplementary civil action filed in the criminal proceedings. However, the court held the defendants jointly and severally liable for damages covering the plaintiff's costs in establishing the legislative history content as its lost licensing profits under the Civil Code. This is based on Article 216 of the Civil Code, which provides that damages must compensate a creditor for actual damage and lost profits. Said profits that can be expected under ordinary circumstances, or based on any definite plan, equipment, or other particular circumstances, shall be deemed lost profits. The plaintiff asserts that the benefit obtained by the defendants is an unjust sharing of the plaintiff's expenditure costs, and thus the plaintiff may claim damages based solely on those costs. The court held that, had the defendants purchased the content at issue from the plaintiff at cost price and the plaintiff made no profit from the defendants, the plaintiff's profits would equal the cost of producing the content at issue. In this case, the defendants did not pay the plaintiff any price, resulting in the plaintiff's loss of the profit that could have been obtained under ordinary circumstances. Therefore, the plaintiff is entitled to claim compensation for this lost profit (Taiwan New Taipei District Court's supplementary civil judgment in criminal proceedings (Case No. 112-Zhi-Chang-Fu-Min-1), 24 June 24 2025).

 

3.     Discussion: 

This is not the first case in which the court has recognized that a legal database contains a compilation protected by the Copyright Act. On April 22, 2010, the Intellectual Property Court issued a criminal judgment (Case No. 97-Sang-Su-41) recognizing that the "Judicial Interpretations and Rulings" sub-database of LB demonstrates a high degree of professional intellectual input in the selection and compilation of data and is not merely the result of mechanical labor. The "Notes on Changes in Laws and Regulations" sub-database, which traces and compiles historical changes in "laws and regulations" and "court precedents and regulatory rulings" and further edits those already created as compilation works, also possesses originality and creativity. In the past, judicial practice in Taiwan referenced German legal theory, which holds that a work must possess a certain level of creativity (i.e., "obviously exceeding the level that an ordinary creator could achieve"). However, such requirement for copyright protection is rather stringent. Subsequently, the so-called "small coin" principle was developed, under which protection is granted as long as the selection and arrangement of data demonstrates a minimum degree of creativity. In this case, the court relied on "the minimum degree of creativity or personal expression" to recognize that LB is protected by copyright. 

However, Ln, which is operated by the accused, is regarded as an innovative brand in the legal database industry. The court imposed criminal liabilities on the accused and ordered them to compensate the plaintiff for more than NTD 100 million. This has sparked public debates as to whether the judiciary is stifling the development of innovative industries. The issues under discussion also include the following: 

(1)     Whether, when determining that the legislative history content on LB should be protected by the Copyright Act, the court should also consider safeguarding the public's right to access legal information from the perspectives of public interest and fair use; 

(2)     Whether the copyright holders of databases should be protected only with civil remedies, without imposing criminal liabilities on users; and whether AI development will be affected if using web crawlers to extract content from another database could result in criminal liability for accessing another's electromagnetic records without justification; and 

(3)     Whether the entire cost of establishing and maintaining the legislative history content on LB should be regarded as the plaintiff's reasonable expectation of profit from licensing the accused.

 

These issues await clarification by the legislature from the perspective of industrial development and by the appellate court from the perspective of fairness in individual cases.

回上一頁