Drug discovery lifecycles within the pharmaceutical and biotech industries are reducing every year. As a result, saving time has become an increasingly important factor at every stage of product development, including patent landscape analysis. An environment in which “first to file” has now largely replaced “first to invent” demands a more sophisticated and efficient patent mining technology.
However, the vast majority of valuable information in patents is stored as unstructured text, with the data required by patent analysis tools scattered throughout lengthy descriptions employing language that is often highly technical and designed to obfuscate.
I2E's suite of patent portfolio analysis tools allows you to create powerful and bespoke queries for patent landscape reports, white space analysis, patent freedom to operate (FTO) searches, competitive intelligence and state-of-the-art reviews.
Traditional Patent Portfolio Analysis
A traditional patent landscape search involves extracting documents directly from databases such as Questel, Patbase, MicroPatent, USPTO, EPO and WIPO/PCT.
These documents are searched using keywords and the results filtered to identify those that might be of interest. This process is analogous to using an Internet search engine such Google or Bing, and the result is usually a large collection of documents which must then be reviewed manually.
Traditional patent landscape analysis seldom employs specialised ontologies, which could make the process more efficient, improve precision and help identify patents that would benefit from further analysis.
A Modern Solution for Patent Landscape Analysis
Linguamatics comprehensive patent mining service removes the need to access patent databases individually by offering a fully-maintained, cloud-based patent index containing full-text patents mined directly using I2E.
Our partnership with IFI Claims® allows us to provide coverage for over 90 countries with patent records available from the US Patent Office (USPTO), European Patent Office (EPO) and World Intellectual Property Office (WIPO).
As the format of patent documents varies between authorities, IFI Claims employ structured XML to provide a unified format for all patent records. Our index can be accessed by individual authority or subdivided by era, for example: last twelve months, last 5 years or last 20 years.
Our cloud-based patent index is updated weekly, thereby ensuring you always have access to the latest patent documents, legal status and classifications.
Our patent index is complimented by:
- A completely secure environment in which to conduct patent landscape analysis.
- The latest, hosted version of the I2E software maintained by us.
- A comprehensive suite of vocabularies, dictionaries and ontologies.
- An out-of-the-box library of commonly-used queries plus the ability to develop bespoke queries of your own.
- Access to chemical structure information.
Advantages of I2E's Patent Analysis Tools
I2E offers a powerful yet flexible solution to patent portfolio analysis, with the capability to search whole documents or focus on specific sections of each patent.
I2E comes with a built-in suite of vocabularies, allowing you to quickly express concepts such as the type of disease or organisation, and then cluster results into areas of interest. The data from I2E queries can also be visualized in tables and graphs or exported to third-party tools such as Pipeline Pilot™.
I2E's text mining technology fast-tracks the process of patent landscape analysis, the end result of which is a more targeted set of documents that have to be reviewed.
Linguistic and Patent Analysis Tools
One key advantage of I2E is its ability to understand the context in which terms and entities appear within a patent:
- Linguistic wildcards allow open questions and searches for entities, verbs and relationships. Using wildcards you can construct patent landscape searches such as “what kinds of antibodies are mentioned” without having to know - in advance - all the antibodies that may be present in the patents being searched.
- Linguistic patterns: I2E’s Natural Language Processing (NLP) can extract information from text using linguistic patterns and terminologies, including specific types of statement, for example: verbs, nouns and pronouns.
- Measurements: I2E’s queries can be used to identify the amounts, dosages or concentrations cited in patent documents.
- Numeric data: Provides the ability to construct queries based on chemical concentration and ratios.
Results table from an I2E query to extract the concentrations for a particular drug, Olanzapine, as part of patent analysis
Domain Specific Ontologies
I2E includes over a million built-in terms commonly encountered in patent landscape analysis within the pharmaceutical and biotech sectors. These cover classes such as chemicals, diseases, entrez genes, MeSH (Medical Subject Headings), National Cancer Institute (NCI) Thesaurus, organisations (by sector or type) and measurements.
I2E incorporates an especially powerful vocabulary for identifying chemicals, including common or layman’s terms (e.g. saltpetre), IUPAC names (e.g. sodium nitrate) or by formulae (e.g. NaN03). Common marketing and even slang terms are also identified.
Learn how I2E's powerful chemical vocabulary was put to practical use during the 2014 Ebola crisis.
Fast-tracking patents: from pain to gain:
Extracting information from patents can be a slow and labor-intensive process, owing to the size of the documents involved and the style in which they are written. Often only a small amount of data is required from any particular patent, but finding it can be very difficult. An interactive text mining system like I2E speeds up this process by using domain-specific ontologies, such as genes, diseases and compounds; by extracting numerical information like amounts and concentrations; and by providing the power to limit searches to within certain regions of the patents.
For example, searches can be constrained to find some terms in the Title, other terms in the Claims section, and a particular company in the Agent field. A key advantage of I2E is understanding the context in which target entities are mentioned in addition to uncovering relationships between them.This is of particular interest in journal articles as well as patents.The flexibility of querying provided by the underlying linguistics and the use of regular expressions gives the user huge potential to find this information fast.
Find out more about how I2E can support patent landscape reporting and analysis.
Reviewing patents for drug dosage amounts and extracting contextual metadata
"Once you see what I2E can do, you won't want to go back to wading through irrelevant documents"
Associate Director, Safety Assessment, Top-10 Global Pharmaceutical Company
Cooperative Patent Classification (CPC) Support
I2E supports patent landscape searches using the Cooperative Patent Classification (CPC) system, jointly developed by the European Patent Office (EPO) and the United States Patent and Trademark Office (USPTO).
Selective Patent Reporting
When undertaking a patent landscape search, it is often the case that only a small amount of data are required from each patent, however, extracting this information can be a slow and labor-intensive process owing to the size of the documents, the technical nature of their content and the style in which they are written.
I2E enables you to search within a standard section or sections of each patent, for example, the Title, Abstract, Description, Claims or Agent fields.
Identifying Novel Chemical Compounds
I2E allows users to extract all the chemical structures within a document or to filterit based on substructure or similarity. It allows the extraction of chemicals, associated properties and the relationship with other entities.
Using I2E you can find both known and novel compounds via:
- Exact, substructure and similarity search;
- A dictionary-based search for known chemicals, for example, via a common name;
- Name to structure to find novel chemicals.
Learn more about I2E’s chemistry-enabled text mining capabilities.
Patent Claim Chaining
Patent portfolio analysis is made substantially more complex and time consuming in cases where the claims in one patent are dependent on claims in an earlier patent or patents, often referred to as “claim chaining”.
I2E provides powerful tools to identify claim chains, allowing you to review dependent claims or work back along an entire chain of claims.
I2E results showing (top) the structured table following a particular patent claim chain and (bottom) a cached copy of the original patent, with the relevant text marked up
From the start of the patent landscape analysis process, I2E can help pharmaceutical and biotech companies understand which areas have already attracted heavy investment and which offer the potential for new drug candidates.
This information is vital in deciding where to invest human and financial resources and thereby maximize your return on investment.
Discover how Bristol Myers Squibb used I2E to analyze industry trends and benchmark their capabilities against other pharmaceutical companies, with key questions including:
- What are the trends for different therapeutic areas?
- What are the trends for technology platforms used by the big pharmaceutical companies?
White Space Analysis
White space analysis is an important aspect of any company’s innovation strategy. An organization contemplating the development of a new product or technology needs to know at an early stage whether it can own the fruits of its efforts, and if it is at risk of being sued for infringement by a third party.
White space analysis is closely related to patent landscape analysis, in that it’s important to gain an up-to-date perspective on what has already been patented, in order to spot the gaps and hence potential opportunities. As with patent landscaping, white space analysis demands a comprehensive overview of a specific technology or other invention area. I2E queries can be designed to extract the necessary information, normalise and standardise the data elements, to provide a substrate for visualisation tools.
Scientific papers are primarily written in English, however, as the use of text mining has become broader, there is an increasing need to be able to deal with other languages.
Linguamatics recognizes this need, and I2E provides a platform that can deal with multiple languages, including single patent documents written in multiple languages, ensuring that an English synonym for adverse events such as “die” does not hit the German determiner “die”.
I2E can be used with third-party patent sources such as PatBase, Thomson Reuters and Minesoft. Patent landscape reports can be run seamlessly on patents stored in both cloud-based and third-party sources. Results from I2E can be exported and the data formatted within popular third-party products such as Microsoft Excel, TIBCO Spotfire® or KNIME™.
An example of using I2E with third-party software is Sanofi, who developed a workflow combining I2E, PatBase and Intellixir to create a patent landscape for Antibody-Drug Conjugate (ADC) therapeutics.
Sanofi's US Global Patents department commented:
"The agility and flexibility of [I2E’s] NLP-based querying is remarkable. Its uses in text analysis are practically unlimited; in our project we take advantage of ontologies to categorize patent documents."
Read more about Sanofi’s experience in combining multiple software resources in a large patent landscaping project.
Patent Supplementary Files
Many patents are now associated with supplementary files such as .mol. I2E can process .mol files and automatically generate the associated SMILES and IUPAC names in the correct position within the patent document.
Pfizer Case Study
I2E's patent analysis tools deliver more relevant results in a much shorter timeframe, with similar or better accuracy than traditional methods.
Read the case study and discover how Pfizer was able to:
- Improve their patent portfolio analysis of novel drug targets and indications tenfold;
- Speed up the process, thereby saving time and reducing costs;
- Stay up-to-date on recent findings in the industry literature.