There is a wealth of data available in international patent databases that is often underutilised by the research community. Patent analytics provides the ability to extract, analyse, visualise and interpret patent data to support Australian research organisations to increase collaboration and strategic use of IP and to improve commercialisation outcomes.

Protein crystallography continues to be essential for understanding proteins in health and disease, and enabling rational design of drug targets. This article provides a short demonstration of some of our patent analytics techniques, using PCT applications from the EPO PATSTAT database related to 3D structures of proteins.

Time Series Analysis

Time series analysis can be used to visualise trends in inventive activity in a field. In the field of 3D structures of proteins, the first PCT applications were published in 1994 but it was not until the turn of the century that the volume of applications reached significant levels (more than 10 applications per year), reflecting the increasing availability of high-resolution crystal structures at that time.

The number of PCT applications peaked in 2003 and has gradually declined, with a minor peak in 2009 (Figure 1). The surge of patent applications is likely to have receded once easy-to-crystallise and/or commercially relevant structures were completed. It is also possible that the perception that protein crystal structures are now “obvious” to make has slowed recent patenting activity in the field.

Figure 1: PCT applications for protein 3D structures since 2000, by publication year.

The classification breakdown of the applications in the early vs late 2000s suggests that the majority of the applications in the early peak were related to structural genes, particular bacterial and viral proteins, while the applications in the late 2000s are predominantly related to antibody structures (data not shown).

The major companies filing in each period of the 2000s is shown in Figure 2. The single largest patent applicant in the surge period of 2003-2005 was Affinium Pharmaceuticals, which developed narrow-spectrum antimicrobials by in silico drug design based on bacterial metabolic enzyme 3D structures.

Figure 2: Top applicants grouped by 3-year periods. Applicants having five or more PCT applications on 3D structures in each period are shown. No applicant had five or more applications in the period 2012-2014.

In order to visualise the technology being pursued by Affinium in that time, the CPC classes assigned to those applications are visualised in Figure 3. This indicates the majority of Affinium’s applications are directed to structure of bacterial proteins, in particular Pseudomonas and Streptococcus enzymes and also testing for antimicrobial activity.

Figure 3: Top Cooperative Patent Classifications (CPC) of the top applicant (Affinium Pharmaceuticals). Classifications found in five or more of Affinium’s PCT applications are shown. 

Geographical Analysis

Geographical analysis can be used to visualise hotspots of inventive activity. The OECD REGPAT database assigns regional locations to all patent applicants of PCT applications, based on address and postcode as listed on the application. The greatest concentration of applications can be seen in the Northeastern United States and Toronto, as well as California, and spread throughout western Europe.

The highest ranked cities can be seen in Figure 4. The patent applications arising from San Diego, California, and Cambridge, Massachusetts, are shared between a number of different organisations whereas Toronto, Canada, is dominated by a single company, Affinium Pharmaceuticals (see Figure 2). Copenhagen, Denmark, is the leading European centre for inventive activity in this field, and this is made up of Novozymes and Novo Nordisk (data not shown).

Figure 4: Geographical distribution of applicants of PCT applications in the dataset. The OECD REGPAT database assigns each patent applicant to an OECD Region and Locality based on the applicants address and postcode as listed on the patent.


Australia is the seventh country in terms of overall applications (data not shown), and Melbourne is the sixth most active region in the field (Figure 4). This is most likely due to the presence of the Australian Synchrotron in Melbourne, although the direct involvement of the synchrotron cannot be determined from this patent data set since it does not generally retain rights to IP generated and therefore is not listed as a patent applicant.

Patenting and collaboration in Australia

The major Australian patent applicants in the crystallography field are publicly funded research organisations and they have collaborated in a quarter of their PCT applications in the field.

The most active applicant in Australia in 3D structures of proteins is the Walter and Eliza Hall Institute for Medical Research (WEHI). WEHI’s PCT applications include WO 2007/147213 and WO 2014/047673 relating to the structure of the insulin receptor, WO 2003/014159, WO 2003/025017 and WO 2004/031232 relating to the structure of cytokine receptors, enabling design of anti-cancer agents, and WO 2007/128080 and WO 2007/147213 relating to the structure of apoptotic peptides (MCL-1 and BCL-W).

The structural biology division of WEHI is continuing to work with the Australian Synchrotron to obtain high-resolution structural information on proteins and peptides related to apoptosis and necroptosis. This will assist in the development of targeted treatments for conditions including autoimmune disease and cancer.

Because patent applications can be filed in the name of more than one entity, analysis of co-applicants can be used to provide evidence of collaborative activity.

Consistent with Melbourne’s status as one of the hot spots for protein crystallography patent activity, there is a high degree of collaboration between Australian applicants. Figure 6 shows all collaborations between Australian institutions (which occurred on 27 per cent of all Australian applications). The three WEHI applications to cytokine receptors mentioned above were made in collaboration with CSIRO and the Ludwig Institute for Cancer Research. CSIRO has also collaborated with the US-based Dyax Corporation on antibodies to IGF-II. Another notable collaboration is between St Vincent’s Institute in Melbourne and Medvet Science in Adelaide. This related to a 2008 patent application (WO 2008/052277) on the crystallisation of GM-CSF complexed with its receptor and methods of in vitro and in silico screening using this crystal structure.

Figure 6: Collaboration map (Thomson Data Analyser) of patenting collaborations involving Australian organisations on protein crystal structures.

Classification Analysis

The different types of technical activity within the field of protein crystal structures can be visualised by classification analysis. CPC classification captures all aspects of technology and so counting the number of applications in each classification allows visualisation of the most commonly patented technologies, for example the type of the protein under study and the species from which the protein is obtained.

Figure 7 shows the top CPC marks. The most common mark relates to computational techniques for determination of molecular structure, indicating that a considerable proportion of inventive activity in the field is related to structural assignment algorithms. Computational proteomics is also found among the most common CPCs. Medicinal applications are also seen in the form of protein based therapies and antibody combination therapies. The most commonly patented molecular structures are of antibodies and kinases.

The level of detail in the CPC shows that different applications focus on different properties of antibodies, for example humanised antibodies, high affinity antibodies and structures of linear epitopes. Other well represented CPCs relate to analysis of ligand-target interactions and in silico screening methods. Methods of single crystal growth rounds out the top 15, featuring in 30 patent applications.

Figure 7: Top cooperative patent classification (CPC) categories. All applications have the classification C07K 2299/00 (protein 3D structures) thus it is not shown.

Glossary of Terms

The Patent Cooperation Treaty (PCT) is a mechanism that allows a single patent application to be used as a basis for filing later applications around the world. Each PCT application receives an international (WO) publication and a preliminary examination report. The PCT entered into force in 1978 and WO publications have been issued since 1980.

PCT applications are a useful metric for inventive activity because in general there will be one PCT application for one invention, regardless of how many later applications may be filed around the world to cover that invention.

PCT applications are also considered to represent valuable inventions in the sense that their applicants intend to obtain patent protection overseas.

To assist patent searching and examination, patents are classified into technical categories. This study uses the Cooperative Patent Classification (CPC) which has been developed jointly by the US and European patent offices and combines both offices’ systems into a single classification. Each published patent application is assigned one or more CPC marks representing which technical fields to which it relates.

Methods and databases

The data set for analysis was obtained by searching the EPO PATSTAT database (Spring 2015 edition) for all PCT applications that contained the CPC mark C07K 2299/00, which is specific for inventions relating to the 3D structures of proteins. We identified 660 PCT applications having this CPC mark. We then used the OECD REGPAT database (September 2015 edition) to obtain the applicant names and addresses and publication dates for the patents. CPC marks (alphanumeric codes) were replaced with descriptive text in all figures.

The Patent Analytics Hub

The Patent Analytics Hub at IP Australia aims to help Australian innovators make the most of their intellectual property (IP). The Hub provides analysis, visualisation and interpretation of data included in patent documents.

For more information, visit our website.