The Computer Scientist's Guide to Designing mRNA Vaccines

Part 2: Antigen Selection Pipeline

2.5. Pipeline Alternatives

While matching results is an easy way to ensure pipeline quality, improvements may also result from comparing the current structure, originating from Mehmood et al., to those of the other pipelines.

Across pipelines, PSORTb is the most popular tool for localization, with some papers complementing its results with CELLO. BlastP is usually used over DIAMOND, and VFDB is widely cited. In some works, VFDB was complemented with the microbial virulence database MvirdB (note: last updated snapshot in 2017) and the VirulentPred database (released in 2023).

The greatest variety is found in the first part, the ‘homology filter’, with pipelines using a wide range of tools from Mehmood et al. to in-house pipelines. Essential proteins are usually extracted using bioinformatics tools, most of which require licenses or can handle only limited amounts of data in short runtimes. used Clusters of Orthologous Groups (COG) distribution analysis, which, in addition to showing how conserved proteins are, also indicates their functional group (e.g., metabolism), allowing for better filtering in the first step (at the expense of more data and longer run time). Steps are otherwise similar, with some pipelines adding additional steps toward the end. Namely, evaluated antigenicity of many candidates in animal models, and performed physicochemical analysis of vaccine constructs and mapped B/T cell epitopes and their antigenicity/virulence prediction using ABCPred, ProPred1, and ProPred. For the K. pneumoniae pipeline, molecules were also screened using BepiPred-3.0, DiscoTope, and NetMHCII, using specific information related to the mouse species. The lab would then test the antigens.