Welcome to AFFIPred
AFFIPred is a web application for predicting pathogenicity of SNVs based on AlphaFold structures. AFFIPred provides predictions for all possible amino acid variations in human proteome.
AFFIPred is a web application for predicting pathogenicity of SNVs based on AlphaFold structures. AFFIPred provides predictions for all possible amino acid variations in human proteome.
Structural information holds immense potential for pathogenicity prediction of missense variations, albeit structure-based pathogenicity classifiers are limited compared to their sequence-based counterparts due to the well-known gap between sequence and structure data. Leveraging the highly accurate protein structure prediction method, AlphaFold2 (AF2), we introduce AFFIPred, an ensemble machine learning classifier that combines sequence and AF2-based structural characteristics to predict disease-causing missense variant pathogenicity.
AFFIPred predicts pathogenicity of human missense variations, merging both sequence and structural information. Utilization of AF2 predictions instead of PDB structures not only enable AFFIPred to tackle inherent limitations of traditional structure-based classifiers, but also ensures retrieval of structural information in a more systematic and precise manner. Related preprint publication can be accessed here: https://doi.org/10.1101/2024.05.13.593840
There are 2 main ways to query the database:
AF-based Functional Impact Prediction (AFFIPred) accepts multiple inputs including genomic position, amino acid position and variant number (RS id). Example query input for each format is available in the box.
Collect the necessary information about the variants you want to analyze. Organize this data in a format compatible with the AFFIPred’s requirements. Ensure that the prompt clearly specifies the gene/protein and variant information.
Genomic position: 19 1010539 G C
Protein position: P21817 Y116A
Variant ID: rs1042779
You can also upload a VCF file, or a txt file containing variants in one of these formats. If you upload a VCF file, you should select 'Genomic position' on the right panel.
Site saturation predictions can be visualized for a protein. Available input types are uniprot ID and gene name and Ensembl gene ID
Uniprot ID: Q04206
Gene name: TMEM145
Ensembl Gene ID: ENSG00000175899
For any bug and suggestion, please contact by either opening issue: https://github.com/mustafapir/AFFIPred-cli/issues or by email: mustafapir29@gmail.com
For other inquries: https://timucinlab.com
If you use AFFIPred in your publication, please cite:
Pir, Mustafa Samet, & Timucin, Emel. (2024). AFFIPred: AlphaFold2 Structure-based Functional Impact Prediction of Missense Variations. bioRxiv, 2024.2005.2013.593840. doi: 10.1101/2024.05.13.593840
@article {Pir2024.05.13.593840,
author = {Pir, Mustafa Samet and Timucin, Emel},
title = {AFFIPred: AlphaFold2 Structure-based Functional Impact Prediction of Missense Variations},
elocation-id = {2024.05.13.593840},
year = {2024},
doi = {10.1101/2024.05.13.593840},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2024/05/15/2024.05.13.593840},
eprint = {https://www.biorxiv.org/content/early/2024/05/15/2024.05.13.593840.full.pdf},
journal = {bioRxiv}
}
First, install the tool using pip.
pip install affipred
Provide input and output files with relative or absolute path:
affipred variants.vcf -o affipred_results.csv
The input should be a .vcf
file while the output file name extension should be .csv
.
The output file will contain all the features used to predict the impact of the variants alongside the AFFIPred scores and the prediction of functional impact of the variants.