FEPS is a comprehensive web-based tool designed for extracting the most widely used sequence-derived features from protein sequences. Pioneering the field of automated feature extraction, FEPS was first released in 2016 and has since evolved into a robust platform. It organizes these features into 7 major groups, encompassing a total of 48 feature extraction methods. Altogether, 2765 unique descriptors can be computed through FEPS (Ismail et al., 2022).
The extracted features can be seamlessly integrated with both traditional machine learning algorithms—such as Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbors (KNN) — and modern deep learning approaches. These computational techniques enable a wide range of bioinformatics classification tasks, including:
👉 Click here to download and read the full FEPS article.
👉 Click here to access the FEPS repository on GitHub.
The input to the webserver is a fasta-formatted protein sequence file. In a typical classification scenario, you may have protein sequences for different groups (download the tutorial). The sequences belonging to the same group are saved together in a single multiple-sequence fasta-formatted file. The input sequences have to meet following guidelines:
Select FASTA files:
Some feature types have options (see the supporting document). You may use the default options or choose options that you want. Moreover, please bear in mind that whenever 'ID Number' is an option, you can select one out of 544 Amino Acid Physicochemical properties from the drop-down menu or enter ID number to specify the amino acid physicochemical properties.
You can choose one or more file formats. The following are the most common feature file formats accepted by machine learning packages (e.g. weka, svm-light). Whenever, the input file includes the sequences of a protein group, the last column of the output file represents the class labels.