deepFEPS: Deep Learning-Oriented Feature Extraction for Biological Sequences
DeepFEPS is a high-performance bioinformatics platform for extracting advanced sequence-based features from DNA, RNA, and protein data. It integrates modern machine learning and deep learning techniques to transform raw biological sequences into rich numerical representations suitable for classification, clustering, and predictive modeling.
Each feature extractor below offers an advanced way of representing biological sequences — from sequence embedding models such as Word2Vec, FastText, and Doc2Vec, to Transformer-based architectures, Autoencoder-derived features, and Graph-based embeddings. These deep learning and graph representation techniques can capture complex sequence patterns and relationships beyond simple k-mer counts, enabling more powerful analysis for functional annotation, motif discovery, and predictive modeling.
Simply select the method that best fits your research goals, upload your sequences, configure the parameters, and download your processed features.
Autoencoder features
Learned compressed representations using autoencoders on k-mer BoW or fixed one-hot encodings.
OpenDoc2Vec embeddings
Sequence-as-document embeddings (PV-DM / PV-DBOW) over k-mers; strong global context vectors.
OpenGraph embeddings
k-mer graph embeddings (DeepWalk/Node2Vec/Graph2Vec) pooled to fixed-size features.
Open