The ever-increasing biological data deposited in databases worldwide requires corresponding  bioinformatics tools that can quickly reveal its hidden information. We describe here Protein  Sequence Profiler (PsP) that was developed to characterize protein sequence entries (stored in  database) which gives the user a simplified description about proteins sequences as well as the  capability to generate new dataset, either subjected to redundancy check or not for prediction  purposes. The system is built using PHP as the computing language and the use of arrays as data structure. The system could filter-out and retrieve from the protein sequence database  entries according to the following groupings (or in combination): signal peptide, taxonomy,  protein type, transmembrane type, non-membrane type, and evidence level. Consequently, the  filtered protein sequence entries could be downloaded, which in effect creates a new data set, or  could further be subjected to the integrated redundancy checker to remove “highly” similar  protein sequences.

Theory and Practice of Computation, pp. 44-58. DOI:  10.1142/9789813234079_000