A software bug is an error, fault, failure or flaw in a computer program that causes it to produce an incorrect result or to behave in unintended way. In this proposed work, a software bug classification algorithm i.e., Classification of Software Bugs Using Bug Attribute Similarity (CLUBAS) is used to categorize the bugs based on its phase and cost of the bugs severity which is dually assigned. The CLUBAS algorithm is designed using the classification technique, in which initially clustering is done using textual similarity of bug description and then labels are generated, assigned to each cluster. Later, the cluster labels are mapped to the bug classes using bug taxonomic terms which is based on the severity to the bug’s categories which are assigned. Text preprocessing consists of tokenization where the act of breaking a sequence of strings into pieces such as words, keywords, phrases, symbols and other elements called tokens. In the process, some characters like punctuation marks are discarded. The tokens acts as the input for another process like parsing and text mining. The project is an extension to CLUBAS algorithm using advanced text pre-processing techniques for the classification.