Performance with RandomForest Classifier
Machine Learning to Solve Multi-class Classification Problem
Machine learning algorithms normally assume roughly similar classes in number of objects. However, in real-life scenario, the data distribution is mostly skewed and some of classes appear much more frequently than others. So, when facing such disproportions we must design an intelligent system that is able to overcome such a bias.
Here, we will work with a multi-class problem where data are taken from UCI ML library as shown below.
url = ("https://archive.ics.uci.edu/ml/machine-learning-"
"databases/glass/glass.data")
df = pd.read_csv(url, header=None)
df.columns = ['Id', 'RI', 'Na', 'Mg', 'Al', 'Si','K', 'Ca', 'Ba', 'Fe', 'type']
df.set_index('Id', inplace=True)
print('Data loading:')
df.head()
Here, we have different chemical compositions in the features and different type of glasses as multi-class. The problem presents chemical compositions of various types of glass with the objective of the problem is to determine the use for…