Image classification is a fundamental task in machine learning that involves assigning labels or classes to images based on their content. It is often performed using convolutional neural networks (CNNs). These networks are capable of learning and generalizing patterns from large amounts of data. However, if the data is not sufficiently voluminous, overfitting can occur. In such cases, it is recommended to turn to classical machine learning techniques. Moreover, the data that was insufficient for deep learning may exceed the processing capacity of the machine. This can pose significant challenges in terms of storage, memory availability, and computational power required to perform the learning operations. Our proposed approach involves addressing these challenges by optimizing the content of the dataset. This optimization is performed while preserving the essential information necessary for classification. Indeed, identical or highly similar are identified, grouped together and represented by the most representative one among them. At the same time, their sizes can be reduced. Furthermore, another significant challenge in our proposed approach revolves around managing class imbalances within the dataset. Our approach has been evaluated and the results are promising.
Unsupervised linear/non-linear dimensionality reduction data visualization technique unsupervised learning algorithm dataset optimization
Primary Language | English |
---|---|
Subjects | Software Engineering (Other) |
Journal Section | Articles |
Authors | |
Early Pub Date | December 25, 2023 |
Publication Date | December 30, 2023 |
Published in Issue | Year 2023 |