Towards Classifying HTML-embedded Product Data Based On Machine Learning Approach

dc.contributor.authorMatveiev, O. M.en
dc.contributor.authorZubenko, A.en
dc.contributor.authorYevtushenko, D.en
dc.contributor.authorCherednichenko, O. Yu.en
dc.contributor.authorМатвєєв, О. М.uk
dc.contributor.authorЧередніченко, О. Ю.uk
dc.date.accessioned2023-05-06T09:59:17Z
dc.date.available2023-05-06T09:59:17Z
dc.date.issued2021
dc.description.abstractIn this paper we explored machine learning approaches using descriptions and titles to classify footwear by brand. The provided data were taken from many different online stores. In particular, we have created a pipeline that automatically classifies product brands based on the provided data. The dataset is provided in JSON format and contains more than 40,000 rows. The categorization component was implemented using K-Nearest Neighbour (K-NN)and Support Vector Machine (SVM) algorithms. The results of the pipeline construction were evaluated basing on the classification report, especially the Precision weighted average value was considered during the calculation, which reached 79.0% for SVM and 72.0% for K-NN.en
dc.identifier.citationMatveiev, O., Zubenko, A., Yevtushenko, D., & Cherednichenko, O. (2021). Towards Classifying HTML-embedded Product Data Based On Machine Learning Approach. MoMLeT+DS 2021 : 3rd International Workshop on Modern Machine Learning Technologies and Data Science. CEUR Workshop Proceedings, 2917, 85–95.en
dc.identifier.citationMatveiev O., Zubenko A., Yevtushenko D., Cherednichenko O. Towards Classifying HTML-embedded Product Data Based On Machine Learning Approach. MoMLeT+DS 2021 : 3rd International Workshop on Modern Machine Learning Technologies and Data Science. CEUR Workshop Proceedings. 2021. Vol. 2917. P. 85–95.en
dc.identifier.orcidhttps://orcid.org/0000-0001-5907-3771
dc.identifier.orcid0000-00001-9178-0847
dc.identifier.orcidhttps://orcid.org/0000-0001-6250-4616
dc.identifier.orcidhttps://orcid.org/0000-0002-9391-5220
dc.identifier.urihttps://dspace.mipolytech.education/handle/mip/206
dc.language.isoenen
dc.publisherCEUR Workshop Proceedingsen
dc.relation.ispartofMoMLeT+DS 2021 : 3rd International Workshop on Modern Machine Learning Technologies and Data Science. CEUR Workshop Proceedings. Vol. 2917 : 85–95.en
dc.subjectProduct classificationen
dc.subjectSVMen
dc.subjectK-Nearest Neighbouren
dc.subjectTF-IDFen
dc.subjectmachine learningen
dc.subjectvectorizationen
dc.subjectitem matchingen
dc.titleTowards Classifying HTML-embedded Product Data Based On Machine Learning Approachen
dc.typeArticle

Файли

Контейнер файлів

Зараз показуємо 1 - 1 з 1
Ескіз
Назва:
Towards Classifying HTML-embedded Product Data Based On Machine Learning Approach.pdf
Розмір:
1.69 MB
Формат:
Adobe Portable Document Format

Ліцензійна угода

Зараз показуємо 1 - 1 з 1
Ескіз недоступний
Назва:
license.txt
Розмір:
10.29 KB
Формат:
Item-specific license agreed to upon submission
Опис: