Text Codification for Statistical Production using Machine Learning

Authors

  • Helda Curma
  • Valentina Sinaj

Keywords:

Algorithms; Data; Classification

Abstract

Objective

The main objective of this research is the usage and evaluation of Machine Learning algorithms for automatic text codification in statistical production process.

Prior Work

This is an evolving area of research due to rapid changes in technology as well as the new data ecosystem. The paper will build on previous research done on text classifications techniques.

Approach

In this paper Machine Learning algorithms will be used and evaluated for text codification. Natural Language Processing and classification algorithms will be implemented in Python.

Results

Machine Learning is powerful in the process of automation and modernization of statistical production lifecycle. Machine Learning algorithms performance is different for text classification. Data pre-processing and balance on the training data set are important to achieve good results.

Implications

This study shows that machine learning can be used in automating part of the statistical codification process. The results of this paper will serve the work of Albanian administration and more specifically statistical production. 

Value

This research is a contribution to the usage of Machine Learning for the modernization of the codification process. It will serve as an initial work towards improving the timeliness and lowering statistical production costs

References

.

Downloads

Published

2023-07-24

Issue

Section

The European Citizen and Public Administration