>
Fa   |   Ar   |   En
   Solving the difficult problem of topic extraction in Thai tweets  
   
نویسنده nararatwong r. ,legaspi r. ,cooharojananone n. ,okada h. ,maruyama h.
منبع journal of telecommunication, electronic and computer engineering - 2016 - دوره : 8 - شماره : 6 - صفحه:141 -145
چکیده    We tackled in this study the difficult problem of topic extraction in thai tweets on the country's historic flood in 2011. after using latent dirichlet allocation (lda) to extract the topics,the first difficulty that faced us was the inaccuracy the word segmentation task that affected our interpretation of the lda result. to solve this,we refined the stop word list from the lda result by removing uninformative words caused by the word segmentation,which resulted to a more relevant and comprehensible outcome. with the improved results,we then constructed a rule-based categorization model and used it to categorize all the collected tweets on a per-week scale to observe changes in tweeting trend. not only did the categories reveal the most relevant and compelling topics that people raised at that time,they also allowed us to understand how people perceived the situations as they unfold over time.
کلیدواژه LDA; Thai tweets; Topic extraction
آدرس graduate university for advanced studies,kanagawa,japan,national institute of informatics,tokyo, Japan, research organization of information and systems,transdisciplinary research integration center,institute of statistical mathematics,tokyo, Japan, chulalongkorn university, Japan, national institute of informatics,tokyo, Japan, research organization of information and systems,transdisciplinary research integration center,institute of statistical mathematics,tokyo, Japan
 
     
   
Authors
  
 
 

Copyright 2023
Islamic World Science Citation Center
All Rights Reserved