Please use this identifier to cite or link to this item: https://publication.npru.ac.th/jspui/handle/123456789/2327
Title: Development of a Prototype of the Speech-to-Text Web Application for Electronic Meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University
Other Titles: การพัฒนาต้นแบบเว็บแอปพลิเคชันแปลงเสียงเป็นข้อความสำหรับการประชุมอิเล็กทรอนิกส์ สถาบันวิจัยและพัฒนา มหาวิทยาลัยราชภัฏนครปฐม
Authors: Saliew, Pimonpan
Buhuatchai, Jirundon
Thammasiri, Dech
Keywords: Web Application
Speech-to-Text
Electronic Meetings
Large Language Models: LLMs
Whisper
Issue Date: 21-Aug-2025
Publisher: The 17th NPRU National Academic Conference Nakhon Pathom Rajabhat University
Series/Report no.: Proceedings of the 17th NPRU National Academic Conference;426-438
Abstract: The objectives of this research are to 1) develop a prototype of the speech-to-text web application for electronic meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University. 2) To compare the accuracy of models for the prototype of the speech-to-text web application for electronic meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University. Due to the previous problem, the researcher had to spend a long time transcribing the meeting summary audio files and did not want to process the audio files in the cloud to maintain confidentiality within the organization. The researchers therefore developed a prototype of a speech-to-text web application using the large language model Whisper using PHP and Python to use the generated voice data of 3 messages with lengths of 10 characters, 42 characters, and 121 characters. And using a personal computer, the experiment was repeated 3 times with 6 sizes of Whisper models: 1) Tiny 2) Base 3) Small 4) Medium 5) Large, and 6) Turbo. The research results found that the Turbo Whisper model has an average accuracy of 67.81% and an average processing time of 38.54 seconds, making it the most suitable among all models. For a character length of 10 characters, the average accuracy is 100%, with an average processing time of 28.72 seconds. For a character length of 42 characters, the average accuracy is 57.14%, with an average processing time of 30.41 seconds. For a character length of 121 characters, the average accuracy is 46.28%, with an average processing time of 56.50 seconds. The researchers then applied the Turbo Whisper model and improved the post-processing results of Thai speech-to-text conversion to achieve better accuracy.
URI: https://publication.npru.ac.th/jspui/handle/123456789/2327
ISBN: 978-974-7063-48-6
Appears in Collections:Proceedings of the 17th NPRU National Academic Conference

Files in This Item:
File Description SizeFormat 
426.pdfDTI-P1548.04 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.