Development of a Prototype of the Speech-to-Text Web Application for Electronic Meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University

Saliew, Pimonpan; Buhuatchai, Jirundon; Thammasiri, Dech

Please use this identifier to cite or link to this item: https://publication.npru.ac.th/jspui/handle/123456789/2327

Full metadata record

DC Field	Value	Language
dc.contributor.author	Saliew, Pimonpan	-
dc.contributor.author	Buhuatchai, Jirundon	-
dc.contributor.author	Thammasiri, Dech	-
dc.date.accessioned	2025-09-01T07:08:46Z	-
dc.date.available	2025-09-01T07:08:46Z	-
dc.date.issued	2025-08-21	-
dc.identifier.isbn	978-974-7063-48-6	-
dc.identifier.uri	https://publication.npru.ac.th/jspui/handle/123456789/2327	-
dc.description.abstract	The objectives of this research are to 1) develop a prototype of the speech-to-text web application for electronic meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University. 2) To compare the accuracy of models for the prototype of the speech-to-text web application for electronic meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University. Due to the previous problem, the researcher had to spend a long time transcribing the meeting summary audio files and did not want to process the audio files in the cloud to maintain confidentiality within the organization. The researchers therefore developed a prototype of a speech-to-text web application using the large language model Whisper using PHP and Python to use the generated voice data of 3 messages with lengths of 10 characters, 42 characters, and 121 characters. And using a personal computer, the experiment was repeated 3 times with 6 sizes of Whisper models: 1) Tiny 2) Base 3) Small 4) Medium 5) Large, and 6) Turbo. The research results found that the Turbo Whisper model has an average accuracy of 67.81% and an average processing time of 38.54 seconds, making it the most suitable among all models. For a character length of 10 characters, the average accuracy is 100%, with an average processing time of 28.72 seconds. For a character length of 42 characters, the average accuracy is 57.14%, with an average processing time of 30.41 seconds. For a character length of 121 characters, the average accuracy is 46.28%, with an average processing time of 56.50 seconds. The researchers then applied the Turbo Whisper model and improved the post-processing results of Thai speech-to-text conversion to achieve better accuracy.	en_US
dc.publisher	The 17th NPRU National Academic Conference Nakhon Pathom Rajabhat University	en_US
dc.relation.ispartofseries	Proceedings of the 17th NPRU National Academic Conference;426-438	-
dc.subject	Web Application	en_US
dc.subject	Speech-to-Text	en_US
dc.subject	Electronic Meetings	en_US
dc.subject	Large Language Models: LLMs	en_US
dc.subject	Whisper	en_US
dc.title	Development of a Prototype of the Speech-to-Text Web Application for Electronic Meetings for the Research and Development Institute of Nakhon Pathom Rajabhat University	en_US
dc.title.alternative	การพัฒนาต้นแบบเว็บแอปพลิเคชันแปลงเสียงเป็นข้อความสำหรับการประชุมอิเล็กทรอนิกส์ สถาบันวิจัยและพัฒนา มหาวิทยาลัยราชภัฏนครปฐม	en_US
dc.type	Other	en_US
Appears in Collections:	Proceedings of the 17th NPRU National Academic Conference

Files in This Item:

File	Description	Size	Format
426.pdf	DTI-P1	548.04 kB	Adobe PDF	View/Open

Show simple item record