การพัฒนา LVQ สำหรับการบีบอัดสัญญาณเสียง

เกรียงศักดิ์ พัฒนบุรี

Please use this identifier to cite or link to this item: http://www.repository.rmutt.ac.th/xmlui/handle/123456789/2222

Title:	การพัฒนา LVQ สำหรับการบีบอัดสัญญาณเสียง
Other Titles:	Enhancement LVQ for speech compression
Authors:	เกรียงศักดิ์ พัฒนบุรี
Keywords:	การประมวลสัญญาณเสียง speech processing
Issue Date:	2555
Publisher:	มหาวิทยาลัยเทคโนโลยีราชมงคลธัญบุรี. คณะวิศวกรรมศาสตร์. สาขาวิชาวิศวกรรมไฟฟ้า
Abstract:	จุดประสงค์ของการบีบอัดสัญญาณเสียง คือ การลดขนาดของสัญญาณเสียงให้มีขนาดเล็กที่สุด และขณะเดียวกันจะต้องรักษาคุณภาพของสัญญาณเสียงให้มีคุณภาพสูงสุด วิทยานิพนธ์ฉบับนี้ได้นำเสนอการเพิ่มประสิทธิภาพการบีบอัดเสียงด้วยการนำเทคนิคเครือข่ายการเรียนรู้เวกเตอร์ควอนไทซ์เซชั่น มาประยุกต์ใช้ในระบบการเข้ารหัสเสียงพูด เช่น LPC-10 และ LSP-10 เพื่อเพิ่มอัตราการบีบอัดสัญญาณเสียงพูด การทดลองเริ่มจากการเตรียมสัญญาณเสียงพูดที่มาวิเคราะห์ ซึ่งใช้ตัวอย่างเสียงพูด จำนวน 10 เสียงพูด ผู้ชาย 5 คน ผู้หญิง 5 คน โดยการบันทึกสัญญาณเสียงผ่านไมโครโฟนด้วยอัตราการสุ่มตัวอย่าง 8 กิโลเฮิรตซ์ต่อวินาที จากนั้นนำสัญญาณเสียงพูดวิเคราะห์หาค่าสัมประสิทธิ์ LPC-10 และ LSP-10 ตามลำดับ เมื่อได้ค่าสัมประสิทธิ์ LPC-10 และ LSP-10 นำค่าสัมประสิทธิ์ทั้งสองมาบีบอัดด้วยเทคนิคเครือข่ายการเรียนรู้เวกเตอร์ควอนไทซ์เซชั่นเพื่อเปรียบเทียบประสิทธิภาพของการบีบอัดสัญญาณเสียง โดยใช้หลักการของอัตราส่วนของการแทนค่าข้อมูลที่ขาดหายไป อัตราส่วนสัญญาณต่อสัญญาณรบกวนสูงสุด และอัตราส่วนสัญญาณต่อสัญญาณรบกวนเป็นตัววัดผลการทดสอบ ผลทดสอบการบีบอัดค่าสัมประสิทธิ์ของ LPC-10 และ LSP-10 โดยการวัดประสิทธิภาพด้วยหลักการของอัตราส่วนของการแทนค่าข้อมูลที่ขาดหายไป อัตราส่วนสัญญาณต่อสัญญาณรบกวนสูงสุดและอัตราส่วนสัญญาณต่อสัญญาณรบกวน พบว่า การบีบอัดค่าสัมประสิทธิ์ LSP-10 เป็นการบีบอัดที่ดีที่สุด โดยรายละเอียดของการบีบอัดค่าสัมประสิทธิ์ LSP-10 จะมีค่า NRMSE ต่ำสุดเท่ากับ 0.0111 ค่า PSNR สูงสุดเท่ากับ 36.6372 dB The speech compression aims to compress speech signal into as small an amount of information as possible while maintaining the speech quality in as high a level as possible. Therefore, this thesis presents the enhancement efficiency of speech compression using Learning Vector Quantization (LVQ) technique, which it is applied to speech compression such as Linear Predictive Coefficient order 10 (LPC-10) and Linear Spectral Pairs order 10 (LSP-10) for increasing compression rate of speech signal. In the experiment, 10 speech signals which corrected from 5 male and 5 female, are used as the input speech signal. These speech signals are record using microphone with sampling rate at 8 kHz. These speech signal are then analyzed and calculated the LPC-10 and LSP-10 respectively. The LVQ is then used to compress both coefficients. Also the Normalized Root Mean Squared Error (NRMSE) is used to measure the error coding. Moreover, Peak Signal to Noise Ratio (PSNR) and Signal to Noise Ratio (SNR) are used to measure the synthesis speech quality. The results of LPC-10 and LSP-10 coefficients compression in the term of NRMSE, PSNR and SNR show that LVQ-LSP-10 provides the best compression. It can notice that the minimum NRMSE is 0.0111 and the maximum PSNR is equal to 41.0208 dB. Also the maximum SNR is 36.6372 dB
URI:	http://www.repository.rmutt.ac.th/dspace/handle/123456789/2222
Appears in Collections:	วิทยานิพนธ์ (Thesis - EN)

Files in This Item:

File	Description	Size	Format
143582.pdf	การพัฒนา LVQ สำหรับการบีบอัดสัญญาณเสียง	54.06 MB	Adobe PDF	View/Open

Show full item record