Speaker recognition using neural networks.

number: 
286
إنجليزية
department: 
Degree: 
Imprint: 
Computer Science
Author: 
Ala'a Hassan Harif
Supervisor: 
Dr.Mohammed Ali Shallal
Dr.Riyadh Abdul Kader Mehdi
year: 
1998
Abstract:

Sneaker recognition is a very important flclu that can be used in many applications such as controlling access to, protected area, banking transaction over telepnone networks, databases access services, voice mail, forensic investigations, and many other areas which make current human beings life better. Speaker recognition is divided into two fields, Speaker Verification and Speaker identification. In speaker verification the claimed identity of the speaker is examined and either accepted or rejected. Speaker identification, on the other hand, is the process of determining from which of the registered speakers a given utterance comes. Neural networks, on the other hand were shown to be effective for solving many problems in very diverse applications. The objective of this work is to investigate and introduce the use of the Ncural Networks (NN) in the construction of a speaker recognition system. The proposed NN was used as a speaker identification system. This thesis also discusses a number of features that can be effectively used in distinguishing people from their. voices. Linear Prediction Coding (LPC) using autocorrelation method is the speech feature extraction approach used in this work. It showed acceptable results. The thesis discusses other features that might be used in building such system The Feedforward Backpropagated neural network structure is proposed for constructing tthis system. The basic neural structure unit used in this research consists of a three-layer pereeptron. A -numher of these units (35 — 65) are combines in training are testng a speaker's extracted features pattern. The input to the proposed system is a speech utterance of a sentence and the output is the identity of the speaker. Many experiments and tests have been made to improve the system performance. The training set of the system covered 20 speakers (12 male, and 8 female) and one of the main distinguishing features of the proposed system is us ability to be trained and identify unlimited number of speakers. Tests showed recognition rale of 97% for patterns never presented to the system and 100% recognition rate for previously trained patterns. Training performed using a maximum of 10 utterances for each sentence per speaker. The proposed system depends on text-prompting approach using 9 different sentences. Other important features of the proposed system are: Ability to increase number of sentences. Ability to delete any speaker without any influence on the other speakers. Ability to identify people with different health conditions if such patterns were presented to the system before. Ability to work in. random prompting mode. The proposed, system was implemented on a 80486DX-2 66 MHz computer, with a sound blaster support, DOS ver 6.0 or higher, 4 MB RAM, using Turbo Pascal language version.