Spectral Analysis Of Speech Signal


Contents


Contents
Pg no
1
Project description
6-7

1.1
Filter bank analysis
8

1.2
Convertion of discrete fourier transform
9

1.3
Windowing of a signal
9

1.4
Spectrogram of signal
11
2
Code
12-13
3
Output
14
4
References
15






                              
                              SPECTRAL ANALYSIS OF SPEECH SIGNAL
INTRODUCTION:
   Spectral analysis is an elementary operation in speech recognition. Speech recognition operation requires heavy computation due to large samples per window. Speech signal methods using Fourier transform are commonly used in speech recognition. One of the most widely used speech signal methods is the Fast Fourier Transform (FFT).  FFT is a basic technique for digital signal processing applicable. For spectrum analysis .Another transformation is Discrete Cosine Transform (DCT). The FFT is often used to compute numerical approximations to continuous Fourier. The Discrete Tchebichef Transform (DTT) is another transform method based on discrete Tchebichef polynomials .DTT has a lower computational complexity and it does not require complex transform unlike continuous orthonormal transforms. The preliminary experimental results show that DTT has the potential to be a simpler and faster transformation for speech recognition

OBJECTIVES:
Load,display and manipulation of speech signals both in time domain and Frequency    domain.

MODULES:
1.        Build and perform a filter bank analysis of speech signal.
2.      Use the discrete Fourier transform to convert a waveform to a   spectrum and vice versa.   
3.      Divide a signal into overlapping windows.
4.       Compute and display a spectrogram









Time domain & Frequency domain:
Time domain is the analysis of mathematical functions, physical signals or time            series of economic or environmental data, with respect to time. In the time domain, the signal or function's value is known for all real numbers, for the case of continuous time, or at various separate instants in the case of discrete time. Frequency domain refers to the analysis of mathematical functions or signals with respect to frequency, rather than time.
time-domain graph shows how a signal changes over time, whereas a frequency-domain graph shows how such of the signal lies within each given frequency band over a range of frequencies. A frequency-domain representation can also include information on the phase shift that must be applied to each sinusoid in order to be able to recombine the frequency components to recover the original time signal.
  


Filter bank analysis:
The most flexible way to perform spectral analysis is to use a bank of band pass filters. A filter bank can be designed to provide a spectral analysis with any degree of frequency resolution (wide or narrow), even with non-linear filter spacing and bandwidths. A dis-advantage of filter banks is that they almost always take more calculation and processing time than discrete Fourier analysis using the FFT.
            To use a filter bank for analysis we need one band-pass filter per channel to do the filtering, a means to perform rectification, and a low-pass filter to smooth the energies. In this example, we build a 19-channel filter bank using bandwidths that are modelled on human auditory bandwidths. We rectify and smooth the filtered energies and convert to a decibel scale.A band-pass filter is a device that passes frequencies within a certain range and rejects frequencies outside that range.
Band pass is an adjective that describes a type of filter or filtering process; it is to be distinguished from pass band, which refers to the actual portion of affected spectrum. Hence, one might say "A dual band pass filter has two pass bands." A band pass signal is a signal containing a band of frequencies not adjacent to zero frequency, such as a signal that comes out of a band pass filte 



 Spectral analysis using Fourier transforms:
         The discrete-time discrete-frequency version of the Fourier transform (DFT) converts an array of N sample amplitudes to an array of N complex harmonic amplitudes. If the sampling rate is fs , the N input samples are 1/ fs seconds apart, and the output harmonic frequencies are fs / N Hertz apart. That is the N output amplitudes are evenly spaced at frequencies between 0 and (N-1) fs / N Hertz. Perform DFT for the speech signal. Use sizes of 512, 1024, etc., for the fastest speed. Plot and display the magnitude and phase spectrum.
         To compute the DFT in MATLAB, we use the function fft(x,n).  This function takes a waveform x and the number of samples n.  When n is less than the length of x, then x is truncated; when n is longer than the length of x, then x is padded with zeros.  The output is an array of complex amplitudes of length n.  You can obtain the magnitude of each spectral component with abs(), and its phase with angle() (result in radians).

Windowing a signal:
Often it is desired to analyze a long signal in overlapping short sections called “windows”. For example it is required to calculate an average spectrum, or a spectrogram. Unfortunately it cannot simply chop the signal into short pieces because this will cause sharp discontinuities at the edges of each section. Instead it is preferable to have smooth joins between sections. Raised cosine windows are a popular shape for the join.
The speech signal is constantly changing (non-stationary) Signal processing algorithms usually assume that they the signal is stationary Piecewise stationarity : model speech signal as a sequence of frames (each assumed to be stationary).
        Windowing: multiply the full waveform s(n) by a window
          w(n) (in time domain)

                              x[n] = w[n]s[n]
Bhanu Namikaze

Bhanu Namikaze is an Ethical Hacker, Security Analyst, Blogger, Web Developer and a Mechanical Engineer. He Enjoys writing articles, Blogging, Debugging Errors and Capture the Flags. Enjoy Learning; There is Nothing Like Absolute Defeat - Try and try until you Succeed.

No comments:

Post a Comment