01 Oct 2017
Week Ending 1 October
This past week I’ve helped to schedule volunteers to participate in the study as well as worked with the team to help advertise the study. We hope to find another 20 participants this month.
We also discussed next steps for the audio analysis which I will use to guide my literature review next week.
I’ve also continue my literature and application review, and have found the following tools:
24 Sep 2017
Week Ending 24 September
This week I was introduced to the FACES sketch software. It is software that
allows the user to build forensics sketches by selecting different facial components.
There are about 20 types of facial components, from hair, to nose shape, and wrinkles,
and for each component their are hundreds of options to choose from.
I started by building my own sketch. It’s something I found quite difficult
and time consuming, both given the shear number of options, but also as I
tended to get stuck trying to choose the best choice for a feature between similar
options, were all the choices were slightly of from reality. I found that my
brain would become desensitized to the differences and I would have to move on to
a different feature and finish the more subtle choices later.
In general I found that the best way to create a sketch was to go feature by
feature and get each one recognizably accurate before moving on the next, and
then going back to the end to make final adjustments.
17 Sep 2017
Second Week
This week we discussed the time-line/projectory of the project, especially in regards to data collection. Now that the details have be established I will begin to help in data collection starting next week.
In addition I continued to explore the existing code and my literature review. Summaries for a few of the papers are shown below.
- T.Drugman, Y. Stylianou, “Voiced Activity Detection: Merging Source and Filter-based Information”, IEEE Signal Processing Letters, 2014.
- They propose source-related filters and merge them with filter-based features to produce VAD results. They found that their proposed method worked better than previous ones, with an average F1 score of 93%. They attribute this to the fact that filter based feature (especially ones based on power spectral density) tend to not deal well with “sporadic impulsive noise”.
- filter based features:
- spectral envelope
- MFCC
- PLP
- low-computational cost
- source-related features:
- harmonics: SRH (summation of residual harmonics)
- Cepstral Peak Prominence (CPP)
- Information Fusion and Classification
- Artificial Neural Network (ANN)
- Martin, Rainer. “Spectral subtraction based on minimum statistics.” power 6 (1994): 8
- They propose a spectral subtraction algorithm for the enhancement of noise speech signals. To limit the suppression of low energy phonemes they use less over subtraction for high frequencies and SNR conditions. They found that this method “eliminates the need for a speech activity detector” and has an advantage at removing non-stationary noise over other methods.
- Kinnunen, Tomi, and Padmanabhan Rajan. “A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data.” ICASSP. 2013.
- They exhibit a Unsupervised VAD algorithm that trains models based on MFCCs. They found that this outperforms energy VAD although accuracy still drop with decreasing SNR.
Plan for next week:
- Begin data collection
- Record ground-truth to determine accuracy
- Search for existing applications
- Continue literature review
09 Sep 2017
First Week
I was introduced to the project and the system. I took photos of myself to be added as training data as well as read through the code and ran it on various inputs to get a better idea of how the current system is working. I began to familiarize myself with the literature.