Difference between revisions of "Speech Reading Group"

From PublicWiki
Jump to: navigation, search
(Created page with "This is the wiki page for topics to be discussed in the Speech Reading Group, starting Winter 2013. The group will cover any topics with some relation to speech research, inc...")
 
(Prior Meetings)
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
This is the wiki page for topics to be discussed in the Speech Reading Group, starting Winter 2013.
 
This is the wiki page for topics to be discussed in the Speech Reading Group, starting Winter 2013.
  
The group will cover any topics with some relation to speech research, including but not limited to:
+
The group is meant to offer an opportunity for people working on or interested in speech technology to interact and share ideas, with the goals of fostering a larger sense of community amongst speech researchers at UW, and allowing participants to become familiar with the breadth of research going on here and in the speech community at large.
Speech Recognition
 
Speech Production
 
Feature Extraction
 
Speech Perception
 
Cognitive Psychology of Speech
 
  
It is meant to offer an opportunity for people working on or interested in speech technology to interact and share ideas, with the goals of fostering a larger sense of community amongst speech researchers at UW, and allowing participants to become familiar with the breadth of research going on here and in the speech community at large.
 
  
 +
We will cover any topics with some relation to speech research, including but not limited to:
 +
* Speech Recognition
 +
* Speech Production
 +
* Feature Extraction
 +
* Speech Enhancement/Separation
 +
* Auditory scene analysis
 +
* Speech Perception
 +
* Cognitive Psychology of Speech
  
== Administrative Meeting ==
 
  
  
There will be an initial administrative meeting as follows:
+
For questions, or other, please contact either:
 +
 
 +
* Gabe Schubiner - gabeos [at] cs.wash....edu
 +
* Scott Wisdom  - swisdom [at] ee.wash....edu
 +
 
 +
 
 +
== Announcements ==
 +
 
 +
* Permanent meeting time has been chosen as Thursdays, at 5:00pm
 +
 
 +
== Meeting Schedule ==
 +
 
 +
'''Next Meeting'''
  
'''When: Thursday, 4:00 pm'''
+
'''When: Thursday, January 24, 5:00 pm'''
'''Where: CSE 303'''
 
  
This meeting will be to gauge interest and size of the group, as well as to discuss some administrative matters, such as organization of the group, meeting times, location, and paper schedule.
+
'''Where: CSE 503'''
  
 +
The paper for our meetings should be sent out to the listserve by no later than Sunday.
  
For questions, or other, please contact either:
+
== Email list ==
  
Gabe Schubiner - gabeos@cs.washington.edu
+
You can subscribe here: https://mailman.cs.washington.edu/mailman/listinfo/speech-rg
Scott Wisdom  - scott.thomas.wisdom@gmail.com
 
  
 +
== Prior Meetings ==
 +
{| class="wikitable"
 +
! Date
 +
! Paper
 +
! Authors
 +
! Venue
 +
! Leader
 +
! Info
 +
|-
 +
| 1/10/13
 +
| Pilot Meeting
 +
| N/A
 +
| CSE 303
 +
| N/A
 +
| Discussed organization of group.
 +
|-
 +
| 1/17/13
 +
| [https://research.microsoft.com/pubs/160916/selfridge2011sigdial.pdf Stability and Accuracy in Incremental Speech Recognition]
 +
| Ethan O. Selfridge, Iker Arizmendi, Peter A. Heeman, and Jason D. Williams
 +
| CSE 303
 +
| Shiri Azenkot
 +
|
 +
|-
 +
| 1/24/13
 +
| [http://aclweb.org/anthology-new/D/D12/D12-1008.pdf Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems]
 +
| Nina Dethlefs; Helen Hastie; Verena Rieser; Oliver Lemon
 +
| CSE 503
 +
| Ben Hixon
 +
|
 +
|-
 +
| 1/31/13
 +
| Analysing the correspondence between automatic prosodic segmentation and syntactic structure
 +
| Szaszák, G., Nagy, K., & Beke, A
 +
| CSE 503
 +
| Gabe Schubiner
 +
|
 +
|-
 +
| 2/7/13
 +
| [http://laslab.org/upload/monaural_speech_separation_and_recognition_challenge.pdf Monaural Speech Separation and Recognition Challenge]
 +
| Martin Cooke, John R. Hershey, and Steven J. Rennie
 +
| CSE 503
 +
| Scott Wisdom
 +
|
 +
|-
 +
| 2/14/13
 +
| [http://www.cs.illinois.edu/~paris/pubs/wilson-icassp08.pdf Speech Denoising Using Nonnegative Matrix Factorization with Priors]
 +
| Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, and Ajay Divakaran
 +
| CSE 503
 +
| Gabe Schubiner
 +
|
 +
|}
  
 
== Potential Papers ==
 
== Potential Papers ==
  
Coming soon...
+
* [http://www.cs.cmu.edu/~awb/papers/IS2012/1146_Paper.pdf Miranda, J., Neto, J. and Black A. Parallel combination of speech streams for improved ASR Interspeech 2012, Portland, OR.]
 +
 
 +
* [http://www.cs.cmu.edu/~awb/papers/is2011_spamf0.pdf Anumanchipalli, G., Oliveira, L., and Black, A., A Statistical Phrase/Accent Model for Intonation Modeling, Interspeech 2011 , Florence, Italy]
 +
 
 +
* [http://www.cs.cmu.edu/~awb/papers/asru2009/AS090074.pdf Al-Haj, H., Hsiao, R., Lane, I., Black, A., and Waibel, A. "Pronunciation Modeling for Dialectal Arabic Speech Recognition" ASRU 2009, Merano, Italy.]
 +
 
 +
* [http://www1.cs.columbia.edu/~fadi/papers/Interspeech09_biadsy_hirschberg.pdf Fadi Biadsy, Julia Hirschberg, "Using Prosody and Phonotactics in Arabic Dialect Identification," In Proceedings of Interspeech 2009, Brighton, UK.]
 +
 
 +
* [http://www.cs.columbia.edu/speech/PaperFiles/2009/RosenbergAndHirschberg09.pdf Andrew Rosenberg, Julia Hirschberg, "Detecting Pitch Accents at the Word, Syllable, and Vowel Level," NAACL/HLT 2009, Boulder, CO.]
 +
 
 +
* [http://www.prosodylab.org/~chael/www/etap2/abstracts/lucente_etal.pdf Luciana Lucente, Julia Hirschberg and Plınio Barbosa, “Intonation, Discourse Structure and Information Status in Spontaneous Speech,” ETAP 2, Montreal. 2011]
 +
 
 +
* [http://www1.cs.columbia.edu/~sbenus/Research/Gravano_et_al_Social_behavior_IS11.pdf Agustın Gravano, Rivka Levitan, Laura Willson, Stefan Benus, Julia Hirschberg, and Ani Nenkova, “Acoustic and prosodic correlates of social behavior,” Interspeech 2011, Florence.]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/ Sourish Chaudhuri, Bhiksha Raj. "Unsupervised Structure Discovery for Semantic Analysis of Audio", to appear in Neural Information Processing Systems (NIPS), 2012 ]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/pdfs/spm.array.pdf Kenichi Kumatani, John McDonough, Bhiksha Raj. "Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-field Sensors", To appear in IEEE Signal Processing Magazine, 2012 ]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/pdfs/isc12.pdf  Manas Pathak, José Portelo, Bhiksha Raj, Isabel Trancoso. "Privacy-preserving speaker authentication", Information Security Conference, Passau, 2012. ]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/pdfs/sapa12.2.pdf  Kamal Sahni, Pranay Dighe, Rita Singh, Bhiksha Raj. "Language identification using spectro-temporal patch features". Proc. 5th ISCA workshop on statistical and perceptual audition (SAPA2012). 2012. ]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/pdfs/.pdf Gahgene Gweon, Mahaveer Jain, John McDonough, Carolyn Rosé, Bhiksha Raj "Predicting Idea Co-Construction in Speech Data using Insights from Sociolinguistics", International Conference of the Learning Sciences, 2012.]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/pdfs/Kumatani_ASRU2011.pdf Kenichi Kumatani, John McDonough, Bhiksha Raj, "Maximum kurtosis beamforming with a subspace filter for distant speech recognition", Automatic Speech Recognition and Understanding (ASRU) 2011]
 +
 
 +
* [http://mlsp.cs.cmu.edu/publications/pdfs/evandro.intersp2011.pdf Evandro Gouvea. "Hybrid speech recognition for voice search; a comparative studey", Interspeech 2011. Florence, 2011]
 +
 
 +
* Kshitiz Kumar, Bhiksha Raj, Rita Singh, Richard Stern. "An iterative least-squares technique for dereverberation", IEEE International Conference on Acoustics Speech and Signal Processing. Prague, 2011
 +
 
 +
 
 +
More coming soon, and feel free to add your own.

Latest revision as of 07:52, 13 February 2013

This is the wiki page for topics to be discussed in the Speech Reading Group, starting Winter 2013.

The group is meant to offer an opportunity for people working on or interested in speech technology to interact and share ideas, with the goals of fostering a larger sense of community amongst speech researchers at UW, and allowing participants to become familiar with the breadth of research going on here and in the speech community at large.


We will cover any topics with some relation to speech research, including but not limited to:

  • Speech Recognition
  • Speech Production
  • Feature Extraction
  • Speech Enhancement/Separation
  • Auditory scene analysis
  • Speech Perception
  • Cognitive Psychology of Speech


For questions, or other, please contact either:

  • Gabe Schubiner - gabeos [at] cs.wash....edu
  • Scott Wisdom - swisdom [at] ee.wash....edu


Announcements

  • Permanent meeting time has been chosen as Thursdays, at 5:00pm

Meeting Schedule

Next Meeting

When: Thursday, January 24, 5:00 pm

Where: CSE 503

The paper for our meetings should be sent out to the listserve by no later than Sunday.

Email list

You can subscribe here: https://mailman.cs.washington.edu/mailman/listinfo/speech-rg

Prior Meetings

Date Paper Authors Venue Leader Info
1/10/13 Pilot Meeting N/A CSE 303 N/A Discussed organization of group.
1/17/13 Stability and Accuracy in Incremental Speech Recognition Ethan O. Selfridge, Iker Arizmendi, Peter A. Heeman, and Jason D. Williams CSE 303 Shiri Azenkot
1/24/13 Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems Nina Dethlefs; Helen Hastie; Verena Rieser; Oliver Lemon CSE 503 Ben Hixon
1/31/13 Analysing the correspondence between automatic prosodic segmentation and syntactic structure Szaszák, G., Nagy, K., & Beke, A CSE 503 Gabe Schubiner
2/7/13 Monaural Speech Separation and Recognition Challenge Martin Cooke, John R. Hershey, and Steven J. Rennie CSE 503 Scott Wisdom
2/14/13 Speech Denoising Using Nonnegative Matrix Factorization with Priors Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, and Ajay Divakaran CSE 503 Gabe Schubiner

Potential Papers

  • Kshitiz Kumar, Bhiksha Raj, Rita Singh, Richard Stern. "An iterative least-squares technique for dereverberation", IEEE International Conference on Acoustics Speech and Signal Processing. Prague, 2011


More coming soon, and feel free to add your own.