Difference between revisions of "Speech Reading Group"

From PublicWiki

Jump to: navigation, search

Latest revision as of 07:52, 13 February 2013

This is the wiki page for topics to be discussed in the Speech Reading Group, starting Winter 2013.

The group is meant to offer an opportunity for people working on or interested in speech technology to interact and share ideas, with the goals of fostering a larger sense of community amongst speech researchers at UW, and allowing participants to become familiar with the breadth of research going on here and in the speech community at large.

We will cover any topics with some relation to speech research, including but not limited to:

Speech Recognition
Speech Production
Feature Extraction
Speech Enhancement/Separation
Auditory scene analysis
Speech Perception
Cognitive Psychology of Speech

For questions, or other, please contact either:

Gabe Schubiner - gabeos [at] cs.wash....edu
Scott Wisdom - swisdom [at] ee.wash....edu

1 Announcements
2 Meeting Schedule
3 Email list
4 Prior Meetings
5 Potential Papers

Announcements

Permanent meeting time has been chosen as Thursdays, at 5:00pm

Meeting Schedule

Next Meeting

When: Thursday, January 24, 5:00 pm

Where: CSE 503

The paper for our meetings should be sent out to the listserve by no later than Sunday.

Email list

You can subscribe here: https://mailman.cs.washington.edu/mailman/listinfo/speech-rg

Prior Meetings

Date	Paper	Authors	Venue	Leader	Info
1/10/13	Pilot Meeting	N/A	CSE 303	N/A	Discussed organization of group.
1/17/13	Stability and Accuracy in Incremental Speech Recognition	Ethan O. Selfridge, Iker Arizmendi, Peter A. Heeman, and Jason D. Williams	CSE 303	Shiri Azenkot
1/24/13	Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems	Nina Dethlefs; Helen Hastie; Verena Rieser; Oliver Lemon	CSE 503	Ben Hixon
1/31/13	Analysing the correspondence between automatic prosodic segmentation and syntactic structure	Szaszák, G., Nagy, K., & Beke, A	CSE 503	Gabe Schubiner
2/7/13	Monaural Speech Separation and Recognition Challenge	Martin Cooke, John R. Hershey, and Steven J. Rennie	CSE 503	Scott Wisdom
2/14/13	Speech Denoising Using Nonnegative Matrix Factorization with Priors	Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, and Ajay Divakaran	CSE 503	Gabe Schubiner

Potential Papers

Miranda, J., Neto, J. and Black A. Parallel combination of speech streams for improved ASR Interspeech 2012, Portland, OR.

Anumanchipalli, G., Oliveira, L., and Black, A., A Statistical Phrase/Accent Model for Intonation Modeling, Interspeech 2011 , Florence, Italy

Al-Haj, H., Hsiao, R., Lane, I., Black, A., and Waibel, A. "Pronunciation Modeling for Dialectal Arabic Speech Recognition" ASRU 2009, Merano, Italy.

Fadi Biadsy, Julia Hirschberg, "Using Prosody and Phonotactics in Arabic Dialect Identification," In Proceedings of Interspeech 2009, Brighton, UK.

Andrew Rosenberg, Julia Hirschberg, "Detecting Pitch Accents at the Word, Syllable, and Vowel Level," NAACL/HLT 2009, Boulder, CO.

Luciana Lucente, Julia Hirschberg and Plınio Barbosa, “Intonation, Discourse Structure and Information Status in Spontaneous Speech,” ETAP 2, Montreal. 2011

Agustın Gravano, Rivka Levitan, Laura Willson, Stefan Benus, Julia Hirschberg, and Ani Nenkova, “Acoustic and prosodic correlates of social behavior,” Interspeech 2011, Florence.

Sourish Chaudhuri, Bhiksha Raj. "Unsupervised Structure Discovery for Semantic Analysis of Audio", to appear in Neural Information Processing Systems (NIPS), 2012

Kenichi Kumatani, John McDonough, Bhiksha Raj. "Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-field Sensors", To appear in IEEE Signal Processing Magazine, 2012

Manas Pathak, José Portelo, Bhiksha Raj, Isabel Trancoso. "Privacy-preserving speaker authentication", Information Security Conference, Passau, 2012.

Kamal Sahni, Pranay Dighe, Rita Singh, Bhiksha Raj. "Language identification using spectro-temporal patch features". Proc. 5th ISCA workshop on statistical and perceptual audition (SAPA2012). 2012.

Gahgene Gweon, Mahaveer Jain, John McDonough, Carolyn Rosé, Bhiksha Raj "Predicting Idea Co-Construction in Speech Data using Insights from Sociolinguistics", International Conference of the Learning Sciences, 2012.

Kenichi Kumatani, John McDonough, Bhiksha Raj, "Maximum kurtosis beamforming with a subspace filter for distant speech recognition", Automatic Speech Recognition and Understanding (ASRU) 2011

Evandro Gouvea. "Hybrid speech recognition for voice search; a comparative studey", Interspeech 2011. Florence, 2011

Kshitiz Kumar, Bhiksha Raj, Rita Singh, Richard Stern. "An iterative least-squares technique for dereverberation", IEEE International Conference on Acoustics Speech and Signal Processing. Prague, 2011

More coming soon, and feel free to add your own.

Retrieved from "http://abstract.cs.washington.edu/wiki/index.php?title=Speech_Reading_Group&oldid=9386"

@@ Line 13: / Line 13: @@
 * Cognitive Psychology of Speech
-== Administrative Meeting ==
-There will be an initial administrative meeting as follows:
+For questions, or other, please contact either:
+* Gabe Schubiner - gabeos [at] cs.wash....edu
+* Scott Wisdom   - swisdom [at] ee.wash....edu
-'''When: Thursday, January 10, 4:00 pm'''
-'''Where: CSE 303'''
+== Announcements ==
-This meeting will be to gauge interest and size of the group, as well as to discuss some administrative matters, such as organization of the group, meeting times, location, and paper schedule.
+* Permanent meeting time has been chosen as Thursdays, at 5:00pm
+== Meeting Schedule ==
-For questions, or other, please contact either:
+'''Next Meeting'''
+'''When: Thursday, January 24, 5:00 pm'''
+'''Where: CSE 503'''
+The paper for our meetings should be sent out to the listserve by no later than Sunday.
+== Email list ==
+You can subscribe here: https://mailman.cs.washington.edu/mailman/listinfo/speech-rg
-* Gabe Schubiner - gabeos [at] cs.wash....edu
+== Prior Meetings ==
-* Scott Wisdom   - swisdom [at] uw.edu
+{| class="wikitable"
+! Date
+! Paper
+! Authors
+! Venue
+! Leader
+! Info
+|-
+| 1/10/13
+| Pilot Meeting
+| N/A
+| CSE 303
+| N/A
+| Discussed organization of group.
+|-
+| 1/17/13
+| [https://research.microsoft.com/pubs/160916/selfridge2011sigdial.pdf Stability and Accuracy in Incremental Speech Recognition]
+| Ethan O. Selfridge, Iker Arizmendi, Peter A. Heeman, and Jason D. Williams
+| CSE 303
+| Shiri Azenkot
+|
+|-
+| 1/24/13
+| [http://aclweb.org/anthology-new/D/D12/D12-1008.pdf Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems]
+| Nina Dethlefs; Helen Hastie; Verena Rieser; Oliver Lemon
+| CSE 503
+| Ben Hixon
+|
+|-
+| 1/31/13
+| Analysing the correspondence between automatic prosodic segmentation and syntactic structure
+| Szaszák, G., Nagy, K., & Beke, A
+| CSE 503
+| Gabe Schubiner
+|
+|-
+| 2/7/13
+| [http://laslab.org/upload/monaural_speech_separation_and_recognition_challenge.pdf Monaural Speech Separation and Recognition Challenge]
+| Martin Cooke, John R. Hershey, and Steven J. Rennie
+| CSE 503
+| Scott Wisdom
+|
+|-
+| 2/14/13
+| [http://www.cs.illinois.edu/~paris/pubs/wilson-icassp08.pdf Speech Denoising Using Nonnegative Matrix Factorization with Priors]
+| Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, and Ajay Divakaran
+| CSE 503
+| Gabe Schubiner
+|
+|}
 == Potential Papers ==
-Coming soon...
+* [http://www.cs.cmu.edu/~awb/papers/IS2012/1146_Paper.pdf Miranda, J., Neto, J. and Black A. Parallel combination of speech streams for improved ASR Interspeech 2012, Portland, OR.]
+* [http://www.cs.cmu.edu/~awb/papers/is2011_spamf0.pdf Anumanchipalli, G., Oliveira, L., and Black, A., A Statistical Phrase/Accent Model for Intonation Modeling, Interspeech 2011 , Florence, Italy]
+* [http://www.cs.cmu.edu/~awb/papers/asru2009/AS090074.pdf Al-Haj, H., Hsiao, R., Lane, I., Black, A., and Waibel, A. "Pronunciation Modeling for Dialectal Arabic Speech Recognition" ASRU 2009, Merano, Italy.]
+* [http://www1.cs.columbia.edu/~fadi/papers/Interspeech09_biadsy_hirschberg.pdf Fadi Biadsy, Julia Hirschberg, "Using Prosody and Phonotactics in Arabic Dialect Identification," In Proceedings of Interspeech 2009, Brighton, UK.]
+* [http://www.cs.columbia.edu/speech/PaperFiles/2009/RosenbergAndHirschberg09.pdf Andrew Rosenberg, Julia Hirschberg, "Detecting Pitch Accents at the Word, Syllable, and Vowel Level," NAACL/HLT 2009, Boulder, CO.]
+* [http://www.prosodylab.org/~chael/www/etap2/abstracts/lucente_etal.pdf Luciana Lucente, Julia Hirschberg and Plınio Barbosa, “Intonation, Discourse Structure and Information Status in Spontaneous Speech,” ETAP 2, Montreal. 2011]
+* [http://www1.cs.columbia.edu/~sbenus/Research/Gravano_et_al_Social_behavior_IS11.pdf Agustın Gravano, Rivka Levitan, Laura Willson, Stefan Benus, Julia Hirschberg, and Ani Nenkova, “Acoustic and prosodic correlates of social behavior,” Interspeech 2011, Florence.]
+* [http://mlsp.cs.cmu.edu/publications/ Sourish Chaudhuri, Bhiksha Raj. "Unsupervised Structure Discovery for Semantic Analysis of Audio", to appear in Neural Information Processing Systems (NIPS), 2012 ]
+* [http://mlsp.cs.cmu.edu/publications/pdfs/spm.array.pdf Kenichi Kumatani, John McDonough, Bhiksha Raj. "Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-field Sensors", To appear in IEEE Signal Processing Magazine, 2012 ]
+* [http://mlsp.cs.cmu.edu/publications/pdfs/isc12.pdf  Manas Pathak, José Portelo, Bhiksha Raj, Isabel Trancoso. "Privacy-preserving speaker authentication", Information Security Conference, Passau, 2012. ]
+* [http://mlsp.cs.cmu.edu/publications/pdfs/sapa12.2.pdf  Kamal Sahni, Pranay Dighe, Rita Singh, Bhiksha Raj. "Language identification using spectro-temporal patch features". Proc. 5th ISCA workshop on statistical and perceptual audition (SAPA2012). 2012. ]
+* [http://mlsp.cs.cmu.edu/publications/pdfs/.pdf Gahgene Gweon, Mahaveer Jain, John McDonough, Carolyn Rosé, Bhiksha Raj "Predicting Idea Co-Construction in Speech Data using Insights from Sociolinguistics", International Conference of the Learning Sciences, 2012.]
+* [http://mlsp.cs.cmu.edu/publications/pdfs/Kumatani_ASRU2011.pdf Kenichi Kumatani, John McDonough, Bhiksha Raj, "Maximum kurtosis beamforming with a subspace filter for distant speech recognition", Automatic Speech Recognition and Understanding (ASRU) 2011]
+* [http://mlsp.cs.cmu.edu/publications/pdfs/evandro.intersp2011.pdf Evandro Gouvea. "Hybrid speech recognition for voice search; a comparative studey", Interspeech 2011. Florence, 2011]
+* Kshitiz Kumar, Bhiksha Raj, Rita Singh, Richard Stern. "An iterative least-squares technique for dereverberation", IEEE International Conference on Acoustics Speech and Signal Processing. Prague, 2011
+More coming soon, and feel free to add your own.

Difference between revisions of "Speech Reading Group"

Latest revision as of 07:52, 13 February 2013

Contents

Announcements

Meeting Schedule

Email list

Prior Meetings

Potential Papers

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools