Ideas and Voices from MIT This Month: Web Class of 2003
July/August 2003
 

In This Edition

People, Information, and Mediating Technologies

Part 1: Representing Information

Part 2: Moving Information

Part 3: Interpreting Information

Interviews

Cameron Marlow SM '01
Working on new communication technologies

Professor Joseph A. Paradiso PhD '81
Director of the Responsive Environments Group and co-director of the Media Lab's Things That Think Consortium

Andrew Pollack SM '77
Technology and biotechnology reporter for the New York Times

Han Shu '96, MEng '97
Contributed to the development of the technology of handwriting recognition, fully automated telephone number retrieval, face recognition, and speech recognition

Professor Sherry Turkle
Founder and current director of the MIT Initiative on Technology and Self

openDOOR home

About openDOOR & Archives

Tell Us What You Think

Interview with:

Han Shu '96, MEng '97

Han Shu '96, MEng '97
Han Shu '96

Han Shu '96, who moved from mainland China to the US when he was 15, is pursuing a PhD with the Spoken Language Systems group at MIT's Laboratory for Computer Science. As a MIT undergrad, he was a 6A co-op student with BBN's Speech and Language Processing Department and contributed to the development of the technology of handwriting recognition, fully automated telephone number retrieval, face recognition, and speech recognition in both English and Mandarin Chinese.

What problems in speech and language processing are you working on in your PhD research?

My PhD research focuses on modeling the dynamics of speech sound for speech recognition. The current dominant approach extracts features from speech signal at a constant rate. However, the acoustic cues important for phonetic classification typically are not uniformly present in the speech signal. Motivated by this understanding, another approach extracts features from possible phonetic segments. I am attempting to put the two approaches in a common framework, thus enabling the combination of the two approaches.

What are some differences between representing speech in English and in Mandarin Chinese?

People in the speech community have generally found that techniques used for recognizing and synthesizing speech for one language generally carry over to another. However, there are still some differences due to the different attributes of various languages. For example, Mandarin is tonal while English is not, so modeling pitch information improves the discriminability between Chinese characters, but it is not as helpful for English.

What are some potential societal benefits from learning to synthesize and recognize speech?

Air travel has brought people with different cultural backgrounds together with lightening speed, but in many cases language barriers still prevent people from communicating with one another. Speech recognition and synthesis, language translation, and computer-assisted language learning technologies will enable people without a common language to communicate with greater ease, then we will truly live in a small global village.

According to a recent New York Times article, 6 million people or 5% of the US labor force are currently in the telesales and service industry. The advancement of speech technology and the understanding of human dialog has started to change this, but a more pervasive shift to human-like automated information agents is still to come. As we understand more about speech user interface, interacting with computers, cell phones, and other devices using speech will become commonplace. These new technologies will only be possible with more fundamental research. Hopefully government-funded research by DARPA and NSF at universities and corporate research laboratories will continue to play a pivotal role.

Cameron Marlow SM '01
"Webloggers are a great leading indicator of trends in the news simply by being part of a group that intends to keep each other informed."
more...

Professor Joseph A. Paradiso PhD '81
"Graduate students come to my group to work at the frontier where sensor technologies meet the human-computer interface."
more...

Andrew Pollack SM '77
"[The Internet] hasn't changed the basic concept of reporting. But it has certainly made it easier to track down information..."
more...

Han Shu '96
"As we understand more about speech user interface, interacting with computers, cell phones, and other devices using speech will become commonplace."
more...

Professor Sherry Turkle
"...recently the pace and depth of technology's effects on identity have increased. The Internet has become a space for new forms of self-exploration and social encounter."
more...


mit Copyright ©2003 Massachusetts Institute of Technology
Comments and questions to opendoor@mit.edu