3 Pages

N03-2020

Course: N 03, Fall 2008
School: UPenn
Rating:
 
 
 
 
 

Word Count: 2197

Document Preview

Robust A Retrieval Engine for Proximal and Structural Search Katsuya Masuda Takashi Ninomiya Yusuke Miyao Tomoko Ohta Junichi Tsujii Department of Computer Science, Graduate School of Information Science and Technology, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, Japan CREST, JST (Japan Science and Technology Corporation) Honcho 4-1-8, Kawaguchi-shi, Saitama 332-0012, Japan...

Register Now

Unformatted Document Excerpt

Coursehero >> Pennsylvania >> UPenn >> N 03

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Robust A Retrieval Engine for Proximal and Structural Search Katsuya Masuda Takashi Ninomiya Yusuke Miyao Tomoko Ohta Junichi Tsujii Department of Computer Science, Graduate School of Information Science and Technology, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, Japan CREST, JST (Japan Science and Technology Corporation) Honcho 4-1-8, Kawaguchi-shi, Saitama 332-0012, Japan {kmasuda,ninomi,yusuke,okap,tsujii}@is.s.u-tokyo.ac.jp 1 Introduction In the text retrieval area including XML and Region Algebra, many researchers pursued models for specifying what kinds of information should appear in specied structural positions and linear positions (Chinenyanga and Kushmerick, 2001; Wolff et al., 1999; Theobald and Weilkum, 2000; Clarke et al., 1995). The models attracted many researchers because they are considered to be basic frameworks for retrieving or extracting complex information like events. However, unlike IR by keywordbased search, their models are not robust, that is, they support only exact matching of queries, while we would like to know to what degree the contents in specied structural positions are relevant to those in the query even when the structure does not exactly match the query. This paper describes a new ranked retrieval model that enables proximal and structural search for structured texts. We extend the model proposed in Region Algebra to be robust by i) incorporating the idea of rankedness in keyword-based search, and ii) expanding queries. While in ordinary ranked retrieval models relevance measures are computed in terms of words, our model assumes that they are dened in more general structural fragments, i.e., extents (continuous fragments in a text) proposed in Region Algebra. We decompose queries into subqueries to allow the system not only to retrieve exactly matched extents but also to retrieve partially matched ones. Our model is robust like keyword-based search, and also enables us to specify the structural and linear positions in texts as done by Region Algebra. The signicance of this work is not in the development of a new relevance measure nor in showing superiority of structure-based search over keyword-based search, but in the proposal of a framework for integrating proximal and structural ranking models. Since the model treats all types of structures in texts, not only ordinary text structures like title, abstract, authors, etc., but also semantic tags corresponding to recognized named entities or events can also be used for indexing text fragments and contribute to the relevance measure. Since extents are treated similarly to keywords in traditional models, our model will be integrated with any ranking and scalability techniques used by keyword-based models. We have implemented the ranking model in our retrieval engine, and had preliminary experiments to evaluate our model. Unfortunately, we used a rather small corpus for the experiments. This is mainly because there is no test collection of the structured query and tag-annotated text. Instead, we used the GENIA corpus (Ohta et al., 2002) as structured texts, which was an XML document annotated with semantics tags in the led of biomedical science. The experiments show that our model succeeded in retrieving the relevant answers that an exact-matching model fails to retrieve because of lack of robustness, and the relevant answers that a nonstructured model fails because of lack of structural specication. 2 A Ranking Model for Structured Queries and Texts This section describes the denition of the relevance between a document and a structured query represented by the region algebra. The key idea is that a structured query is decomposed into subqueries, and the relevance of the whole query is represented as a vector of relevance measures of subqueries. The region algebra (Clarke et al., 1995) is a set of operators, which represent the relation between the extents (i.e. regions in texts). In this paper, we suppose the region algebra has seven operators; four containment operators (, , , ) representing the containment relation between the extents, two combination operators ( , ) corresponding to and and or operator of the boolean model, and ordering operator (Q) representing the order of words or structures in the texts. For convenience of explanation, we represent a query as a tree structure as Figure 1: Subqueries of the query [book] ([title] retrieval) shown in Figure 1 1 . This query represents Retrieve the books whose title has the word retrieval. Our model assigns a relevance measure of the structured query as a vector of relevance measures of the subqueries. In other words, the relevance is dened by the number of portions matched with subqueries in a document. If an extent matches a subquery of query q, the extent will be somewhat relevant to q even when the extent does not exactly match q. Figure 1 shows an example of a query and its subqueries. In this example, even when an extent does not match the whole query exactly, if the extent matches retrieval or [title]retrieval, the extent is considered to be relevant to the query. Subqueries are formally dened as following. Denition 1 (Subquery) Let q be a given query and n1 , ..., nm be the nodes of q. Subqueries q1 , ..., qm of q are the subtrees of q. Each qi has node ni as a root node. When a relevance (qi , d) between a subquery qi and a document d is given, the relevance of the whole query is dened as following. Denition 2 (Relevance of the whole query) Let q be a given query, d be a document and q1 , ..., qm subqueries of q. The relevance vector (q, d) of d is dened as follows: (q, d) = (q1 , d), (q2 , d), ..., (qm , d) A relevance of a subquery should be dened similarly to that of keyword-based queries in the traditional ranked retrieval. For example, TFIDF, which is used in our experiments in Section 3, is the most simple and straightforward one, while other relevance measures recently proposed in (Robertson and Walker, 2000) can be applied. TF value is calculated using the number of extents matching the subquery, and IDF value is calculated using the number of documents including the extents matching the subquery. While we have dened a relevance of the structured query as a vector, we need to sort the documents according to the relevance vectors. In this paper, we rst map a vector into a scalar value, and then sort the documents 1 In this query, [x] is a syntax sugar of x Q /x . C%7#$514A21B " 8 6 # v q #" # 0)(' &$" ED3 t q C86$54@21B " # 1 # r s %#$" '(3 w x %#$" 00F#$" ! ! 3 A0 3 `@XgfeU dba R V c V u q #$" ! y q 9087$5432 " 6 # 1 # 1 0#)(' &%$" ! " # A x p q ` X R V YSWU T R P H FSQIG %@3 h i according to this scalar measure. Three methods are introduced for the mapping from the relevance vector to the scalar measure. The rst one simply works out the sum of the elements of the relevance vector. Denition 3 (Simple Sum) m sum (q, d) = i=1 (qi , d) The second represents the rareness of the structures. When the query is A B or A B, if the number of extents matching the query is close to the number of extents matching A, matching the query does not seem to be very important because it means that the extents that match A mostly match A B or A B. The case of the other operators is the same as with and . Denition 4 (Structure Coefcient) When the operator op is , or Q, the structure coefcient of the query A op B is: scAopB = C(A) + C(B) C(A op B) C(A) + C(B) and when the operator op is or , the structure coefcient of the query A op B is: scAopB = C(A) C(A op B) C(A) where A and B are the queries and C(A) is the number of extents that match A in the document collection. The scalar measure sc (qi , d) is then dened as m sc (q, d) = i=1 scqi (qi , d) The third is a combination of the measure of the query itself and the measure of the subqueries. Although we calculate the score of extents by subqueries instead of using only the whole query, the score of subqueries can not be compared with the score of other subqueries. We assume normalized weight of each subquery and interpolate the weight of parent node and children nodes. Denition 5 (Interpolated Coefcient) The interpolated coefcient of the query qi is recursively dened as follows: ic (qci , d) l where ci is the child of node ni , l is the number of children of node ni , and 0 1. This formula means that the weight of each node is dened by a weighted average of the weight of the query and its subqueries. When = 1, the weight of each query is normalized weight of the query. When = 0, the weight of each query is calculated from the weight of the subqueries, i.e. the weight is calculated by only the weight of the words used in the query. ic (qi , d) = (qi , d) + (1 ) ci 1 2 3 4 ([cons]([sem]G#DNA domain or region)) (inQ([cons]([sem](G#tissue G#body part)))) ([event]([obj]gene)) (inQ([cons]([sem](G#tissue G#body part)))) ([event]([obj]Q([sem]G#DNA domain or region))) (inQ([cons]([sem](G#tissue G#body part)))) ([event]([dummy]G#DNA domain or region)) (inQ([cons]([sem](G#tissue G#body part)))) Table 1: Queries submitted in the experiments 3 Experiments In this section, we show the results of our preliminary experiments of text retrieval using our model. Because there is no test collection of the structured query and tagannotated text, we used the GENIA corpus (Ohta et al., 2002) as a structured text, which was an XML document composed of paper abstracts in the eld of biomedical science. The corpus consisted of 1,990 articles, 873,087 words (including tags), and 16,391 sentences. We compared three retrieval models, i) our model, ii) exact matching of the region algebra (exact), and iii) not-structured at model. In the at model, the query was submitted as a query composed of the words in the queries in Table 1 connected by the and operator ( ). The queries submitted to our system are shown in Table 1, and the document was sentence represented by sentence tags. Query 1, 2, and 3 are real queries made by an expert in the eld of biomedicine. Query 4 is a toy query made by us to see the robustness compared with the exact model easily. The system output the ten results that had the highest relevance for each model2 . Table 2 shows the number of the results that were judged relevant in the top ten results when the ranking was done using sum . The results show that our model w...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

UPenn - TDT - 3
Topic-Weighted versus Story-Weighted Calculation of TDT Error ProbabilitiesSummary: The variance of error probability estimates is modeled, and topic-weighted estimates are found to be superior to story-weighted estimates for TDT2 evaluation systems
UPenn - MARCH - 2005
Update on the END Task: Entity Normalization and DisambiguationLance Ramshaw and David Day 2005-03-17OutlineIntroductionGoal and Context Overall planSeed Database Design and Data Adding ACE Corpus Data API to Provide to ACE Systems Scoring
UPenn - FEB - 2004
ACE TIDES Tasks 2004Ralph Weischedel BBN Technologies 23 February 2004Evaluation Plans (10/2003) Languages: English, Arabic, Chinese Media Text (including human transcripts) ASR (for English) [NIST] OCR (for English) [UMD?] NO Input Corpus:
UPenn - MARCH - 2005
The Entity Annotation Task:Overview, Inventory and Open IssuesStephanie Strassel <strassel@ldc.upenn.edu> Christopher Walker, Zhiyi Song, Ramez Zakhary, Alexis Mitchell Linguistic Data Consortium www.ldc.upenn.edu/Projects/ACEACE Mid-Course Cor
UPenn - MARCH - 2005
ACE2005 Corpus CharacteristicsStephanie Strassel <strassel@ldc.upenn.edu> Julie Medero <jmedero@ldc.upenn.edu> Linguistic Data Consortium www.ldc.upenn.edu/Projects/ACEACE Mid-Course Correction Workshop March 17, 2005New Data PlanVolume Volum
UPenn - MARCH - 2005
ACE2005 Annotation StrategyStephanie Strassel <strassel@ldc.upenn.edu> Linguistic Data Consortium www.ldc.upenn.edu/Projects/ACEACE Mid-Course Correction Workshop March 17, 2005Annotation Stages1P: One annotator completes all tasks for a doc
UPenn - MARCH - 2005
The Event Annotation Task:Overview, Inventory and Open IssuesChristopher Walker <chwalker@ldc.upenn.edu> Zhiyi Song, Ramez Zakhary, Alexis Mitchell, Stephanie Strassel Linguistic Data Consortium www.ldc.upenn.edu/Projects/ACEACE Mid-Course Corr
UPenn - MARCH - 2005
OverviewRalph WeischedelGoals for 2005 EvaluationFUNDEDNOT CURRENTLY FUNDEDEntity normalization & Disambiguation Mapping to persistent data base Cross-document entity trackingEvents for the 1st time Massively more types/subtypes 1
UPenn - MARCH - 2005
The Relation Annotation Task:Overview, Inventory and Open IssuesChristopher Walker <chwalker@ldc.upenn.edu> Zhiyi Song, Ramez Zakhary, Alexis Mitchell, Stephanie Strassel Linguistic Data Consortium www.ldc.upenn.edu/Projects/ACEACE Mid-Course C
UPenn - MARCH - 2005
ACE Scoring for 2005 Scorer overview Pilot annotation scoresThe Scoring Method The scorer scores the performance of a system by computing the value of the systems output using a three-step process:1. The value of each system output element is c
UPenn - TDT - 3
Cost-based Cluster Evaluation Algorithm for Topic DetectionThe clustering task: A set of messages is to be clustered according to target identity. The goal of the clustering is that each cluster should contain messages from a single target and that
UPenn - TDT - 3
Hierarchical Topic Detection Evaluation Proposal1. Overview This document describes a new approach to the Topic Detection task for the Topic Detection and Tracking Evaluation series. The new approach, called Hierarchical Topic Detection (HTD), is be
UPenn - LDC - 94
MACROPHONE TRANSCRIPTION Overview -The goal of the Macrophone transcription effort is to provide an accurate word level transcription of what the caller said, with minimal markings for extraneous events and disfluencies. Because of the volume of data
UPenn - LDC - 2002
Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure TheoryLynn Carlson Department of Defense Ft. George G. Meade MD 20755 lmcarlnord@aol.com Daniel Marcu Information Sciences Institute University of S. California Marina del R
UPenn - LDC - 93
MINIMAL AND MAXIMAL REFERENCE ANSWERS I. OVERVIEW A minimal/maximal reference answer pair is produced for each utterance interpretation that is taken as a request to retrieve a set of tuples >from the database. In order for a hypothesis answer to be
UPenn - LDC - 96
FFMTIMIT Acoustic-Phonetic Continuous Speech Corpus Far Field Microphone Recordings Training and Test Data and Speech Header Software NIST Speech Disc 21-1.1 August 1995 This CD-ROM contains the previously-unreleased secondary microphone waveforms fo
UPenn - LDC - 98
ISSUES INVOLVED IN VOICEMAIL DATA COLLECTION M. Padmanabhan, G. Ramaswamy, B. Ramabhadran, P. S. Gopalakrishnan, C. Dunn IBM T. J. Watson Research Center P. O. Box 218, Yorktown Heights, NY 10598 1 INTRODUCTION Speech recognition is an important area
UPenn - LDC - 2005
Articulation Index Corpus(please see readme.txt for a rough sketch of the DVD contents)IntroductionThe Articulation Index Corpus was partly inspired by the work of HarveyFletcher, who did a number of perceptual experiments involving English
UPenn - LDC - 2007
README File for the ARABIC GIGAWORD CORPUS THIRD EDITION =INTRODUCTION-Arabic Gigaword Third Edition was produced by Linguistic DataConsortium (LDC); the catalog number is LDC2007T40 and the ISBN is1-58563-460-3. This is a compre
UPenn - LDC - 2004
Note: This document describes the transcription methods andconventions as employed by the Meeting Transcription team. Becausethe transcripts were later reformatted to conform to the new MRTspecification, an ADDENDUM is provided at the end of the d
UPenn - LDC - 93
Studio Quality Speaker-Independent Connected-Digit Corpus (TIDIGITS) CD-ROM Set NIST Speech Discs 4-1, 4-2, and 4-3
UPenn - LDC - 2003
ARABIC PART-OF-SPEECH/MORPHOLOGICAL ANALYSIS TAGGINGThe Penn Arabic Treebank uses a level of annotation more accuratelydescribed as morphological analysis than as part-of-speech tagging. InOctober 2001, the decision was taken to use Tim Buckwalt
UPenn - LDC - 96
VOICE ACROSS HISPANIC AMERICA TRANSCRIPTION = Yeshwant Muthusamy, Barb Wheatley and Joseph Picone Personal Systems Laboratory, Texas Instruments INTRODUCTION -This document describes the conventions used to validate and transcribe Spanish speech data
UPenn - LDC - 2004
*BUCKWALTER ARABIC MORPHOLOGICAL ANALYZER VERSION 2.0Portions (c) 2002-2004 QAMUS LLC (www.qamus.org),(c) 2002-2004 Trustees of the University of Pennsylvania**LDC USER AGREEMENTUse of this version of the Buckwalter Arabic Morphological Analy
UPenn - CIT - 591
Abstract Classes and InterfacesApr 10, 2009Abstract methodsYou can declare an object without defining it:Personp; publicabstractvoiddraw(intsize); Notice that the body of the method is missingSimilarly, you can declare a method without
UPenn - CIT - 590
Extreme ProgrammingApr 10, 2009Software engineering methodologiesA methodology is a formalized process or set of practices for creating software An early methodology was the waterfall model, so named because each stage flowed into the next,
UPenn - CIT - 591
Which is better?Which is better? Assume s1 and s2 are Strings:A. if(s1=s2){.} B. if(s1.equals(s2){.}Answer: B s1=s2tests whether s1 and s2 reference the same string; s1.equals(s2) tests whether they reference equal strings Strings1="ABC"; Str
UPenn - CIT - 591
Objects: Extended ExampleGeneral idea Simulate (model) the following situation: A customer walks into a grocery store, picks up a few items, pays for them, and leaves Lets write a program to do this Limitations: As yet, we have no way of inte
UPenn - CIT - 591
The Rabbit HuntAn example Java programThe user interfaceThe program designThe eight classes RabbitHunt - just gets things started Controller - accepts GUI commands from user View - creates the animated display Model - coordinates all the a
UPenn - CIT - 591
Lunar LanderAn Example of Interacting ClassesApr 10, 2009LunarLanderGameThis class contains the publicstaticvoidmain(String[]args) method. In this method, you should (1) create a LunarLander object, (2) create an IOFrame object, and (3) send
UPenn - CIT - 594
StacksWhat is a stack? A stack is a Last In, First Out (LIFO) data structure Anything added to the stack goes on the top of the stack Anything removed from the stack is taken from the top of the stack Things are removed in the reverse order
UPenn - CIT - 594
RecursionApr 10, 2009Definitions IA recursive definition is a definition in which the thing being defined occurs as part of its own definition Example: An atom is a name or a number A list consists of: An open parenthesis, "(" Zero or m
UPenn - CIT - 597
JSPJava Server PagesReference: http:/www.apl.jhu.edu/~hall/java/Servlet Tutorial/ServletTutorialJSP.htmlApr 10, 2009A Hello World servlet(from the Tomcat installation documentation)publicclassHelloServletextendsHttpServlet{ publicvoiddoGet(H
UPenn - CIT - 597
Regular Expressions in JavaApr 10, 2009Regular Expressions A regular expression is a kind of pattern that can be applied to text (Strings, in Java) A regular expression either matches the text (or part of the text), or it fails to match I
UPenn - CIT - 597
AjaxApr 10, 2009The hypeAjax (sometimes capitalized as AJAX) stands for Asynchronous JavaScript And XML Ajax is a technique for creating better, faster, more responsive web applications Web applications with Ajax are supposed to replac
UPenn - CIT - 591
Enums(and a review of switch statements)Apr 10, 2009Enumerated valuesSometimes you want a variable that can take on only a certain listed (enumerated) set of values Examples: dayOfWeek: SUNDAY, MONDAY, TUESDAY, month: JAN, FEB, MAR,
UPenn - CIT - 591
Additional Java SyntaxApr 10, 2009Odd cornersWe have already covered all of the commonly used Java syntax Some Java features are seldom used, because: They are needed in only a few specialized situations, or Its just as easy to do withou
UPenn - CIT - 591
Which is better?Apr 10, 2009Which is better?Assume s1 and s2 are Strings: A. if(s1=s2){.} B. if(s1.equals(s2){.}?2Answer: Bs1=s2tests whether s1 and s2 reference the same string;s1.equals(s2) tests whether they reference equal str
UPenn - CIT - 591
Characters and StringsApr 10, 2009CharactersIn Java, a char is a primitive type that can hold one single character A character can be: A letter or digit A punctuation mark A space, tab, newline, or other whitespace A control character
UPenn - CIT - 594
GenericsApr 10, 2009Arrays and collectionsIn Java, array elements must all be of the same type: int[]counts=newint[10]; String[]names={"Tom","Dick","Harry"};Hence, arrays are type safe: The compiler will not let you put the wrong kind
UPenn - CIT - 594
Linked ListsAnatomy of a linked list A linked list consists of: A sequence of nodesmyList a b c dEach node contains a value and a link (pointer or reference) to some other node The last node contains a null link The list may have a headerMor
UPenn - LING - 001
Ling 001: Syntax IIMovement & Constraints 2-11-2009Phrases In the last lecture, we talked about simple phrases; e.g. Noun Phrases like The dog The big dog The big dog that John was talking to In this lecture, we will look at how phrases and
UPenn - LING - 102
GenieandLanguageAcquisitionHowchildrenlearntospeakandwhat happensoncetheypassthecritical periodwithouthavingdoneso.Infants:010mos. Infantscandistinguishsoundsfrombirth,even ifthosesoundsarenotpartoftheirparents speech. Bysixmonths,babiesbegintol
UPenn - LING - 102
LanguageContactpresentedby MichaelL.Friesner August6,2007Thank you to Gillian Sankoff for sending me her PPT slides (among other things).TwoMainTypesof LanguageContactAgent:Nonnativespeakersaffectingalanguagethey cometospeak languageshift
UPenn - LING - 102
Acts of Conflicting IdentityThe Sociolinguistics of British pop-song pronunciation by Peter TrudgillThe Accent of pop singing At least since the 20s and the advent of Jazz, singers have adopted speech patterns while singing that are different fro
UPenn - LING - 001
A puzzle: why language? Quantitatively and qualitatively unique like elephants trunks No similar evolutionary trends in other species other species dont want to pick up peanuts with their noses all mammals have flexible noses, some use them as
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 6Speech analysis II: Stops, nasals, liquidsOct. 8-12, 20072LING 120 Introduction to Speech Analysis, Fall 20073LING 120 Introduction to Speech Analysis, Fall 20074LING 1
UPenn - LING - 520
LING 520 Introduction to Phonetics IFall 2008Week 9Basic audition Speech perception Nov. 3, 20082LING 520 Introduction to Phonetics I, Fall 20083LING 520 Introduction to Phonetics I, Fall 20084LING 520 Introduction to Phonetics
UPenn - LING - 520
LING 520 Introduction to Phonetics IFall 2008Week 2English consonants and vowels Articulatory phonology Sep. 15, 20082 1. Consonants are longer when at the end of a phrase (bib, did, don, nod). 2. Voiceless stops (i.e., /p, t, k/) are asp
UPenn - COGSCI - 501
Loudness predicts prominence: fundamental frequency lends little.G. Kochanski and E. Grabe and J. Coleman and B. Rosner( 2006/08/27 09:49:02 UTC )Running title: Fundamental Frequency Lends Little Prominence The University of Oxford Phonetics Lab
UPenn - COGSCI - 501
Psychological Review Vol. 65, No. 6, 19S8THE PERCEPTRON: A PROBABILISTIC MODEL FOR INFORMATION STORAGE AND ORGANIZATION IN THE BRAIN 1F. ROSENBLATT Cornell Aeronautical LaboratoryIf we are eventually to understand the capability of higher organi
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 5Speech analysis I: Vowels and FricativesOct. 1-5, 20072[From: UCL phonetics website]LING 120 Introduction to Speech Analysis, Fall 20073LING 120 Introduction to Speech Ana
UPenn - LING - 520
LING 520 Introduction to Phonetics IFall 2008Week 3Sounds in other languagesSep. 22, 2008Languages in the world There are about 7,000 languages in the world today. Over half of them (52 percent) are spoken by fewer than 10,000 people; over
UPenn - COGSCI - 501
268IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL. 24, NO. 2,FEBRUARY 2002Short Papers_Two Variations on Fishers Linear Discriminant for Pattern RecognitionTristrom CookeAbstractDiscriminants are often used in patter
UPenn - LING - 106
Right Linear GrammarsLing 106 October 8, 20031.Regular languages as languages generated by FSAWhen we did distributional analysis, we saw that linguistic units in natural language (roughly words) can be classified into grammatical categories o
UPenn - LING - 520
LING 520 Introduction to Phonetics IFall 2008Week 5Acoustic theory of speech production Acoustics of vowels Oct. 6, 20082 LING 120 Introduction to Phonetics I, Fall 20083LING 120 Introduction to Phonetics I, Fall 20084n=2L
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 2Anatomy of speech production Phonetic transcription RecordingSep. 10-14, 20072Nasal Cavity Oral Cavity Pharynx Larynx: vocal folds in it Trachea: the windpipe Lung: supply airstreamSa
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 4Acoustics of speech production SamplingSep. 24-28, 20072 LING 120 Introduction to Speech Analysis, Fall 20073n=2L nfn =vn=nv 2Ln = 1, 2, 3.L = /2 = 2L f = v/
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 3Physics of soundSep. 17-21, 20072Motion: Distance (unit: meters, 1 m 39 inches); displacement (vector); Speed = distance / time (units: meters/sec, m/s); Velocity specifies the di
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 8Speech analysis IV: Variation and statistical techniques (I)Oct. 22-24, 2007Variation in speech2 Linguistic factors: phonetic context, intonation, syntax/semantics, etc. Paralin
UPenn - LING - 120
LING 120 Introduction to Speech AnalysisFall 2007Week 9Speech analysis IV: Variation and statistical techniques (II)Oct. 29 - Nov. 2, 2007Hypothesis testing Steps for Hypothesis Testing: 1. Formulate your hypotheses: - Need a Null Hypothesis