Lecture13-PoS4

Lecture13-PoS4 - An induction problem by any other name…...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: An induction problem by any other name… Psych 215L: Language Acquisition Lecture 13 Poverty of the Stimulus: Anaphoric One One of the most controversial claims in linguistics is that children face an induction problem: “Poverty of the Stimulus” (Chomsky 1980, Crain 1991, Lightfoot 1989, Valian 2009) “Logical Problem of Language Acquisition” ( Baker 1981, Hornstein & Lightfoot 1981) “Plato’s Problem” (Chomsky 1988, Dresher 2003) Basic claim: The data encountered are compatible with multiple hypotheses. hypothesis 2 hypothesis 1 data encountered correct hypothesis The induction problem Extended claim: Given this, the data are insufWicient for identifying the correct hypothesis as quickly as children do (Legate & Yang 2002) – or at all. Big question: How do children do it, then? hypothesis 2 hypothesis 1 data encountered correct hypothesis One answer: Children come prepared • Children are not unbiased learners. • But if children come equipped with helpful learning biases, then what is the nature of these necessary biases? – Are they innate or derived from the input somehow? – Are they domain-­‐speciWic or domain-­‐general? – Are they about what’s being learned or about how to learn? The Universal Grammar (UG) hypothesis (Chomsky 1965, Chomsky 1975): These biases are innate and domain-­‐speciWic. Induction problems, UG, and informative data The direct evidence assumption If you want to learn linguistic knowledge L, you learn it by observing examples of L in your input (and possibly by also being sensitive to indirect negative evidence about what examples are missing from the input.) Traditional Idea Induction problems Universal Grammar (UG) Learning complex yes/no questions Traditional assumption: Only directly related data are informative data. These data are often rare, and that’s why induction problems occur. Direct evidence L: “Is the boy who is in the corner tis happy?” hypothesis 2 hypothesis 1 Possible indirect negative evidence: *“Is the boy who tis in the corner is happy?” data? correct hypothesis The direct evidence assumption The direct evidence assumption If you want to learn linguistic knowledge L, you learn it by observing examples of L in your input (and possibly by also being sensitive to indirect negative evidence about what examples are missing from the input.) If you want to learn linguistic knowledge L, you learn it by observing examples of L in your input (and possibly by also being sensitive to indirect negative evidence about what examples are missing from the input.) Learning the representation of English anaphoric one Situation Direct evidence L: “I see a red bottle… but there isn’t another one around.” Possible indirect negative evidence: *“I like the student of linguistics and he likes the one of computer science.” Learning syntactic islands Direct evidence L: “What did the teacher think t what inspired the students?” “Who did the teacher think the letter from the soldier inspired twho?” “Who twho thought the letter from the soldier inspired the students?” Possible indirect negative evidence: *“Who did the teacher think [[the letter from twho] inspired the students]?” island A broader set of informative data Indirect evidence: other kinds of data that may also be relevant, thereby broadening the set of informative data hypothesis 2 hypothesis 1 data data? correct hypothesis Mapping out UG & the acquisition process Big questions: – When induction problems exist, what does it take to solve them? • What indirect evidence is available? How might a child leverage this evidence? • What learning biases can get the job done, given the available data? Are they necessarily innate and domain-­‐speciWic (UG)? – How can the necessary learning biases inform us about how the acquisition process works? Recent computational models have been exploring this: • Complex yes/no questions (Perfors, Tenenbaum, & Regier 2006, 2011) • Anaphoric one (Regier & Gahl 2004, Pearl & Lidz 2009, Foraker et al. 2009) Anaphoric One Anaphoric One Look -­‐ a red bottle! Look -­‐ a red bottle! Do you see another one? Do you see another one? red bottle Process: First determine the antecedent of one (what string one is referring to). “red bottle” Anaphoric One Anaphoric One Look -­‐ a red bottle! Look -­‐ a red bottle! red bottle Do you see another one? Do you see another one? Two steps: (1)Identify syntactic antecedent (based on syntactic category of one) (2)Identify semantic referent (based on syntactic antecedent) Process: Because the antecedent (“red bottle”) includes the modiWier “red”, the property RED is important for the referent of one to have. referent of one = RED BOTTLE Anaphoric One: Syntactic Category Anaphoric One: Syntactic Category Standard linguistic theory claims that one in these kind of utterances is a syntactic category smaller than an entire noun phrase, but larger than just a noun (N0). This category is sometimes called N’. This category includes strings like “ bottle” and “red bottle”. Standard linguistic theory claims that one in these kind of utterances is a syntactic category smaller than an entire noun phrase, but larger than just a noun (N0). This category is sometimes called N’. This category includes strings like “ bottle” and “red bottle”. . NP det another NP N’ N0 bottle [NP another [N’ [ N0 bottle]]] det another NP det N’ adj red [NP another [N’ red [N’ [ N0 bottle]]]] N’ N0 bottle another NP one N’ N0 bottle [NP another [N’ [ N0 bottle]]] det another N’ adj red [NP another [N’ red [N’ [ N0 bottle]]]] N’ N0 bottle Anaphoric One: Interpretations based on Syntactic Category Anaphoric One: Syntactic Category Importantly, one is not N0. If it was, it could only have strings like “bottle” as its antecedent, and could never have strings like “red bottle” as its antecedent. NP det another NP N’ N0 det another bottle [NP another [N’ [ N0 bottle]]] Since one’s antecedent is “red bottle”, and “red bottle” cannot be N0, o ne must not be N0. N’ red N0 [NP another [N’ red [N’ [ N0 bottle]]]] bottle Anaphoric One: Children’s Knowledge Anaphoric One: The induction problem Acquisition: Children must learn the right syntactic category for one, and the right interpretation preference for one in situations with more than one option. Lidz, Waxman, & Freedman (2003) [LWF] found that 18-­‐ month-­‐olds have a preference for the red bottle in the same situation. “Look – a red bottle! Do you see another one?” LWF interpretation & conclusion: Preference for the RED BOTTLE means the preferred syntactic a ntecedent is “red bottle”. LWF conclude that 18-­‐month-­‐old knowledge = syntactic category of o ne = N’ syntactic antecedent when modiWier is present includes modiWier (e.g., red) = referent has modiWier property “Look – a red bottle! Do you see another one?” Because one’s antecedent could only be “ bottle”, we would h ave to interpret the second part as “Do you see another b ottle?” and the purple bottle would be a Wine referent for one. N’ adj If o ne was N 0, we would have a different interpretation of NP det a N’ adj N’ red N0 bottle Anaphoric One: The induction problem Anaphoric One: The induction problem Acquisition: Children must learn the right syntactic category for one, and the right interpretation preference for one in situations with more than one option. Acquisition: Children must learn the right syntactic category for one, and the right interpretation preference for one in situations with more than one option. Problem: Most data children encounter are ambiguous. Syntactically (SYN) ambiguous data: “Look – a bottle! Oh, look – another one.” Problem: Most data children encounter are ambiguous. Semantically and syntactically (SEM-­‐SYN) ambiguous: “Look – a red bottle! Oh, look – another one.” one’s referent = BOTTLE one’s antecedent = [ N’[ N0 bottle]] or [ N0 bottle]? Anaphoric One: The induction problem Acquisition: Children must learn the right syntactic category for one, and the right interpretation preference for one in situations with more than one option. Problem: Unambiguous data are rare (<0.25%, based on LWF’s analysis) Unambiguous (UNAMB) data: “Look – a red bottle! Hmmm -­‐ there doesn’t seem to be another one here, though.” one’s referent = BOTTLE? If so, one’s antecedent = “bottle”. But it’s strange to claim there’s not another bottle here. So, one’s referent must be RED BOTTLE, and one’s antecedent = [ N’ red[N’[ N0 bottle]]]. one’s referent = RED BOTTLE or BOTTLE? one’s antecedent = [ N’ red[N’[ N0 bottle]]] or [ N’[ N0 bottle]] or [N0 bottle]? Previous proposals for learning about one Baker (1978) [Baker] (also Hornstein & Lightfoot 1981, Lightfoot 1982, Hamburger & Crain 1984, Crain 1991): Only unambiguous data are informative. Because they’re so rare, they can’t be responsible for the acquisition of one. How then? Children have innate, d omain-­‐speciWic knowledge restricting the hypotheses about one: o ne cannot be syntactic category N0. What about when there are multiple N’ antecedents? [N’ red[N’[ N0 bottle]]] or [N’[N0 bottle]]? (No speciWic proposal for this.) Previous proposals for learning about one Previous proposals for learning about one Regier & Gahl 2004 [R&G]: Sem-­‐Syn ambiguous data can be leveraged, in addition to using unambiguous data. “Look – a red bottle! Oh, look – another one!” Pearl & Lidz 2009 [P&L]: Syn ambiguous data must not be leveraged, even if Sem-­‐Syn and unambiguous data are used. “Look – a bottle! Oh, look – another one!” How? Use innate d omain-­‐general statistical learning abilities t o track how often o ne’s referent has the mentioned property (e.g. red). If the referent often has the property (RED BOTTLE), this is a suspicious coincidence unless the antecedent really does include the modiWier (“red bottle”) and one’s category is N’. Why? These data cause an “equal-­‐opportunity” (EO) probabilistic learner to think one’s category is N0. [ N’ red[N’[N0 bottle]]] [ N0 bottle] How? P&L propose a domain-­‐speciWic learning bias to ignore just these ambiguous data, though they speculate how this bias could be derived from an innate d omain-­‐general preference for learning when there is local uncertainty. Previous proposals for learning about one A new proposal: Broadening the data set Foraker et al. 2009 [F&al]: Leverage the syntactic distribution of o ne with i nnate domain-­‐general s tatistical learning, by using subtle domain-­‐speciWic semantic distinctions that indicate syntactic category. Pearl & M is, submitted [P&M]: Other pronouns in the language can also be used anaphorically: h im, her, it, … “ball with stripes” “side of the road” “one with dots” *“one of the river” [modiWiers] [head noun = N’] [complements = conceptually evoked by head noun] [head noun = N0 ] How? Indirect negative evidence (never seeing one with a complement, even though other nouns take complements) indicates o ne is not N0. Look at the cute penguin. I want to hug him/her/it. [NP the [N’ cute [N’ [ N0 penguin]]]] [NP him/her/it] Look! A cute penguin. I want one. [NP a [N’ cute [N’ [ N0 penguin]]]] [NP one] Note: The issue of one’s category only occurs when one is used in a syntactic environment that indicates it is smaller than an NP (<NP). A new proposal: Broadening the data set A new proposal: Broadening the data set Pearl & M is, submitted [P&M]: Track how often the referent of the anaphoric element (one, him, her, it, etc.) has the property mentioned in the potential antecedent, using innate d omain-­‐general statistical learning abilities. Pearl & M is, submitted [P&M]: Track how often the referent of the anaphoric element (one, him, her, it, etc.) has the property mentioned in the potential antecedent, using innate d omain-­‐general statistical learning abilities. Important: This applies, even when the syntactic category is known. Important: This applies, even when the syntactic category is known. Look at the cute penguin. I want to hug him/her/it. Look at the cute penguin. I want to hug him/her/it. Look! A cute penguin. I want one. Look! A cute penguin. I want one. Is the referent cute? Yes! So it’s important that the antecedent include the modiWier “ cute”. Data set comparisons: Learners using syntactic and semantic information Unamb <NP “Look – a red bottle! Hmmm -­‐ there doesn’t seem to be another o ne here, though.” Learners: Baker, R&G, P&L’s EO, P&M Sem-­Syn Amb “Look – a red bottle! Oh, look – another one!” Data points like those above will always include the modiWier in the antecedent, since the category of the pronoun is NP and so the antecedent is the entire NP. These data are unambiguous: the referent must have the mentioned property. Information in the data previous context = ex: “…a red bo0le…” current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Pronoun used Syntac8c category of pronoun Learners: R&G, P &L’s EO, P&M Syn Amb “Look – a bottle! Oh, look – another one!” Antecedent string includes property? Antecedent string includes modifier? Syntac8c environment Learners: P&L’s EO, P&M Actual antecedent string Unamb NP “Look – a red bottle! I want one/it.” Learners: P&M Object referred to Observed Latent Information in the data previous context = ex: “…a red bo0le…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Antecedent string includes property? Information in the data current usage = pronoun ex: “…another one…” Pronoun used Syntac8c category of pronoun Antecedent string includes modifier? Syntac8c environment Actual antecedent string Observed Object referred to Latent Information in the data previous context = ex: “…a red bo0le…” Property important? Antecedent string includes property? SYNTACTIC USAGE Pronoun used Syntac8c category of pronoun Antecedent string includes modifier? current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Antecedent string includes property? Pronoun used Antecedent string includes modifier? Object referred to Syntac8c environment Actual antecedent string Observed Object referred to Syntac8c environment previous context = ex: “…a red bo0le…” Latent current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Antecedent string includes property? Pronoun used Syntac8c category of pronoun Antecedent string includes modifier? <NP Actual antecedent string NP, N’, or N0? Syntac8c category of pronoun Information in the data current usage = pronoun ex: “…another one…” REFERENTIAL INTENT Property men8oned? previous context = ex: “…a red bo0le…” Observed Latent Syntac8c environment N’ = yes, no N0 = no Actual antecedent string Object referred to Observed Latent Information in the data previous context = ex: “…a red bo0le…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Information in the data current usage = pronoun ex: “…another one…” Pronoun used Syntac8c category of pronoun previous context = ex: “…a red bo0le…” current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Pronoun used Syntac8c category of pronoun yes, no Antecedent string includes property? Antecedent string includes modifier? Syntac8c environment Actual antecedent string Observed Object referred to Latent Information in the data previous context = ex: “…a red bo0le…” Property important? Antecedent string includes property? SYNTACTIC USAGE Pronoun used Syntac8c category of pronoun Antecedent string includes modifier? Antecedent string includes modifier? Syntac8c environment Actual antecedent string Observed Object referred to Latent Information in the data current usage = pronoun ex: “…another one…” REFERENTIAL INTENT Property men8oned? Antecedent string includes property? Syntac8c environment previous context = ex: “…a red bo0le…” current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Antecedent string includes property? Pronoun used Syntac8c category of pronoun Antecedent string includes modifier? Syntac8c environment yes, no Actual antecedent string Object referred to Observed Latent Actual antecedent string Object referred to “red bottle”, “bottle” Observed Latent Information in the data previous context = ex: “…a red bo0le…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Property important? Information in the data: Unamb <NP current usage = pronoun ex: “…another one…” previous context = ex: “…a red bo0le…” current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Pronoun used Pronoun used yes Syntac8c category of pronoun Property important? one Syntac8c category of pronoun yes Antecedent string includes property? Antecedent string includes modifier? Syntac8c environment N’ Antecedent string includes property? Antecedent string includes modifier? yes Actual antecedent string Latent current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Pronoun used yes Property important? one Syntac8c category of pronoun no, yes N’, N0 Antecedent string includes property? Antecedent string includes modifier? no, yes no, yes Actual antecedent string “red bottle”, “bottle” Object referred to Syntac8c environment <NP Observed Latent <NP Observed “red bottle” Object referred to Information in the data: Sem-­‐Syn Amb previous context = ex: “…a red bo0le…” yes Actual antecedent string Observed Object referred to Syntac8c environment Latent Information in the data: Syn Amb previous context = ex: “…a bo0le…” current usage = pronoun ex: “…another one…” SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Pronoun used no Property important? one Syntac8c category of pronoun no N’, N0 Antecedent string includes property? Antecedent string includes modifier? no no <NP Actual antecedent string “bottle” Object referred to Syntac8c environment Observed Latent Information in the data: Unamb NP previous context = ex: “…a red bo0le…” current usage = pronoun ex: “…want one…” The online probabilistic framework Tracking the probability that a property mentioned in the potential antecedent is important: pI Property men8oned = yes SYNTACTIC USAGE REFERENTIAL INTENT Property men8oned? Pronoun used yes Property important? one Property important? Syntac8c category of pronoun yes NP Antecedent string includes property? Antecedent string includes modifier? yes Tracking the probability that the syntactic category is N’ when it is smaller than NP: p N’ Syntac8c environment yes Actual antecedent string “a red bottle” Object referred to Syntac8c category of pronoun NP Syntac8c environment = <NP Observed Latent The online probabilistic framework: Updating pI The online probabilistic framework General form of update equations for p x (adapted from Chew 1971): ϕI px = " + datax ," = # = 1 " + # + totaldatax Unamb <NP Unamb NP Syn Amb A very weak prior total informative data seen w .r.t x After every informative data point encountered: datax = datax + "x ! Incremented by probability that data point suggests x is true totaldatax = totaldatax +1 m "1 = pN ' * * pI m+n ! ! " 3 = (1 # pN ' ) * (1 # pI ) * ! ! Probability property is important Category = N’, choose N’ with modiWier, property is important n ! 1 " 2 = pN ' * * (1 # pI ) * m+n t ! Property deWinitely important Property deWinitely important Not informative for pI "1 "1 + " 2 + " 3 Sem-­‐Syn A mb One informative data point seen Explanation 1 1 N/A data seen suggesting x is true 1 t Category = N’, choose N’ without modiWier, property is not important, choose object with property by chance Category = N0 , property is not important, choose object with property by chance The online probabilistic framework: Updating pN’ The online probabilistic framework: Updating pN’ ϕ N’ 1 N/A Unamb <NP Unamb NP Syn Amb Category deWinitely N’ Not informative for pN’ Probability category is N’ Unamb <NP Unamb NP Syn Amb Probability category is N’ Sem-­‐Syn A mb "4 "4 + "5 "1 + " 2 "1 + " 2 + " 3 Sem-­‐Syn A mb m "1 = pN ' * * pI m+n Explanation ! " 3 = (1 # pN ' ) * (1 # pI ) * 1 t ! Category = N’, choose N’ without modiWier, property is not important, choose object with property by chance Category = N0, property is not important, choose object with property by chance " 5 = 1 # pN ' Explanation 1 N/A Category deWinitely N’ Not informative for pN’ Probability category is N’ "4 "4 + "5 "1 + " 2 "1 + " 2 + " 3 ! n " 4 = pN ' * m+n Category = N’, choose N’ with modiWier, property is important ! n 1 " 2 = pN ' * * (1 # pI ) * m+n t ! ϕ N’ ! Probability category is N’ Category = N’, choose N’ without modiWier Category = N0 ! ! ! Example updates Start with p N’ = pI = 0.50 Corpus Analysis & Learner Input Brown/Eve corpus (CHILDES: MacWhinney 2000): starting at 18 months 17,521 utterances of child-­‐directed speech, 2874 referential pronoun utterances One Unamb <NP data point: pN’ = 0.67, p I = 0.67 One Unamb NP data point: pN’ = 0.50, p I = 0.67 One Sem-­‐Syn A mb data point: pN’ = 0.56, p I = 0.47 m=1, n=3, t=5 [from P&L] If m=1, n=3, t=20 p N’ = 0.58, pI = 0.62 One Syn Amb data point: pN’ = 0.48, p I = 0.50 m=1, n=3, t=5 [from P&L] Unamb <NP Sem-­‐Syn A mb Syn Amb Unamb NP Uninformative 0.00% 0.66% 7.52% 8.42% 83.4% Pearl & Lidz (2009): Children learn o ne’s representation between 14 and 18 months. Based on estimates of the number of utterances children hear from birth until 18 months (Akhtar et al., 2004), we can calculate the data distribution in their input (36,500 referential pronoun utterances total). Measures of Success: LWF children’s behavior Corpus Analysis & Learner Input Learner Input based on Brown/Eve corpus distributions Unamb <NP Sem-­‐Syn A mb Syn Amb Unamb NP Uninformative Baker R&G 0 0 0 242 0 0 0 0 36500 36258 0.00% 0.66% 7.52% 8.42% 83.4% P&L’s EO P&M 0 0 242 242 2743 2743 0 3073 33515 30442 In addition to directly assessing p I a nd pN’, we can measure how often a learner would reproduce the behavior in the LWF experiment. Look – a red bottle! Do you see another one? Pearl & Lidz (2009): Children learn o ne’s representation between 14 and 18 months. Based on estimates of the number of utterances children hear from birth until 18 months (Akhtar et al., 2004), we can calculate the data distribution in their input (36,500 referential pronoun utterances total). Measures of Success: LWF children’s behavior In addition to directly assessing p I a nd pN’, we can measure how often a learner would reproduce the behavior in the LWF experiment. 2 choices t = 2 "1 + " 2 + " 3 "1 + 2 * " 2 + 2 * " 3 "1 = pN ' * ! " 2 = pN ' * ! ! ! m * pI m+n Additional two outcomes where learner looks at other bottle Category = N’, antecedent = “red bottle” n 1 * (1 # pI ) * m+n t 1 " 3 = (1 # pN ' ) * (1 # pI ) * t Any outcome where learner looks at red bottle Testing LWF’s assumption about what behavior means In addition to directly assessing the learner’s behavior, we can assess LWF’s assumption that correct behavior indicates the children have the correct representation for o ne. Is it possible to get correct behavior in the LWF experiment without having the correct representation for one in general (as measured by pI and pN’)? Is it possible to get correct behavior in the LWF experiment without having the correct representation for one at the time the behavior is being produced? "1 "1 + " 2 + " 3 Category = N’, antecedent = “bottle” Category = N0, antecedent = “bottle” ! the probability the look to the red bottle is because the learner has the correct representation (N’, “red bottle”) given that the learner looks at the red bottle Learner Results Learner Results Averages over 1000 simulations, standard deviations in parentheses. Baker R&G P&L’s EO Averages over 1000 simulations, standard deviations in parentheses. P&M Baker R&G P&L’s EO P&M pI 0.50 (<.01) 0.95 (<.01) 0.02 (<.01) >0.99 (<.01) pI 0.50 (<.01) 0.95 (<.01) 0.02 (<.01) >0.99 (<.01) pN’ 0.50 (<.01) 0.97 (<.01) 0.17 (.02) 0.37 (.04) pN’ 0.50 (<.01) 0.97 (<.01) 0.17 (.02) 0.37 (.04) p(LWF behavior) 0.53 (<.01) 0.93 (<.01) 0.50 (<.01) >0.99 (<.01) p(LWF behavior) p(correct representation when producing LWF behavior) p(correct representation when producing LWF behavior) As previous studies found: Traditional unambiguous data alone fails (Baker). Leveraging S em-­‐Syn ambiguous data succeeds (R&G, P&L). Leveraging S yn ambiguous data in addition fails (P&L’s EO). New result: The probability of p roducing the LWF behavior w ith this incorrect representation is high. New result: Leveraging Unamb NP data (P&M) does not yield the correct representation in general (pN’ is low), but… Learner Results Learner Results Averages over 1000 simulations, standard deviations in parentheses. Baker R&G How does this work? If p I is high, then when a property is mentioned (like “red”), the learner believes that property is relevant – which means the referent must include that property (RED BOTTLE). P&L’s EO Averages over 1000 simulations, standard deviations in parentheses. P&M Baker R&G P&L’s EO P&M pI 0.50 (<.01) 0.95 (<.01) 0.02 (<.01) >0.99 (<.01) pI 0.50 (<.01) 0.95 (<.01) 0.02 (<.01) >0.99 (<.01) pN’ 0.50 (<.01) 0.97 (<.01) 0.17 (.02) 0.37 (.04) pN’ 0.50 (<.01) 0.97 (<.01) 0.17 (.02) 0.37 (.04) p(LWF behavior) 0.53 (<.01) 0.93 (<.01) 0.50 (<.01) >0.99 (<.01) p(LWF behavior) 0.53 (<.01) 0.93 (<.01) 0.50 (<.01) >0.99 (<.01) p(correct representation when 0.22 (<.01) producing LWF behavior) 0.92 (<.01) <0.01 (<.01) >0.99 (<.01) p(correct representation when producing LWF behavior) What this means: LWF’s assumption that correct behavior indicates the child has the correct representation does not seem to hold. Or does it? When the child produces the correct behavior in the LWF experiment, the probability that the child has the correct representation w hen making that interpretation i s very high, even if the probability for the correct representation i n general ( e.g., when there is no modiWier present) is very low. Upshot: LWF were not wrong about children’s representation when interpreting utterances like those in their experiment. Learner Results Recap & Implications Averages over 1000 simulations, standard deviations in parentheses. Baker R&G P&L’s EO P&M pI 0.50 (<.01) 0.95 (<.01) 0.02 (<.01) >0.99 (<.01) pN’ 0.50 (<.01) 0.97 (<.01) 0.17 (.02) 0.37 (.04) p(LWF behavior) 0.53 (<.01) 0.93 (<.01) 0.50 (<.01) >0.99 (<.01) p(correct representation when 0.22 (<.01) producing LWF behavior) 0.92 (<.01) • Children may be able to learn the correct interpretation for o ne in certain situations (such as the LWF experiment) by broadening the set of data they consider relevant. <0.01 (<.01) >0.99 (<.01) Also, the other learners behave as LWF expect: When they show the correct behavior, they have the correct representation. When they show incorrect b ehavior, they have t he incorrect representation. data • Instead, they may realize that o ne’s category is N’ (rather than N0 ) at some later point. data? correct hypothesis • Just because children demonstrate that they have the correct interpretation some of the time does not mean they have the correct representation all of the time. Recap & Implications • While children must eventually learn the correct representation of one, they do not necessarily need to do so by 18 months. hypothesis 2 hypothesis 1 The Acquisition Trajectory I want it. I want one. Another one! Do you see h im? Do you see o ne? One possibility: [F&al] Leverage the syntactic distribution of one w ith i nnate domain-­‐general statistical learning, by using subtle d omain-­‐speciWic semantic distinctions that indicate syntactic category. “ball with stripes” “one with dots” [modiWiers] [head noun = N’] “side of the road” *“one of the river” [complements = conceptually evoked by head noun] [head noun = N0 ] Before 18 months: Need domain-­‐speciWic knowledge Recognize that one is similar to other anaphoric elements (it, him, etc.). How to get it? Derive it by using innate domain-­‐general statistical learning abilities to observe the distribution of one compared to these other elements. The Acquisition Trajectory The Acquisition Trajectory “Look – a red bottle! Oh, look – another one!” Before 18 months: Track how often a mentioned property is important for a referent to have. How to get it? Use innate domain-­‐general statistical learning abilities to track this. The Acquisition Trajectory “ball with stripes” “side of the road” “one with dots” *“one of the river” 18 months: Be able to assign the correct interpretation to utterances like those in the LWF experiment. (Know that one is N’ in these cases.) “Look – a red bottle! Do you see another one?” Back to the bigger questions – When induction problems exist, what does it take to solve them? • What indirect evidence is available? How might a child leverage this evidence? Broader data sets that are identiWiable via innate domain-­‐general learning abilities may be additional sources of useful information. hypothesis 2 hypothesis 1 After 18 months: Need domain-­‐speciWic knowledge about subtle semantic distinctions that indicate syntactic category in order to leverage the syntactic distribution of one with innate domain-­‐general s tatistical learning. How? May come from innate domain-­‐speciWic knowledge (UG) about language. data correct hypothesis data? Back to the bigger questions – When induction problems exist, what does it take to solve them? • What learning biases can get the job done, given the available data? Are they necessarily innate and domain-­‐speciWic (UG)? In this case study, the Wirst step may not involve this kind of knowledge, although achieving the Winal adult knowledge state may. Stage I 18-­‐month-­‐old behavior Stage II Back to the bigger questions – How can the necessary learning biases inform us about how the acquisition process works? identify learning biases needed to achieve 18-­‐month-­‐old behavior identify knowledge state those biases suggest suggest a two stage acquisition process for learning anaphoric one Stage I 18-­‐month-­‐old behavior Stage II derived d omain-­‐speciWic knowledge innate domain-­‐speciWic knowledge? derived d omain-­‐speciWic knowledge innate domain-­‐speciWic knowledge? innate domain-­‐general statistical learning innate domain-­‐general statistical learning innate domain-­‐general statistical learning innate domain-­‐general statistical learning The big picture • Indirect evidence does not necessarily mean indirect negative evidence – it can come from considering a broader pool of informative data • Indirect evidence does not necessarily negate the need for learning biases (of whatever kind) • Considering indirect evidence and its impact on acquisition can help deWine concrete proposals a bout what is necessarily innate and domain-­‐speciWic, and thus what is in Universal Grammar • Knowing the impact of the necessary learning biases on acquisition may also inform us about the acquisition trajectory ...
View Full Document

This note was uploaded on 12/12/2011 for the course PSYCH 215l taught by Professor Pearl during the Fall '11 term at UC Irvine.

Ask a homework question - tutors are online