Lecture11-PoS2

Lecture11-PoS2 - Reminder: Poverty of the Stimulus Language...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Reminder: Poverty of the Stimulus Language Psych 215L: Language Acquisition Lecture 11 Poverty of the Stimulus Can be thought of as the set of legal items in the language (sentences, strings, etc.). The child’s job: figure out the rules that generate that legal set and don’t generate illegal items. Illegal Items Hoggle a dwarf ornery is Fairies bite adventurers Legal items Hoggle is an ornery dwarf Bite adventurers fairies Can the girl who can summon the Goblin King solve the Labyrinth? Can the girl who summon the Goblin King can solve the Labyrinth? Reminder: Poverty of the Stimulus The Logic of Poverty of the Stimulus (The Logical Problem of Language Acquisition) 1) Suppose there is an incorrect hypothesis compatible with the data. 3) Suppose children behave as if they never entertain the incorrect hypothesis. The argument for having innate biases to guide language acquisition Suppose there are some data. 2) Reminder: Poverty of the Stimulus Addendum (interpretation): Or children converge on the correct hypothesis much earlier than expected (Legate & Yang 2002). Conclusion: Children possess prior knowledge ruling out the incorrect hypothesis from the hypothesis space considered. Addendum (Interpretation): The initial hypothesis space does not include all hypotheses. Specifically, the incorrect ones of a particular kind are not in the child’s hypothesis space. Legal Items A fairy who flies around the Labyrinth walls bites anyone who passes by. Fairies bite Items Encountered Hoggle is an ornery dwarf Can the girl solve the Labyrinth? Can the girl who can summon the Goblin King solve the Labyrinth? Idea: The data available to the child are compatible with a number of generalizations. However, children only seem to pick the right ones. Therefore, they must have some other constraints guiding their language learning. The innate part: The guiding information must be available prior to learning. Reminder: LPLA Pullum & Scholz (2002) Induction Problem: Logical Problem of Language Acquisition (Standard Theory) Children don’t get access to all the data in the language by the time they have the correct generalization. They learn from a subset of the legal items in the language. And still they seem to converge on the right generalizations…without trying out many (or all) of the wrong ones. Legal Items Compatible with move first move second, move last, move odd one counting from the beginning, … Items Encountered Can the girl solve the Labyrinth? Can the girl who can summon the Goblin King solve the Labyrinth? Pullum & Scholz (2002): Frustration with PoS Proponents “…linguistic nativism is the view…that human infants have at least some linguistically specific innate knowledge” “…issue is whether a full description of that predisposition incorporates anything that entails specific contingent facts about natural languages” [poverty of the stimulus] “…argument…turns on the claim that during the language acquisition process, children often come to know things about the language they are acquiring despite not having access to the crucial evidence that shows these things to be true of the language.” Pullum & Scholz (2002): Frustration with PoS Proponents Children’s observable accomplishments “Instead of clarifying the reasoning, each successive writer on this topic shakes together an idiosyncratic cocktail of claims about children’s learning of language, and concludes nativism is thereby supported. Most of the frequently encountered claims are about children’s observable accomplishments or aspects of the child’s environment.” Speed: Children learn so fast. Reliability: Children always succeed. Productivity: Children learn a system. Selectivity: Children pick the correct option from a bunch of incorrect (and “seductive”) alternatives. Underdetermination: Children arrive at systems of knowledge underdetermined by the data. Convergence: Children end up with the right system. Universality: The system acquired has a lot of properties in common with other language systems of the world. Pullum & Scholz (2002): Frustration with PoS Proponents Pullum & Scholz (2002): Frustration with PoS Proponents Children’s observable accomplishments Children’s observable accomplishments Speed: Children learn so fast. Reliability: Children always succeed. Productivity: Children learn a system. Selectivity: Children pick the correct option from a bunch of incorrect (and “seductive”) alternatives. Underdetermination: Children arrive at systems of knowledge underdetermined by the data. Convergence: Children end up with the right system. Universality: The system acquired has a lot of properties in common with other language systems of the world. Speed: Children learn so fast. Pullum & Scholz (2002): Frustration with PoS Proponents Relevant Interpretation: Faster than expected, given available data. Selectivity: Children pick the correct option from a bunch of incorrect (and “seductive”) alternatives. Relevant Interpretation: Seductive because also compatible with data. Underdetermination: Children arrive at systems of knowledge underdetermined by the data. Relevant Interpretation: Alternative hypotheses also compatible with data. Pullum & Scholz (2002): Frustration with PoS Proponents Aspects of Child’s Environment Aspects of Child’s Environment Ingratitude: No explicit payoff for correct language usage. Finiteness: Children don’t get infinite data to learn from. Idiosyncracy: The subset of data children encounter varies from child to child. Incompleteness: Children don’t hear everything in the language. Positivity: No explicit instruction of what isn’t in the language. Degeneracy: Input to children has noise. Ingratitude: No explicit payoff for correct language usage. Finiteness: Children don’t get infinite data to learn from. Idiosyncracy: The subset of data children encounter varies from child to child. Incompleteness: Children don’t hear everything in the language. Positivity: No explicit instruction of what isn’t in the language. Degeneracy: Input to children has noise. Pullum & Scholz (2002): Frustration with PoS Proponents Pullum & Scholz (2002): The Version Chosen To Attack Aspects of Child’s Environment Finiteness: Children don’t get infinite data to learn from. Relevant Interpretation: Make generalization from incomplete data set. Idiosyncracy: The subset of data children encounter varies from child to child. Relevant Interpretation: Make generalization from incomplete data set. Incompleteness: Children don’t hear everything in the language. Relevant Interpretation: Make generalization from incomplete data set. Positivity: No explicit instruction of what isn’t in the language. Relevant Interpretation: Make generalization from incomplete data set. Pullum & Scholz (2002): How to Support APS “People attain knowledge of the structure of their language for which no evidence is available in the data to which they were exposed as children.” - Hornstein & Lightfoot (1981) “We replace total lack of evidence by lack of evidence that is adequate to the task…would not emerge in conversational data near often enough to guarantee that any particular child would ever encounter it.” - Pullum & Scholz “…‘the APS’ to stand for ‘the Argument Selected by Pullum & Scholz’ ” Pullum & Scholz (2002): How to Support APS Step 1: Describe in detail what is known. Step 1: Describe in detail what is known. Step 2a: Identify the crucial data that would lead a data-driven learner to that knowledge. Step 2b: Give reason to believe that’s the crucial data. Step 2a: Identify the crucial data that would lead a data-driven learner to that knowledge. Step 2b: Give reason to believe that’s the crucial data. Step 3: Show learners don’t have access to that crucial data. Step 3: Show learners don’t have access to that crucial data. Step 4: Show that learners nonetheless acquire the right knowledge. One way: Look for really rare data types. These are likely to be close enough to absent. Step 4: Show that learners nonetheless acquire the right knowledge. Pullum & Scholz (2002): Case Studies Case 1: Plurals in noun-noun compounds 3-6 yr olds behavior: Irregular plural pattern (plural marker on first noun okay) 1 tooth-eater or 1 teeth-eater 1 mouse-eater or 1 mice-eater Regular plural pattern (plural marker on first noun not okay) 1 toy-eater (but not 1 toys-eater) 1 rat-eater (but not 1 rats-eater) Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Case Studies Case 1: Plurals in noun-noun compounds Knowledge of incomplete paradigm: tooth-eater teeth-eater toy-eater *toys-eater Important point: No generalization to regular plural nouns. Pullum & Scholz (2002): Case Studies Case 1: Plurals in noun-noun compounds Case 1: Plurals in noun-noun compounds Gordon (1986): Brown corpus (1,000,000 words): irregular sg compounds [tooth-eater] (153 tokens) irregular pl compounds [teeth-eater] (3 tokens) regular sg compounds [toy-eater] (…more…?) regular pl compounds [*toys-eater] (0 tokens) P&S rebuttals: Not clear 3-6 yr old behavior was really true outside the experimental setup (method flaws). Point: Not children’s behavior. Argument: Irregular pl compounds appear so rarely, they are similar in frequency to regular pl compounds (which never appear because they’re ungrammatical.) But children still produce the irregular pl compounds and do not produce the irregular sg compounds. This is hard to explain if they’re data-driven. (Though see Foraker et al. (2009) for an example where an ideal learner can make use of even slight differences in data distribution, and then Lidz, Waxman, & Freedman (2003) and Pearl & Lidz (2009) for discussion about how “informative” data that have very low frequency may not be helpful to real learners…) Not clear that regular pl compounds are ungrammatical. Examples: “rules committee”, “chemicals-maker”, “citizenssponsored” appear in Wall Street Journal corpus. Point: Not adult’s behavior either. Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Case Studies Case 2: Auxiliary sequences Case 2: Auxiliary sequences Kimball 1973: It rains, It may rain, It may have rained, It may be raining, It has rained, It has been raining, It is raining, It may have been raining,… Kimball 1973: It rains, It may rain, It may have rained, It may be raining, It has rained, It has been raining, It is raining, It may have been raining,… Rule: (…Aux Verb: {rains, may rain, may have rained, …} Aux --> Tense (Modal) (have +en) (be +ing) Rule: (…Aux Verb: {rains, may rain, may have rained, …} Aux --> Tense (Modal) (have +en) (be +ing) {present} {may, might} {have VERBen} {be VERBing} {present} {may, might} {have VERBen} {be VERBing} {present} + rain = rains Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Case Studies Case 2: Auxiliary sequences Case 2: Auxiliary sequences Kimball 1973: It rains, It may rain, It may have rained, It may be raining, It has rained, It has been raining, It is raining, It may have been raining,… Kimball 1973: It rains, It may rain, It may have rained, It may be raining, It has rained, It has been raining, It is raining, It may have been raining,… Rule: (…Aux Verb: {rains, may rain, may have rained, …} Aux --> Tense (Modal) (have +en) (be +ing) Rule: (…Aux Verb: {rains, may rain, may have rained, …} Aux --> Tense (Modal) (have +en) (be +ing) {present} {may, might} {have VERBen} {be VERBing} {present} + {Modal} = may rain {present} {may, might} {have VERBen} {be VERBing} {past} + {Modal} + {have + en} = may have rained Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Case Studies Case 2: Auxiliary sequences Case 2: Auxiliary sequences Kimball 1973: It rains, It may rain, It may have rained, It may be raining, It has rained, It has been raining, It is raining, It may have been raining,… Kimball 1973: It rains, It may rain, It may have rained, It may be raining, It has rained, It has been raining, It is raining, It may have been raining,… Rule: (…Aux Verb: {rains, may rain, may have rained, …} Aux --> Tense (Modal) (have +en) (be +ing) Rule: (…Aux Verb: {rains, may rain, may have rained, …} Aux --> Tense (Modal) (have +en) (be +ing) {present} {may, might} {have VERBen} {be VERBing} {present} {may, might} {have VERBen} {be VERBing} {past} + {have + en} = had rained {past} + {Modal} + {be+ing} = may have been raining Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Case Studies Case 2: Auxiliary sequences Kimball 1973: Aux --> Tense (Modal) (have +en) (be +ing) Crucial data to get proper rule sequence have all three optional components: “It may have been raining” No examples in 1,000,000 word corpus, vanishingly rare in conversation… Case 2: Auxiliary sequences P&S rebuttal: Is that rule really what children are acquiring? Instead, children may be able to abstract the necessary sequence from other sequences of not exactly that type. Also, data not so vanishingly rare: hundreds of examples in adult literature (Moby Dick, Wuthering Heights) and many in children’s literature (Peter Pan, Alice in Wonderland, The Wizard of Oz) Estimate: 1 approximately every 3000-4000 sentences The real question: How many is enough? Need a quantitative claim from the linguists (see Legate & Yang 2002, Hsu & Chater 2010 for some attempts at this). Pullum & Scholz (2002): Case Studies Anaphoric One structure NP Case 3: Anaphoric One Originally, Baker 1978 Recently described accessibly in Foraker et al. (2007, 2009) and Pearl & Mis (2011) “I liked the debate about acquisition. You liked the one about modeling.” * I’ll walk by the side of the road and you can walk by the one of the river.” NP det N’ det the N0 the N’ det adj N’ good debate Syntactic distribution distinction: Difference between complement-taking nouns (side) and modifier-taking nouns (debate). Syntactic distinction: complement-taking nouns = Category N0, modifier-taking nouns = Category N’ (larger than N) Semantic distinction: Complement-taking nouns are conceptually different from modifier-taking nouns. (side = side of what?, debate can stand by itself) NP the N0 N’ N’ good N adj N’ debate PP N0 debate about acquisition [modifier] Anaphoric One structure NP NP det N’ det the N0 the debate NP NP N’ adj good det N’ N0 debate one Anaphoric One structure the det good N’ N’ det the N’ adj NP N0 the side N’ PP N’ det adj N’ good N0 the N’ adj N’ good N0 PP side N0 debate NP about acquisition side of the road [complement] Pullum & Scholz (2002): Case Studies Anaphoric One structure NP NP det N’ det the N0 the N’ det adj N’ good side Case 3: Anaphoric One NP N0 the Originally, Baker 1978 N’ adj Necessary Data: Baker 1978: rule out one = N0 N’ good N0 PP side Need specific utterance & world situation: Utterance: “Look – a red bottle! We want another one and there doesn’t seem to be one here.” side one Situation of the road Pullum & Scholz (2002): Case Studies Reasoning: one ≠ “bottle”, since another bottle is present. Therefore, one = “red bottle”, which can only be N’ (not N0) Pullum & Scholz (2002): Case Studies Case 3: Anaphoric One Case 3: Anaphoric One 18-month olds behave as if they have the right interpretations (Lidz, Waxman, & Freedman 2003) Rebuttal of another kind: this is not the crucial evidence “Can be learned from other available data” Regier & Gahl 2004, Foraker et al. 2009 Unambiguous data are pretty rare in child-directed speech: (Lidz, Waxman, & Freedman 2003; Pearl & Lidz 2009) ~0.25% of anaphoric one utterances are unambiguous for one != category N0, but instead one = something larger like N’ Similar P&S rebuttal as before: How rare is too rare? (see Legate & Yang 2002, Hsu & Chater 2010 for some suggestions for how to quantify “too rare”) “…but not without some other learning constraints/knowledge, too” Foraker et al. 2009, Pearl & Lidz 2009 “…unless you broaden your idea of what counts as informative data…and even then, you may need still need some additional knowledge” Pearl & Mis (2011) Pullum & Scholz (2002): Case Studies Case 4: Auxiliary Fronting Pullum & Scholz (2002): Case Studies Case 4: Auxiliary Fronting Chomsky 1971: Adult Knowledge Chomsky 1971: Child Behavior (Crain & Nakayama (1987)) The girl is easily fooled. Is the girl easily fooled? The girl is easily fooled. Is the girl easily fooled? Rule: Move first auxiliary? Rule: Move main-clause auxiliary? Rule: Move odd-numbered auxiliary? The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? Rule: Move auxiliary next to female noun? Someone who is not easily fooled could trick someone who is. Could someone who is not easily fooled trick someone who is? … The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Rule: Move main-clause auxiliary Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Case Studies Case 4: Auxiliary Fronting Case 4: Auxiliary Fronting Chomsky 1971: Child Data P&S rebuttal: The girl is easily fooled. Is the girl easily fooled? The girl is easily fooled. Is the girl easily fooled? Very frequent Very frequent I could borrow your pencil when you’re done. When you’re done, could I borrow your pencil? The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Very infrequent The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Rules out “front first aux” hypothesis: Should be very frequent Very infrequent Pullum & Scholz (2002): Case Studies Case 4: Auxiliary Fronting Pullum & Scholz (2002): Case Studies Case 4: Auxiliary Fronting P&S rebuttal: P&S rebuttal: The girl is easily fooled. Is the girl easily fooled? The girl is easily fooled. Is the girl easily fooled? Very frequent The changes these events portend are how fundamental. How fundamental are the changes these events portend? Rules out “front first aux” hypothesis, though not in yes/no questions: 15th sentence in WSJ corpus The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Very infrequent What I’m doing is in the shareholders’ best interest. Is what I’m doing in the shareholders’ best interest? Not really a good sample of child-directed speech, unless you have a very unique child. Also, Legate & Yang (2002) found that 1% are of this data type (5 of the 500 sentences). Rules out “front first aux” hypothesis: 180th sentence in WSJ corpus The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Legate & Yang (2002): Rebuttal Data set = 500 sentences of the Wall Street Journal “How fundamental are the changes these events portend?” “Is what I’m doing in the shareholders’ best interest?” Very frequent Very infrequent Pullum & Scholz (2002): Case Studies Case 4: Auxiliary Fronting P&S rebuttal: The girl is easily fooled. Is the girl easily fooled? Very frequent The other dolly that was in here is where. Where’s the other dolly that was in here? Rules out “front first aux” hypothesis, though not yes/no question: Child-directed speech The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Very infrequent Legate & Yang (2002): Rebuttal Legate & Yang (2002): Rebuttal Looking at the Nina corpus: 46,499 sentences 20,651 questions 14 unambiguous data examples (all of wh-question type) Looking at the Brown-Adam corpus: 20,372 sentences 8,889 questions 4 unambiguous data examples (all of wh-question type) Frequency of unambiguous data: 0.068% Frequency of unambiguous data: 0.045% Pullum & Scholz (2002): Case Studies Pullum & Scholz (2002): Summary Case 4: Auxiliary Fronting Linguists should be careful about what knowledge they think children are acquiring. P&S rebuttal: Main point (similar to previous ones): How much data is enough? Legate & Yang (2002) and Hsu & Chater (2010) offer some ideas for how to assess this. The girl who can solve the labyrinth is easily fooled. Is the girl who can solve the labyrinth easily fooled? * Can the girl who solve the labyrinth is eaily fooled? Very infrequent It’s not that there is no evidence for the child to learn from in most cases. It’s just that it’s rare. It’s an open question about how rare is too rare. Additional Note: Larger point about PoS (from Crain & Pietroski (2002)) “…it’s not enough to mention ways in which children could learn some things without Universal Grammar. To rebut poverty-of-the-stimulus arguments, one has to show how children could learn [everything] adults actually know; and as close investigation reveals, adults know a lot more than casual inspection suggests. That is the nativist’s main point.” Examples of linguistic knowledge that some researchers believe is hard: - restrictions on meaning interpretation (Crain & Pietroski 2002) - restrictions on syntactic case (Valian 2009, ch2, section 2.4) - restrictions on syntactic islands (Pearl & Sprouse 2011) Exploring the Nature of the Necessary Bias(es): Computational Modeling Work Domain-general biases explored: -prefer subset hypothesis: Regier & Gahl 2004 -prefer simplicity: Perfors, Tenenbaum, & Regier 2006, 2011 -use only maximally informative data: Pearl & Weinberg 2007, Pearl 2008, Pearl & Lidz 2009, Pearl & Mis 2011 -prefer highly probable sequences: Reali & Christiansen 2005, Kam et al. 2008, Pearl & Sprouse 2011 Domain-specific biases explored: -ignore certain kinds of ambiguous data that are identified using domainspecific (linguistic) knowledge: Regier & Gahl 2004, Pearl & Lidz 2009, Pearl & Mis 2011 -ignore embedded clause data: Pearl & Weinberg 2007 -prefer syntactic information over semantic information: Foraker et al. 2009 Innate Bias = Domain-Specific? Poverty of the Stimulus (the existence of an induction problem) is usually used as the motivation for Universal Grammar. But just because an induction problem exists doesn’t mean innate domainspecific knowledge like UG is required to solve it. The knowledge required could be derived from prior knowledge (domain-specific or domain-general) or simply be domain-general to begin with. ...
View Full Document

This note was uploaded on 12/12/2011 for the course PSYCH 215l taught by Professor Pearl during the Fall '11 term at UC Irvine.

Ask a homework question - tutors are online