Fon the rejects ones that cannot ulll s of retraining

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: n the yieldsfrom best accuracy findings the small roach seed: Clark et al. [1], P co-training, perior to naive co-training and eement-basedenn Tree since amount e amically selectsClark ofal. [1], Bank findings from the exatmples that nd rejects ones that cannot fulfill ior to naive co-training, / transribed since aining adds the exa examples ically selectsall new mples that in agged speech te noisetduring learning. fOn the rejects ones that cannot ulfill s of retraining that the agreementing addsextnew examples in t all of magnitude arger lly an order learning. On lthe oise during nterestingly,that the agreementf retraining the max-t-min-s apuces comparable performance to an order of magnitude larger nsidering this approach is much restingly, the max-t-min-s apn the agreement-based approach, es comparable performance to in other co-training tasks. Also, dering this approach is much in-s approach outperforms max- combination of co-training and active learning. We will also apply corpus and performance of different co-training setups, and effective co-training for POS tagging (and parsing) on more difficult genres combination of co-training and [Wang, Huang, Harper, apply active learning. We will also 2007] like spontaneous speech. co-training for POS tagging (and parsing) on more difficult genres like spontaneous speech. Table 3. Comparison of the tagging accuracy (%) of the HMM tagger and ME tagger when trained on the entire CTB corpus and the Table 3. Comparison BNthe tagging accuracy (%) thethe HMM BN additional Mandarin of seed corpus and tested on of Mandarin tagger and ME tagger Knowntrained unknown word, and overall accuraPOS-eval test set. when word, on the entire CTB corpus and the additionalincluded. BN seed corpus and tested on the Mandarin BN cies are Mandarin Tagger Known Unknown Overall POS-eval test set. Known word, unknown word, and overall accuracies are included. CTB HMM 80.0 69.2 79.0 Tagger Known Unknown Overall CTB+seed 90.5 75.1 89.6 HMM CTB 80.0 69.2 79.0 ME CTB 79.2 66.8 78.5 CTB+seed 90.5 75.1 89.6 CTB+seed 89.2 74.0 88.1 ME CTB 79.2 66.8 78.5 CTB+seed 89.2 74.0 88.1 Mandarin Speech POS Self-/Co- Training Mandarin POS Self/Co-Training Two POS Taggers Table 4. Overall POS tagging accuracy (%) on the Mandarin BN POS-eval test set after applying self-training and co-training. Training Condition Tagger Table 4. Overall POS tagging accuracy (%) on the Mandarin BN HMM ME POS-eval test set after applying self-training and co-training. Initial (i.e., CTB+seed) 89.6 88.1 Training Condition Tagger self-training 90.8 HMM 90.2 ME co-training naive 91.9 91.8 Initial (i.e., CTB+seed) 89.6 88.1 agreement-based 94.1 94.1 self-training 90.8 90.2 max-score 93.2 93.1 co-training naive 91.9 91.8 max-t-min-s 94.1 93.9 agreement-based 94.1 94.1 max-score 93.2 93.1 max-t-min-s 94.1 93.9 5. ACKNOWLEDGEMENTS...
View Full Document

Ask a homework question - tutors are online