7 Million Study Materials
From students who've taken these classes before
24/7 Access to Tutors
Personal attention for all your questions
Learn
93% of our members earn better grades
21 sample documents related to CS 533
-
# Refinder3.txt # to change field divider from / to _ in Genia 3.0 POS file # and add _SYM tag to = open(INPUT,\"GENIA3_0_pos.txt\"); open(OUTPUT,\">genia30pos4.txt\"); @lines = <INPUT>; $n=0; while ($lines[$n]) { $_ = $lines[$n]; s/(.*)
-
GENIA 3.0 POS BAD TAGS AND COPRRECTIONS (N=193) C:\\PERL\\MARK>perl refinder5.txt LINE# BAD TAG CORRECTION -1202 ER_NN. ER_NN 25466 A_NN. A_NN 27556 transcription_NN. transcription_NN 42300 B_NN. B_NN 50755 granulocyte-macrophage_JJ! granulocyte-macrop
-
CS533 TERM PROJECT Mark Sharp & Lu Liu msharp@scils.rutgers.edu luliu@scils.rutgers.edu Spring 2003 GENIA is an information extraction project targeted to the biomedical domain. The project has made available to the BioNLP community a variety of reso
-
BioNLP Tools from the GENIA Corpora Mark Sharp highly technical terminology; changes rapidly; but simple syntax (~100% declarative) Text on
-
CS533 TERM PROJECT PROPOSAL Mark Sharp & Lu Liu Spring 2003 We want to use the GENIA bioNLP corpora (http:/www-tsujii.is.s.u-tokyo.ac.jp/~genia/) to train a tagger and see how well it tags an arbitrary biomedical text, say, a set of Medline abstracts
-
7+ 11 18 20-epi 39 9-cis 55 AP-2 61 A 81 B 117 E. 130 ERK1 131 ERK2 132 ER 142 FCS 148 G. 150 GAS-motif 154 GR 174 Hence 192 JAK/STAT 203 LPS 205 LTR 210 M. 212 M. 225 Murine 233 NFAT 239 NXS 243 Northern 244 Northern 251 P. 252 PBL 280 S. 297 Southe
-
4) 323 ] 7, 8, 38 : 10 79 Both 221 Neither 492 both 787 either 1243 neither 48 AND 1291 or 5 +/6+ 1362 plus 49 AND 1292 or 1851 x 80 Both 493 both 222 Neither 494 both 788 either 50 AND 9 -2 1793 two13 1 19 2 25 3 27 4 35 5 167 I 18 247464 28 4 37 9.
-
if (!open(INPUT,\"GeniaOut1.txt\") {goto end;} @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) { $tag[$j
-
if (!open(INPUT,\"GeniaOut1.txt\") {goto end;} @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) { $tag[$j
-
# Refinder2.txt # to change field divider from / to _ in Genia 3.0 POS file # and add _SYM tag to = open(INPUT,\"GENIA3_0_pos.txt\"); open(OUTPUT,\">genia30pos2.txt\"); open(OUTPUT2,\">genia30pos3.txt\"); @lines = <INPUT>; $n=0; while ($lines[$n]) {
-
# Refinder4.txt # to print records with .*_.*\\/.* or .*_.*_.* in genia30pos4.txt open(INPUT,\"genia30pos4.txt\"); @lines = <INPUT>; $n=0; while ($lines[$n]) { $_ = $lines[$n]; if (/.*_.*\\/.*/) { print \"$_\"; }
-
# Refinder5.txt # to clean up illegal tags in genia30pos1.txt > genia30pos2.txt open(INPUT,\"genia30pos1.txt\"); @lines = <INPUT>; close(INPUT); open(OUTPUT,\">genia30pos2.txt\"); $n=0; while ($lines[$n]) { $_ = $lines[$n]; if (/.*\\_\\-$/
-
4) 323 ] 7, 8, 38 : 10 79 Both 221 Neither 492 both 787 either 1243 neither 48 AND 1291 or 5 +/6+ 1362 plus 49 AND 1292 or 1851 x 80 Both 493 both 222 Neither 494 both 788 either 50 AND 9 -2 1793 two13 1 19 2 25 3 27 4 35 5 167 I 18 247464 28 4 37 9.3 127
-
# Refinder5.txt # to clean up illegal tags in genia30pos1.txt > genia30pos2.txt open(INPUT,\"genia30pos1.txt\"); @lines = <INPUT>; close(INPUT); open(OUTPUT,\">genia30pos2.txt\"); $n=0; while ($lines[$n]) cfw_ $_ = $lines[$n]; if (/.*\\_\\-$/) cfw_ chop; pr
-
if (!open(INPUT,\"GeniaOut1.txt\") cfw_goto end; @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) cfw_ $tag[$j]=(spli
-
if (!open(INPUT,\"GeniaOut1.txt\") cfw_goto end; @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) cfw_ $tag[$j]=(spli
-
BioNLP Tools from the GENIA Corpora Mark Sharp highly technical terminology; changes rapidly; but simple syntax (~100% declarative) Text only, no sp
-
7+ 11 18 20-epi 39 9-cis 55 AP-2 61 A 81 B 117 E. 130 ERK1 131 ERK2 132 ER 142 FCS 148 G. 150 GAS-motif 154 GR 174 Hence 192 JAK/STAT 203 LPS 205 LTR 210 M. 212 M. 225 Murine 233 NFAT 239 NXS 243 Northern 244 Northern 251 P. 252 PBL 280 S. 297 Southern 34
-
CS533 TERM PROJECT PROPOSAL Mark Sharp & Lu Liu Spring 2003 We want to use the GENIA bioNLP corpora (http:/www-tsujii.is.s.u-tokyo.ac.jp/~genia/) to train a tagger and see how well it tags an arbitrary biomedical text, say, a set of Medline abstracts. The
-
CS533 TERM PROJECT Mark Sharp & Lu Liu msharp@scils.rutgers.edu luliu@scils.rutgers.edu Spring 2003 GENIA is an information extraction project targeted to the biomedical domain. The project has made available to the BioNLP community a variety of resources
-
GENIA 3.0 POS BAD TAGS AND COPRRECTIONS (N=193) C:\\PERL\\MARK>perl refinder5.txt LINE# BAD TAG CORRECTION -1202 ER_NN. ER_NN 25466 A_NN. A_NN 27556 transcription_NN. transcription_NN 42300 B_NN. B_NN 50755 granulocyte-macrophage_JJ! granulocyte-macrophage_
7,000,000 study materials • 24/7 tutors • earn better grades