7 Million Study Materials
From students who've taken these classes before
24/7 Access to Tutors
Personal attention for all your questions
Learn
93% of our members earn better grades
Rutgers | CS 533
 
 

21 sample documents related to CS 533

  • Rutgers CS 533
    # Refinder3.txt # to change field divider from / to _ in Genia 3.0 POS file # and add _SYM tag to = open(INPUT,\"GENIA3_0_pos.txt\"); open(OUTPUT,\">genia30pos4.txt\"); @lines = <INPUT>; $n=0; while ($lines[$n]) { $_ = $lines[$n]; s/(.*)
     
  • Rutgers CS 533
    GENIA 3.0 POS BAD TAGS AND COPRRECTIONS (N=193) C:\\PERL\\MARK>perl refinder5.txt LINE# BAD TAG CORRECTION -1202 ER_NN. ER_NN 25466 A_NN. A_NN 27556 transcription_NN. transcription_NN 42300 B_NN. B_NN 50755 granulocyte-macrophage_JJ! granulocyte-macrop
     
  • Rutgers CS 533
    CS533 TERM PROJECT Mark Sharp & Lu Liu msharp@scils.rutgers.edu luliu@scils.rutgers.edu Spring 2003 GENIA is an information extraction project targeted to the biomedical domain. The project has made available to the BioNLP community a variety of reso
     
  • Rutgers CS 533
    BioNLP Tools from the GENIA Corpora Mark Sharp highly technical terminology; changes rapidly; but simple syntax (~100% declarative) Text on
     
  • Rutgers CS 533
    CS533 TERM PROJECT PROPOSAL Mark Sharp & Lu Liu Spring 2003 We want to use the GENIA bioNLP corpora (http:/www-tsujii.is.s.u-tokyo.ac.jp/~genia/) to train a tagger and see how well it tags an arbitrary biomedical text, say, a set of Medline abstracts
     
  • Rutgers CS 533
    7+ 11 18 20-epi 39 9-cis 55 AP-2 61 A 81 B 117 E. 130 ERK1 131 ERK2 132 ER 142 FCS 148 G. 150 GAS-motif 154 GR 174 Hence 192 JAK/STAT 203 LPS 205 LTR 210 M. 212 M. 225 Murine 233 NFAT 239 NXS 243 Northern 244 Northern 251 P. 252 PBL 280 S. 297 Southe
     
  • Rutgers CS 533
    4) 323 ] 7, 8, 38 : 10 79 Both 221 Neither 492 both 787 either 1243 neither 48 AND 1291 or 5 +/6+ 1362 plus 49 AND 1292 or 1851 x 80 Both 493 both 222 Neither 494 both 788 either 50 AND 9 -2 1793 two13 1 19 2 25 3 27 4 35 5 167 I 18 247464 28 4 37 9.
     
  • Rutgers CS 533
    if (!open(INPUT,\"GeniaOut1.txt\") {goto end;} @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) { $tag[$j
     
  • Rutgers CS 533
    if (!open(INPUT,\"GeniaOut1.txt\") {goto end;} @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) { $tag[$j
     
  • Rutgers CS 533
    # Refinder2.txt # to change field divider from / to _ in Genia 3.0 POS file # and add _SYM tag to = open(INPUT,\"GENIA3_0_pos.txt\"); open(OUTPUT,\">genia30pos2.txt\"); open(OUTPUT2,\">genia30pos3.txt\"); @lines = <INPUT>; $n=0; while ($lines[$n]) {
     
  • Rutgers CS 533
    # Refinder4.txt # to print records with .*_.*\\/.* or .*_.*_.* in genia30pos4.txt open(INPUT,\"genia30pos4.txt\"); @lines = <INPUT>; $n=0; while ($lines[$n]) { $_ = $lines[$n]; if (/.*_.*\\/.*/) { print \"$_\"; }
     
  • Rutgers CS 533
    # Refinder5.txt # to clean up illegal tags in genia30pos1.txt > genia30pos2.txt open(INPUT,\"genia30pos1.txt\"); @lines = <INPUT>; close(INPUT); open(OUTPUT,\">genia30pos2.txt\"); $n=0; while ($lines[$n]) { $_ = $lines[$n]; if (/.*\\_\\-$/
     
  • Rutgers CS 533
    4) 323 ] 7, 8, 38 : 10 79 Both 221 Neither 492 both 787 either 1243 neither 48 AND 1291 or 5 +/6+ 1362 plus 49 AND 1292 or 1851 x 80 Both 493 both 222 Neither 494 both 788 either 50 AND 9 -2 1793 two13 1 19 2 25 3 27 4 35 5 167 I 18 247464 28 4 37 9.3 127
     
  • Rutgers CS 533
    # Refinder5.txt # to clean up illegal tags in genia30pos1.txt > genia30pos2.txt open(INPUT,\"genia30pos1.txt\"); @lines = <INPUT>; close(INPUT); open(OUTPUT,\">genia30pos2.txt\"); $n=0; while ($lines[$n]) cfw_ $_ = $lines[$n]; if (/.*\\_\\-$/) cfw_ chop; pr
     
  • Rutgers CS 533
    if (!open(INPUT,\"GeniaOut1.txt\") cfw_goto end; @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) cfw_ $tag[$j]=(spli
     
  • Rutgers CS 533
    if (!open(INPUT,\"GeniaOut1.txt\") cfw_goto end; @textlines=<INPUT>; close (INPUT); print @textlines[0,1,2,3]; $nx=scalar(@textlines); print \"Number of text lines = $nx\ \"; # # split input lines into tag and dtag # for ($j=0;$j<$nx;$j+) cfw_ $tag[$j]=(spli
     
  • Rutgers CS 533
    BioNLP Tools from the GENIA Corpora Mark Sharp highly technical terminology; changes rapidly; but simple syntax (~100% declarative) Text only, no sp
     
  • Rutgers CS 533
    7+ 11 18 20-epi 39 9-cis 55 AP-2 61 A 81 B 117 E. 130 ERK1 131 ERK2 132 ER 142 FCS 148 G. 150 GAS-motif 154 GR 174 Hence 192 JAK/STAT 203 LPS 205 LTR 210 M. 212 M. 225 Murine 233 NFAT 239 NXS 243 Northern 244 Northern 251 P. 252 PBL 280 S. 297 Southern 34
     
  • Rutgers CS 533
    CS533 TERM PROJECT PROPOSAL Mark Sharp & Lu Liu Spring 2003 We want to use the GENIA bioNLP corpora (http:/www-tsujii.is.s.u-tokyo.ac.jp/~genia/) to train a tagger and see how well it tags an arbitrary biomedical text, say, a set of Medline abstracts. The
     
  • Rutgers CS 533
    CS533 TERM PROJECT Mark Sharp & Lu Liu msharp@scils.rutgers.edu luliu@scils.rutgers.edu Spring 2003 GENIA is an information extraction project targeted to the biomedical domain. The project has made available to the BioNLP community a variety of resources
     
  • Rutgers CS 533
    GENIA 3.0 POS BAD TAGS AND COPRRECTIONS (N=193) C:\\PERL\\MARK>perl refinder5.txt LINE# BAD TAG CORRECTION -1202 ER_NN. ER_NN 25466 A_NN. A_NN 27556 transcription_NN. transcription_NN 42300 B_NN. B_NN 50755 granulocyte-macrophage_JJ! granulocyte-macrophage_
     
 
 
 
 
7,000,000 study materials • 24/7 tutors • earn better grades
Ask a tutor a question for CS 533
 
* 
Browse...