AS loc wban datechararray temp count grunt dataflt FILTER dataraw BY date

As loc wban datechararray temp count grunt dataflt

This preview shows page 16 - 19 out of 19 pages.

AS (loc, wban, date:chararray, temp, count);grunt> data_flt = FILTER data_raw BY date != ’YEARMODA’;grunt> data = FOREACH data_flt GENERATE (long)loc, (int)wban,date, (float)temp, (int)count;grunt> temps = FOREACH data GENERATE ((temp-32.0)*5.0/9.0);grunt> temps_group = GROUP temps ALL;grunt> max_temp = FOREACH temps_group GENERATE MAX(temps);grunt> DUMP max_temp;(43.27777862548828)16
Image of page 16
Also the mean daily temperatures were obtained from averaging a variable number ofmeasurements: the amount is given in the 5thcolumn, variable count. You might wantto filter all mean values obtained with less than –say– 5 measurements out. This is leftas an exercise to the reader.5.4Some extra Pig commandsSome relational operatorsFILTERUse it to work with tuples or rows of dataFOREACHUse it to work with columns of dataGROUPUse it to group data in a single relationORDERSort a relation based on one or more fields...Some built-in functionsAVGCalculate the average of numeric values in asingle-column bagCOUNTCalculate the number of tuples in a bagMAX/MINCalculate the maximum/minimum value in asingle-column bagSUMCalculate the sum of values in asingle-column bag...17
Image of page 17
Hands-on block 6Extras6.1Installing your own HadoopThe Hadoop community has its main online presence in:Although you can download the latest source code and release tarballs from that location,we strongly suggest you to use the more production-ready Cloudera distribution:Cloudera provides ready to use Hadoop Linux packages for several distributions, as well asa Hadoop Installer for configuring your own Hadoop cluster, and also a VMWare appliancepreconfigured with Hadoop, Hue, HBase and more.18
Image of page 18
Appendix AAdditional InformationHadoop HomepageInternet:Cloudera Hadoop DistributionInternet:DocumentationTutorial:mapred_tutorial.htmlHadoop API:Pig:Recommended booksHadoop:TheDefinitive GuideTom White, O’Reilly Media, 2010 (2nd Ed.)Hadoop in actionChuck Lam, Manning, 201119
Image of page 19

You've reached the end of your free preview.

Want to read all 19 pages?

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture