2012-05-23 11:27:17,923 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.0-SNAPSHOT (rexportiert) compiled Mai 23 2012, 10:32:21 2012-05-23 11:27:17,924 [main] INFO org.apache.pig.Main - Logging error messages to: /home/schwenk/Desktop/pig-debug/pig_1337765237921.log 2012-05-23 11:27:18,104 [main] INFO org.apache.hadoop.security.UserGroupInformation - JAAS Configuration already set up for Hadoop, not re-installing. 2012-05-23 11:27:18,161 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/schwenk/.pigbootup not found 2012-05-23 11:27:18,238 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// #----------------------------------------------- # New Logical Plan: #----------------------------------------------- b: (Name: LOStore Schema: id#10:int,grp#11:int,additional#12:int,referer#13:chararray) | |---b: (Name: LOFilter Schema: id#10:int,grp#11:int,additional#12:int,referer#13:chararray) | | | (Name: Or Type: boolean Uid: 23) | | | |---(Name: UserFunc(com.adition.pig.filtering.string.CONTAINS) Type: boolean Uid: 20) | | | | | |---referer:(Name: Project Type: chararray Uid: 13 Input: 0 Column: 3) | | | | | |---(Name: Constant Type: chararray Uid: 19) | | | |---(Name: UserFunc(com.adition.pig.filtering.string.CONTAINS) Type: boolean Uid: 22) | | | |---referer:(Name: Project Type: chararray Uid: 13 Input: 0 Column: 3) | | | |---(Name: Constant Type: chararray Uid: 21) | |---a: (Name: LOForEach Schema: id#10:int,grp#11:int,additional#12:int,referer#13:chararray) | | | (Name: LOGenerate[false,false,false,false] Schema: id#10:int,grp#11:int,additional#12:int,referer#13:chararray)ColumnPrune:InputUids=[10, 11, 12, 13]ColumnPrune:OutputUids=[10, 11, 12, 13] | | | | | (Name: Cast Type: int Uid: 10) | | | | | |---id:(Name: Project Type: bytearray Uid: 10 Input: 0 Column: (*)) | | | | | (Name: Cast Type: int Uid: 11) | | | | | |---grp:(Name: Project Type: bytearray Uid: 11 Input: 1 Column: (*)) | | | | | (Name: Cast Type: int Uid: 12) | | | | | |---additional:(Name: Project Type: bytearray Uid: 12 Input: 2 Column: (*)) | | | | | (Name: Cast Type: chararray Uid: 13) | | | | | |---referer:(Name: Project Type: bytearray Uid: 13 Input: 3 Column: (*)) | | | |---(Name: LOInnerLoad[0] Schema: id#10:bytearray) | | | |---(Name: LOInnerLoad[1] Schema: grp#11:bytearray) | | | |---(Name: LOInnerLoad[2] Schema: additional#12:bytearray) | | | |---(Name: LOInnerLoad[3] Schema: referer#13:bytearray) | |---a: (Name: LOLoad Schema: id#10:bytearray,grp#11:bytearray,additional#12:bytearray,referer#13:bytearray)RequiredFields:null #----------------------------------------------- # Physical Plan: #----------------------------------------------- b: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-22 | |---b: Filter[bag] - scope-14 | | | Or[boolean] - scope-21 | | | |---POUserFunc(com.adition.pig.filtering.string.CONTAINS)[boolean] - scope-17 | | | | | |---Project[chararray][3] - scope-15 | | | | | |---Constant(obama) - scope-16 | | | |---POUserFunc(com.adition.pig.filtering.string.CONTAINS)[boolean] - scope-20 | | | |---Project[chararray][3] - scope-18 | | | |---Constant(praesident) - scope-19 | |---a: New For Each(false,false,false,false)[bag] - scope-13 | | | Cast[int] - scope-2 | | | |---Project[bytearray][0] - scope-1 | | | Cast[int] - scope-5 | | | |---Project[bytearray][1] - scope-4 | | | Cast[int] - scope-8 | | | |---Project[bytearray][2] - scope-7 | | | Cast[chararray] - scope-11 | | | |---Project[bytearray][3] - scope-10 | |---a: Load(file:///home/schwenk/Desktop/pig-debug/TestCONTAINS-testFilteringCluster-input.txt:org.apache.pig.builtin.PigStorage) - scope-0 2012-05-23 11:27:18,979 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2012-05-23 11:27:19,011 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2012-05-23 11:27:19,011 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 #-------------------------------------------------- # Map Reduce Plan #-------------------------------------------------- MapReduce node scope-23 Map Plan b: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-22 | |---b: Filter[bag] - scope-14 | | | Or[boolean] - scope-21 | | | |---POUserFunc(com.adition.pig.filtering.string.CONTAINS)[boolean] - scope-17 | | | | | |---Project[chararray][3] - scope-15 | | | | | |---Constant(obama) - scope-16 | | | |---POUserFunc(com.adition.pig.filtering.string.CONTAINS)[boolean] - scope-20 | | | |---Project[chararray][3] - scope-18 | | | |---Constant(praesident) - scope-19 | |---a: New For Each(false,false,false,false)[bag] - scope-13 | | | Cast[int] - scope-2 | | | |---Project[bytearray][0] - scope-1 | | | Cast[int] - scope-5 | | | |---Project[bytearray][1] - scope-4 | | | Cast[int] - scope-8 | | | |---Project[bytearray][2] - scope-7 | | | Cast[chararray] - scope-11 | | | |---Project[bytearray][3] - scope-10 | |---a: Load(file:///home/schwenk/Desktop/pig-debug/TestCONTAINS-testFilteringCluster-input.txt:org.apache.pig.builtin.PigStorage) - scope-0-------- Global sort: false ---------------- 2012-05-23 11:27:19,061 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: FILTER 2012-05-23 11:27:19,085 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2012-05-23 11:27:19,088 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2012-05-23 11:27:19,088 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2012-05-23 11:27:19,099 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId= 2012-05-23 11:27:19,111 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 2012-05-23 11:27:19,129 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2012-05-23 11:27:19,134 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2012-05-23 11:27:19,136 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=505 2012-05-23 11:27:19,136 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2012-05-23 11:27:19,159 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2012-05-23 11:27:19,210 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 2012-05-23 11:27:19,211 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2012-05-23 11:27:19,239 [Thread-5] INFO org.apache.hadoop.util.NativeCodeLoader - Loaded the native-hadoop library 2012-05-23 11:27:19,265 [Thread-5] WARN org.apache.hadoop.mapred.JobClient - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 2012-05-23 11:27:19,304 [Thread-5] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2012-05-23 11:27:19,305 [Thread-5] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2012-05-23 11:27:19,315 [Thread-5] WARN org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library is available 2012-05-23 11:27:19,315 [Thread-5] INFO org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library loaded 2012-05-23 11:27:19,318 [Thread-5] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2012-05-23 11:27:19,597 [Thread-6] INFO org.apache.hadoop.util.ProcessTree - setsid exited with exit code 0 2012-05-23 11:27:19,603 [Thread-6] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@c45aa2c 2012-05-23 11:27:19,616 [Thread-6] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/home/schwenk/Desktop/pig-debug/TestCONTAINS-testFilteringCluster-input.txt:0+505 2012-05-23 11:27:19,641 [Thread-6] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: a[3,4],a[-1,-1],b[4,4] C: R: 2012-05-23 11:27:19,646 [Thread-6] INFO org.apache.hadoop.mapred.Task - Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 2012-05-23 11:27:19,650 [Thread-6] INFO org.apache.hadoop.mapred.LocalJobRunner - 2012-05-23 11:27:19,650 [Thread-6] INFO org.apache.hadoop.mapred.Task - Task attempt_local_0001_m_000000_0 is allowed to commit now 2012-05-23 11:27:19,653 [Thread-6] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local_0001_m_000000_0' to file:/tmp/temp-390375712/tmp-1337884501 2012-05-23 11:27:19,654 [Thread-6] INFO org.apache.hadoop.mapred.LocalJobRunner - 2012-05-23 11:27:19,654 [Thread-6] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_000000_0' done. 2012-05-23 11:27:19,655 [Thread-6] WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup 2012-05-23 11:27:19,712 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local_0001 2012-05-23 11:27:19,712 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a,b 2012-05-23 11:27:19,712 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[3,4],a[-1,-1],b[4,4] C: R: 2012-05-23 11:27:19,718 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2012-05-23 11:27:20,226 [main] WARN org.apache.pig.tools.pigstats.PigStatsUtil - Failed to get RunningJob for job job_local_0001 2012-05-23 11:27:20,234 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2012-05-23 11:27:20,235 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete 2012-05-23 11:27:20,240 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 0.20.2-cdh3u3 0.11.0-SNAPSHOT schwenk 2012-05-23 11:27:19 2012-05-23 11:27:20 FILTER Success! Job Stats (time in seconds): JobId Alias Feature Outputs job_local_0001 a,b MAP_ONLY file:/tmp/temp-390375712/tmp-1337884501, Input(s): Successfully read records from: "file:///home/schwenk/Desktop/pig-debug/TestCONTAINS-testFilteringCluster-input.txt" Output(s): Successfully stored records in: "file:/tmp/temp-390375712/tmp-1337884501" Job DAG: job_local_0001 2012-05-23 11:27:20,243 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2012-05-23 11:27:20,248 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2012-05-23 11:27:20,248 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (4,323,242,http://www.google.com/url&url=http%3A%2F%2Fwww.tagesschau.de&q=obama) (5,423,342,http://www.google.com/url&url=http%3A%2F%2Fwww.bild.de&q=obama) (6,523,442,http://www.google.com/url&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&q=praesident)