Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Jan 27, 2016

  1. enable whole stage codegen during tests even there is only one operator supports that.
  2. split doProduce() into two APIs: upstream() and doProduce()
  3. generate prefix for fresh names of each operator
  4. pass UnsafeRow to parent directly (avoid getters and create UnsafeRow again)
  5. fix bugs and tests.

@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #50183 has finished for PR 10944 at commit b4db006.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Jan 27, 2016

cc @nongli @rxin

@rxin
Copy link
Contributor

rxin commented Jan 27, 2016

Can you paste some generated code? (Actually I think that's useful for most of the code gen prs).

@davies
Copy link
Contributor Author

davies commented Jan 27, 2016

Here is the generated code for sqlContext.range(values).filter("(id & 1) = 1").count()

/* 001 */
/* 002 */ public Object generate(Object[] references) {
/* 003 */   return new GeneratedIterator(references);
/* 004 */ }
/* 005 */
/* 006 */ class GeneratedIterator extends org.apache.spark.sql.execution.BufferedRowIterator {
/* 007 */
/* 008 */   private Object[] references;
/* 009 */   private boolean TungstenAggregate_initAgg0;
/* 010 */   private boolean TungstenAggregate_bufIsNull1;
/* 011 */   private long TungstenAggregate_bufValue2;
/* 012 */   private boolean Range_initRange6;
/* 013 */   private long Range_partitionEnd7;
/* 014 */   private long Range_number8;
/* 015 */   private boolean Range_overflow9;
/* 016 */   private UnsafeRow TungstenAggregate_result29;
/* 017 */   private org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder TungstenAggregate_holder30;
/* 018 */   private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter TungstenAggregate_rowWriter31;
/* 019 */
/* 020 */   private void initRange(int idx) {
/* 021 */     java.math.BigInteger index = java.math.BigInteger.valueOf(idx);
/* 022 */     java.math.BigInteger numSlice = java.math.BigInteger.valueOf(1L);
/* 023 */     java.math.BigInteger numElement = java.math.BigInteger.valueOf(209715200L);
/* 024 */     java.math.BigInteger step = java.math.BigInteger.valueOf(1L);
/* 025 */     java.math.BigInteger start = java.math.BigInteger.valueOf(0L);
/* 026 */
/* 027 */     java.math.BigInteger st = index.multiply(numElement).divide(numSlice).multiply(step).add(start);
/* 028 */     if (st.compareTo(java.math.BigInteger.valueOf(Long.MAX_VALUE)) > 0) {
/* 029 */       Range_number8 = Long.MAX_VALUE;
/* 030 */     } else if (st.compareTo(java.math.BigInteger.valueOf(Long.MIN_VALUE)) < 0) {
/* 031 */       Range_number8 = Long.MIN_VALUE;
/* 032 */     } else {
/* 033 */       Range_number8 = st.longValue();
/* 034 */     }
/* 035 */
/* 036 */     java.math.BigInteger end = index.add(java.math.BigInteger.ONE).multiply(numElement).divide(numSlice)
/* 037 */     .multiply(step).add(start);
/* 038 */     if (end.compareTo(java.math.BigInteger.valueOf(Long.MAX_VALUE)) > 0) {
/* 039 */       Range_partitionEnd7 = Long.MAX_VALUE;
/* 040 */     } else if (end.compareTo(java.math.BigInteger.valueOf(Long.MIN_VALUE)) < 0) {
/* 041 */       Range_partitionEnd7 = Long.MIN_VALUE;
/* 042 */     } else {
/* 043 */       Range_partitionEnd7 = end.longValue();
/* 044 */     }
/* 045 */   }
/* 046 */
/* 047 */
/* 048 */   private void TungstenAggregate_doAgg5() {
/* 049 */     // initialize aggregation buffer
/* 050 */     /* 0 */
/* 051 */
/* 052 */     TungstenAggregate_bufIsNull1 = false;
/* 053 */     TungstenAggregate_bufValue2 = 0L;
/* 054 */
/* 055 */
/* 056 */
/* 057 */     // initialize Range
/* 058 */     if (!Range_initRange6) {
/* 059 */       Range_initRange6 = true;
/* 060 */       if (input.hasNext()) {
/* 061 */         initRange(((InternalRow) input.next()).getInt(0));
/* 062 */       } else {
/* 063 */         return;
/* 064 */       }
/* 065 */     }
/* 066 */
/* 067 */     while (!Range_overflow9 && Range_number8 < Range_partitionEnd7) {
/* 068 */       long Range_value10 = Range_number8;
/* 069 */       Range_number8 += 1L;
/* 070 */       if (Range_number8 < Range_value10 ^ 1L < 0) {
/* 071 */         Range_overflow9 = true;
/* 072 */       }
/* 073 */
/* 074 */       /* ((input[0, bigint] & 1) = 1) */
/* 075 */       /* (input[0, bigint] & 1) */
/* 076 */       /* input[0, bigint] */
/* 077 */
/* 078 */       /* 1 */
/* 079 */
/* 080 */       long Filter_value14 = -1L;
/* 081 */       Filter_value14 = Range_value10 & 1L;
/* 082 */       /* 1 */
/* 083 */
/* 084 */       boolean Filter_value12 = false;
/* 085 */       Filter_value12 = Filter_value14 == 1L;
/* 086 */       if (!false && Filter_value12) {
/* 087 */
/* 088 */
/* 089 */
/* 090 */
/* 091 */         // do aggregate and update aggregation buffer
/* 092 */
/* 093 */         /* (input[0, bigint] + 1) */
/* 094 */         /* input[0, bigint] */
/* 095 */
/* 096 */         /* 1 */
/* 097 */
/* 098 */         long TungstenAggregate_value22 = -1L;
/* 099 */         TungstenAggregate_value22 = TungstenAggregate_bufValue2 + 1L;
/* 100 */         TungstenAggregate_bufIsNull1 = false;
/* 101 */         TungstenAggregate_bufValue2 = TungstenAggregate_value22;
/* 102 */
/* 103 */
/* 104 */
/* 105 */       }
/* 106 */
/* 107 */     }
/* 108 */
/* 109 */   }
/* 110 */
/* 111 */
/* 112 */   public GeneratedIterator(Object[] references) {
/* 113 */     this.references = references;
/* 114 */     TungstenAggregate_initAgg0 = false;
/* 115 */
/* 116 */
/* 117 */     Range_initRange6 = false;
/* 118 */     Range_partitionEnd7 = 0L;
/* 119 */     Range_number8 = 0L;
/* 120 */     Range_overflow9 = false;
/* 121 */     TungstenAggregate_result29 = new UnsafeRow(1);
/* 122 */     this.TungstenAggregate_holder30 = new org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(TungstenAggregate_result29, 0);
/* 123 */     this.TungstenAggregate_rowWriter31 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(TungstenAggregate_holder30, 1);
/* 124 */   }
/* 125 */
/* 126 */   protected void processNext() throws java.io.IOException {
/* 127 */
/* 128 */     if (!TungstenAggregate_initAgg0) {
/* 129 */       TungstenAggregate_initAgg0 = true;
/* 130 */       TungstenAggregate_doAgg5();
/* 131 */
/* 132 */       // output the result
/* 133 */
/* 134 */
/* 135 */
/* 136 */       TungstenAggregate_rowWriter31.zeroOutNullBytes();
/* 137 */
/* 138 */       /* input[0, bigint] */
/* 139 */
/* 140 */       if (TungstenAggregate_bufIsNull1) {
/* 141 */         TungstenAggregate_rowWriter31.setNullAt(0);
/* 142 */       } else {
/* 143 */         TungstenAggregate_rowWriter31.write(0, TungstenAggregate_bufValue2);
/* 144 */       }
/* 145 */       currentRow = TungstenAggregate_result29;
/* 146 */       return;
/* 147 */
/* 148 */     }
/* 149 */
/* 150 */   }
/* 151 */ }
/* 152 */

@davies
Copy link
Contributor Author

davies commented Jan 28, 2016

@nongli Does this one looks good to you? this one blocks others.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment what references mean? references is a very generic name

@nongli
Copy link
Contributor

nongli commented Jan 28, 2016

The generated code has a ton of extra new lines. If this is easy to remove, it will help the debuggability of this.

LGTM, feel free to address the comments in follow ups.

@davies
Copy link
Contributor Author

davies commented Jan 28, 2016

Thanks, merging this into master to unblock others, comments will be addressed by follow up.

@asfgit asfgit closed this in cc18a71 Jan 28, 2016
asfgit pushed a commit that referenced this pull request Jan 29, 2016
1. enable whole stage codegen during tests even there is only one operator supports that.
2. split doProduce() into two APIs: upstream() and doProduce()
3. generate prefix for fresh names of each operator
4. pass UnsafeRow to parent directly (avoid getters and create UnsafeRow again)
5. fix bugs and tests.

This PR re-open #10944 and fix the bug.

Author: Davies Liu <[email protected]>

Closes #10977 from davies/gen_refactor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants