You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: assignment7.html
+141-31
Original file line number
Diff line number
Diff line change
@@ -80,30 +80,29 @@ <h1>Assignments <small>CS 489/698 Big Data Infrastructure (Winter 2017)</small><
80
80
<div>
81
81
<h3>Assignment 7: Inverted Indexing (Redux) <small>due 1:00pm April 3</small></h3>
82
82
83
-
<p><b>Note:</b> This assignment is a draft and incomplete. If you
84
-
begin working on this assignment, be aware that parts can still
85
-
hange. This message will be removed once the assignment is
86
-
complete.</p>
87
-
88
83
<p>In this assignment you'll revisit the inverted indexing and boolean
89
84
retrieval program in <ahref="assignment3.html">assignment 3</a>. In
90
85
assignment 3, your indexer program wrote postings to HDFS
91
86
in <code>MapFile</code>s and your boolean retrieval program read
92
87
postings from those <code>MapFile</code>s. In this assignment, you'll
93
88
write postings to and read postings from HBase instead. In other
94
-
words, the program logic should not change, except for the backend
89
+
words, the core program logic should not change, except for the backend
95
90
storage that you are using. This assignment is to be completed using
96
91
MapReduce in Java.</p>
97
92
98
-
<p>Due to the complexities of setting up HBase in Altiscale, this
99
-
assignment will be completed entirely in the Linux Student CS
100
-
Environment, where we have stood up a single-node HBase cluster. Check
101
-
out <ahref="http://ubuntu1404-010.student.cs.uwaterloo.ca:16010/master-status"><code>http://ubuntu1404-010.student.cs.uwaterloo.ca:16010/master-status</code></a>. As
102
-
a result, you won't be able to play with HBase on sizeable
103
-
collections, although the assignment will still give you some
104
-
experience of what developing against HBase "feels like".</p>
105
-
106
-
<p>For this assignment ssh
93
+
<p><b>Note</b>: Due to the complexities of setting up HBase in
94
+
Altiscale, we have not been able to stand up an HBase cluster yet. For
95
+
now, work in the Linux Student CS environment (more details below). If
96
+
we manage to get a stable HBase cluster running on Altiscale, you will
97
+
complete the Altiscale portion of the assignment; otherwise, you will
98
+
not be responsible for getting code to run on Altiscale.</p>
99
+
100
+
<p>We have stood up a single-node HBase cluster in the Linux Student
101
+
CS environment. Check
102
+
out <ahref="http://ubuntu1404-010.student.cs.uwaterloo.ca:16010/master-status"><code>http://ubuntu1404-010.student.cs.uwaterloo.ca:16010/master-status</code></a>. You
103
+
won't be able to play with HBase on sizeable collections, although the
104
+
assignment will still give you some experience of what developing
105
+
against HBase "feels like". For this assignment ssh
107
106
into <code>ubuntu1404-010.student.cs.uwaterloo.ca</code> and work
108
107
specifically on that host.</p>
109
108
@@ -138,15 +137,15 @@ <h4 style="padding-top: 10px">HBase Word Count</h4>
138
137
</pre>
139
138
140
139
<p>Use the <code>-config</code> option to specify the HBase config
141
-
file: point to a version on the Altiscale workspace that we've
142
-
prepared for you. This config file tells the program how to connect to
143
-
the HBase cluster. Use the <code>-table</code> option to name the
144
-
table you're inserting the word counts into. The other options should
145
-
be straightforward to understand.</p>
140
+
file: point to a version that we've prepared for you. This config file
141
+
tells the program how to connect to the HBase cluster. Use
142
+
the <code>-table</code> option to name the table you're inserting the
143
+
word counts into. The other options should be straightforward to
144
+
understand.</p>
146
145
147
-
<p><B>Note:</b> Since HBase is a shared resource across the cluster,
148
-
please make your tables unique by using your username as part of the
149
-
table name, per above.</p>
146
+
<p><B>Note:</b> Since HBase is a shared resource, please make your
147
+
tables unique by using your username as part of the table name, per
148
+
above.</p>
150
149
151
150
<p>You should then be able to fetch the word counts from HBase:</p>
0 commit comments