|
| 1 | +<!DOCTYPE html> |
| 2 | +<html> |
| 3 | +<head> |
| 4 | + |
| 5 | + <meta name="viewport" content="width=device-width, initial-scale=1"> |
| 6 | + |
| 7 | + <!-- Latest compiled and minified CSS --> |
| 8 | + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous"> |
| 9 | + |
| 10 | + <!-- Optional theme --> |
| 11 | + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap-theme.min.css" integrity="sha384-rHyoN1iRsVXV4nD0JutlnGaslCJuC7uwjduW9SVrLvRYooPp2bWYgmgJQIXwl/Sp" crossorigin="anonymous"> |
| 12 | + |
| 13 | + <!-- Latest compiled and minified JavaScript --> |
| 14 | + <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> |
| 15 | + |
| 16 | + <link rel="stylesheet" href="../style.css" /> |
| 17 | + |
| 18 | + <title>Data Cleaning</title> |
| 19 | + |
| 20 | +</head> |
| 21 | +<body> |
| 22 | +<p><a href="../index.html">Back to Homepage</a></p> |
| 23 | + |
| 24 | +<!-- UPDATE --> |
| 25 | +<img src="../images/bricks.jpg" class="img-responsive" /> |
| 26 | +<small>Credit: Takeshi Hirano</small> |
| 27 | + |
| 28 | +<!-- UPDATE --> |
| 29 | +<h1>Data Cleaning with Wrangler</h1> |
| 30 | +<div class="lead">Benji Xie & Greg Nelson</div> |
| 31 | + |
| 32 | +<p>Today you're going to practice using Trifacta Wrangler to clean some data.</p> |
| 33 | + |
| 34 | +<h2>Download Wrangler, Data</h2> |
| 35 | + |
| 36 | +<p>Wrangler is free to download: <a href="https://www.trifacta.com/products/wrangler/" target="_blank">Wrangler download page</a></p> |
| 37 | + |
| 38 | +<h2>Transform obituary data from disease simulation</h2> |
| 39 | + |
| 40 | +<p>Do the following:</p> |
| 41 | + |
| 42 | +<ol> |
| 43 | + <li>Export the <a href="https://goo.gl/GKxUB7", target="_blank">obituary data</a> from the Google Sheets (as CSV).</li> |
| 44 | + <li>Import the data into Wrangler</li> |
| 45 | + <li>Start transforming the data!</li> |
| 46 | +</ol> |
| 47 | + |
| 48 | +<p>Tips for data cleaning with Wrangler:</p> |
| 49 | + |
| 50 | +<ul> |
| 51 | + <li>Try highlighting some part of the data. Wrangler is pretty good at recommending some "recipes" based on what you highlighted and you can modify the recommendations if necessary.</li> |
| 52 | + <li>Split up the problem into sub-steps. Break up a problem so 1 step is done in 1 column and another step is done in another column and then combine the 2 new columns.</li> |
| 53 | + <li>Consult documentation. Trifacta has great <a href="https://docs.trifacta.com/display/PE/Workflow+Basics" target="_blank">documenation</a> |
| 54 | + and <a href="https://www.trifacta.com/support/articles/topics/125211-online-training/" target="_blank">online training</a> to teach you how to use Wrangler.</li> |
| 55 | +</ul> |
| 56 | + |
| 57 | +<p>Be sure to clean your data with a purpose and goal in mind!</p> |
| 58 | + |
| 59 | +</body> |
| 60 | + |
| 61 | +</html> |
0 commit comments