generated from jhudsl/OTTR_Template
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathSolutions.Rmd
241 lines (180 loc) · 7.03 KB
/
Solutions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
```{r, include = FALSE}
# enable python code previews; must use python 3
library(reticulate)
use_python("/usr/bin/python3")
ottrpal::set_knitr_image_path()
```
# Solutions {-}
## Refactoring Hands-On Exercise Solutions {-}
Create an AI prompt that fixes any formatting issues with the code that would cause it not to run.
:::{.query}
Can you help me fix any formatting issues in the following python code?
:::
:::{.ai_response}
```{python eval = FALSE, python.reticulate = FALSE}
import tweepy
import pandas
# Enter your API keys and access tokens here
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'
# Authenticate with Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q="Mr. Trash Wheel", tweet_mode='extended').items(100)
# Create empty list to store tweet data
date_data = []
location_data = []
text_data = []
def get_tweet_length(tweet):
# Return the length of the tweet text
return len(tweet.full_text)
# Loop through each tweet and extract desired data
for tweet in tweets:
date_info = {
'date': tweet.created_at
}
date_data.append(date_info)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q="Mr. Trash Wheel", tweet_mode='extended').items(100)
# Loop through each tweet and extract desired data
for tweet in tweets:
location_info = {
'location': tweet.user.location
}
location_data.append(location_info)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q='Mr. Trash Wheel', tweet_mode='extended').items(100)
# Loop through each tweet and extract desired data
for tweet in tweets:
text_info = {
'text': tweet.full_text
}
text_data.append(text_info)
# Combine lists into a dictionary
data = {'date': date_data, 'location': location_data, 'text': text_data}
# Store results in pandas dataframe
df = pandas.DataFrame(data)
# Print dataframe
print(df)
```
:::
Devise an AI prompt that removes any dead code from your sample above. What gets removed?
:::{.query}
Can you remove any unused functions from this python code?
:::
:::{.ai_response}
Yes, the `get_tweet_length()` function is not being used in this code. It can be removed.
```{python eval = FALSE, python.reticulate = FALSE}
import tweepy
import pandas
# Enter your API keys and access tokens here
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'
# Authenticate with Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q="Mr. Trash Wheel", tweet_mode='extended').items(100)
# Create empty list to store tweet data
date_data = []
location_data = []
text_data = []
# Loop through each tweet and extract desired data
for tweet in tweets:
date_info = {
'date': tweet.created_at
}
date_data.append(date_info)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q="Mr. Trash Wheel", tweet_mode='extended').items(100)
# Loop through each tweet and extract desired data
for tweet in tweets:
location_info = {
'location': tweet.user.location
}
location_data.append(location_info)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q='Mr. Trash Wheel', tweet_mode='extended').items(100)
# Loop through each tweet and extract desired data
for tweet in tweets:
text_info = {
'text': tweet.full_text
}
text_data.append(text_info)
# Combine lists into a dictionary
data = {'date': date_data, 'location': location_data, 'text': text_data}
# Store results in pandas dataframe
df = pandas.DataFrame(data)
# Print dataframe
print(df)
```
:::
Create a prompt that makes the code less repetitive, adhering to the DRY principle. What aspect of the code was repetitive?
:::{.query}
Can you help make this code more DRY?
:::
:::{.ai_response}
Yes, you can make this code more DRY by removing the code repetition. You can combine the three for loops into a single loop and extract all the desired data in one go. Here's an updated version of the code:
```{python eval = FALSE, python.reticulate = FALSE}
import tweepy
import pandas
# Enter your API keys and access tokens here
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'
# Authenticate with Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Search for tweets containing the search term
tweets = tweepy.Cursor(tweepy.API(auth).search_tweets, q="Mr. Trash Wheel", tweet_mode='extended').items(100)
# Create empty list to store tweet data
data = []
# Loop through each tweet and extract desired data
for tweet in tweets:
tweet_data = {
'date': tweet.created_at,
'location': tweet.user.location,
'text': tweet.full_text
}
data.append(tweet_data)
# Store results in pandas dataframe
df = pandas.DataFrame(data)
# Print dataframe
print(df)
```
This code extracts the date, location, and text data in one loop, and appends them to a single `tweet_data` dictionary. The data list is populated with these dictionaries, and then used to create the pandas dataframe.
:::
Construct a prompt that makes the code more concise. What are some trade-offs that appear in this code between readability and brevity?
:::{.query}
Can you make the following python code more concise?
:::
:::{.ai_response}
```{python eval = FALSE, python.reticulate = FALSE}
import tweepy
import pandas as pd
# Enter your API keys and access tokens here
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'
# Authenticate with Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Search for tweets containing the search term
tweets = tweepy.Cursor(api.search_tweets, q="Mr. Trash Wheel", tweet_mode='extended').items(100)
# Extract desired data from each tweet and store in a list
data = [{'date': tweet.created_at, 'location': tweet.user.location, 'text': tweet.full_text} for tweet in tweets]
# Store results in pandas dataframe
df = pd.DataFrame(data)
# Print dataframe
print(df)
```
This version removes the duplicate search code by combining the three search loops into one, and simplifies the loop to a list comprehension.
The list comprehension format might be less readable or intuitive to some programmers.
:::