-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix wahoo speed and import more meta data #81
Conversation
Wahoo Element Road saves its speed in m/s instead of km/h, so to get proper values the conversion has to be done explicitly. We use a simple heuristic to check if we do need to convert the units.
Extract further meta data like the sport from Wahoo FIT files. Use that data for the filename and to fill missing data.
Thank's for the PR, I'm not sure when I can review it, but I hope I will get to it soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the suggested changes. I've added a bunch of comments. I will take your ideas and come up with an architecture that is a bit more general and allows growing the metadata extraction concept to the other file formats as well.
@@ -3,3 +3,6 @@ __pycache__/ | |||
dist | |||
site | |||
data | |||
.venv | |||
.DS_Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added those two files to the ignore on master
.
@@ -3,3 +3,6 @@ __pycache__/ | |||
dist | |||
site | |||
data | |||
.venv | |||
.DS_Store | |||
.gitignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason you include the gitignore in itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, just a habit, so that I don't have to take care of strange OS and editors of contributors, like you have to right now :)
@@ -89,6 +90,19 @@ def get_time_series(self, id: int) -> pd.DataFrame: | |||
* 3.6 | |||
) | |||
changed = True | |||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice idea, but in #82 there came the suggestion to use the units
field from the FIT files. I think that makes more sense to use the actual data instead of a heuristic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I do agree. Shall I rebase on the updated master which includes PR #82 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that rebasing makes sense. I've already manually cherry-picked some of your ideas into the master
branch. If you want to want to make further changes I'd suggest that you base them on the latest master
. And perhaps let me know what you plan, then I can perhaps already implement it based on your idea.
@@ -20,7 +20,9 @@ class ActivityParseError(BaseException): | |||
|
|||
def read_activity(path: pathlib.Path) -> pd.DataFrame: | |||
suffixes = path.suffixes | |||
if suffixes[-1] == ".gz": | |||
if not suffixes: # Skip files without extensions like .DS_Store files on macos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, I've added a skip.
elif file_type in [".kml", ".kmz"]: | ||
df = read_kml_activity(path, opener) | ||
elif file_type == ".csv": # Simra csv export | ||
df = read_simra_activity(path) | ||
else: | ||
raise ActivityParseError(f"Unsupported file format: {file_type}") | ||
raise ActivityParseError(f"Unsupported file format: {file_type} of file: {path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path will be added in the import_from_directory
function, so we don't need to repeat it here.
@@ -120,13 +123,26 @@ def read_fit_activity(path: pathlib.Path, open) -> pd.DataFrame: | |||
row["altitude"] = fields["altitude"] | |||
if "enhanced_altitude" in fields: | |||
row["altitude"] = fields["enhanced_altitude"] | |||
if "grade" in fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those three additional fields are now added to the time series. There is no analysis on them yet, though.
df = pd.DataFrame(rows) | ||
if metadata: | ||
for key, value in metadata.items(): | ||
setattr(df, key, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you want to do, but I am not too happy with adding this to the data frame object. I will think about a more general way to extract metadata from the files as there are also some GPX files with metadata.
|
||
return pd.DataFrame(rows) | ||
elif "wkt_name" in fields and "sport" in fields and "sub_sport" in fields: | ||
metadata["wkt_name"] = fields["wkt_name"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is “wkt name”? Can you give examples for the values that it takes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I've found some documentation for this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently the contents are just strings. I've also added that to master
now.
return pd.DataFrame(rows) | ||
elif "wkt_name" in fields and "sport" in fields and "sub_sport" in fields: | ||
metadata["wkt_name"] = fields["wkt_name"] | ||
metadata["sport"] = (fields["sport"], fields["sub_sport"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does sport and sub sport differ from the activity kind (ride, walk, hike)? Can you give examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so it is just something “cycling” and “generic”. I've added that into the kind
field.
row = { | ||
"id": activity_id, | ||
"commute": commute, | ||
"distance": distance, | ||
"name": path.stem, | ||
"filename": path.stem, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the filename to the metadata seems very sensible!
Convert speed from m/s to km/h
Use name and kind from metadata