feat: split session time into human hours vs machine hours#3921
feat: split session time into human hours vs machine hours#3921marcusquinn merged 3 commits intomainfrom
Conversation
Replace opaque session duration with precise human/machine time split using SQLite LAG() window functions on message timestamps: - human_time = gap between assistant completing and next user message (reading + thinking + typing) - machine_time = assistant.completed - assistant.created (AI generating) Interactive sessions are ~70-85% human time; workers are ~0%. This surfaces the actual human effort for transparency. Add quarter period support (day/week/month/quarter/year). Add PR #N pattern to worker session classification.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the session time reporting by introducing a detailed breakdown of human versus machine interaction time. This change provides a clearer understanding of how time is spent in interactive and worker sessions, moving beyond a single, aggregated duration metric. Additionally, it extends the reporting capabilities by adding support for quarterly periods, offering more flexibility in analyzing activity trends. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sun Mar 8 20:44:37 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThe contributor activity helper script now computes per-session human vs machine interaction time using message-level timestamps and SQL windowed functions. Period support extends to quarters, and JSON/markdown outputs expand to include separate human and machine hour metrics for interactive and worker session categories, with cross-repo aggregations using set unions and per-type accumulation. Changes
Sequence DiagramsequenceDiagram
participant Input as Message Feed<br/>(Timestamps + Roles)
participant SQL as SQL Engine<br/>(LAG Window)
participant Python as Session<br/>Classifier
participant Output as Output<br/>Formatter
Input->>SQL: Raw messages with timestamps
SQL->>SQL: Compute LAG() for gaps
SQL->>SQL: Calculate human_ms, machine_ms<br/>per session
SQL->>SQL: Cap human gaps to 1hr,<br/>filter idle
SQL->>Python: Aggregated session metrics<br/>(human/machine hours)
Python->>Python: Classify by worker patterns<br/>(with PR detection)
Python->>Python: Accumulate per-type<br/>human/machine times
Python->>Output: Classified sessions<br/>with time metrics
Output->>Output: Generate markdown table<br/>(Type, Hours, Sessions)
Output->>Output: Generate JSON<br/>(nested aggregations)
Output-->>Input: Console & JSON output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a more granular session time tracking by splitting it into 'human hours' and 'machine hours', providing a clearer picture of user engagement versus automated processing. It also adds support for a 'quarter' time period and expands the classification for worker sessions. A critical security concern is the potential for SQL injection due to direct variable interpolation in SQLite queries; a more robust parameterized approach is recommended. Additionally, minor code style improvements are suggested for better readability in the embedded Python scripts.
| query_result=$(sqlite3 -json "$db_path" " | ||
| WITH msg_data AS ( | ||
| SELECT | ||
| s.id AS session_id, | ||
| s.title, | ||
| json_extract(m.data, '\$.role') AS role, | ||
| m.time_created AS created, | ||
| json_extract(m.data, '\$.time.completed') AS completed, | ||
| LAG(json_extract(m.data, '\$.role')) | ||
| OVER (PARTITION BY s.id ORDER BY m.time_created) AS prev_role, | ||
| LAG(json_extract(m.data, '\$.time.completed')) | ||
| OVER (PARTITION BY s.id ORDER BY m.time_created) AS prev_completed | ||
| FROM session s | ||
| JOIN message m ON m.session_id = s.id | ||
| WHERE s.parent_id IS NULL | ||
| AND m.time_created > ${since_ms} | ||
| AND (s.directory = '${safe_path}' | ||
| OR s.directory LIKE '${like_path}.%' ESCAPE '\\' | ||
| OR s.directory LIKE '${like_path}-%' ESCAPE '\\') | ||
| ) | ||
| SELECT | ||
| s.title, | ||
| (max(m.time_created) - min(m.time_created)) as duration_ms | ||
| FROM session s | ||
| JOIN message m ON m.session_id = s.id | ||
| WHERE s.parent_id IS NULL | ||
| AND m.time_created > ${since_ms} | ||
| AND (s.directory = '${safe_path}' | ||
| OR s.directory LIKE '${like_path}.%' ESCAPE '\\' | ||
| OR s.directory LIKE '${like_path}-%' ESCAPE '\\') | ||
| GROUP BY s.id | ||
| HAVING count(m.id) >= 2 | ||
| AND duration_ms > 5000 | ||
| session_id, | ||
| title, | ||
| SUM(CASE | ||
| WHEN role = 'user' AND prev_role = 'assistant' | ||
| AND prev_completed IS NOT NULL | ||
| AND (created - prev_completed) BETWEEN 1 AND 3600000 | ||
| THEN created - prev_completed | ||
| ELSE 0 | ||
| END) AS human_ms, | ||
| SUM(CASE | ||
| WHEN role = 'assistant' AND completed IS NOT NULL | ||
| AND (completed - created) > 0 | ||
| THEN completed - created | ||
| ELSE 0 | ||
| END) AS machine_ms | ||
| FROM msg_data | ||
| GROUP BY session_id | ||
| HAVING human_ms + machine_ms > 5000 | ||
| ") || query_result="[]" |
There was a problem hiding this comment.
The SQL query on line 540 (in the session_time function) is constructed by interpolating shell variables ${since_ms}, ${safe_path}, and ${like_path} directly into the query string. This direct interpolation is vulnerable to SQL injection if an attacker can control the input (e.g., the repo path argument), potentially allowing arbitrary SQL commands against the Claude.db database. To prevent SQL injection in shell scripts using sqlite3, it is recommended to create a helper function that uses .param set for safe parameterized bindings instead of direct string interpolation. This approach is more robust than manual escaping.
References
- To prevent SQL injection in shell scripts using
sqlite3, create a helper function that uses.param setfor safe parameterized bindings instead of direct string interpolation.
| print(f'| Interactive | {i[\"count\"]} | {i_human_h}h | {i_machine_h}h |') | ||
| print(f'| Workers/Runners | {w[\"count\"]} | — | {w_machine_h}h |') |
There was a problem hiding this comment.
The backslashes (\) used to escape the double quotes within these f-strings are unnecessary. Since the f-strings are delimited by single quotes ('...'), the inner double quotes for dictionary key access (["count"]) are treated as literal characters by Python. The shell also passes them through correctly because the entire Python script is wrapped in double quotes. Removing these redundant escapes improves code readability.
| print(f'| Interactive | {i[\"count\"]} | {i_human_h}h | {i_machine_h}h |') | |
| print(f'| Workers/Runners | {w[\"count\"]} | — | {w_machine_h}h |') | |
| print(f'| Interactive | {i["count"]} | {i_human_h}h | {i_machine_h}h |') | |
| print(f'| Workers/Runners | {w["count"]} | — | {w_machine_h}h |') |
| print(f'| Interactive | {i[\"interactive_sessions\"]} | {i[\"interactive_human_hours\"]}h | {i[\"interactive_machine_hours\"]}h |') | ||
| print(f'| Workers/Runners | {i[\"worker_sessions\"]} | — | {i[\"worker_machine_hours\"]}h |') |
There was a problem hiding this comment.
Similar to the other comment, the backslashes (\) for escaping double quotes here are redundant. The f-strings use single quotes, so the inner double quotes for dictionary keys don't need to be escaped for either the shell or Python. Removing them will make the code cleaner and easier to maintain.
| print(f'| Interactive | {i[\"interactive_sessions\"]} | {i[\"interactive_human_hours\"]}h | {i[\"interactive_machine_hours\"]}h |') | |
| print(f'| Workers/Runners | {i[\"worker_sessions\"]} | — | {i[\"worker_machine_hours\"]}h |') | |
| print(f'| Interactive | {i["interactive_sessions"]} | {i["interactive_human_hours"]}h | {i["interactive_machine_hours"]}h |') | |
| print(f'| Workers/Runners | {i["worker_sessions"]} | — | {i["worker_machine_hours"]}h |') |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
.agents/scripts/contributor-activity-helper.sh (2)
668-668: Minor doc nit: missingquarterin period list.The comment says
--period day|week|month|yearbutquarteris now supported (it passes through tosession_timewhich handles it). Consider updating for consistency.-# --period day|week|month|year (optional, default: month) +# --period day|week|month|quarter|year (optional, default: month)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/contributor-activity-helper.sh at line 668, Update the inline comment that lists supported --period values to include "quarter" so it reflects current behavior; specifically change the comment string "# --period day|week|month|year (optional, default: month)" to include quarter (e.g. "# --period day|week|month|quarter|year (optional, default: month)") so it matches how the flag is passed through to session_time and supported by the script.
625-643: Note:worker_human_hoursis computed but not exposed in JSON.The code correctly accumulates
w['human_ms'](line 630) intototal_human_h, but doesn't exposeworker_human_hoursas a separate JSON field. This is fine for single-repo output (workers have ~0% human time anyway), but creates an aggregation gap incross_repo_session_time— see the related comment there.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/contributor-activity-helper.sh around lines 625 - 643, The result JSON is missing worker_human_hours: compute worker_human_h = ms_to_h(w['human_ms']) (using the existing ms_to_h) and add a 'worker_human_hours': worker_human_h entry to the result dict so worker human time is exposed (refer to variables i, w, ms_to_h and the result dictionary).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/contributor-activity-helper.sh:
- Around line 747-748: The cross-repo total_human_h is missing worker human
hours; in cross_repo_session_time() update the aggregation to include worker
human time as session_time() does (i.e., sum interactive human hours and worker
human hours from the per-repo totals) — use the per-repo totals keys (e.g.,
totals['interactive_human_hours'] plus totals['worker_human_hours'] or
totals['total_human_hours'] if present) when computing total_human_h so the
cross-repo report matches session_time()’s definition.
---
Nitpick comments:
In @.agents/scripts/contributor-activity-helper.sh:
- Line 668: Update the inline comment that lists supported --period values to
include "quarter" so it reflects current behavior; specifically change the
comment string "# --period day|week|month|year (optional, default: month)" to
include quarter (e.g. "# --period day|week|month|quarter|year (optional,
default: month)") so it matches how the flag is passed through to session_time
and supported by the script.
- Around line 625-643: The result JSON is missing worker_human_hours: compute
worker_human_h = ms_to_h(w['human_ms']) (using the existing ms_to_h) and add a
'worker_human_hours': worker_human_h entry to the result dict so worker human
time is exposed (refer to variables i, w, ms_to_h and the result dictionary).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8c3fdbac-63ac-4c24-acf1-ff8a45add38d
📒 Files selected for processing (1)
.agents/scripts/contributor-activity-helper.sh
Ensures cross-repo total includes worker human hours (negligible but consistent with per-repo session_time() which sums both).
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sun Mar 8 20:53:04 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
Finding addressed in follow-up commit
…comment Address CodeRabbit nitpicks: - Expose worker_human_hours in session_time() JSON result for schema completeness - Add worker_human_hours aggregation in cross_repo_session_time() - Update cross_repo_session_time comment to include quarter in period list
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sun Mar 8 20:58:43 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|



Summary
LAG()window functions on message timestampsWhy
The previous session time metric (total duration) was misleading — it mixed human effort with machine generation time and idle gaps. For transparency of effort, you need to know: "how many hours did the human actually spend reading and directing?"
Results (aidevops repo)
Interactive sessions are ~70-85% human time. Worker sessions are ~0% human time (confirmed: 2.3h human vs 100.1h machine for workers last month).
Changes
contributor-activity-helper.shsession_time()SQL to useLAG()window function computinghuman_msandmachine_msper sessioncross_repo_session_time()to aggregate new fieldsinteractive_human_hours,interactive_machine_hours,worker_machine_hoursquarter(90 days) to period optionsPR #Nto worker session title patternsNo changes to
pulse-wrapper.shThe body assembly embeds markdown from the helper — the new table format flows through automatically.
Summary by CodeRabbit
New Features
Improvements