Replies: 2 comments
-
👋 @loujr - Thank you for this amazing right up! 🎉
This was the inspiration I needed to write an automated pre-script to do just that ⬆️ for My hope is others find this next piece useful 🔽 run_discovery.sh #!/bin/bash
#
# Example Usage:
# $ bash run_discovery.sh
# Please enter the organization name:
# My-Super-Cool-ORG
if [ -z "$TOKEN" ]; then
echo "Error: Please set the GitHub API token in the TOKEN environment variable."
echo "Example: $ export TOKEN=ghp_****"
exit 1
fi
echo "Please enter the organization name: "
read orgName
url="https://api.github.com/orgs/$orgName/repos?per_page=100"
repos=()
page=1
while true; do
page_repos=$(curl -s -H "Authorization: token $TOKEN" "$url&page=$page" | jq -r '.[] | .name, .private')
while IFS= read -r repo; do
repos+=("$repo")
done <<< "$page_repos"
headers=$(curl -s -I -H "Authorization: token $TOKEN" "$url&page=$page")
link_header=$(echo "$headers" | awk '/^link:/ {print $0}')
echo ""
echo "Discovering Repo Listing Standby: "
echo ""
echo $link_header
if echo "$link_header" | grep -q 'rel="next"'; then
next_page=1
else
next_page=0
fi
if [ "$next_page" -eq 0 ]; then
break
fi
((page++))
sleep 6
done
arr=(${repos[@]})
for ((i=0; i<${#arr[@]}; i+=2)); do
visibility="Public"
if [ "${arr[$i+1]}" = "true" ]; then
visibility="Private"
fi
repoName="${arr[$i]}"
date=$(date +"%Y-%m-%d_%H-%M")
filename="/tmp/${date}_discovered_repositories.tmp"
echo "$repoName" >> $filename
done
echo ""
echo "Discovered Repository Count: $(cat $filename | wc -l)"
echo "Repo Listing located in: $filename" For full inline notes please see here |
Beta Was this translation helpful? Give feedback.
0 replies
This comment was marked as off-topic.
This comment was marked as off-topic.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Navigating large datasets can present a challenge when making some API calls. To make navigation and parsing easier, GitHub uses Pagination to help curate large datasets into a more manageable length. Pagination refers to dividing large datasets into smaller chunks. These smaller chunks of data become pages that can be navigated through similar to a web request. This guide covers the two methods for pagination within GitHub:
Cursor Based Pagination
Page Based Pagination
To start with, it's important to know a few facts about receiving paginated items:
Different API calls respond with different defaults. For example, a call to List public repositories provides paginated items in sets of 30, whereas a call to the GitHub Search API provides items in sets of 100.
You can specify how many items to receive (up to a maximum of 100); but, for technical reasons, not every endpoint behaves the same. For example, events won't let you set a maximum for items to receive. Be sure to read the documentation on how to handle paginated results for specific endpoints.
Pagination begins at header of the request. The following is an example of an authenticated curl request to view the audit log of our organization:
This is a standard HTTP output the
link
section forms the Link Header of the API call. The-I
parimeter returns only the header information and not the contents.In examining the header information, the Link Header of this request is located in this section of the request:
Let's break down this Link Header. The audit log using pagination terms
before
andafter
. These terms will be explained in Navigation Through the Pages.rel=next
says that the next page is located atafter=MS42NjQzODM5MTkzNDdlKzEyfDM0MkI6NDdBNDo4RTFGMEM6NUIyQkZCMzo2MzM0N0JBRg%3D%3D&before=>
.This is an example of a Link Header that uses
page
. Notice that instead of being provided cursor links, you are given page numbers to reference. In this examplerel="next"
shows that the next page is 2page=2
, while the last page is 34page=34
. This is in contrast tobefore
andafter
that do not contain these references. This means that you are on page one,as pagination defaults at the first page,and there are 33 more pages of information inaddClass
.Note: Always rely on these link relations provided to you. Don't try to guess or construct your own URL.
Using Cursor Based Pagination
There are two ways of Navigation using pagination. This will depend on the output of your Link Header.
before=
indicates that your pagination terms usebefore
andafter
.Before and After
To navigate using
before
andafter
. Copy the Link Header generated above into yourcurl
request:This will generate a page of 100 items and new header information that you can use to make the next request. The important part of the output here is the Link Header needs to be generated rather than manually imputed. Copy the entire link into the following output.
rel="next"
provides the next 100 items of results.rel="prev"
provides the previous 100 items of results.Using Page Based Pagination
Now that you know how many pages there are to receive, you can start navigating through the pages to consume the results. You do this by passing in a
page
parameter. By default,page
always starts at1
. Let's jump ahead to page 14 and see what happens:$ curl -I "https://api.github.com/search/code?q=addClass+user:mozilla&page=14"
Here's the link header once more:
As expected,
rel="next"
is at 15, andrel="last"
is still 34. But now we've got some more information:rel="first"
indicates the URL for the first page, and more importantly,rel="prev"
lets you know the page number of the previous page. Using this information, you could construct some UI that lets users jump between the first, previous, next, or last list of results in an API call.Changing the number of items received
By passing the
per_page
parameter, you can specify how many items you want each page to return, up to 100 items. Let's try asking for 50 items aboutaddClass
:$ curl -I "https://api.github.com/search/code?q=addClass+user:mozilla&per_page=50"
Notice what it does to the header response:
As you might have guessed, the
rel="last"
information says that the last page is now 20. This is because we are asking for more information per page about our results.Conclusion
Pagination is a critical tool for navigating API queries on large datasets. In this article, we outlined the two pagination methods within GitHubs REST API: cursor and page based pagination. When navigating using cursor based pagination it is important to use generated header links
before
andafter
. Page based pagination uses page numbers to navigate those datasets. Some REST API endpoints use page based pagination, some only respond to cursor based pagination, while others might respond to both. If you are unsure about which pagination method to use, you can get additional instructions from the header information from the REST API call that you just made.Beta Was this translation helpful? Give feedback.
All reactions