Confluence Snapshot

Confluence Snaphost is a Python application designed to download Confluence space pages as PDF files including attachments. This tool may be useful if you don't have access to PDF API or other page export and automation API (it may be disabled by Confluence admin).

Main features:

pages are downloaded as PDF files using selenium
attachments are downloaded separately using Confluence REST API

Downloaded page tree example:

IPH
├── IPhone
│   ├── IPhone13
│   │   ├── Specification.attachments
│   │   │   ├── pic1.png
│   │   │   ├── pic2.png
│   │   │   └── pic3.png
│   │   └── Specification.pdf
│   ├── IPhone13.pdf
│   ├── IPhone14
│   │   ├── Prices.attachments
│   │   │   └── prices.xlsx
│   │   ├── Prices.pdf
│   │   ├── Specification.attachments
│   │   │   └── pic1.png
│   │   └── Specification.pdf
│   └── IPhone14.pdf
└── IPhone.pdf

Requirements

Python 3.8+
external dependencies from requirements.txt
Google Chrome (tested onv121) and according webdriver (for selenium)

Installation

Clone the repository:

git clone https://github.com/yourusername/your-repository.git

Install the required Python libraries:
```
pip install -r requirements.txt
```
Install Google Chrome and ChromeWebdriver if not already installed:
- Google Chrome
- Webdriver

Configuration

Fill config.yaml file in the project directory with the following format:

username: your_confluence_username
password: your_confluence_password
api_url: https://your-confluence-instance-url/rest/api
space: your_confluence_space_key
download_path: /path/to/download/directory
with_attachments: true
lazy_mode: true
web_url: https://your-confluence-instance-url
user_data_dir: /path/to/chrome/user/data
profile_directory: Default

Fill in the necessary information:
- username: Your Confluence username.
- password: Your Confluence password.
- api_url: The URL of your Confluence API.
- space: The key of the Confluence space you want to download.
- download_path: The directory where downloaded pages and attachments will be saved.
- with_attachments: Set to true if you want to download attachments along with pages.
- lazy_mode: Set to true if you want to enable lazy mode (adds a delay between downloads).
- web_url: The URL of your Confluence instance.
- user_data_dir: The path to the Chrome user data directory for Selenium.
- profile_directory: The Chrome profile directory for Selenium.

Usage

Run the confluence_downloader.py script:

python confluence_downloader.py

The script will start downloading pages and attachments from the specified Confluence space according to the configuration provided in config.yaml.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
confluence-snapshot.py		confluence-snapshot.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Confluence Snapshot

Requirements

Installation

Configuration

Usage

About

Releases 1

Packages

Languages

approximatenumber/confluence-snapshot

Folders and files

Latest commit

History

Repository files navigation

Confluence Snapshot

Requirements

Installation

Configuration

Usage

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages