🌐 Website Downloader

Website Downloader is a powerful Python script designed to download entire websites along with all their assets. This tool allows you to create a local copy of a website, including HTML pages, images, CSS, JavaScript files, and other resources. It ensures that all internal links and resources are downloaded for offline viewing. Perfect for web archiving and offline browsing!

✨ Features

🔄 Recursively download entire websites.
💾 Save HTML pages with all linked assets.
🔗 Convert links for offline viewing.
🖼️ Handle various file types including images, CSS, and JavaScript.
📜 Log download progress and errors.
✔️ Verify the completeness of the downloaded website and re-download missing files if necessary.

📥 Installation

Clone the repository:

git clone https://github.com/PKHarsimran/website-downloader.git
cd website-downloader

Install the required dependencies:
```
 pip install -r requirements.txt
```

🚀 Usage

Run the website downloader script:
```
python website-downloader.py
```
Follow the prompts to enter the URL of the website to download and the destination folder.
After the download is complete, you will be prompted to check if the website is correctly downloaded.

🛠️ Libraries Used

requests: 🌐 A simple and elegant HTTP library for Python. Used to send HTTP requests to download HTML pages and resources.
wget: 📥 A utility for non-interactive download of files from the web. Used to download resources such as images, CSS, and JavaScript files.
BeautifulSoup: 🍜 A library for parsing HTML and XML documents. Used to extract links to resources from HTML pages.
logging: 📝 A standard Python library for generating log messages. Used to log download progress and errors.
subprocess: ⚙️ A standard Python library to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. Used to run the verification script.
argparse: 🛠️ A standard Python library for parsing command-line arguments. Used in the verification script to handle input parameters.

🗂️ Project Structure

website-downloader.py: The main script for downloading the website and its resources.
check_download.py: The verification script for checking the completeness of the downloaded website.
requirements.txt: A file listing the required dependencies.

🤝 Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

📜 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
check_download.py		check_download.py
requirements.txt		requirements.txt
website-downloader.py		website-downloader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌐 Website Downloader

✨ Features

📥 Installation

🚀 Usage

🛠️ Libraries Used

🗂️ Project Structure

🤝 Contributing

📜 License

About

Releases 1

Packages

Languages

License

PKHarsimran/website-downloader

Folders and files

Latest commit

History

Repository files navigation

🌐 Website Downloader

✨ Features

📥 Installation

🚀 Usage

🛠️ Libraries Used

🗂️ Project Structure

🤝 Contributing

📜 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages