# Auto-Scrolling Webpage Capture to PDF

GitHub: 0xarchit/Scroll-To-Pdf

A desktop application built with PyQt6 that automatically scrolls through a webpage, captures screenshots, stitches them, and exports the result as a multi-page PDF.

# Screenshots

# Overview

When you need a PDF of an entire webpage (including content below the fold), manual scrolling and screenshotting is tedious. AutoScrollCapturePDF automates:

Scrolling through the page
Taking successive screenshots
Detecting end of page via image similarity
Concatenating and saving as PDF

Ideal for long articles, reports, dashboards, and documentation pages.

# Features

Configurable scroll delay, height, and max scroll count
Automatic end-of-page detection (image similarity threshold)
Preview stitched result before export
Primary and fallback PDF export methods (Pillow and img2pdf)
Dark-themed, responsive PyQt6 GUI
Cross-platform (Windows/macOS/Linux)

# Installation

Clone this repository:

git clone https://github.com/0xarchit/Scroll-To-Pdf.git
cd ScreenShotToPdf

(Optional) Create a virtual environment:

python -m venv .venv
.venv\Scripts\activate   # Windows
source .venv/bin/activate  # macOS/Linux

Install dependencies:
```
pip install -r requirements.txt
```
Ensure app_icon.ico sits alongside main.py in the project root.

# Usage

# Launching the Application

python main.py

The main window appears. Adjust settings then click Start Capture.

# Settings Panel

Setting	Description
Delay between scrolls	Seconds to wait after each scroll (0.1 - 5.0)
Max scrolls (0 = unlimited)	Maximum number of scroll actions before stopping
Scroll height (0 = auto)	Vertical pixels per scroll (0 uses default heights)
Fullscreen mode (F11)	Toggle auto height for full-screen vs windowed browsing

# Capture Workflow

Click Start Capture → 3‑second countdown → window minimizes
Focus browser and let it scroll+capture automatically
Status bar displays live updates and screenshot count
Capture stops on page end or reaching max scrolls
Window restores with Preview, Save, and Clear options

# Previewing

Click Preview Screenshots to open a stitched image of all screenshots. Close preview to return.

# Exporting to PDF

Click Save as PDF
Choose destination file path (.pdf)
Application attempts Pillow export; on failure, uses img2pdf fallback

# Architecture

main.py: GUI, settings, thread orchestration
CaptureThread: Runs scrolling & screenshot logic on background thread
Image Similarity: Grayscale thumbnails compared to detect end of page
PDF Export: Multi-page via Pillow or img2pdf fallback

# How It Works

Initial Delay: 3 seconds to switch focus to browser
Scroll Loop:
- Grab full-screen screenshot
- Compare with previous (100×100 grayscale threshold)
- If similar beyond 95%, capture final part and exit
- Else, scroll down by configured height and repeat
Signal UI: screenshot_taken, status_update, capture_complete signals update progress

# Configuration Parameters

Parameter	Default	Unit	Description
`delay`	0.5	seconds	Wait time between scroll & capture
`max_scrolls`	10	counts	Zero = infinite until end-of-page detected
`manual_height`	0	pixels	Override auto scroll height if > 0
`is_fullscreen`	False	boolean	Use fullscreen height (1300px) vs windowed (1245px)

# Packaging as Executable

This project includes a PyInstaller spec (main.spec):

pip install pyinstaller
pyinstaller main.spec

Look under dist/ or build/ for the generated executable.

# Troubleshooting

Blank screenshots: Ensure browser window is not minimized and is in focus.
OCR or hidden UI elements: Use headless capture libraries or adjust scroll offsets.
Permission errors: On macOS, grant screen recording privileges.
PDF Export fails: Install img2pdf via pip install img2pdf.

# FAQ

Q: Can I capture only part of a page?

A: Not currently; feature coming soon (scroll region selection).

Q: Why are some screenshots repeated?

A: Overlapping scroll height may cause repeats. Increase delay or adjust scroll height.

Q: How do I change similarity threshold?

A: Hardcoded to 0.95; modify images_are_similar() in CaptureThread.

# Contributing

Fork the repo
Create a feature branch (git checkout -b feature/YourFeature)
Commit your changes (git commit -m 'Add feature')
Push to your branch (git push origin feature/YourFeature)
Open a Pull Request

Please follow PEP8 and include tests for new functionality.

# License

This project is licensed under the MIT License. See LICENSE for details.