Explained
An HTTP proxy is the most common type of proxy server. It understands HTTP at the application layer: when your client sends a request, the proxy can read the URL, headers, and (for plain HTTP) the body, before forwarding the request to the destination. For HTTPS, the proxy uses the CONNECT method to establish a TCP tunnel to the destination, after which it just relays encrypted bytes without seeing the content.
Most commercial proxy services — including residential, ISP, and datacenter providers — expose HTTP proxy endpoints because every HTTP client and library natively supports them. Setting `HTTP_PROXY` and `HTTPS_PROXY` environment variables, passing `proxies={...}` to Python's `requests`, or configuring a launch flag in Playwright all work out-of-the-box with HTTP proxy URLs like `http://user:pass@gate.shifter.io:10000`.
The difference between HTTP and SOCKS5 is mostly architectural. HTTP proxies operate at the application layer (can parse HTTP); SOCKS5 operates at the transport layer (just forwards TCP/UDP bytes). For HTTPS scraping the difference is mostly cosmetic — both end up tunneling encrypted bytes — and HTTP proxy support is more universal across tooling.
How It Works
For plain HTTP, your client sends the full request to the proxy (`GET http://example.com/path HTTP/1.1` with absolute URL), the proxy reads the URL, opens a connection to the destination, forwards the request, and relays the response back. For HTTPS, the client first sends a `CONNECT example.com:443` request to the proxy, the proxy opens a TCP tunnel to the destination, and from that point on the client and server speak TLS end-to-end through the proxy, which just shuffles encrypted bytes.
Authentication usually happens via the `Proxy-Authorization` header (Basic auth with username:password) or by encoding credentials in the proxy URL (`http://user:pass@host:port`). Geo-targeting and session parameters in commercial services are typically encoded in the username (`customer-USER-country-us-session-12345`).