Loading...

How to Use Proxies to Rotate IP Addresses in Python

Avatar
Chris Prosser

Are you looking for how to rotate IPs using proxies with Python? Then you are on the right page as I will show you how to get it done and the best practices you should use.

Web scraping and other forms of automation can’t be done on any reasonable scale with a single IP address. After a few requests, your IP address will be blocked and for this reason, you will be denied access to the data you need to collect. The only way you can bypass the IP tracking and blocking mechanism baked into the anti-spam system of websites is for you to have multiple IP addresses and rotate them. This will enable you to exceed request limits set by the website thereby avoiding the block.

Proxies offer you a way to get as many IP addresses as you want. You can then rotate between these proxy/IP addresses so you don’t send too many requests with a single IP address within a short period. In this article, I will show you how to use proxies to rotate IP addresses in Python. You will also learn tips on how to best do this and other ways to avoid getting detected and banned.

Proxies As a Source of Multiple IP Addresses — an Overview

When your Internet Service Provider (ISP) assigns your device an IP to access the Internet, they assign you a single IP address, and for that period, no other device/user is assigned that IP on the Internet. This is what makes IP addresses good for identification online. However, with proxies, this becomes useless.

Proxies are servers that allow you to route your requests through them to your target site, effectively bypassing IP tracking as you get to use the IP address of the proxy server and not your own. If you have access to multiple proxy servers, it means you now have access to multiple IP addresses that you can use to access the Internet.

Let's see how this will help you in web scraping. Let's say you target site block IPs that send 50 requests in a minute but your scraper can send 200 requests. This means after the 50th request, you will get blocked. However, if you have 10 proxies/IPs, it means only about 20 requests will be sent through an IP, and by doing that, you wouldn’t exceed the request limit that will lead to getting blocked.

Step-By-step guide on How to Rotate IP address in Python

By now, you already know that when I mention rotating IP addresses, proxies are the source of the IP addresses as your ISP wouldn’t provide you more than one IP address per device. In this section of the article, I will guide you on how to write a script in Python that rotates the IP address provided by proxies. Let’s get started.

Step 1: Get Your Proxies Ready

Before you begin coding, you need to have your proxies ready to use. There are some free proxy providers out there that you can get proxies to use for testing and some non-important hobby projects. I wouldn’t recommend you use them for serious projects as they are unreliably, terribly slow, already blocked by some websites, and can even expose you to some security and privacy risks. For this guide, you can use free proxies. I use the one provided by Geonode, a paid provider with some free proxies — Geonode free proxies.

In real projects, you can’t and shouldn’t use them except you are ready to deal with their unreliable nature and the risks they can expose your project to. I recommend you use commercial proxies that are always online, and great in terms of performance, security, and privacy. Some of the best providers of this are Proxy-Seller, MyPrivateProxy, and Proxy-cheap.

Step 2: Set up Necessary Tools and Libraries

Since I will be using Python in this guide, it is the first tool you should install as all others depend on it. If you are using either macOS or Windows, python is already installed. You can verify by typing the command “python” in the command line or Terminal. However, the Python installed is version 2 which is available for legacy reasons. You need version 3. If you don’t have that, you should install it from the official Python website.

Next, you need to have your libraries/framework for web scraping. Since this is just a guide, using a framework will be an overkill. Instead, I will use the “Requests: HTTP for Human”. Using this library, you are able to send web requests. There is a default library in Python’s standard library known as urllib but this third-party library is easier and handles errors better. I will be accessing an API in this guide so I don’t need to install a parsing library.

Step 3: Create a New Empty File and Import the Necessary Libraries

I am using an IDE for this guide but you can use a notepad. Create an empty file in Notepad and name it guide.py. then import the necessary libraries. In this case, it is a library known as the request library.

import requests

import json

import re

import random

Step 4: Retrieve Proxies or Define Them

You need to decide how you want to import the proxies to use. There are multiple options available to you. Some of the providers offer you an API to retrieve proxies. For some, you will need to download the proxies either as a JSON, CSV, or TXT file. In some cases, you will need to copy the proxies and add them to your script manually.

  • Retrieving IPs via an API

Below is a sample code for retrieving proxies via API or URL. I used the free proxies from Geonode.

# get proxies from URL/API

proxies_request = requests.get("https://proxylist.geonode.com/api/proxy-list?limit=500&page=1&sort_by=lastChecked&sort_type=desc")

parsed_proxies = json.loads(proxies_request.text)["data"]

proxies = ["{}:{}".format(i["ip"], i["port"]) for i in parsed_proxies]

As you can see above, I used list comprehension to iterate through the proxies and collect only the information I needed which is the proxy IP and port. There are other information returned in the response such as location and much more. But for what I want to do in this guide, it serves that purpose.

  • Using IPs from Downloaded Files

I downloaded the free proxies from the Geonode site with the file name Free_Proxy_List.txt and copied it to the same folder as the guide.py file. Below is how I retrieve only the proxy address and port.

txt_proxies = []

with open("Free_Proxy_List.txt", "r") as file:

    proxies = file.readlines()

    for i in proxies[1:]:

        proxy_address = i.split(",")[0].strip().replace('"', "")

        proxy_port = i.split(",")[7].strip().replace('"', "")

        if proxy_port.isnumeric():

            txt_proxies.append("{}:{}".format(proxy_address, proxy_port))

I read the txt file and each line contains detail per proxy. I use the split function to separate each of the details of the proxy, use the strip method to remove whitespace, and then remove the (“) surrounding each of the details. Some of the proxies have non-numeric strings as ports so I had to filter them out.

  • Using a Proxy List

Sometimes, all you need to do is get the proxy details yourself and create a list of strings, each string holding details of the proxy address and port — you can include the username and password for authentication =. In our case, we are using a free proxy so there is no need for that.

proxy_list = ['107.180.101.226:41675', '154.65.39.8:80', '38.54.71.67:80', '212.107.29.43:80']

Step 5: Create a Function to Return a Random IP from the List

 Below is my custom function. It chooses from the list of IPs at random. It does not keep track of the IP it has already assigned you. You can create yours to keep a list of already assigned IPs to make it more robust and offer you better rotation. Below is my function for you to take a clue from.

def get_random_ip(full_proxy_list):

    i = random.randrange(0, len(full_proxy_list))

    return {

        "http": full_proxy_list[i],

        "https": full_proxy_list[i],

    }

Step 6: Create a Function to Send a Request

With the proxies ready and a function to choose one at random for you, you can then start scraping. Below is an example of how to get it done. In my case, I sent a request to the httpbin API that responds with the IP address of the client.

def send_web_request():

    session = requests.session()

    session.proxies = get_random_ip(api_proxies)

    response = session.get("https://httpbin.org/ip").text

    print(response)

Full Code for Rotating IP Addresses with Proxies

import requests

import json

import re

import random

# get proxies from URL/API

proxies_request = requests.get("https://proxylist.geonode.com/api/proxy-list?limit=500&page=1&sort_by=lastChecked&sort_type=desc")

parsed_proxies = json.loads(proxies_request.text)["data"]

api_proxies = ["{}:{}".format(i["ip"], i["port"]) for i in parsed_proxies]

txt_proxies = []

with open("Free_Proxy_List.txt", "r") as file:

    proxies = file.readlines()

    for i in proxies[1:]:

        proxy_address = i.split(",")[0].strip().replace('"', "")

        proxy_port = i.split(",")[7].strip().replace('"', "")

        if proxy_port.isnumeric():

            txt_proxies.append("{}:{}".format(proxy_address, proxy_port))

proxy_list = ['107.180.101.226:41675', '154.65.39.8:80', '38.54.71.67:80', '212.107.29.43:80']

def get_random_ip(full_proxy_list):

    i = random.randrange(0, len(full_proxy_list))

    return {

        "http": full_proxy_list[i],

        "https": full_proxy_list[i],

    }

def send_web_request():

    session = requests.session()

    session.proxies = get_random_ip(api_proxies)

    response = session.get("https://httpbin.org/ip").text

    print(response)

send_web_request()

Tips to Make IP Rotation More Effective and Efficient

At the very best, the code you see above will just work. It does not happen any form of error or exception and wouldn’t even be good for use in a production environment. Below are some of the tips you should follow to make your IP rotation system as robust and effective as follows.

  • Constantly Monitor your Proxy List

You need to develop a system, probably a script with a scheduler to constantly monitor all of the proxies in your list. Without doing that, you will waste time retrying requests and wasting time. In some instances, some of the proxies would leak your IP address daily. Always send pings to each of the IPs and make sure they are not just working, but also remain highly anonymous. You can also integrate speed testing logic so you keep the ones that become too slow out of your list.

  • Use IPs from Multiple Subnets and Networks

If you use purchase IPs from the same proxy provider, the chances of getting all or most from the same subnet are high. At first, there is no problem with using multiple IPs from the same subnet. However, you need to know that when a website discovers your multiple IPs are being used to access their platform automatedly, they can block the subnet and when a subnet is blocked, even IPs that haven’t been used will be blocked from accessing that site.

When purchasing proxies, always request the IPs should be drawn from multiple subnets. The best thing for you is to purchase the IPs from different providers. Doing this ensures you have enough subnet diversity in the case of a subnet ban.

  • IP Assignment Should Not be Absolutely Random

There is a problem with randomness — the same IP address can be assigned to you over and over again, especially if the selection pool is small. If you leave things to randomness, you will find yourself getting the same IP reassigned. Instead, you need a system that keeps a history of IP addresses that have already been assigned. You also need to keep track of the IP subnet and location. What you need this for is to create a systemic but random assignment engine that doesn’t reuse IPs too frequently to the extent of exceeding the request limit per IP.  You will also want the general reassignment to come from different subnets to keep things more natural and less suspicious.

  • Handle Exceptions

When you code or write scripts, you need to have it at the back of your mind that things wouldn’t go as expected and define how your code should behave in such cases. This is known as error handling or exception handling. You can’t possibly catch all the exceptions. However, there are some key exceptions you should handle. What happens when your list of IPs/proxies is empty? How should the code behave if you get blocked? And what happens if your proxies become slow? There are more you need to handle for you to build a more robust system.

A Better Way to Handling IP Rotation with Proxy

In most cases, there is no point in rotating IPs yourself. This is because it will cost you more, requires your effort in managing the rotation system, and would still not be as effective as a done-for-you solution. I recommend you ditch the idea of rotation IPs yourself and embrace rotating proxies more specifically the residential rotating proxies. For this kind of proxy, what you get is a single proxy endpoint you configure.

Anytime you send a web request, the provider will choose a random IP address to assign to you. The best providers have IPs from all over the world and offer you reliable and fast Internet connectivity while hiding the fact that you are using a proxy. The IPs are residential, making them difficult to detect as proxies. Most of these providers also have support for session management which will retain an IP address for a limited period before rotating to a different IP. Currently, the best rotating residential proxy services are Bright Data and Smartproxy. If you are on a tight budget, you can consider using Proxy-cheap which is also a good choice.

Conclusion

From the above, you can see that even though it is possible and easy for you to rotate IPs with proxies using Python, I recommend you don’t do it. This is because of the financial implications and management costs attached to doing that. Instead of that, I recommend you use a rotating residential proxy for that as IP rotation is automatic and management is minimal as you just have one proxy endpoint to use. As per recommendation, I recommend Bright Data or Smartproxy but if you are on a tight budget, Proxy-cheap is a good choice.

FAQs

IP rotation is not illegal just like masking your real IP address is not illegal. Unfortunately, websites do not allow users to rotate IPs and if you are discovered, you will get blocked based on suspicion or even told because you are using a proxy. For you to rotate IPs successfully, you need to use IPs that are undetectable. If your IP gets detected as a proxy, you will get blocked. However, while rotating IP is not illegal, using it for illegal tasks can still get you into legal trouble.

Rotating residential proxies is the best for IP rotation. This is because IP rotation for them is automatic with support for customisation depending on your requirements. The IPs also enjoy a high level of legitimacy and acceptance compared to datacenter proxies. Currently, Bright Data is the best provider of this kind of proxies. Their pricing has become a lot cheaper as you can purchase 1GB for $8. However, there are even cheaper options such as Proxy-cheap at $5 per GB.

Free proxies are not reliable and they can go offline without notice. One will think using a bunch of free proxies will do the magic as the script can keep trying a new proxy until one goes through. The danger in this is that you will waste a lot of time and even when you eventually get a working one, you don’t know how fast is and its history on your target site. Eventually, you might not only end up wasting your time, but the bad proxies can actually ruin your project.

Top

Top