Optimizations for Popular Libraries¶
aiohttp¶
Install it as
aiohttp[speedups]
(in some cases, might require a local C compiler toolchain to be available)Use the
AsyncResolver
in a custom connector for yourClientSession
sRequires aiodns, which is installed by the
[speedups]
option.Because it hits your resolver directly, customizations in your OS’ host name resolution path (like /etc/hosts and LMHOSTS) won’t work.
async_resolver = aiohttp.AsyncResolver() async with aiohttp.ClientSession( connector=aiohttp.TCPConnector(resolver=async_resolver) ) as http: await http.get('https://justinarthur.com/')
On cpython, pass a custom loads from ujson or rapidjson-python.
boto3¶
Serializing and deserializing numbers between Python and DynamoDB is slow when using boto3’s DynamoDB “resource” features. If you end up working with numbers a lot, consider using the raw DynamoDB “client” features instead of the “resource” ones.
It means constructing the raw DynamoDB item structure yourself in dicts.
def serialize_number(value):
return {"N": str(value)}
is about 2x faster than boto3’s built-in serializer for numbers, and it works for floats without needing to do an intermediate decimal conversion.
Caveats:
You lose some value-checking performed by boto3’s built-in serializer. Be careful to only supply this serializer with numbers you know will conform to DynamoDB’s limitations (e.g. 38 digits of precision)
DynamoDB’s Number type was intended for storing exact numbers and it’s why boto3’s built-in serialize doesn’t accept floats. floats use a fraction behind the scenes to approximate a value, and there’s no way for boto3 to know what exact value you’re trying to approximate. If you were to pass it 1.3, it wouldn’t know if you meant to approximate 1.3 or 1.300000000000000044408920985006261616945266723632812; the same float represents either. By simply doing str(a_float_value), you’ll get a DynamoDB Number equal to the shortest decimal that the float could represent (e.g. 1.3 in the aforementioned case). Make sure that’s what you want.
A similar speed-up is available on deserialization from DynamoDB if you know
ahead of time what type of number you stored in DynamoDB (e.g. int
,
float
, Decimal
). The built-in deserializer always produces a
Decimal
, as this is the safest way to convey the exact
value of the DynamoDB Number without making assumptions.
requests¶
Requests was made to be easy, quick, and concise. If you’re willing to make your code a bit more verbose, you can squeeze better performance out of some situations.
If you will be making requests of the same host or hosts repeatedly, create a session object and use it for all requests.
visited = set()
to_visit = {'https://thespecial.place/'}
with requests.Session() as http:
while to_visit:
url = to_visit.pop()
response = http.get(url)
# the response needs to be read (e.g. with .content) or closed
# for its connection to be released to the pool for fast re-use
new_urls = set(extract_urls(response.content)))
to_visit |= new_urls - visited
visited.add(url)